Professional Documents
Culture Documents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Wiley and Royal Economic Society are collaborating with JSTOR to digitize, preserve and extend access to The
Economic Journal.
http://www.jstor.org
The notion of' long-run' is inextricably linked with the concept of' equilibrium'
in economics, although in much time series econometrics long-run analysis is
conducted without providing an explicit account of the type of equilibrium
theory that may underlie it. This 'atheoretical' approach to long-run modelling
has received a further impetus from cointegration analysis, in particular the
unrestricted VAR approach of Johansen (1988, i99i). In this paper I shall
argue against such a purely statistical approach to long-run modelling, and
discuss the alternative theory-based procedures that could be employed in
practice.
While we have learned a great deal from the theoretical literature on unit
roots and cointegration, empirical applications of this methodology have
focused on the statistical properties of the underlying economic time series,
often at the expense of theoretical insights and economic reasoning. What is
needed is a more satisfactory integration of the cointegration analysis with the
various structural economic modelling approaches that are extant in the
literature. Already important first steps have been made along these lines.'
Estimation of long-run relations can be approached at different levels,
depending on the extent to which short-run dynamics predicted (possibly) by
the economic theory under consideration are taken into account in the
specification of an economic model. Cointegration analysis, at least in the form
it has been implemented so far, does not take account of what theory may
predict concerning short-rundynamics, on the grounds that theory is concerned
with long-run behaviour only, and that such short-runeffects are of lower order
of importance and in practice can be adequately modelled within an
unrestricted VAR framework. In contrast, rational expectations and RBC
models impose the short-run dynamics predicted by the theory.
The plan of the paper is as follows. Section I deals with the theoretical issues
involved in establishing links between the different notions of equilibrium in
economics and the long-run. Here I argue in favour of formulating long-run
relations as the steady state solutions of intertemporal optimisation problems
* In preparing this draft I have benefited from discussionswith Michael Binder, Graham Elliott, Cheng
Hsiao, Yongcheol Shin, and Ron Smith, and comments by Terry Barker, Peter Boswijk, Paul Fisher, Brian
Henry, Huw Dixon and an anonymous referee. Partial financial support from the ESRC and the Isaac
Newton Trust Fund of Trinity College, Cambridge are gratefully acknowledged.
' Hsiao (I995), Pesaran and Shin (I995a), and Wickens (I996), discuss the estimation and hypothesis
testing problems in the context of traditional structural models with unit-root forcing variables. Ogaki
(I992), Ogaki and Park (995), Kashyap and Wilcox (I993), Clarida (994), Croix and Urbain (I995)
consider the implications of the unit-root processes for estimation of long-run relations based on Euler
equations, obtained from intertemporal-rational expectations optimising models. Finally, Hercowitz and
Sampson (i99i), King et al. (i99i), Mellander et al. (I992), and So5derlindand Vredin (I995) examine
cointegration properties of some simple real business cycle models.
[ I78 ]
has pairs of solutions which satisfy the familiar regularity conditions, namely
5 For expositional simplicity I am considering a first-ordersystem. But this does not involve any loss of
generality as all higher-order systems can always be reduced to the first-ordercanonical form given in (i).
See, for example, Binder and Pesaran (I995a, pp. 140-2), and Wickens (I995).
C Royal EconomicSocietyI997
Over the past decade, largely under the influence of the unit root literature,
views about how to estimate long-run relations and how to make inferences
about them, have undergone important changes. Before the emergence of the
unit root literature, the traditional approach was to model relationships
between y, (the decision or endogenous variables), and x, (the forcing or
exogenous variables) in the form of stationary distributed lags or autoregressive-
distributed lags (ARDL), and then use standard asymptotic maximum
likelihood theory for estimation and inference on the long-run relations implicit
in the ARDL model. (For a survey of this literature see, for example, Hendry
et al. (1 984).) The cointegrating literature, pioneered by Granger (1 986), and
Engle and Granger (I987), by focussing on the relationships between unit-root
processes, seems to suggest that in the presence of unit roots, this traditional
approach is no longer applicable and should not be followed. Instead, the long-
run analysis must be conducted either in the unrestricted VAR framework of
Johansen (I988, I99I, I995) and Reinsel and Ahn (I992), or the semi-
parametric, triangular form advocated by Phillips (i99i), and its various
modifications/extensions in Phillips and Hansen (I990), Phillips and Loretan
(i99I), Phillips (I 995) and Saikkonen (i99I, I993).9 This literature implicitly
suggests that the traditional methods of estimation and inference are wrong. In
this section I shall argue that as far as estimation and inference involving long-
run parameters are concerned, this conclusion is premature. The main
contribution of the cointegration literature, however, has been to focus
attention on the issue of testing for the existence of long-run relations, often
taken for granted in the traditional approach.
Given the complexity of the issues involved I shall advance the arguments in
steps; starting first with the case where there is only one long-run relation (i.e.
n = I in the notation of the previous section); distinguishing between the cases
where the forcing variables are determined exogenously and when they are not.
(o, Y) ,L = (u Se)(7)
i8t ~id
Suppose that there exists a single long-run relationship between yt and xt. (The
case of multiple long-run relations will be considered in Section II.2.) Then it
9 For further references and excellent reviews of the unit-root and cointegration literature see Campbell
and Perron(i99i) and Watson(I 994).
C Royal Economic Society I997
Ayt = c- (i -
0) Yt-1+ [y + (Ju,/oJee)(i -
p)] xt-1 + (oue/o7eJ Axt + yt, (8)
where c = a(i - 5) -b(ou,/oa,) (i -p), and xt is independently distributed of Vt.
When 8 = o, equations, (8) and (6), are 'observationally' equivalent to the
original bivariate model (5) and (6); the main difference being that in (8), xt
can be treated as strictly exogenous, even if o=u=* o, but this is not true of xt in
(5). Irrespective of whether xt is strictly exogenous in (5), using (8) the long-
run relationship between yt and xt is given by
Yt = +xt + t (9)
where II(I), and x= /I0,(O
applicable even if xts are endogenous, irrespective of whether they are I (i) or
not.
The order-augmented traditional ARDL approach has the additional
advantage that it does not require pre-testing of the regressorsfor the presence
of unit roots, a problem that afflictsother approaches to estimation of long-run
relations, such as the fully modified OLS approach of Phillips and Hansen
(I990). The pre-testing is particularly problematic in the unit-root-
cointegration literature where the power of the unit-root tests is typically very
low, and there is a switch in the distribution function of the test statistics as one
or more roots of the xt process approach unity. It is useful to illustrate this point
in the context of the following simple bivariate model recently analysed by
Cavanagh et al. (1 995):
Yt = a +7Xt-l+8tn ~~~(I2)
s-1
0 = y + (II-P) U
where ?L(L) = (o- lL- (DPLP, and all the roots of det [4D(z)]= o fall
outside the unit circle, and 4D(i) = A-B-C, and suppose that the k x
vector, xt, follows the VAR(s) process,
s-1
Axt = bo + (Db1) t-Dxt1 +E Ri Axt-i + et, t = I 2 ...*
* T, (i 6)
i=l
the underlying variables are I (i). For a proof of this result in the case of univariate models see Chan and Wei
(i 988).
When the xts are not strictly exogenous, long-run relations are no longer
given by (3) and the indirect effects of xt on yt resulting from the non-zero
correlation between Yt and ct must also be taken into account. Following a
similar procedure as in section 11.I we obtain the following structural long-run
relations:14
(A-B-C) y* = (r+i: E-1 D) x* +t ) (I7
where Ywe= E(u ct) and Lee= E(t ). When Eu O,these relations reduce to
the theoretical specification given by (3) only if xts are I(i) and are not
themselves cointegrated, i.e. if D = o. This case corresponds to identifying the
xts as the model's common stochastic trends (cf. Stock and Watson (i988)).
In general, when xts are not strictly exogenous, the econometric analysis of
the long-run relations can be carried out by embedding (I 7) in a VAR model
that combines two set of equations given in (I 5) and (i 6) for Yt and xt. Let
Zt= (yt, xt')', then the combined model can be written as:
p-i
L kxn kx k
which is the VAR (p) specification that underlies the cointegration analysis of
Johansen.'5 Notice that the long-run relationshipsimplied by (i 8) are the same
as those given by (I 7).16 The estimation of (i8) can now proceed by the ML
method, taking account of the long-run restrictions implied by the economic
theory on the elements of the coefficient matrices A, B, C, F, and the unit root
restrictions (if any) involving the elements of D. Pesaran and Shin's (I995a)
long-run structural modelling approach is directly applicable to (i8), and is
made fully operational (with and without I (I) exogenous variables) in Microfit
4.o (see Pesaran and Pesaran (I996)).
In the absence of any a priorirestrictionson H, the long-run relations are not
identified, and the best that can be done is to test rank restrictions on H,
assuming that zts are known to be I(i). In the case where the underlying
theoretical model has a unique stable solution, A - B - C will be non-singular
14
For expositional simplicity I have abstracted from the possible effects of the deterministic trends in the
long run relations.
15 In many applications ofJohansen's procedure the trend coefficientsin (i 8) are specified independently
of the long-run matrix coefficients, HI.However, as argued in Pesaran and Shin (I 995 a), the restrictionson
the coefficientsof the trends implicit in formulation (i 8) that require the trend-coefficientsto lie in the space
spanned by the columns of HI,ensure that the nature of the trends in the levels of zt remains invariant to the
rank of HI.But when the trend-coefficientsare left unrestrictedthe model (i 8) has the unsatisfactoryproperty
that zt will have linear deterministic trends when H is full rank, but zt will contain quadratic trends when
H is rank deficient.
16
Using (i8), and ignoring the deterministic variables the long-run relations are given by H1z*= v*
where vt' =ueSe ?*+t*, with E(1* Ixt*) = o. Hence, it readily follows that (A-B-C)E(y"xl =
(ro+ya Econom) X . S
(C)Royal EconomicSocietyI 997
The main advantage of the above 'full' system approach over the traditional
simultaneous equations system approach lies in the fact that, in principle, the
'full' system approach allows one to test the theory's prediction that the
number of long-run relations is in fact equal to the number of the decision
variables, n. Unfortunately, the implementation of this test in practice has
encountered two major difficulties: First, to ensure that all the variables are
I (i) an important element of pre-testing will be involved. Secondly, when the
number of variables in the 'full' system exceeds I o, the cointegration test tends
to have rather poor small sample properties particularly for reasonable choices
for the order of the underlying VAR model.
REFERENCES
Abstract: Although recent articles have stressed the importance of testing for unit roots and cointegration in time-series
analysis, practitioners have been left without a straightforward procedure to implement this advice. I propose using the
autoregressive distributed lag model and bounds cointegration test as an approach to dealing with some of the most commonly
encountered issues in time-series analysis. Through Monte Carlo experiments, I show that this procedure performs better
than existing cointegration tests under a variety of situations. I illustrate how to implement this strategy with two step-by-
step replication examples. To further aid users, I have designed software programs in order to test and dynamically model
the results from this approach.
Replication Materials: The data, code, and any additional materials required to replicate all analyses in this arti-
cle are available on the American Journal of Political Science Dataverse within the Harvard Dataverse Network, at:
https://doi.org/10.7910/DVN/MPQQC0.
R
ecent work in the time-series literature has cointegration testing. Depending on the results of the
stressed the importance of testing for unit roots as cointegration test, this strategy absolves users from hav-
well as the existence of long-run relationships— ing to distinguish between stationary (henceforth I(0))
or cointegration—between variables.1 Since the presence and first-order nonstationary (I(1)) regressors. This is an
or absence of each of these characteristics ultimately de- advantage since unit root testing is difficult in short se-
termines the appropriate model, failure to perform such ries and introduces “a further degree of uncertainty into
pretesting makes spurious inferences more likely. Even the analysis” (Pesaran, Shin, and Smith 2001, 289). The
with existing tools designed to identify unit roots and test ARDL-bounds procedure involves the following:
for cointegration, short series, the weak power of statis-
tical tests, and the dangers of overfitting make pretesting 1. Ensuring the dependent variable is I(1).
time-series data particularly problematic. Although re- 2. Ensuring the independent variables are not ex-
cent articles have helped to identify these issues (Grant plosive or higher orders of integration than I(1).
and Lebo 2016; Keele, Linn, and Webb 2016), users have 3. Estimating the ARDL model in error correction
been left without a straightforward solution about how form, and ensuring there is no autocorrelation.
to deal with such problems.2 4. Performing the bounds test for cointegration.
I propose using the autoregressive distributed lag Three possibilities result: (a) all regressors are
model and associated bounds testing procedure (ARDL- I(1) and cointegrating, (b) all regressors are
bounds) developed by Pesaran, Shin, and Smith (2001) I(0)—by definition, they cannot cointegrate—or
as a comprehensive approach to model specification and (c) indeterminate. An indeterminate result may
Andrew Q. Philips is assistant professor, Department of Political Science, University of Colorado at Boulder, UCB 333, Boulder, CO
80309-0333 (andrew.philips@colorado.edu).
I would like to thank Lorena Barberia, Allyson Benton, Harold Clarke, Peter Enns, Nathan Favero, Eric Guntermann, Mark Pickup, Joe
Ura, B. Dan Wood, and participants of the Texas A&M methodology brownbag lunches. Special thanks go to Soren Jordan, Paul Kellstedt,
and Guy D. Whitten. Despite this helpful advice, any errors and omissions remain my own.
1
Covariance stationary series exhibit constant mean, variance, and covariance. A linear combination of two or more first-order nonstationary
series that yields a stationary series is said to be cointegrating.
2
Grant and Lebo (2016) provide two solutions, including the one discussed herein. However, their discussion is brief.
American Journal of Political Science, Vol. 00, No. 0, xxxx 2017, Pp. 1–15
C 2017, Midwest Political Science Association DOI: 10.1111/ajps.12318
1
2 ANDREW Q. PHILIPS
still find cointegration among some of the inde- collinearity, lag order restrictions are often imposed. A
pendent variables, although further testing and common restriction is the ARDL(1,1) model:
respecification (in Step 3) is required. yt = ␣0 + ␣1 yt−1 + 0 xt + 1 xt−1 + ⑀t . (2)
Surprisingly, while this method is popular in other The contemporaneous effect of xt on yt is given by 0 .
fields (over 5,300 cites on Google Scholar as of Septem- The magnitude of ␣1 informs us about the “memory”
ber 2016), it has been cited and implemented only twice of yt (De Boef and Keele 2008). Assuming 0 < ␣1 < 1,
among American Political Science Review, American Jour- larger values indicate that movements in yt take longer to
nal of Political Science, Journal of Politics, and Political dissipate.4 The long-run effect (or long-run multiplier)
Analysis: Dickinson and Lebo (2007) and Grant and Lebo is the total effect that a change in xt has on yt . It is given
(2016). 0 +1 )
as 1 = ((1−␣1 )
, and its variance is typically approximated
Four contributions stand out in this article. First, I using the delta method.
discuss why an additional time-series procedure is neces- The generalized error correction model (GECM) may
sary, given recent debates about the role of error correc- also be used if all variables are I(0); the most common
tion models (Esarey 2016; Grant and Lebo 2016; Helgason form is the one-step GECM:
2016; Keele, Linn, and Webb 2016). Second, I use Monte
Carlo experiments to compare the performance of the yt = ␣0 + ␣1∗ yt−1 + 0 xt + 1∗ xt−1 + ⑀t , (3)
ARDL-bounds cointegration test against existing alterna- where the first difference of yt is a function of a constant
tives, under a variety of scenarios that practitioners typi- term, ␣0 , its own lag, yt−1 , the first difference of xt and its
cally encounter. I also examine how well the model recov- lag, xt−1 , and an i.i.d. error term, ⑀t . Although the GECM
ers substantively interesting effects, such as long-run mul- is algebraically equivalent to the ARDL(1,1) model, inter-
tipliers or adjustment parameters. Third, I demonstrate pretation changes. Contemporaneous effects of a change
the utility of the ARDL-bounds approach and the merits in xt on yt are still given by 0 . The rate of adjustment, or
of dynamic interpretation through two replications. Fi- the speed at which the total effect of a change of xt accu-
nally, I conclude with guidelines for implementing this mulates in yt , is given by ␣1∗ . It is used in calculating the
∗
procedure and introduce software programs designed to long-run multiplier, 1 = − ␣1∗ . Although obtaining vari-
1
help practitioners with cointegration testing and explor- ance estimates of the short-run effect is straightforward,
ing the substantive implications of their results. the variance around 1 must be approximated using the
Bewley transformation or the delta method (De Boef and
Keele 2008).
Unit Roots and Cointegration The GECM is also ideal for when the dependent and
in Time-Series independent variables are I(1) and cointegrating. In our
bivariate example, if there exists some linear combination
Consider a general autoregressive distributed lag of the two I(1) series that results in a stationary series, they
ARDL( p, q ) model where a series, yt , is a function of are said to be cointegrating. Testing is often performed
a constant term, ␣0 , past values of itself stretching back p using the Engle-Granger “two-step” approach (Engle and
periods, contemporaneous and lagged values of an inde- Granger 1987), which involves regressing yt on xt :
pendent variable, xt , of lag order q , and an independent,
yt = 0 + 1 xt + z t . (4)
identically distributed (i.i.d.) error term:
p
q If both variables are I(1), there exists one cointegrat-
yt = ␣0 + ␣i yt−i +  j xt− j + ⑀t , ing relationship if the residuals in Equation (4), z t , are
i =1 j =0 stationary.5 More generally, a sufficient condition in
which to use an error correction model is if all variables
⑀t ∼ N(0, 2 ). (1)
are I(1) and cointegrating.6
The data generation process for the dependent and inde-
pendent variables determines how Equation (1) is esti- 4
Values of ␣1 greater than one suggest an explosive series or a
mated. If variables on both the left- and right-hand sides model mis-specification. Values less than zero suggest the series is
are I(0), they will exhibit constant mean, variance, and overcorrecting or oscillating; this is rare in the social sciences.
covariance, and the ARDL( p, q ) shown in Equation (1) 5
This is true for any k series, which can have up to k − 1 cointe-
may be used.3 Since additional lags may induce multi- grating relationships.
6
p This condition is sufficient but not necessary; one could use other
3
The stationarity condition for yt is given as | i =1 ␣i | < 1. Such models (e.g., first differences). I focus on I(1) series since higher
variables are said to be covariance stationary. orders of integration are rare in political science, although this
HAVE YOUR CAKE AND EAT IT TOO? 3
Even if both series are I(1), there may not always be A Comprehensive Approach
an underlying cointegrating relationship between them. to Time-Series Analysis
Practitioners often conflate re-equilibration with error
correction and fail to test for cointegration (Grant and
While the autoregressive distributed lag (ARDL) model
Lebo 2016).7 Even if xt and yt are I(1), without cointe-
and associated bounds test of Pesaran, Shin, and Smith
gration, there cannot be a long-run relationship between
(2001) comprise an approach already popular in eco-
them since (rewriting Equation 4) the linear combina-
nomics, it remains relatively unknown in political sci-
tion of the series, z t = (yt−1 − 0 − 1 xt−1 ), will not be
ence. It is ideal for four reasons. First, although we may
stationary. If all variables are I(1) but not cointegrating,
suspect that all regressors are I(1), an initial model can
the series can only be analyzed in first differences since a
be estimated without having to rely on unit root testing
short-run relationship may still exist
to distinguish between I(0) or I(1) regressors. Restric-
The recommendations above are straightforward in
tions on the independent variables can then be imposed
theory. In practice, identifying the correct model is non-
to avoid spurious conclusions of cointegration. Second,
trivial. For one, unit root tests often have size distortions
the one-step procedure for the initial cointegration test is
and low power in small samples, making it difficult to
similar to the GECM, making it easy to estimate. Third,
determine whether a variable is I(0) or I(1) (Choi 2015;
the cointegration test is often straightforward to inter-
Maddala and Kim 1998). This difficulty is compounded
pret. Fourth, this framework provides a comprehensive
since users must test each variable in order to use models
approach for practitioners.
such as the GECM. Series may be so highly autoregressive
The ARDL-bounds approach is shown in schematic
(near-integrated) that testing procedures cannot distin-
form in Figure 1.9 As shown in step a, users must first es-
guish them from an I(1) series (De Boef and Granato
tablish whether the dependent variable is I(1). To mitigate
1997). Moreover, series may be fractionally integrated.
difficulties with unit root testing, users should employ a
While some scholars argue that these are common in po-
suite of unit root tests and account for the possibility of pe-
litical science (Box-Steffensmeier and Smith 1998; Grant
riodicity, drift, and deterministic trends. If the dependent
and Lebo 2016; Lebo, Walker, and Clarke 2000), others
variable is stationary, then cointegration is not possible
remain skeptical (Keele, Linn, and Webb 2016; Pickup
and any I(1) regressors must be first differenced (step f).
2009).8 In other words, with short series (less than 100),
After ensuring that all independent variables are station-
we are often at the mercy of our tests, and we risk choos-
ary (step c), we must also check that no autocorrelation
ing models that are not reflective of the characteristics of
remains in the residuals (step i). As shown by step h in
our data.
Figure 1, if there is autocorrelation, we can incorpo-
As recent work has shown, many scholars have over-
rate lags of the dependent and independent variables,
looked the crucial steps of testing for unit roots and coin-
or lagged first differences if a regressor is I(1). Lag struc-
tegration (Grant and Lebo 2016). Others find that com-
tures are typically chosen based on theoretical expecta-
plex model specifications tend to overfit and perform
tions about the data generation process, and by minimiz-
poorly in small samples (Esarey 2016; Keele, Linn, and
ing information criteria such as the Akaike Information
Webb 2016). While these important contributions have
Criterion (AIC) and Schwarz-Bayesian Information Cri-
identified potential problems, they leave users without a
terion (SBIC) . If no autocorrelation remains, the result-
clear and easy-to-implement solution. As I show in the
ing ARDL model is one where all variables are I(0), as
next section, a procedure already exists that greatly eases
shown in step j, a version of which was shown in Equa-
unit root testing, includes a test for cointegration, and is
tion (1). There is no need to check for cointegration since
simple to estimate. Moreover, when combined with dy-
all variables are stationary.
namic simulations, these models can provide additional
If the dependent variable is I(1), there may be coin-
substantive interpretations.
tegration. As shown in step b in Figure 1, we do not have
to establish whether the regressors are I(0) or I(1); we of
excludes the possibility of multi-cointegration (Enders 2010, 380– course suspect I(1), since we are testing for cointegration.
82). However, we must ensure that there are no explosive se-
7
While cointegrating relationships can be estimated using GECMs, ries, seasonal unit roots, or series higher than I(1) in any
estimating GECMs does not necessarily mean two or more series of the variables. Violation of these conditions invalidates
are cointegrated.
8 9
Helgason (2016) and Esarey (2016) investigate treating data For brevity, I do not consider fractionally integrated relation-
as fractionally integrated versus I(1) through Monte Carlo ships. I discuss strategies for handling these data in the supporting
simulations. information.
4 ANDREW Q. PHILIPS
(a) Is the
dependent
variable non-
stationary? (c)
Are the
(b) Are all independent
independent yes no
variables non-
variables of stationary?
order I(1) or (f)
lower? (e) no yes Difference
(d) Estimate independent
no yes ARDL(p,q) in variables
error
Difference correction (g)
independent form
variables Is there
(h) (l) (j)
autocorrelation
in the residuals? yes Is there no Estimate
Incorporate autocorrelation ARDL(p,q) in
(k) yes
in the residuals? levels
difference of
Incorporate no variables
difference of
variables
(n)
(i) (m) (o)
Conduct
Exclude yes
Are all bounds test. Do
at least one Conclude
yes independent the results
independent no, cointegration
variables in suggest
variable in indeterminate cointegration?
levels non-
levels
stationary?
(p) no, all I(0)
(r)
Exclude no
stationary
(q) First-
Conclude difference
variables from
stationary dependent variable
appearing in
regressors and run
levels
ARDL(p,q)
The resulting model appears as. the t- and F-statistics can be found in Pesaran, Shin, and
p Smith (2001, 300–304), and small-sample critical values
yt = ␣0∗ + 0 yt−1 + 1 xt−1 + ␣i yt−i for the F-statistic can be found in Narayan (2005, 1987–
i =1 90). No small-sample critical values are currently available
q for the t-test, so in small samples it should only be used
+  j xt− j + ⑀t . (9) for confirmatory purposes. Interpretation of the bounds
j =0 test is illustrated in Figure 2. Three possibilities result.
If the value of the F-statistic is lower than the station-
After estimating the ARDL-bounds model in Equa- ary critical value, then we cannot reject the null hypoth-
tion (9) and ensuring white noise residuals (steps g and esis that there is no cointegrating relationship (step q in
k), the next step is to conduct the bounds test (step n). Figure 1); in fact, we can conclude that all independent
It tests the null hypothesis of no cointegration between variables appearing in levels are stationary, without hav-
the dependent variable and any regressors included in the ing to conduct any further unit root testing. If this is the
cointegrating equation (Pesaran, Shin, and Smith 2001, case, the final model specification is the first difference of
294–95). Only regressors that enter into the equation in the dependent variable regressed on up to l lags of the in-
levels (e.g., xt−1 ) in Equation (9) can (potentially) coin- dependent variables appearing in levels, as well as up to p
tegrate with yt . The bounds F-test consists of running and q lags of the first differences of the dependent and in-
a Wald test or F-test on the following restriction from dependent variables necessary to remove autocorrelation
Equation (9): (step r):
H0 : 0 = 1 = 0 (10)
l
p
under the null hypothesis that no cointegrating relation- yt = ␣0 + ␦k xt−k + ␣i yt−i
k=0 i =1
ship exists between xt and yt . Rejecting H0 indicates that
there is a cointegrating relationship between the series.
q
model in error correction form is correctly specified, and investigates failing to detect cointegration when it exists
that cointegration exists between the dependent variable (Type II error).
and any independent variables appearing in levels. To evaluate the ability of the bounds cointegration
If the F-statistic is between the stationary and I(1) test to avoid Type I error, I generated an I(1) dependent
critical values, the test is inconclusive. There could be a variable, yt , for series of length T = 35, 50, 80.15 Next,
mix of stationary and I(1) regressors, and cointegration four independent variables, xkt (where k = 1, 2, 3, 4),
among the I(1) variables and the dependent variable may were generated. These were completely unrelated to yt ,
still exist. However, further testing is required. As shown or to one another:
by step m in Figure 1, the next step is to conduct unit
yt = yt−1 + t . (12)
root tests for each independent variable. Since I(0) vari-
ables cannot possibly have a cointegrating relationship
xkt = k xkt−1 + kt . (13)
with an I(1) dependent variable, they should only enter
into the model in first-differenced form.14 After rerun- The stochastic components t and kt are i.i.d. and inde-
ning the ARDL model in error correction form (step e), pendent from each other. As discussed earlier, detection
conduct the bounds test for cointegration (step n) on the of stationary variables is difficult in short series. To see
remaining I(1) regressors. If a conclusive result is reached, the consequences of erroneously including an I(0) regres-
no further testing is required. If the test is still inconclu- sor when all other variables are I(1), I allow the autore-
sive, the next step is to start excluding combinations of gressive process for x1t , 1 , to vary from 0.0 to 1.0 by
I(1) regressors from the cointegrating equation (having a increments of 0.10. All other independent variables are
coefficient in Equation 9) and repeat steps e and n. If, I(1) (i.e., k = 1 ∀k = 1). Next, I ran the ARDL-bounds
after iterating through the possible combinations of inde- model:
pendent variables, there is still no conclusive result from yt = ␣0 + 0 yt−1 + 1 x1t−1 + · · · + k xkt−1
the bounds test, then we can conclude no cointegration.
Since short-run effects between I(1) variables may still
p
q1
+ ␣i yt−i + 1 j x1t− j
exist, the final model can be estimated in first differences.
i =1 j =0
Evaluating the t-statistic is exactly the opposite of the
F-statistic; if the value of the t-statistic is lower then the
qk
I(1) critical value, then we can reject the null hypothesis of +··· + kj xkt− j + ⑀t . (14)
j =0
no cointegrating relationship. If the value of the t-statistic
falls above the I(0) critical value, then we cannot reject the The number of lagged first differences of yt and each xkt
null hypothesis. Just as with the F-statistic, if the critical to include in Equation (14) was determined via SBIC for
value falls between the bounds, the test is inconclusive, each of the 500 simulations conducted for all combina-
and more precise testing of the regressors is necessary. tions of T , k, and 1x .16 After estimating Equation (14), an
That is to say, we would next use unit root testing to F-test of the null hypothesis that 0 = 1 = · · · = k = 0
isolate out only the I(1) variables and iterate through was conducted for each simulation. The resulting statistic
them as needed in order to conclude either cointegration was compared against the associated critical values of the
(step o) or all I(0) regressors (step q). bounds test from Narayan (2005, 1988). Since these series
were independently generated, evidence of cointegration
(an F-statistic greater than the I(1) critical value) is an
incorrect rejection of the null hypothesis and thus a form
Monte Carlo Evidence of Type I error.17
1 1 1 1
.8 .8 .8 .8
.6 .6 .6 .6
.4 .4 .4 .4
.2 .2 .2 .2
0 0 0 0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x x x
Value of Value of Value of Value of
1 1 1 1
.8 .8 .8 .8
.6 .6 .6 .6
.4 .4 .4 .4
.2 .2 .2 .2
0 0 0 0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x x x
Value of Value of Value of Value of
1 1 1 1
.8 .8 .8 .8
.6 .6 .6 .6
.4 .4 .4 .4
.2 .2 .2 .2
0 0 0 0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x x x
Value of Value of Value of Value of
Note: Each plot shows the proportion of simulations finding (at p < .05) evidence of one cointegrating relationship with up
to k regressors and different numbers of observations across varying amounts of autoregression in x1t , using each of the four
cointegration testing procedures.
I compare the performance of the bounds test to two included, given that in small series an autoregressive I(0)
other procedures. I included the Engle-Granger two-step variable may be indistinguishable from an I(1) series.
procedure by implementing an augmented Dickey-Fuller The results from the first Monte Carlo experiment
unit root test on the residual series, z t , from the coin- are shown in Figure 3. The level of autoregression, 1 ,
tegrating equation: yt = 0 + 1 x1t + · · · + k xkt + z t .18 in the single stationary series—x1t —is on the horizontal
I also used the Johansen procedure for cointegration to axis. The proportion of simulations finding evidence of
test for the existence of a single cointegrating relationship, cointegration is on the vertical axis; higher values indicate
using both the multiple trace testing procedure as well as Type I error. When there are only 35 observations, it is
the number of cointegrating ranks as chosen by mini- clear that the bounds test is the only cointegration pro-
mizing SBIC (Johansen 1995).19 Although cointegration cedure that comes close to the conventional 5% rejection
tests are only supposed to be run on all-I(1) series, the rate (shown by the thin black line). As the number of
purpose of this Monte Carlo experiment is to evaluate test independent variables increases (each column shows the
performance when a stationary regressor is erroneously number of k regressors), all tests tend to have increased
Type I error. For instance, when there are four regres-
sors, we find spurious evidence of cointegration about
18
The same lag restrictions were placed on the additional augment- 60% of the time when using the Engle-Granger test; sur-
ing lags of yt−i needed to remove autocorrelation, as determined prisingly, its high rate of Type I error does not change
by minimizing SBIC. Critical values are from MacKinnon (1994). as T increases. This finding underscores recent work on
19
Lag-order selection was the same as the Engle-Granger procedure. overfitting in short time-series (Helgason 2016; Keele,
Results of r = 1 were recorded as no evidence of Type I error.
8 ANDREW Q. PHILIPS
1 1 1
Number of X Variables
Number of X Variables
Number of X Variables
2 2 2
3 3 3
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
Proportion Finding Evidence of Cointegration at < 0.05 Level
Linn, and Webb 2016). Despite this, the bounds test ex- I next explore the likelihood that the bounds test fails
cels at successfully failing to reject the null hypothesis of to detect cointegration when it exists (Type II error). As
no cointegration under all scenarios. Only the Johansen before, I vary the number of regressors and the number
test appears to have the same low rate of Type I error, but of observations. However, now the independent variables
only when the level of autoregression in x1t approaches a cointegrate with the dependent variable:20
unit root process. xkt = xkt−1 + kt . (15)
The performance of the bounds test is notable in a
number of ways. Not surprisingly, I find evidence that ut = 0.75ut−1 + t . (16)
it, along with other cointegration tests, performs poorly
in small samples. However, this is only when the length yt = 0.25x1t + · · · + 0.25xkt + ut . (17)
of the series is small and the number of regressors large. The errors kt and t are independent. This data gener-
Even then, the rate of Type I error using the bounds ation process yields an adjustment parameter of −0.25
test is often half that of the other cointegration tests, and a long-run multiplier of 0.25 for each of the k inde-
and it remains robust to erroneously including an I(0) pendent variables. The cointegration tests are the same
regressor. Only the Johansen-BIC test has a similar level as in the previous experiment, and conducted on 1,000
of Type I error, but only when all variables are at or simulations across each combination of observations and
near I(1). The fact that the performance of the bounds regressors.
test is barely affected by autoregression indicates that it The results of the second experiment are shown in
is a good test for cointegration in small samples; this Figure 4. Each bar depicts the proportion of cointegrat-
is exactly when we might erroneously include an I(0) ing relationships for a particular cointegration test, across
variable. Finally, while the Engle-Granger procedure is each combination of observations and regressors. Higher
robust to autoregression in a single regressor, it has much values correspond with a lower rate of Type II error. For
larger Type I error as the number of regressors increases. all tests, as the length of the series increases, Type II error
Taken together, this evidence suggests that the bounds decreases. In addition, as the number of cointegrating re-
cointegration test has lower Type I error than other tests, gressors increases, the Engle-Granger test correctly iden-
and it remains robust to short series, multiple regressors,
and erroneously including stationary regressors. 20
A proof of this is in the supporting information.
HAVE YOUR CAKE AND EAT IT TOO? 9
tifies cointegration at a greater rate than other tests. The conservative and less supportive of welfare. They find no
bounds test has the largest Type II error rate when T = 35, evidence that policy liberalism, income inequality, un-
although this improves sharply as the series lengthen. In employment, or inflation has any effect on public mood
addition, the proportion of simulations correctly iden- in the short-run. There are two reasons to believe these
tifying cointegration varies significantly across tests; the results may be suspect. First, the number of observations
Engle-Granger procedure has between one-third and one- is small. Second, although Kelly and Enns perform unit
half the rate of Type II error as the bounds test, and the root testing on the dependent variable, the authors make
bounds test has about one-half the Type II error as the no mention of testing the regressors.
Johansen tests. I replicated their model of public support for welfare
A number of important findings stand out from these policy.22 Results from their GECM are shown in Table
two experiments on cointegration. The bounds test has 1, Model 1. First, I ensured that the dependent variable
the lowest Type I error across all scenarios; moderate Type is I(1) (see step a in the Figure 1 schematic). Results
I error (20%) occurs only when there are four regressors from five unit root tests are shown in Table 2. While we
and 50 observations or fewer. While the bounds test is can reject the null hypothesis of an I(1) series using the
largely unaffected, the Johansen test tends to experience a augmented Dickey-Fuller test, more powerful ones such
rapid increase in Type I error rates when an I(0) regressor as the Dickey-Fuller Generalized Least Squares (DF-GLS)
is included. Although the Engle-Granger test has the low- and Elliott-Rothenberg-Stock (ERS) tests find evidence of
est Type II error rates, the bounds test tends to perform a unit root process.23 Although the Kwiatkowski-Phillips-
better than the Johansen tests in all scenarios, except for Schmidt-Shin (KPSS) test also provides mixed evidence,
a single regressor or short series. we can tentatively confirm that the data-generating pro-
In the supporting information, I conduct eight addi- cess of the dependent variable is I(1).24
tional Monte Carlo experiments. These include varying After ensuring that all regressors are first-order
the adjustment parameter and long-run multiplier, us- nonstationary or less (step b in Figure 1), I then estimated
ing fractionally (co)integrated series, and examining the the ARDL model in error correction form (step e).25 Us-
percentage of time a given cointegration test correctly or ing SBIC, I found that the lag structure in the original
incorrectly diverges from the other three cointegration model used by Kelly and Enns (2010) was optimal, given
tests. I also examine the ability of the GECM and ARDL- the data. This specification produced white noise residu-
bounds models to recover substantively interesting effects als, as evidenced by a battery of post-estimation diagnos-
(e.g., short- and long-run effects, or the adjustment pa- tics. Thus, the ARDL-bounds model shown in Model 2 in
rameter). Many of the findings are consistent with those Table 1 is identical to the original ECM in Model 1.
above; interested readers are directed to the brief sum- Since the model appears to be dynamically stable,
mary in Table 1 in the supporting information. we next use the bounds test to identify whether a
Taken together, the Monte Carlo results suggest that cointegrating relationship exists between support toward
the bounds test offers an ideal compromise between Type I welfare policy, policy liberalism, and income inequality
and Type II error. Given calls for more conservative coin- (step n in Figure 1). An F-test that the parameters on
tegration tests (Grant and Lebo 2016), the bounds test the variables appearing in lagged levels—Welfare t−1 ,
seems the prudent choice since it strongly avoids spuri- Policy Liberalismt−1 , and Income Inequality t−1 —are
ous cointegration, yet can still identify true cointegrating jointly equal to zero yields an F-statistic of 4.15. Although
relationships, at least for weakly exogenous regressors.21 Narayan (2005) provides the small-sample critical values
I show two applications of this approach below. necessary to evaluate this statistic, these are also available
in Stata and R using the programs pssbounds and pss,
22
See Table 1, Model 4, on page 864 in their article.
Application I: Kelly and Enns (2010) 23
The augmented Dickey-Fuller and Phillips-Perron tests suffer
from size distortions and weak power, and they are often outper-
Kelly and Enns (2010) examine how income inequality formed by the ERS and DF-GLS tests (Choi 2015, 37–54; Enders
2010, 234–37; Maddala and Kim 1998, 98–103).
affects public mood liberalism and support for welfare
24
policy. The authors find that in the long-run, increases in I examine the consequences of concluding stationarity in the
supporting information. Although the final model differs, the sub-
inequality are associated with the public becoming more
stantive results remain unchanged.
25
Unit root tests of the first difference of policy liberalism and
21
Were the regressors endogenous, methods such as the Johansen inequality rejected the I(2) null hypothesis; results are in the sup-
approach should be used. porting information.
10 ANDREW Q. PHILIPS
TABLE 1 Results of the ARDL-Bounds Model for Welfare Policy Mood (Kelly and Enns 2010)
respectively (Jordan and Philips 2016; Philips 2016b). Although the results of the cointegration test were
The critical values for 33 observations and two regressors borderline inconclusive with both policy liberalism and
are a lower stationary bound of 4.183 and an upper income inequality, a single regressor may still cointegrate
I(1) bound of 5.333. Strictly speaking, the F-statistic is with welfare policy mood. The next step is to test that
below the stationary lower bound, so we might conclude the regressors are I(1), since any I(0) regressor can easily
that all regressors are stationary (step q in Figure 1). be excluded from the cointegrating equation (step m in
However, given that the test result was so close to the Figure 1). Unit root testing (available in the supporting
I(0) lower bound of the test, we may want to treat the information) indicated that both policy liberalism and
result as inconclusive, which means that further testing income inequality are I(1).
is needed.26 Since unit root testing did not narrow down which
series should not appear in the cointegrating equation,
26
Moreover, the one-sided bounds t-test on the significance of the I estimated two different models (step n). In Model 3,
lagged dependent variable, −3.46, falls between the asymptotic I test to see whether only income inequality has a coin-
upper I(0) and lower I(1) critical bounds of −2.86 and −3.53,
respectively; this supports the “inconclusive” decision. tegrating relationship with public mood toward welfare.
HAVE YOUR CAKE AND EAT IT TOO? 11
TABLE 2 Public Mood Toward Welfare Is I(1) indicate that income inequality and policy liberalism do
(Kelly and Enns 2010) not have a statistically significant effect on the public’s
feelings toward welfare policy in the short run, a similar
Unit Root Test Welfare conclusion to what Kelly and Enns (2010) find.27
Augmented Dickey-Fuller (with −2.05∗ This replication is informative since it shows how one
drift) should proceed, given an inconclusive bounds test result.
Phillips-Perron −1.94 After finding that all regressors were I(1), I proceeded to
Dickey-Fuller GLS (with trend) −2.55 iterate through two different models, excluding one of the
Elliott-Rothenberg-Stock −2.55 regressors from the cointegrating equation in Models 3
Kwiatkowski-Phillips-Schmidt-Shin 0.49∗ (no lag), and 4. Since there was no evidence for cointegration when
(H0 = stationary) 0.29 (1 lag) isolating out income inequality and policy liberalism, the
final model was one of first differences since the error
Conclusion I(1)
correction framework is no longer appropriate.
Note: Thirty-three observations with 1-year lag are included for all While suggestive, this replication does not completely
tests unless otherwise noted. H0 = series contains a unit root for overturn the findings of Kelly and Enns (2010). Short se-
all tests except KPSS ∗ p < .05.
ries introduce a large amount of uncertainty into coin-
tegration tests, so it seems reasonable that different re-
Therefore, policy liberalism does not appear in levels in searchers might come to different conclusions.28 Overall,
Model 3. In order to produce white noise residuals (steps given the best available methods, there appear to be null
g and k), the lagged first difference of policy liberalism findings in their model of public mood toward welfare.29
was included. Because Model 3 reflects a data-generating
process where only income inequality is cointegrating,
evidence of cointegration in Model 3 would indicate that
income inequality, not policy liberalism, is cointegrat- Application II: Volscho and Kelly
ing with public mood toward welfare. An F-test of the (2012)
significance of the lagged variables in Model 3 yields an
F-statistic of 1.72. Since this is below the critical value of Volscho and Kelly (2012) use a GECM to probe the de-
5.290 for the I(0) lower bound and 6.175 for the I(1) up- terminants of the rise in top income shares in the United
per bound, we can conclude that income inequality and States from 1949 to 2008. I examine their power resource
public mood toward welfare are not cointegrating. model, which investigates whether the share of income
Next, I test to see whether only policy liberalism has a of the top 1% is determined by political and institutional
cointegrating relationship with public mood toward wel- factors. Results from their original model are shown in
fare. Therefore, in Model 4, income inequality does not Table 3, Model 1. As Volscho and Kelly find, increases
appear in levels. To produce white noise residuals, one in Democratic strength in Congress, union membership,
lag of the first difference of income inequality was in- and the presence of divided government tend to decrease
cluded. For Model 4, a rejection of the null hypothesis the share of income held by the superrich, but only in
using the bounds test would suggest that policy liberal- the long run. In contrast, Democratic presidents have no
ism, not income inequality, is cointegrating with public effect.
mood toward welfare. An F-test of the significance of the To implement the ARDL-bounds procedure, I first
lagged variables yields an F-statistic of 3.57. Since this ensured that the dependent variable, T op 1% Share, was
falls below the I(0) critical value of 5.290 (as well as the I(1), as shown in Table 4 (step a in Figure 1). After con-
upper I(1) critical value, 6.175), we can conclude that firming that the regressors are I(1) or less (step b), I used
policy liberalism and public mood toward welfare are not
cointegrating. 27
What differs is that the authors find evidence of a long-run effect,
Since neither income inequality nor policy liberalism whereas the ARDL-bounds approach does not.
on their own appear to have a cointegrating relationship 28
The Monte Carlo results show that while the bounds test tends
with welfare policy mood—nor do the three variables all to avoid spurious conclusions of cointegration in small samples, it
together, as found in Model 2—we can conclude that there also tends to have a high rate of false negatives; thus, it is hard to
ascertain whether their result holds.
is no cointegration (step q ). Since the two independent
29
variables may still affect public mood toward welfare in However, I find evidence of cointegration using this same
approach when examining Kelly and Enns’s other dependent
the short run, we may run a model of first differences variable—public mood liberalism—as detailed in the supporting
(step r ). This is shown in Model 5 in Table 1. The results information.
12 ANDREW Q. PHILIPS
(1) (2)
Original GECM ARDL-Bounds
Top 1% Sharet−1 −0.36∗∗ (0.09) −0.30∗∗ (0.07)
Democratic Presidentt 1.47∗∗ (0.53)
Democratic Presidentt−1 0.14 (0.56)
Democratic Presidentt−1 −0.20 (0.36) 0.11 (0.34)
% Congressional Democratt −0.03 (0.04)
% Congressional Democratt−1 0.05 (0.04)
% Congressional Democratt−1 −0.12∗∗ (0.04) −0.12∗∗ (0.03)
Divided Governmentt 0.37 (0.46)
Divided Governmentt−1 −0.11 (0.50)
Divided Governmentt−1 −0.93∗ (0.42) −0.83∗ (0.37)
Union Membershipt 0.29 (0.28) 0.04 (0.28)
Union Membershipt−1 −0.11∗∗ (0.03) −0.09∗∗ (0.02)
Constant 15.05∗∗ (3.83) 13.30∗∗ (2.81)
Observations 60 61
Adjusted R2 0.20 0.29
Breusch-Godfrey 2 of AR(1) 1.39 3.19
AR(2) 1.39 3.21
AR(3) 2.79 5.03
Durbin’s Alternative 2 of AR(1) 1.16 2.76
AR(2) 1.14 2.72
AR(3) 2.29 4.31
Cumby-Huizinga 2 of AR(1)–AR(3) 4.41 5.09
Shapiro-Wilk z 0.17 0.99
Note: Dependent variable is the share of income of the top 1%. Model 1 shows results from Volscho and Kelly (2012), and Model 2 shows
results using ARDL-bounds procedure, with lag structure determined by minimizing SBIC. Standard errors are in parentheses. ∗ p < .05,
∗∗
p < .01.
TABLE 4 Top 1% Share Is I(1) (Volscho and Kelly SBIC. While theory should always guide model specifica-
2012) tion, users must ensure that the residuals are white noise
in order to run the bounds test; in this example, both
Unit Root Test Top 1% Share the dead-start and standard GECM yielded white noise
Augmented Dickey-Fuller (with drift) 0.02 residuals.30
Phillips-Perron −0.21 Since Model 2 contains white noise residuals, we can
Dickey-Fuller GLS (with trend) −1.35 move onto cointegration testing using the bounds test
Elliott-Rothenberg-Stock −1.35 (step n in Figure 1). An F-test of the joint significance
Kwiatkowski-Phillips-Schmidt-Shin 2.20∗ of the five lagged variables (the four regressors plus the
(H0 = stationary) dependent variable) yields an F-statistic of 5.02. Critical
Conclusion I(1) values for 61 observations and four regressors are 3.068
and 4.274 for the lower and upper bounds, respectively.
Note: T = 60 with 1-year lag included for all tests. H0 = series Since the F-statistic is greater than the I(1) upper bound,
contains a unit root for all tests except KPSS. ∗ p < .05.
we can conclude that there is a cointegrating relationship
(step o). As further confirmation, we can use the bounds
SBIC to assist in lag selection for the ARDL model in error t-test; the t-statistic on the lagged dependent variable is
correction form, the result of which is shown in Model 2 −4.01, which is below the critical value of the I(1) lower
(step e). Although the authors may have had theoretical
reasons to use the “dead-start” GECM, I find instead that
30
a model of contemporaneous short-run effects has a lower Therefore, one could use the bounds test on either model.
HAVE YOUR CAKE AND EAT IT TOO? 13
bound (−3.99). Thus, there is strong evidence that all four top 1%. In the supporting information, I also replicate
regressors are cointegrating with the dependent variable. Ura (2014) and find evidence of cointegration.
The largest difference between Volscho and Kelly’s Although the examples above are representative of
(2012) original model and the ARDL-bounds model is most situations practitioners are likely to encounter, I
the significance of the short-run effect of a Democratic briefly review how users should proceed, given their own
president. To see whether this leads to different conclu- theoretically specified model:
sions than the ones made by the authors, in the support-
1. Unit root testing of the dependent variable. If
ing information I use dynamic simulations to help inter-
the dependent variable is I(1), proceed with the
pret how changes in one regressor affect the dependent
ARDL in error correction form.32
variable over time. Model-based dynamic simulations are
2. Ensure that no independent variables are of an
growing in popularity in political science (King, Tomz,
order of integration higher than I(1). The main
and Wittenberg 2000; Williams and Whitten 2012), and
advantage of the bounds approach is that users
they are especially valuable for examining complex model
do not have to make difficult decisions between
specifications such as autoregressive relationships with
I(0) and I(1) regressors; the results of the bounds
interactions (Williams and Whitten 2011) or dynamic
test inform us of these characteristics. However,
compositional dependent variables (Philips, Rutherford,
users must ensure that no variables are integrated
and Whitten 2015, 2016). The ARDL-bounds procedure’s
more than I(1), are explosive, or contain seasonal
lag structure makes it a prime candidate for dynamic sim-
unit roots.33
ulations. Using the program dynpss to create dynamic
3. Estimate the ARDL in error correction form. Since
simulations of the ARDL-bounds model (Philips 2016a),
the bounds testing procedure relies on white
I find that in the short run, moving from a Republican
noise residuals, add lags of the first differences of
to a Democratic president increases the income concen-
the dependent variable and regressors as needed.
tration of the top 1%. However, this effect loses statistical
Use theory and information criteria to aid in lag
significance after 4 years, it is not statistically significantly
specification. Ensure that the residuals are white
different from the predictions using Volscho and Kelly’s
noise.
(2012) GECM, and the long-run effect is nearly zero.31
4. Test the joint significance of all lagged variables
These results are available in the supporting information.
appearing in levels using a Wald/F-test. Use small-
In summary, I find evidence for cointegration in the
sample critical values of the bounds test in
power resources model of Volscho and Kelly (2012). While
Narayan (2005). As an auxiliary test, use the one-
the ARDL-bounds model had slight specification differ-
sided t-test of the lagged dependent variable us-
ences, the substantive findings do not change, as evi-
ing asymptotic critical values in Pesaran, Shin,
denced by dynamic simulations. Institutional and politi-
and Smith (2001).
cal factors may affect the income share of the top 1%, but
5. If the results of the bounds test.
only in the long-run.
(a) Suggest cointegration: All variables appearing
in levels appear to be I(1) and have a cointe-
grating relationship with the dependent vari-
Discussion and Conclusion able.
(b) Suggest stationarity: All regressors appearing
The two examples above represent a variety of situations in levels are I(0) and cannot possibly be in
that the ARDL-bounds approach is designed to handle. a cointegrating relationship. A model of first
For the Kelly and Enns (2010) replication, I find no differences must be estimated since the vari-
evidence of cointegration. Using the steps outlined in ables may still affect the dependent variable
Figure 1, I find no evidence that policy liberalism and in- in the short run.
come inequality affect welfare policy mood in the long- or (c) Are inconclusive: Each regressor should be
short-run. For the Volscho and Kelly (2012) replication, I tested for a unit root. Only I(1) variables can
find evidence of cointegration; these findings support the
32
authors’ conclusions about the long-run effect of institu- If the dependent variable is I(0), it is not first differenced, leading
tions and politics on the concentration of income of the to a lagged dependent variable model as shown in the Figure 1
schematic.
31 33
This is confirmed analytically by calculating the long-run mul- While the test statistics can be adjusted to account for determin-
tiplier, which is 0.36 and is not statistically significantly different istic trends in the dependent variable, it is advisable to identify and
from zero. detrend instead.
14 ANDREW Q. PHILIPS
appear in levels in the error correction model. Stata and R designed to help users test for cointegration
Stationary variables may still appear in first and create dynamic simulations.36
differences.34 Repeat Steps 3 and 4. If the re- This article was motivated by a series of recent ar-
sulting statistic is still inconclusive, combi- ticles in the time-series literature that stress the impor-
nations of variables appearing in levels may tance of careful unit root and cointegration testing. To
need to be tested. Continue testing until (5a) achieve this, I have advocated for the autoregressive dis-
or (5b) is reached. tributed lag bounds approach. I have shown that the
6. Interpretation. Use dynamic simulations and an- ARDL-bounds procedure starts with a theoretically spec-
alytical calculations for hypothesis testing. ified model and moves step-by-step to arrive at an in-
formed conclusion. Through careful testing and model
While the ARDL-bounds procedure provides a com- specification, the ARDL-bounds procedure is a power-
prehensive approach to modeling time-series and testing ful approach to a difficult problem in applied time-series
for cointegration, it is not a remedy for all problems. First, analysis.
like all time-series models, it tends to perform poorly in
small samples. As a precaution against overfitting, Keele,
Linn, and Webb (2016, 40) suggest a minimum of be-
tween 10 and 20 observations per parameter.35 However,
References
as shown by Monte Carlo simulations, the bounds coin-
Balke, Nathan S., and Thomas B. Fomby. 1997. “Threshold
tegration test tends to perform at least as well as other Cointegration.” International Economic Review 38(3): 627–
cointegration tests in small samples. Second, this single- 45.
equation model imposes a causal ordering and assumes Box-Steffensmeier, Janet M., and Renee M. Smith. 1998. “In-
weak exogeneity of the regressors (Pesaran, Shin, and vestigating Political Dynamics Using Fractional Integration
Smith 2001, 293), a disadvantage shared with GECMs. Methods.” American Journal of Political Science 42(2): 661–
Users unwilling to impose a causal ordering should con- 89.
sider alternative methods such as vector error correction Choi, In. 2015. Almost all about unit roots: Foundations, develop-
ments, and applications. Cambridge: Cambridge University
models, which can account for multiple cointegrating re- Press.
lationships. Third, the cointegration test serves as a sub- De Boef, Suzanna, and Jim Granato. 1997. “Near-Integrated
stitute for unit root testing to distinguish between I(0) Data and the Analysis of Political Relationships.” American
and I(1) regressors only when the test results fall outside Journal of Political Science 41(2): 619–40.
of the critical bounds. Given an inconclusive test result, De Boef, Suzanna, and Luke Keele. 2008. “Taking Time Seri-
users must use unit root tests on all regressors and identify ously.” American Journal of Political Science 52(1): 184–200.
the stationary, I(1), and I(1)-and-cointegrating variables Dickinson, Matthew J., and Matthew J. Lebo. 2007. “Reexamin-
through an iterative process, as shown in the Kelly and ing the Growth of the Institutional Presidency, 1940–2000.”
Journal of Politics 69(1): 206–19.
Enns (2010) replication. Last, this procedure still requires
Enders, Walter. 2010. Applied econometric time series. 3rd ed.
balanced equations (Grant and Lebo 2016; Keele, Linn, New York: John Wiley and Sons.
and Webb 2016); although stationary regressors can ap- Engle, Robert F., and Clive W. J. Granger. 1987. “Co-integration
pear in levels in the final model, I(1) regressors that are not and Error Correction: Representation, Estimation, and Test-
cointegrating cannot appear in levels in the final model ing.” Econometrica 55(2): 251–76.
without risk of spurious regression. Esarey, Justin. 2016. “Fractionally Integrated Data and the Au-
To aid in the use of this approach, this article has todistributed Lag Model: Results from a Simulation Study.”
Political Analysis 24(1): 42–49.
provided a step-by-step guide for practitioners that can
be used with any software package that contains unit root, Grant, Taylor, and Matthew J. Lebo. 2016. “Error Correction
Methods with Political Time Series.” Political Analysis 24(1):
autocorrelation, and the F- and t-tests necessary for the 3–30.
bounds test (e.g., R, Stata, or EViews). In addition, in the Helgason, Agnar Freyr. 2016. “Fractional Integration Methods
supporting information, I discuss software programs in and Short Time Series: Evidence from a Simulation Study.”
Political Analysis 24(1): 59–68.
34
I(0) variables could appear in levels in the final model without 36
risking spurious regression. In Stata, these are pssbounds for displaying critical values of
the bounds test and dynpss for creating dynamic simulations of
35
I address concerns about overfitting in the supporting the ARDL-bounds model (Philips 2016a, 2016b). The pss package
information. implements these commands in R (Jordan and Philips 2016).
HAVE YOUR CAKE AND EAT IT TOO? 15
Johansen, Soren. 1995. Likelihood-based inference in cointe- Philips, Andrew Q., Amanda Rutherford, and Guy D. Whitten.
grated vector autoregressive models. Oxford: Oxford Univer- 2015. “The Dynamic Battle for Pieces Of Pie—Modeling
sity Press. Party Support in Multi-Party Nations.” Electoral Studies 39:
Jordan, Soren, and Andrew Q. Philips. 2016. “pss: R Package to 264–74.
Perform the Bounds Test for Cointegration and Create Dy- Philips, Andrew Q., Amanda Rutherford, and Guy D. Whitten.
namic Simulations.” https://github.com/andyphilips/pss. R 2016. “Dynamic Pie: A Strategy for Modeling Trade-Offs in
package version 1.3.9. Compositional Variables over Time.” American Journal of
Keele, Luke, Suzanna Linn, and Clayton M. Webb. 2016. “Treat- Political Science 60(1): 268–83.
ing Time with All Due Seriousness.” Political Analysis 24(1): Pickup, Mark. 2009. “Testing for Fractional Integration in Pub-
31–41. lic Opinion in the Presence of Structural Breaks: A Comment
Kelly, Nathan J., and Peter K. Enns. 2010. “Inequality and the on Lebo and Young.” Journal of Elections, Public Opinion and
Dynamics of Public Opinion: The Self-Reinforcing Link be- Parties 19(1): 105–16.
tween Economic Inequality and Mass Preferences.” Ameri- Ura, Joseph Daniel. 2014. “Backlash and Legitimation: Macro
can Journal of Political Science 54(4): 855–70. Political Responses to Supreme Court Decisions.” American
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Mak- Journal of Political Science 58(1): 110–26.
ing the Most of Statistical Analyses: Improving Interpreta- Volscho, Thomas W., and Nathan J. Kelly. 2012. “The Rise of the
tion and Presentation.” American Journal of Political Science Super-Rich: Power Resources, Taxes, Financial Markets, and
44: 347–61. the Dynamics of the Top 1 Percent, 1949 to 2008.” American
Lebo, Matthew J., Robert W. Walker, and Harold D. Clarke. Sociological Review 77(5): 679–99.
2000. “You Must Remember This: Dealing with Long Mem- Williams, Laron K., and Guy D. Whitten. 2011. “Dynamic Simu-
ory in Political Analyses.” Electoral Studies 19(1): 31–48. lations of Autoregressive Relationships.” Stata Journal 11(4):
MacKinnon, James G. 1994. “Approximate Asymptotic Distri- 577–88.
bution Functions for Unit-Root and Cointegration Tests.” Williams, Laron K., and Guy D. Whitten. 2012. “But Wait,
Journal of Business and Economic Statistics 12(2): 167– There’s More! Maximizing Substantive Inferences from
76. TSCS Models.” Journal of Politics 74(3): 685–93.
Maddala, Gangadharrao S., and In-Moo Kim. 1998. Unit roots,
cointegration, and structural change. Cambridge: Cambridge
University Press. Supporting Information
Narayan, Paresh Kumar. 2005. “The Saving and Investment
Nexus for China: Evidence from Cointegration Tests.” Ap- Additional Supporting Information may be found in the
plied Economics 37(17): 1979–90.
online version of this article at the publisher’s website:
Pesaran, M. Hashem, Yongcheol Shin, and Richard J. Smith.
2001. “Bounds Testing Approaches to the Analysis of Level 1. Programs to Assist in Implementing the Pesaran, Shin
Relationships.” Journal of Applied Econometrics 16(3): 289– and Smith (2001) ARDL Procedure
326.
2. Summary of Monte Carlo Results
Philips, Andrew Q. 2016a. “dynpss: Stata Module to Dynami-
3. Additional Monte Carlo Results
cally Simulate Autoregressive Distributed Lag (ARDL) Mod-
els.” https://andyphilips.github.io/dynpss/. 4. Proof of the Equivalence of the Triangular Error-
Philips, Andrew Q. 2016b. “pssbounds: Stata Module to Con- Correction Representation to the Standard Representa-
duct the Pesaran, Shin, and Smith (2001) Bounds Test for tion
Cointegration.” http://andyphilips.github.io/pssbounds/. 5. Three Replications
The relationship between Trade, FDI and Economic growth in Tunisia: An
application of autoregressive distributed lag model
Abstract:
This paper examines the dynamic causal relationships between foreign direct investment
(FDI), trade and economic growth in Tunisia by applying the bounds testing (ARDL)
approach to cointegration for the period from 1970 to 2008. The bounds tests suggest that the
variables of interest are bound together in the long-run when foreign direct investment is the
dependent variable. The associated equilibrium correction was also significant confirming the
existence of long-run relationship. The results indicate also that there is no significant
Granger causality from FDI to economic growth, from economic growth to FDI, from trade to
economic growth and from economic growth to trade in the short run.
1
1. Introduction
Trade and FDI inflows are well known as very important factors in the economic growth
process. Trade plays the role of upgrading skills through the importation and adoption of
superior production technology and innovation. Exporters use innovation and developed
production technology either by acting as subcontractors to foreign enterprises or through
international markets competition. Producers of import-substitutes face competition from
foreign firms. They are pushed to adopt more capital-intensive production facilities to face the
hard competition in developing countries where products are usually capital-intensive
(Frankel and Romer, 1999). The impact of trade openness on economic growth can be
positive and significant due mainly to the accumulation of physical capital and technological
transfer.
Inward FDI can play an important role by increasing and augmenting the supply of funds for
domestic investment in the host country. This is can be done through production chain when
foreign investors buy locally made inputs and sell intermediate inputs to local enterprises.
Furthermore, inward FDI can increase the host country’s export capacity causing the
developing country to increase its foreign exchange earnings. FDI can also encourage the
creation of new jobs and enhance technology transfer and boosts overall economic growth in
host countries.
The majority of past empirical studies have dealt with either trade and FDI interaction on
economic growth (Balasubramanyam et al., 1996; Karbasi et al., 2005), or the relationship
between FDI and economic growth (Lipsey, 2000) or the relationship between trade and
economic growth (Pahlavani, et al., 2005). All these studies have concluded that both FDI
inflows and trade promote economic growth. However, the studies have failed to provide a
conclusive result on the relation in general and the direction of the causality in particular in
many developing countries. The growth enhancing effects from FDI inflows and trade vary
from country to country and overtime. For some countries FDI and trade can even negatively
affect the economic growth (Balasubramanyam et al., 1996; Borensztein et al., 1998; Lipsey,
2000; De Mello, 1999; Xu, 2000).
Some past studies on this subject suffer from two limitations. The first limit is that these
studies used cointegration techniques based on either the Engle and Granger (1987)
cointegration test or the maximum likelihood test based on Johansen (1988) and Johansen and
Juselius (1990). Or, these cointegration techniques may not be appropriate when the sample
2
size is too small (Odhiambo, 2009). Odhiambo (2009) uses the bounds testing cointegration
approach developed by Pesaran et al. (2001) which is more robust for the small sample. The
second limit is that by using cross-sectional data some studies do not address the country
specific issues (Odhiambo, 2009; Ghirmay, 2004; Casselli et al., 1996).
The current study investigates the dynamic causal relationship between trade, FDI and
economic growth in Tunisia by implementing the newly developed ARDL-Bounds testing
approach to cointegration. Trade and FDI are expressed as a ratio of GDP. The proxy of
economic growth is real GDP per capita. Labour and capital investments are also considered
in the model. The Granger procedure is used to test the direction of causality within the
Vector Error Correction Model (VECM). If a set of variables is cointegrated, they must have
an error correction representation wherein an error correction term (ECT) must be
incorporated in the model (Engle and Granger, 1987). The advantage of VECM is the
reintroduction of the information lost by differencing time series. This step is fundamental to
investigate the short-run dynamics and the long run equilibrium.
Despite the abundant literature on FDI, trade and economic growth in many emerging and
developing countries, there is little empirical work on this subject in Tunisia. By contrasting
the big role of FDI inflows, we can draw important lessons and guidelines for policy makers
in their pursuit for a more effective scheme to promote economic growth in Tunisia which is
suffering from a huge ratio of unemployment. What role that can play FDI and trade in the
New Tunisia to meet the challenges that the revolution spawned? This study will add valuable
knowledge to the existing literature in Tunisia. The study is relevant because the twin policy
targets of FDI attraction and trade liberalisation have been integral preoccupation of Tunisia
since the IMF Structural Adjustment Programme of 1986 and continue to be after the
revolution of 14th January 2011.
The rest of the paper is structured as follows: Section 2 presents a brief literature review.
Section 3 gives an overview of Tunisian’s foreign direct investment and regional trade
agreements. Section 4 describes the used data, while section 5 deals with the estimation
technique and the empirical analysis of the results. Section 6 concludes the paper.
3
2. A brief literature review
The literature studying the impacts of FDI and trade on economic growth is very large. The
effect of each one of the two variables of FDI and trade on economic growth has generally
been studied for many countries using various sample periods and econometric approaches
and methods. The results of some papers studying the effects of trade (or exports) and FDI on
economic growth in developing countries are promising (Balassa, 1985; Sengupta and
Espana, 1996). There is evidence for the export-led growth hypothesis (ELGH) and FDI-led
growth hypothesis (FLGH). These hypotheses, which are supported, are based on the idea that
exports and FDI variables are the main drivers of economic growth.
Ghirmay et al. (2001) studied the relationship between exports and economic growth in
nineteen developing countries. Their results supported a long-run relationship between the
two variables only in twelve of the developing countries and the promotion of exports
attracted investment and increased GDP in these countries. By using a bivariate technique,
Mamun and Nath (2003) found a long-run unidirectional causality from exports to economic
growth in Bangladesh. Narayan et al. (2007) examined the export-led growth hypothesis for
Fiji and Papua New Guinea. Their results support the ELGH in the long-run for Fiji, while for
Papua New Guinea there is evidence of ELGH in the short-run.
Empirical researches, which have studied FLGH, have found that FDI promotion can greatly
benefit host countries by the introduction of new technologies and skills, the creation of new
jobs, surging domestic competition and expanding access to international marketing networks.
According to Blomstrom et al. (1992), FDI promotes economic growth when the host
economy is a developed one. The findings of Boyd and Smith (1992) are that FDI may affect
negatively growth due to misallocation of resources in the presence of some distortions in pre-
existing trade, price and others. Borensztein et al. (1998) studied the effect of FDI on
economic growth in a cross-country regression approach. According to their findings, FDI can
be an important tool and a channel to the transfer of modern technology, but its effectiveness
depends on the stock of human capital in the host country. By referring to Nair-Reichert and
Weinhold (2001) findings, the causal relationship between foreign and domestic investment
and economic growth in developing countries is heterogeneous. The authors justify these
results by the homogeneity of assumptions imposed across countries. By using new statistical
4
techniques and two new databases to reassess the relationship between economic growth and
FDI, Carkovic and Levine (2002) found that there is no evidence of FLGH.
According to Anthukorala (2003), FDI had a positive effect on GDP and a unidirectional
causality running from GDP to FDI in Sri Lanka. The finding of Baliamoune-Lutz (2004) is
that the impact of FDI on economic growth is positive and there is a bidirectional relationship
between exports and FDI in Morocco. This result implies that FDI can also promote exports
and vice versa. Also, some authors have studied the relationship between regional integration
and FDI. Darrat et al. (2005) investigated the impact of FDI on economic growth in Central
and Eastern Europe (CEE) and the Middle East and North Africa (MENA) regions. They
found that FDI inflows stimulate economic growth in EU accession countries, while the
impact of FDI on economic growth in MENA and in non-EU accession countries is either
non-existent or negative. Similar to that of Darrat et al. (2005), Hisarciklilar et al. (2006)
don’t find causality between FDI and GDP for most of the following Mediterranean countries
of Algeria, Cyprus, Egypt, Israel, Jordan, Morocco, Syria, Tunisia and Turkey for the period
of 1979-2000. These countries could create an environment that attract FDI and lead to the
transfer of technology and skills and increase production, creation of new jobs and exports.
Research examining the impacts of exports and FDI on GDP within the same model has also
concluded ambiguous results. For example, by referring to Alia and Dcal (2003), there is
evidence of ELGH for Turkey but not FLGH because the spillover effects from FDI to GDP
are not present. In the Latin American countries (Argentina, Brazil, and Mexico), Alguacil et
al. (2000) found that the FLGH is confirmed but not ELGH. The authors found that FDI
promotes economic growth and trade. Dritsaki and Adamopoulos (2004) found a
unidirectional causal relationship from FDI to economic growth and a bidirectional causal
relationship between exports and economic growth for Greece. According to Yao (2006),
there is a strong relationship between exports, FDI and economic growth for China. Rahman
(2007) re-examined the effects of exports, FDI and expatriates’ remittances on real GDP of
some Asian countries (Bangladesh, India, Pakistan and Sri Lanka) using the ARDL technique
for cointegration for the period of 1976-2006. The ARDL technique confirmed cointegrating
relationship among variables in these three countries. The short-run net effects of exports on
real GDP of Bangladesh are more visible than those of FDI. The same apply to India as well
with some minor exceptions for relatively stronger short-run effects. In the case of Pakistan,
FDI was found to exert net restrictive effects on its real GDP, though not highly significant.
For Sri Lanka, FDI was found to have consistently restrictive effects on real GDP.
5
Alalaya (2008) investigated the relationship between economic growth, trade and FDI for
Jordan for the period of (1990 -2008) by applying the ARDL model for cointegration. He
found a unidirectional causal effect from trade and FDI to economic growth. It was also found
that the speed of adjustment in the model is 0.587 and it seems relatively high and significant.
During the last decades, many measures have been adopted by Tunisian government to attract
FDI inflow by the belief that this inflow will introduce modern technology, enhance
productivity and stimulate export-led growth. Tunisian’s structural adjustment plan was set in
1986. It has led to encourage standard fiscal and monetary policy reforms and liberalization of
financial sector. This programme has characterized the moving forward of Tunisia’s
economic development. A policy of gradual trade liberalization was pursued, first by
implementing current account convertibility, followed by accession to the GATT agreements
and by a free trade association with the European Union in 1995, which went into effect on
January 1, 2008. The objective of the agreement is to eliminate customs tariffs and other trade
barriers on a wide range of goods and services. However, the most important aspect of the
association agreement may well be that it has served to anchor Tunisia’s commitment to
reforms.
“Tunisia provided a wide range of incentives such as a tax relief up to 35 percent on
reinvested revenues and profits (30 percent starting from 2007), exemptions from customs
duties and a 10 percent reduction of VAT for imported capital goods having no Tunisian
manufacturing equivalent, a suspension of VAT and sales tax on locally produced equipment
at company start-up and an optional depreciation scheduling for capital equipment older than
seven years. Additional incentives are provided to off-shore industries or totally exporting
industries such as full exemption on corporate profits earned on export for the first ten years
and 50 percent reduction thereafter (granted also to partially exporting firms), full tax
exemption on reinvested profits and income, total exemption from customs duties on imported
capital goods, raw materials, semi finished goods and services necessary for business” (Ghali
and Rezgui, 2007).
According to Ghali and Rezgui (2007), the net FDI flows to GDP attained 2.2% in 1990.
About 80 percent of FDI was mainly oriented to the petroleum and gas sector until the first
half of the 1990's. Due to the privatization program, the share of total FDI in the petroleum
6
and gas sector decreased and attained 58 percent in 1998. There is an FDI shift to
manufacturing sector.
The largest foreign investor in Tunisia is the European Union (EU). Its FDI is mainly oriented
to the development of the infrastructure network and the textiles and clothing sectors.
Trade openness is important as a vehicle for technological spillovers. In order to benefit from
trade openness, Tunisia needs to have trade partners that are capable to provide it with
technology embodied in products, machines and equipments in which the country is in short
supply. So, by importing capital equipment and intermediate products from developed
countries that have a larger stock of knowledge, Tunisia can improve its own stock of
knowledge.
Tunisia has been a member of the WTO since March 1995. In order to benefit from trade
openness, Tunisia signed a Euro-Mediterranean Association Agreement (AA) with the
European Union in July 1995. It was the first country to sign an AA with the EU among the
South Mediterranean countries which are engaged in the Barcelona Process. However, this
agreement was ratified and entered into force in March 1998. The main objective of the AA is
liberalisation and facilitation of the exchange of goods, services and capital. Already, Tunisia
finished the tariffs dismantling for industrial products in 2008.
The first trading partner of Tunisia is the EU. The main exports of Tunisia to the EU are
manufactured products, raw energy and phosphate, and agricultural products. It accounted for
about 80% of its exports in 2008 and experienced a growth rate of more than 9% from 2003 to
2008. The main imports of Tunisia from the EU are machinery and transport equipment,
textiles, chemicals and refined energy. These imports accounted for near 65% of Tunisian’s
needs in goods from EU countries and grew at an estimated average annual rate of 7.2%
(Boughzala, 2010).
On the other side Tunisia has some international trade relations with some Arabic countries.
Tunisia signed a bilateral agreement with Libya which entered into force in 2002. It signed
the Agadir agreement with Morocco, Egypt and Jordan in 25 February 2004. This committed
all partners to removing substantially all tariffs on trade between them and to harmonizing
their legislation with regard to standards and customs procedures. Even this agreement
entered into force in July 2006, its effective implementation did start only in April 2007.
Tunisia signed also a free trade agreement with a Middle East country which is Turkey in
November 2004. This agreement replaced the old one, which was signed in 1992, and entered
into force in July 2005.
7
The Tunisia’s Euro-Med agreement with the EU can increase the openness of the Tunisian
economy and hence increase FDI inflows to Tunisia. The aim of Mediterranean countries was
to create an environment which can attract FDI that could lead to the transfer of technology
and increase production, creation of new jobs and exports. This objective is our main
motivation to investigate FDI-economic growth relationship in Tunisia. In this study we try to
see if FDI shift has beneficial effects for employment, trade, and economic growth in Tunisia.
4. Data sources and description of variables
Annual time series data on economic growth, FDI, trade, labour and capital stock, which
cover the 1970-2008 period, have been used in this study. The data has been obtained from
different sources, including Tunisia Central Bank annual reports, quarterly bulletins, etc. In
addition, different volumes of the International Financial Statistics (IFS) Yearbook, published
by the International Monetary Fund, and World Development Indicators 2009 edition
published online by the World Bank have been used to supplement the local data.
The economic growth variable, which is measured by real GDP per capita, is noted by Y. FDI
is the value of real gross foreign direct investment inflows to GDP ratio; Trade openness is
the total sum of exports and imports divided by GDP; L is measured as the volume of the total
labour force; capital stock (K) is measured by the real value of gross fixed capital formation
(GFCF).
5. Econometric methodology and empirical results
In time series analysis, before running the causality test the variables must be tested for
stationarity. For this purpose, in this current study we use the conventional ADF tests, the
Phillips-Perron test following Phillips and Perron (1988) and the Dickey-Fuller generalised
least square (DF-GLS) de-trending test proposed by Elliot et al. (1996).
The ARDL bounds test is based on the assumption that the variables are I(0) or I(1). So,
before applying this test, we determine the order of integration of all variables using the unit
root tests. The objective is to ensure that the variables are not I(2) so as to avoid spurious
results. In the presence of variables integrated of order two, we cannot interpret the values of
F statistics provided by Pesaran et al. (2001).
The results of the stationarity tests show that all variables are non-stationary at level. These
results are given in Table 1. The ADF, the Phillips-Perron and DF-GLS tests applied to the
8
first difference of the data series reject the null hypothesis of nonstationarity for all the
variables used in this study (Table 2). It is, therefore, worth concluding that all the variables
are integrated of order one.
Table 1. ADF and DF-GLS unit root tests on log levels of variables
Variables SIC t-Stat Critical value SIC t-Stat Critical t-Stat Critical
*model without constant and trend, **model without trend, ***model with constant and trend
Table 2. ADF and DF-GLS unit root tests on first differences of log levels of variables
Variables SIC t-Stat Critical SIC t-Stat Critical value t-Stat Critical
5%
*model without constant and trend, **model without trend, ***model with constant and trend
In order to empirically analyse the long-run relationships and short run dynamic interactions
among the variables of interest (trade, FDI, labour, capital investment and economic growth),
9
vector autoregressive (VAR) model of order p, in Zt, where Zt is a column vector composed
of the five variables: Zt = (Yt Kt Lt Ft Tt)’. The ARDL cointegration approach was developed
by Pesaran and Shin (1999) and Pesaran et al. (2001). It has three advantages in comparison
with other previous and traditional cointegration methods. The first one is that the ARDL does
not need that all the variables under study must be integrated of the same order and it can be
applied when the under-lying variables are integrated of order one, order zero or fractionally
integrated. The second advantage is that the ARDL test is relatively more efficient in the case
of small and finite sample data sizes. The last and third advantage is that by applying the
ARDL technique we obtain unbiased estimates of the long-run model (Harris and Sollis,
p
D(ln(Yt )) a 01 b11 ln(Yt 1 ) b21 ln( K t 1 ) b31 ln( Lt 1 ) b41 ln( Ft 1 ) b51 ln(Tt 1 ) a1i D(ln(Yt i ))
i 1
q q q q
a 2i D(ln( K t i )) a3i D(ln( Lt i )) a 4i D(ln( Ft i )) a5i D(ln(Tt i )) 1t (1)
i 1 i 1 i 1 i 1
p
D(ln( K t )) a02 b12 ln(Yt 1 ) b22 ln( K t 1 ) b32 ln( Lt 1 ) b42 ln( Ft 1 ) b52 ln(Tt 1 ) a1i D(ln( K t i ))
i 1
q q q q
a 2i D(ln(Yt i )) a 3i D(ln( Lt i )) a 4i D(ln( Ft i )) a5i D(ln(Tt i )) 2t ( 2)
i 1 i 1 i 1 i 1
p
D(ln( Lt )) a 03 b13 ln(Yt 1 ) b23 ln( K t 1 ) b33 ln( Lt 1 ) b43 ln( Ft 1 ) b53 ln(Tt 1 ) a1i D(ln( Lt i ))
i 1
q q q q
a 2i D(ln( K t i )) a3i D(ln(Yt i )) a 4i D(ln( Ft i )) a5i D(ln(Tt i )) 3t (3)
i 1 i 1 i 1 i 1
p
D(ln( Ft )) a04 b14 ln(Yt 1 ) b24 ln( K t 1 ) b34 ln( Lt 1 ) b44 ln( Ft 1 ) b54 ln(Tt 1 ) a1i D(ln( Ft i ))
i 1
q q q q
a 2i D(ln( K t i )) a3i D(ln( Lt i )) a 4i D(ln(Yt i )) a5i D(ln(Tt i )) 4t ( 4)
i 1 i 1 i 1 i 1
10
p
D(ln(Tt )) a 0 b15 ln(Yt 1 ) b25 ln( K t 1 ) b35 ln( Lt 1 ) b45 ln( Ft 1 ) b55 ln(Tt 1 ) a1i D(ln(Tt i ))
i 1
q q q q
a 2i D(ln( K t i )) a3i D(ln( Lt i )) a 4i D(ln( Ft i )) a5i D(ln(Yt i )) 5t (5)
i 1 i 1 i 1 i 1
Where all variables are as previously defined in section 4, ln(.) is the logarithm operator, D is
the first difference, and εt are the error terms.
The bounds test is mainly based on the joint F-statistic which its asymptotic distribution is
non-standard under the null hypothesis of no cointegration. The first step in the ARDL bounds
approach is to estimate the five equations (1, 2, 3, 4 and 5) by ordinary least squares (OLS).
The estimation of the five equations tests for the existence of a long-run relationship among
the variables by conducting an F-test for the joint significance of the coefficients of the lagged
levels of the variables, i.e., : H0: b1i = b2i = b3i = b4i = b5i = 0 against the alternative one : H1:
b1i ≠ b2i ≠b3i≠ b4i ≠ b5i ≠ 0 for i= 1, 2, 3, 4, 5. We denote the F-statistic of the test which
normalize on Y by FY (Y\ K, L, F, T). Two sets of critical values for a given significance level
can be determined (Pesaran et al., 2001). The first level is calculated on the assumption that
all variables included in the ARDL model are integrated of order zero, while the second one is
calculated on the assumption that the variables are integrated of order one. The null
hypothesis of no cointegration is rejected when the value of the test statistic exceeds the upper
critical bounds value, while it is accepted if the F-statistic is lower than the lower bounds
value. Other ways, the cointegration test is inconclusive.
The use of this approach is guided by the short data span. We choose a maximum lag order of
2 for the conditional ARDL vector error correction model by using the Akaike information
criteria (AIC). The calculated F-statistics are reported in Table 3 when each variable is
considered as a dependent variable (normalized) in the ARDL-OLS regressions. Their values
are: for equation (1), FY (Y \L, K, F, T) = 1.992; for equation (2), FL (L \Y, K, F, T) = 0.762; for
equation (3), FT (T\Y, K, F, L) = 2.736; for equation (4), FK (K\Y, L, F, T) = 2.552; and for
equation (5), F \Y, K, L, T) = 6.701. From these results, it is clear that there is a long run
relationship amongst the variables when FDI is the dependent variable because its F-statistic
(6.701) is higher than the upper-bound critical value (4.15) at the 5% level. This implies that
the null hypothesis of no cointegration among the variables in equation (5) is rejected.
However, for the other equations (1) - (4), the null hypothesis of no cointegration is accepted.
11
Table 3: Results from bound tests
Lower and Upper-bound critical values are taken from Pesaran et al. (2001), Table CI(ii) Case II.
Once cointegration is established, the conditional ARDL (p, q1, q2, q3, q4) long-run model for
ln(Ft) can be estimated as:
p q1 q2
ln( Ft ) a0 a1i ln( Ft i ) a2i ln( K t i ) a3i ln( Lt i )
i 1 i 0 i 0
q3 q4
a
i 0
4i ln(Yt i ) a5i ln(Tt i ) t
i 0
(6)
Where, all variables are as previously defined. The orders of the ARDL (p, q1, q2, q3, q4)
model in the five variables are selected by using AIC. Equation (6) is estimated using the
following ARDL (1, 0, 0, 0, 0) specification. The results obtained by normalizing on FDI, in
the long run are reported in Table 4.
Table 4. Estimated long run coefficients using the ARDL approach
Variable Coefficient t-statistic Probability
12
The estimated coefficients of the long-run relationship are significant for capital and labour
but not significant for trade and economic growth. Capital investment has a positive
significant impact on FDI at the 5% level. The labour force variable is negatively signed and
significant at the 5% level. This is indicative of the growing unemployment problem and the
low productivity of labour in Tunisia. Considering the impact of trade openness, it is
insignificant at 5% probability and has a negative impact on FDI. Economic growth is also
insignificant at 5% level and has a positive impact on FDI.
Following the research papers of Odhiambo (2009) and Narayan and Smyth (2008), we obtain
the short-run dynamic parameters by estimating an error correction model associated with the
long-run estimates. The long-run relationship between the variables indicates that there is
Granger-causality in at least one direction which is determined by the F-statistic and the
lagged error-correction term. The short-run causal effect and is represented by the F-statistic
on the explanatory variables while the t-statistic on the coefficient of the lagged error-
correction term represents the long-run causal relationship (Odhiambo, 2009; Narayan and
Smyth, 2006). The equation where the null hypothesis of no cointegration is rejected is
estimated with an error-correction term (Narayan and Smyth, 2006; Morley, 2006).
The vector error correction model is specified as follows:
P q q
D(ln( Ft )) a0 a1i D(ln( Ft i )) a 2i D(ln( K t i )) a3i D(ln( Lt i ))
i 1 i 1 i 1
q q
a
i 1
4i D(ln(Yt i )) a5i D(ln(Tt i )) ECTt 1 t
i 1
(7)
P q q
D(ln(Yt )) a0 a1i D(ln(Yt i )) a 2i D(ln( K t i )) a3i D(ln( Lt i ))
i 1 i 1 i 1
q q
a
i 1
4i D(ln( Ft i )) a5i D(ln(Tt i )) t
i 1
(8)
P q q
D(ln( K t )) a0 a1i D(ln( K t i )) a 2i D(ln(Yt i )) a3i D(ln( Lt i ))
i 1 i 1 i 1
q q
a
i 1
4i D(ln( Ft i )) a5i D(ln(Tt i )) t
i 1
(9)
13
P q q
D(ln( Lt )) a0 a1i D(ln(Yt i )) a 2i D(ln( Lt i )) a3i D(ln( K t i ))
i 1 i 1 i 1
q q
a
i 1
4i D(ln( Ft i )) a5i D(ln(Tt i )) t
i 1
(10)
P q q
D(ln(Tt )) a0 a1i D(ln(Tt i )) a 2i D(ln( K t i )) a3i D(ln( Lt i ))
i 1 i 1 i 1
q q
a
i 1
4i D(ln( Ft i )) a5i D(ln(Yt i )) t
i 1
(11)
Where a1i, a2i, a3i, a4i and a5i are the short-run dynamic coefficients of the model’s convergence
The equations (7) – (11) are estimated by OLS regression separately. The results of the short-
run dynamic coefficients associated with the long-run relationships obtained from the
equation (7) are given in Table 5. Beginning with the results for the long-run, the coefficient
on the lagged error-correction term is significant at 1% level with the expected sign, which
confirms the result of the bounds test for cointegration. Its value is estimated to -0.69 which
implies that the speed of adjustment to equilibrium after a shock is high. Approximately 69%
of disequilibria from the previous year’s shock converge back to the long-run equilibrium in
the current year. In the long run real GDP per capita, labour, capital and trade Granger cause
FDI. This result implies that causality runs interactively through the error-correction term
from real GDP per capita, labour, capital and trade to FDI. In the short run, only capital
investment is significant at 5% level and has an important impact on FDI. Economic growth
and trade have a negative impact but not significant. The impact of labour is positive but not
significant.
The regression for the underlying ARDL equation (7) fits very well and the model is globally
significant at 1% level. It also passes all the diagnostic tests against serial correlation (Durbin
Watson test and Breusch-Godfrey test), heteroscedasticity (White Heteroskedasticity Test),
and normality of errors (Jarque-Bera test). The Ramsey RESET test also suggests that the
model is well specified. All the results of these tests are shown in Table 6.
The stability of the long-run coefficient is tested by the short-run dynamics. Once the ECM
model given by equation (7) has been estimated, the cumulative sum of recursive residuals
(CUSUM) and the CUSUM of square (CUSUMSQ) tests are applied to assess the parameter
stability (Pesaran and Pesaran (1997)). Graphs 1 and 2 plot the results for CUSUM and
14
CUSUMSQ tests. The results indicate the absence of any instability of the coefficients
because the plot of the CUSUM and CUSUMSQ statistic fall inside the critical bands of the
5% confidence interval of parameter stability.
The Chow Breakpoint and Chow Forecast tests are used to examine significant structural
break in the data in 1995 and over the post-Barcelona period of 1995- 2008. The pre-
Barcelona period is 1970-1995. We choose 1995 as a breakpoint because in July 1995,
Tunisia signed an association agreement with the EU among the South Mediterranean
countries engaged in the Barcelona Process. The F-statistics and the Log likelihood ratios do
not indicate any structural break (Table 7).
R-squared 0.43
DW-statistic 1.98
2 statistic Probability
15
Table 7. Statistical output for stability tests
20
15
10
-5
-10
-15
-20
1980 1985 1990 1995 2000 2005
CUSUM 5% Significance
16
Graph 2. Plot of CUSUMSQ Test for equation (7)
1.6
1.2
0.8
0.4
0.0
-0.4
1980 1985 1990 1995 2000 2005
Results of short run Granger causality tests are shown in Table 8. In the short-run, the F-
statistics on the explanatory variables suggest that at the 10% level or better there is bi-
directional Granger causality between capital investment and economic growth and between
capital investment and trade, unidirectional Granger causality running from capital investment
to FDI and from FDI to trade. There is no Granger causality from trade to FDI. Hisarciklilar et
al. (2006) found that there is no Granger causality from FDI to trade or from trade to FDI for
Tunisia. The Granger causality test results for the relationship between FDI and real GDP per
capita are interesting. These results indicate that there is no significant Granger causality from
FDI to GDP or from GDP to FDI and they are consistent with those of Hisarciklilar et al.
(2006). Turning to the Granger causality test results for real GDP per capita and trade
openness, there is also no significant Granger causality from trade to real GDP per capita or
from real GDP per capita to trade. Hisarciklilar et al. (2006) found that the direction of
causality is from economic growth to trade. Our results support the idea that FDI will only be
growth enhancing if it affects technology permanently and positively.
17
We can conclude that domestic investment which promotes trade, FDI and economic growth
in the short-run for Tunisia. Domestic investment is the main catalyser of economic growth in
Tunisia.
F statistics
variable
(*) and (**) denote statistical significance at the 5% and 10% levels respectively.
6. Conclusion
The paper examines the dynamic causal relationship among the series of economic growth,
foreign direct investment, trade, labour and capital investment for Tunisia for the period of
1970-2008. It implements ARDL model to cointegration to investigate the existence of a long
run relation among the above noted series; and the Granger causality within VECM to test the
direction of causality between the variables. The topic merits special importance due to the
possible interrelations among the series with implications for economic growth. The results
show that there is cointegration among the variables specified in the model when FDI is the
dependent variable. Trade openness and economic growth promote foreign direct investment
in Tunisia in the long run. The results indicate that there is no significant Granger causality
from FDI to economic growth or from economic growth to FDI in the short run. Turning to
the Granger causality test results for economic growth and trade openness, there is also no
significant Granger causality from trade to economic growth or from economic growth to
trade in the short run.
Domestic capital investment is the catalyser of economic growth in Tunisia. This finding
generates important implications and recommendations for policy makers in Tunisia. The
results suggest that for FDI to bring in the anticipated positive impacts on economic growth,
18
Tunisian government will undertake serious reforms with clear objectives and strong
commitments.
References
Adamopoulos, A., Dritsaki, Ch., and Dritsaki, M. 2005. “A Causal Relationship between
Trade, Foreign Direct Investment, and Economic Growth for Greece.” American Journal of
Applied Sciences 1: 230-235.
Alalaya M.M. 2008. “ARDL Models Applied for Jordan Trade, FDI and GDP Series (1990-
2008)”. European Journal of Social Sciences – Volume 13, Number 4, 605-616.
Alia, A.A. and Ucal, M.S. 2003. “Foreign direct investment, exports and output growth of
Turkey: Causality Analysis”, Paper presented at the European Trade Study Group (ETSG)
fifth annual conference, Madrid, 11-13, Sept.
Alguacil, M.T., Cuadros A. and Orts, V. 2000. “Openness and Growth: Re-Examining
Foreign Direct Investment, Trade, and Output Linkages in Latin America.” University Jaume
I of Caastellon, Spain.
Athukorala, P.P.A.W. 2003. “The Impact of Foreign Direct Investment for Economic Growth:
A Case Study in Sri Lanka.” International Conference on Sri Lanka Studies,
http://www.freewebs.com/slageconf/9thics/spprslfulp092.pdf
Balassa, B. 1985. “Exports, Policy Choices and Economic Growth in Developing Countries
after the 1973 Oil Shock.” Journal of Development Economics 18 (1): 23-35.
Balasubramanyam, V.N., Salisu M.A. and Sapsford, D. 1996. “Foreign direct investment and
growth in EP and IS countries”. The Economic Journal, 106: 92-105.
Blomstrom, M., Lipsey R. and Zejan M. 1992. “What explains Developing Country
Growth?”, NBER Working Paper Series, No. 4132.
Borensztein, E., Gregorio, J.D. and Lee, J.W. 1998. “How does foreign direct investment
affect economic growth?” Journal of International Economics, 45: 115-35.
Boughzala, M., 2010. “The Tunisia-European Union Free Trade Area Fourteen Years”.
http://www.iemed.org/anuari/2010/aarticles/Boughazala_Tunisia_EU_en.pdf
Boyd, J.H. and Smith, B.D. 1992. “Intermediation and the Equilibrium Allocation of
Investment Capital: Implications for Economics Development”, Journal of Monetary
Economics, Vol. 30, pp. 409-32.
19
Carkovic, M. and Levine, R. 2002. “Does Foreign Direct Investment Accelerate Economic
Growth?”, in Does Foreign Direct Investment Promote Development? Moran T.H., Graham
E.M. and Blomstrom M. (eds.), Institute for International Economics.
Casselli, F., Esquivel, G. and Lefort, F. 1996. “Reopening the convergence debate: A new
look at cross-country growth empirics”. Journal of Economic Growth 1(3).
Darrat A.F., Kherfi S. and Soliman M. 2005. “FDI and Economic Growth in CEE and
MENA Countries: A Tale of Two Regions”. 12th Economic Research Forum’s Annual
Conference, Cairo, Egypt.
De Mello, L.R., Jr., 1999. “Foreign direct investment-led growth evidence from time series
and panel data”. Oxford Economics Papers, 51: 133-151.
Dritsaki, M., C. Dritsaki and A. Adamopoulos, 2004. “A Causal Relationship between Trade,
Foreign Direct Investment and Economic Growth for Greece”. American Journal of Applied
Science, 1: 230-235.
Elliot, G., T.J. Rothenberg and J.H. Stock, 1996. “Efficient tests for an autoregressive unit
root”. Econometrica, 64: 813-36.
Engle, R.F. and Granger, C.J. 1987. “Cointegration and Error-correction - Representation,
Estimation and Testing”, Econometrica 55, 251-78.
Frankel, J.A. and D. Romer, 1999. “Does trade cause growth?” American Economic Review,
89: 379-99.
Ghali S., Rezgui S., 2007. “FDI Contribution to Technical Efficiency in The Tunisian
Manufacturing Sector: A combined empirical approach”. 14th Economic Research Forum’s
Annual Conference, Cairo, Egypt.
Ghirmay, T., Grabowski, R., and Sharma, S. 2001. “Exports, Investment, Efficiency, and
Economic Growth in LDCs an empirical investigation.” Applied Economics 33 (6),
Department of Economics, Southern Illinois University, Carbondale, IL.
Harris, R. and Sollis, R. 2003. “Applied Time Series Modelling and Forecasting”. Wiley,
West Sussex.
Hisarciklilar, M., Kayam, S.S. Kayalica, M.O. and Ozkale. N.L. 2006. “Foreign direct
investment and growth in Mediterranean countries”.
20
Karbasi, A., Mahamadi E. and Ghofrani, S. 2005. “Impact of foreign direct investment on
economic growth”. 12th Economic Research Forum’s Annual Conference, Cairo, Egypt.
Levin, A., Lin, C.F. and Chu, C. 2002. “Unit root tests in panel data: Asymptotic and finite-
sample properties”. Journal of Econometrics, 108: 1–24.
Lipsey, R.E., 2000. “Inward FDI and economic growth in developing countries”.
Transnational Corporations, 9: 61-95.
Mansouri, B., 2005. “The interactive impact of FDI and trade openness on economic growth:
Evidence from Morocco”. 12th Economic Research Forum’s Annual Conference, Cairo,
Egypt.
Morley, B. 2006. “Causality Between Economic Growth and Migration: An ARDL Bounds
Testing Approach”, Economics Letters 90, 72-76.
Nair-Reichert U. and Weinhold D. 2001. “Causality Tests for Cross-Country Panels: A New
Look at FDI and Economic Growth in Developing Countries”, Oxford Bulletin of Economic
and Statistics, Vol. 63, pp. 153-171.
Narayan, P.K. and Smyth, R. 2006. “Higher Education, Real Income and Real Investment in
China: Evidence From Granger Causality Tests”, Education Economics 14, 107-125.
Narayan, P. K., Narayan, S., Prasad, B. C., Prasad, A. 2007. "Export-led growth hypothesis:
evidence from Papua New Guinea and Fiji", Journal of Economic Studies, 34: (4), 341 -351.
Narayan, P.K. and Smyth, R. 2008. “Energy Consumption and Real GDP in G7 Countries:
New Evidence From Panel Cointegration With Structural Breaks”, Energy Economics 30,
2331-2341.
Pahlavani, M., Wilson, E., and Worthington, A.C. 2005. “Trade-GDP nexus in Iran: An
application of autoregressive distributed lag (ARDL) model”. American Journal of Applied
Science, 2: 1158-1165.
Pesaran, M. and Shin, Y. (1999), “An Autoregressive Distributed Lag Modeling Approach to
Cointegration Analysis” in S. Strom, (ed) Econometrics and Economic Theory in the 20th
Century: The Ragnar Frisch centennial Symposium, Cambridge University Press, Cambridge.
Pesaran, M.H., Shin, Y. and Smith, R.J. 2001. “Bounds testing approaches to the analysis of
level relationship.” Journal of Applied Economics 16: 289-326.
Phillips, P.C.B. and Perron, P. 1988. “Testing for a Unit root in Time Series Regression”,
Biometrika 75: 335-346.
Rahman, M. 2007. “Contributions of Exports, FDI and Expatriates’ Remittances to Real GDP
Of Bangladesh, India, Pakistan and Sri Lanka”. Southwestern Economic Review, 141-154.
21
Sengupta, J.K. and Espana, J.R. 1996. “Exports and economic growth in Asian NICs: An
Econometric analysis for Korea”. Applied Economics, 26.
Xu, B., 2000. “Multinational enterprises, technology diffusion and host country productivity
growth”. Journal of Development Economics, 62: 477-93.
Yao, S. 2006. “On Economic Growth, FDI, and Exports in China”. Applied Economics 38
(3): 339-351.
22
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 1/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
1
Another commonly used abbreviation is ADL.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 2/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 3/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
5
1960 1965 1970 1975 1980
log consumption
log income
log investment
Data: National accounts, West Germany, seasonally adjusted, quarterly, billion DM, Lütkepohl (1993, Table E.1).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 4/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
ARDL model
ARDL(p, q, . . . , q) model:
p q
β 0i xt−i + ut ,
X X
yt = c0 + c1 t + φi yt−i +
i=1 i=0
ARDL(4,1,0) regression
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .4568483 .1064085 4.29 0.000 .2450887 .6686079
L2. | .3250994 .1127767 2.88 0.005 .1006666 .5495322
L3. | .1048324 .1092992 0.96 0.340 -.11268 .3223449
L4. | -.1632413 .0853844 -1.91 0.059 -.3331616 .0066791
|
ln_inc |
--. | .4629184 .078421 5.90 0.000 .3068557 .6189812
L1. | -.202756 .0965775 -2.10 0.039 -.3949513 -.0105607
|
ln_inv | .0080284 .0118391 0.68 0.500 -.0155322 .0315889
_cons | .0373585 .0143755 2.60 0.011 .0087504 .0659667
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 6/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
lagcombs[12,4]
ln_consump ln_inc ln_inv aic
r1 1 0 0 -585.22447
r2 1 1 0 -585.39189
r3 1 2 0 -583.88179
r4 2 0 0 -590.66282
r5 2 1 0 -592.6904
r6 2 2 0 -591.62792
r7 3 0 0 -588.69069
r8 3 1 0 -590.83183
r9 3 2 0 -589.67101
r10 4 0 0 -590.03466
r11 4 1 0 -592.73282
r12 4 2 0 -592.15636
. estat ic
-----------------------------------------------------------------------------
Model | Obs ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
. | 88 -64.51057 304.3747 8 -592.7495 -572.9308
-----------------------------------------------------------------------------
Note: N=Obs used in calculating BIC; see [R] BIC note.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 7/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 1
. timer list 1
1: 0.01 / 1 = 0.0150
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 8/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 2
. timer list 2
2: 0.75 / 1 = 0.7520
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 9/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
ARDL(2,0,4) regression
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .30383 .0942165 3.22 0.002 .1161411 .491519
L2. | .3195318 .0776321 4.12 0.000 .1648808 .4741828
|
ln_inc | .3767587 .0389267 9.68 0.000 .2992128 .4543046
|
ln_inv |
--. | .0581759 .0170736 3.41 0.001 .0241635 .0921884
L1. | -.0185484 .0214624 -0.86 0.390 -.0613036 .0242068
L2. | .01012 .021505 0.47 0.639 -.0327202 .0529602
L3. | -.0146641 .0213098 -0.69 0.493 -.0571154 .0277872
L4. | -.0488136 .0174121 -2.80 0.006 -.0835003 -.0141269
|
_cons | .0416317 .0107782 3.86 0.000 .0201603 .063103
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 10/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
EC representation
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 12/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
ln_inv | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inv |
D1. | .080464 .0187106 4.30 0.000 .0432214 .1177066
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 13/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc |
L1. | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
|
ln_inv |
L1. | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inc |
D1. | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
D1. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 14/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3788728 .0420886 -9.00 0.000 -.4626481 -.2950975
-------------+----------------------------------------------------------------
LR |
ln_inc | .9669152 .0039557 244.44 0.000 .9590416 .9747889
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.346926 .0806726 -4.30 0.000 -.5075007 -.1863512
L2D. | -.1074193 .0790118 -1.36 0.178 -.2646883 .0498497
|
ln_inv |
D1. | .0758713 .0176989 4.29 0.000 .0406425 .1111002
LD. | .0422224 .0191523 2.20 0.030 .0041008 .080344
L2D. | .0678568 .0185208 3.66 0.000 .030992 .1047216
L3D. | .0485441 .0179609 2.70 0.008 .0127938 .0842944
|
_cons | .0504873 .0114518 4.41 0.000 .027693 .0732816
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 15/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
EC representation: Interpretation
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 16/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 17/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
3 Pq
The test is not directly performed on the long-run coefficients θ = βj /α.
j=0
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 18/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 19/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
4
The number of short-run coefficients only affects the finite-sample but not the asymptotic critical values
(Cheung and Lai, 1995; Kripfganz and Schneider, 2018). The elements of ω in the ec1 parameterization for
variables that have 0 lags in the ARDL model do not count towards this number.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 20/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Test decisions:
Do not reject H0F or H0t , respectively, if the test statistic is
closer to zero than the lower bound of the critical values.
Reject the H0F or H0t , respectively, if the test statistic is more
extreme than the upper bound of the critical values.
The first two steps of the bounds test are implemented in the
ardl postestimation command estat ectest.
By default, finite-sample critical values for the 1%, 5%, and
10% significance levels are provided. Asymptotic critical values
are displayed with option asymptotic. Alternative significance
levels can be specified with option siglevels(numlist ).
The test statistics in step 3 have the usual asymptotic
standard normal (or χ2 ) distributions irrespective of the
integration order of the independent variables.5
5
The OLS estimator for the long-run coefficients θ of I(1) independent variables is “super-consistent” with
√
convergence rate T instead of T (Pesaran and Shin, 1998; Hassler and Wolters, 2006).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 21/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat ectest
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.032 4.831 | 4.958 5.843 | 7.070 8.119 | 0.000 0.000
t | -2.550 -2.899 | -2.861 -3.225 | -3.470 -3.854 | 0.000 0.000
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 22/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.341178 .0431316 -7.91 0.000 -.4270464 -.2553096
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.14358 .0782318 14.62 0.000 .9878321 1.299327
qtr | -.0036516 .0016171 -2.26 0.027 -.006871 -.0004322
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.4362663 .0851 -5.13 0.000 -.6056874 -.2668452
L2D. | -.1899566 .0825977 -2.30 0.024 -.354396 -.0255172
|
ln_inv |
D1. | .0842961 .0173889 4.85 0.000 .0496775 .1189146
LD. | .0517241 .0188448 2.74 0.008 .0142069 .0892412
L2D. | .0726232 .017972 4.04 0.000 .0368437 .1084027
L3D. | .0482872 .0173383 2.79 0.007 .0137693 .0828051
|
_cons | -.3188651 .1422961 -2.24 0.028 -.602155 -.0355753
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 23/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat ectest
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.066 4.582 | 4.784 5.351 | 6.396 7.057 | 0.000 0.000
t | -3.107 -3.384 | -3.412 -3.704 | -4.014 -4.327 | 0.000 0.000
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 24/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 25/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Postestimation commands
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 26/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Postestimation commands
6
estat dwatson is not valid for ARDL / EC models because the lagged dependent variable is not strictly
exogenous by construction.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 27/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 28/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat hettest
chi2(1) = 0.26
Prob > chi2 = 0.6067
chi2(54) = 52.03
Prob > chi2 = 0.5508
---------------------------------------------------
Source | chi2 df p
---------------------+-----------------------------
Heteroskedasticity | 52.03 54 0.5508
Skewness | 12.24 9 0.2000
Kurtosis | 0.02 1 0.8967
---------------------+-----------------------------
Total | 64.29 64 0.4664
---------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 29/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. sktest resid
. qnorm resid
. pnorm resid
.02 1.00
.01 0.75
0 0.50
−.01 0.25
−.02 0.00
−.02 −.01 0 .01 .02 0.00 0.25 0.50 0.75 1.00
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 30/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
−2
−4
1961 1966 1971 1976 1981
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 31/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
−1
−2
1961 1966 1971 1976 1981
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 32/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Number of obs = 88
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 33/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Number of obs = 88
Note: This is a test for a structural break in the speed-of-adjustment and long-run coefficients.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 34/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Further topics
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 35/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
ARDL(4) regression
------------------------------------------------------------------------------
D.dln_inv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
-------------+----------------------------------------------------------------
LR |
_cons | .015006 .0060544 2.48 0.015 .0029618 .0270501
-------------+----------------------------------------------------------------
SR |
dln_inv |
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
------------------------------------------------------------------------------
Note: The aim is to test whether dln inv, the first difference of ln inv, is nonstationary.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 36/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat ectest
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 3.823 3.812 | 4.677 4.659 | 6.644 6.601 | 0.026 0.025
t | -2.565 -2.569 | -2.869 -2.874 | -3.463 -3.472 | 0.017 0.017
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
Note: The null hypothesis is that dln inv follows a unit root process (without drift).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 37/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.dln_inv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
|
_cons | .0113337 .0060208 1.88 0.063 -.0006437 .023311
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 38/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
1981q1: ...........
1981q2: ...........
1981q3: ...........
1981q4: ...........
1982q1: ...........
1982q2: ..........
1982q3: ..........
1982q4: ...........
7.75
7.7
7.65
7.6
7.55
1979 1980 1981 1982
Note: The forecast period (1981q1 – 1982q4) is excluded from the estimation period (1961q1 – 1980q4).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 40/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
| Newey-West
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .2225557 .0931767 2.39 0.019 .0370552 .4080562
L2. | .2463097 .1003579 2.45 0.016 .0465125 .4461068
L3. | .1899566 .1013927 1.87 0.065 -.0119008 .3918141
|
ln_inc | .3901642 .0400174 9.75 0.000 .3104956 .4698327
|
ln_inv |
D1. | .0842961 .0258047 3.27 0.002 .0329229 .1356693
LD. | .0517241 .0158053 3.27 0.002 .0202582 .08319
L2D. | .0726232 .0156803 4.63 0.000 .0414061 .1038404
L3D. | .0482872 .017342 2.78 0.007 .013762 .0828124
|
qtr | -.0012458 .000383 -3.25 0.002 -.0020083 -.0004833
_cons | -.3188651 .1104624 -2.89 0.005 -.5387789 -.0989513
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 41/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_nl_1 | 1.14358 .0691576 16.54 0.000 1.008033 1.279126
------------------------------------------------------------------------------
Note: This is the same long-run coefficient as earlier but with Newey-West standard errors.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 42/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
help ardl
help ardl postestimation
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 43/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
References
Cheung, Y.-W., and K. S. Lai (1995). Lag order and critical values of the augmented Dickey-Fuller test.
Journal of Business & Economic Statistics 13(3): 277–280.
Engle, R. F., and C. W. J. Granger (1987). Co-integration and error correction: representation, estimation,
and testing. Econometrica 55(2): 251–276.
Hassler, U., and J. Wolters (2006). Autoregressive distributed lag models and cointegration. Allgemeines
Statistisches Archiv 90(1): 59–74.
Kripfganz, S., and D. C. Schneider (2018). Response surface regressions for critical value bounds and
approximate p-values in equilibrium correction models. Manuscript, University of Exeter and Max Planck
Institute for Demographic Research, www.kripfganz.de.
Lütkepohl, H. (1993). Introduction to Multiple Time Series Analysis (2nd edition), Berlin, New York:
Springer.
Narayan, P. K (2005). The saving and investment nexus for China: evidence from cointegration tests.
Applied Economics 37(17): 1979–1990.
Pesaran, M. H., and Y. Shin (1998). An autoregressive distributed-lag modelling approach to cointegration
analysis. In Econometrics and Economic Theory in the 20th Century. The Ragnar Frisch Centennial
Symposium, ed. S. Strøm, chap. 11, 371–413. Cambridge: Cambridge University Press.
Pesaran, M. H., Y. Shin, and R. Smith (2001). Bounds testing approaches to the analysis of level
relationships. Journal of Applied Econometrics 16(3): 289–326.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 44/44
Frontiers in African Business Research
Studies on Economic
Development and
Growth in Selected
African Countries
Frontiers in African Business Research
Series editor
Almas Heshmati, Jönköping International Business School,
Jönköping, Sweden
This book series publishes monographs and edited volumes devoted to studies on
entrepreneurship, innovation, as well as business development and management-
related issues in Africa. Volumes cover in-depth analyses of individual countries,
regions, cases, and comparative studies. They include both a specific and a general
focus on the latest advances of the various aspects of entrepreneurship, innovation,
business development, management and the policies that set the business environ-
ment. It provides a platform for researchers globally to carry out rigorous analyses,
to promote, share, and discuss issues, findings and perspectives in various areas
of business development, management, finance, human resources, technology, and
the implementation of policies and strategies of the African continent. Frontiers in
African Business Research allows for a deeper appreciation of the various issues
around African business development with high quality and peer reviewed contri-
butions. Volumes published in the series are important reading for academicians,
consultants, business professionals, entrepreneurs, managers, as well as policy
makers, interested in the private sector development of the African continent.
Studies on Economic
Development and Growth
in Selected African Countries
123
Editor
Almas Heshmati
Jönköping International Business School
Jönköping University
Jönköping
Sweden
and
Department of Economics
Sogang University
Seoul
South Korea
v
vi Contents
vii
viii Contributors
AD Aggregate demand
ADF Augmented Dickey-Fuller
AfDB African Development Bank
AIC Akaike information criteria
ANOVA Analysis of variance
AR Autoregressive
ARDL Autoregressive distributed lag
AS Aggregate supply
ATM Automatic teller machines
AVC Agriculture value chains
CBHIS Community-based health insurance schemes
CD Cobb–Douglas function
CEO Chief executive officer
CLRM Classical linear regression models
COPIMAR Mining Cooperative of Artisan Miners
CORR Control of corruption
CPI Consumer Price Index
CS Capital structure
CUSUM Cumulative sum
CUSUMQ Cumulative sum of squares
DC Developing countries
DHS Demographic and Heath Survey
EAC East African Countries
ECM Error correction model
EDPRS Economic Development and Poverty Reduction Strategy
EIA Ethiopian Investment Authority
EICV Integrated Household Living Conditions Survey
ELH Ethno-linguistic heterogeneity
ix
x Abbreviations
EP Export performance
EPRDF Ethiopian People Revolutionarily Democratic Front
EU European Union
FAO Food and Agricultural Organization
FDI Foreign direct investment
GDP Gross domestic product
GLR Great Lakes region
GMM Generalized methods of moment
GOVEFFE Government effectiveness
HAC Heteroskedasticity and auto-correlation consistent covariance
ICMM International Council on Mining and Metals
ICT Information and communication technologies
IMF International Monetary Fund
KES Kenyan shilling
KEU Kenya Economic Update
LDE Logistic diffusion equation
LDEFO Logistic differential equation of first order
LIC Low-income countries
LM Lagrangian multiplier
LSDV Least square dummy variable
MDG Millennium development goals
MIC Middle-income countries
MLE Maximum likelihood estimation
MNC Multinational corporation
MOFED Ministry of Finance and Economic Development
NBE National Bank of Ethiopia
NGO Non-Governmental Organization
NISR National Institute of Statistics for Rwanda
NRG New Resolutions Geophysics
OECD Organization for Economic Development and Cooperation
OLS Ordinary least squares
OOPE Out-of-pocket healthcare expenditures
PMG Pooled mean group
POLS Pooled OLS
POLSTAB Political stability
PPP Purchasing power parity
PTA Prospective target areas
PWT Penn World Tables
R&D Research and Development
RA Representative Agent
RoL Rule of Law
RQ Regulatory Quality
SAP Structural adjustment programs
SIDA Swedish International Development Cooperation Agency
SME Small- and medium-sized enterprises
Abbreviations xi
Almas Heshmati
Abstract A major policy challenge facing Africa is how to sustain a high rate of
economic growth that is both socially inclusive and environmentally sustainable.
Growth and its sustainability influence many other challenges facing the continent.
This volume is a collection of selected empirical studies on economic development
and growth in Africa. The papers were presented at the second conference on
Recent Trends in Economic Development, Finance and Management Research in
Eastern Africa, Kigali, Rwanda, June 20-22, 2016. The studies are grouped into
domains influencing economic development and growth in Africa.
Keywords Economic Development Economic Growth Sustainable Growth
Determinants of Growth Governance and Institutions African Countries
1.1 Background
The major policy challenges facing Africa are how to sustain a high rate of eco-
nomic growth that is both socially inclusive and environmentally sustainable.
Population aging, population growth, rapid urbanization, infrastructure for pro-
viding services, facilitating production expansion, the need to reverse declined
economic growth after the 2008 global financial crisis, corruption, inefficiency, and
responding to climate change are among the other challenges facing Africa. In this
background, Jönköping International Business School and the University of
Rwanda organize a conference on economic development in the region every year.
This volume is a collection of selected empirical studies on economic development
and growth in Africa. The papers were selected from a set of more than 90 papers
presented at the second conference on Recent Trends in Economic Development,
Finance and Management Research in Eastern Africa, Kigali, Rwanda, June
A. Heshmati (&)
Jönköping International Business School, Jönköping University, Jönköping, Sweden
e-mail: almas.heshmati@gmail.com
A. Heshmati
Department of Economics, Sogang University, Seoul, South Korea
20–22, 2016. Following a process of review and revisions, 15 papers were accepted
for publication in this edited volume on economic development and growth.
The studies are grouped into domains influencing economic development and
growth in Africa. The core argument for using a multiple approach perspective is
the need to account for different approaches for enhancing growth and develop-
ment. The aim is not to identify specific determinants of growth and development
and to apply them to a set of countries assuming that every country is affected in the
same way and by the same factors. This volume realizes that the countries have
different initial and factor endowment conditions, and as such, they respond
differently to development and growth policies. Together, the chapters included in
the volume provide a comprehensive picture of the state of development and growth
and their country-specific determinants and policies. Heterogeneity of countries and
efficient policies and practices in growth and their distribution on selected parts of
the African continent as a whole and also in selected countries mainly in Eastern
Africa are considered. Development and growth represent a major challenge for
governments and organizations whose aim is development and alleviating poverty.
This volume contains a collection of empirical studies on the level of devel-
opment and growth, and their variations and determinants in Africa. The first
chapter is an introduction/summary written by the volume’s Editor. The remaining
15 chapters are inter-related studies that are grouped into five domains which
influence the level, variations, and developments on the African continent as a
whole and also in individual countries. The results can have strong implications for
the development and policies in Africa.
The second study (Chap. 9), Income Distribution and Economic Growth, by
Atnafu GEBREMESKEL, links access to bank loans and income distribution to
productivity growth of firms. Using Ethiopian manufacturing firm-level data, the
study examines how functional income distribution can influence the evolution of
productivity, thereby promoting economic growth. It employs an evolutionary
economic framework and econometric approach for the analysis. The results show
lack of strong evidence of intra-industry selection for fostering productivity growth
and structural change. The key policy lesson is that access to bank loans is of great
importance to firms for their structural transformation.
Part D. Trade, Mineral Exports, and Exchange Rate
Part D covers German SMEs’ trade with sub-Saharan Africa, contributions of
mineral exports to Rwanda’s trade, and the relationship between economic growth
and real exchange rate in low- and middle-income countries.
The first paper (Chap. 10), SME Trade with sub-Saharan Africa: The secret of
German companies’ success, by Johannes O. BOCKMANN, evaluates the degree
to which internal, micro- and macro-environmental variables explain how some
SMEs based in Germany export more successfully to sub-Saharan Africa than
others in the same category. The econometric methodology is used for identifying
the determinants of export performance. Estimation results indicate that
sub-Saharan Africa has specific requirements for successful exports. Knowledge
about these particular characteristics of the market enables managers and policy-
makers to improve trade relations. By focusing on the export performance of
German SMEs in SSA, this study fills a research gap since no previous study has
dealt with this specific aspect.
The second study (Chap. 11), An Assessment of the Contribution of Mineral
Exports to Rwanda’s Total Exports, by Emmanuel MUSHIMIYIMANA, is an
assessment of the mineral industry’s contribution to Rwanda’s growing mineral
exports. Mineral exports can be a means of increasing exports for agrarian and low-
and middle-income countries. The results, based on the econometric methodology,
show that mineral exports are the main contributor in increasing Rwanda’s total
exports. This implies that the Government of Rwanda needs to introduce significant
reforms in the mining sector and take Botswana and Namibia as its role models in
developing its mineral industry which can play a role in the industrialization of the
country.
The third study (Chap. 12), Testing the External Balassa Hypothesis in Low- and
Middle-Income Countries, by Fentahun BAYLIE, analyzes the long-run relation-
ship between economic growth and the real exchange rate for 15 low- and
middle-income countries. It establishes a co-integration relationship between
growth and exchange rate by controlling for heterogeneity and cross-sectional
dependence. It implies that the productivity effect is estimated consistently and
without any bias. Moreover, the results indicate that the effect of the Balassa term
depends more on income levels than on the rate of economic growth. In general, the
power of the effect is stronger for higher income countries in the long run.
However, in the short run, fiscal policy and exchange rate volatility clearly explain
the variations in the real exchange rate.
6 A. Heshmati
The primary market for this edited book includes undergraduate and graduate
students, lecturers, researchers, public and private institutions, NGOs, international
aid agencies, and decision makers. This book can serve as complementary reading
to texts on economic growth, development, welfare, inequality, and poverty anal-
yses in Africa. The organizers of the annual conference on economic development
in East Africa will market the book at their annual East Africa conferences. There
are many books on growth and development in Africa, but they rarely cover such
diversity in approaches and their country specificity character and policy
recommendations.
This edited book is authored by African experts in the field who employ diverse
up-to-date methods to provide robust empirical results based on representative
disaggregate data at the household and firm levels and aggregate data covering
individual or multiple countries on the continent. It contains a wealth of empirical
evidence, deep analyses, and sound recommendations for policymakers and
researchers for designing and implementing effective economic policies and
strategies to achieve rapid and higher levels of development. As such, the book is a
useful resource for policymakers and researchers involved in development- and
growth-related tasks. It will also appeal to a broader audience interested in eco-
nomic development, resources, policies, economic welfare, and inclusive growth.
The Editor is grateful to a host of dedicated authors and rigorous referees
who helped in assessing the submitted papers. Many were presenters at the 2016
conference at the University of Rwanda. Special thanks go to
Bideri Ishuheri Nyamulinda, Rama Rao, and Lars Hartvigson and the remaining
members of the Organization Committee for their efforts in organizing the con-
ference. The Editor would also like to thank William Achauer at Springer Singapore
for guidance and for assessing this manuscript for publication by Springer.
Financial support by the Swedish International Development Cooperation Agency
(SIDA) to organize the conference is gratefully acknowledged.
Part I
Women’s Empowerment and Demand for
Healthcare
Chapter 2
Measuring Women’s Empowerment
in Rwanda
A. Musonera
MIFOTRA-SPIU, Kigali, Rwanda
e-mail: abdoumusonera@gmail.com
A. Heshmati (&)
Jönköping International Business School (JIBS),
Jönköping University, Jönköping, Sweden
e-mail: almas.heshmati@gmail.com
A. Heshmati
Department of Economics, Sogang University, Seoul, South Korea
2.1 Introduction
Similarly, Golla et al. (2011) highlight women’s empowerment as one of the key
drivers in promoting their abilities, rights, and well-being which subsequently
reduce poverty and increase economic growth, productivity, and efficiency.
However, very few empirical studies use Rwandan data, for example Ali et al.
(2014) in their study on the environmental and gender impact of land tenure reg-
ularization in Africa and Mukashimana, and Sapsford (2013) in their study on
marital conflicts in Rwanda.
In this study, we investigate the determinants of women’s empowerment in
Rwanda, especially what determines household decision-making and self-esteem.
We address two questions: Whether variables of sources of empowerment (edu-
cation, employment for cash, regular media exposure, and wealth) have a significant
positive association with women’s empowerment. Some variables of ‘setting’ (age
of the respondent and children ever born) are positively related to women’s
empowerment, while others such as residence and the age at first marriage are
negatively associated with women’s empowerment.
Data used in the current study are from the Demographic and Health Survey
(DHS) conducted in 2010 by the National Institute of Statistics for Rwanda (NISR
2010a, 2013). Respondents were married women aged between 15 and 49.
A multiple regression analysis was used to empirically analyze the determinants of
women’s empowerment in Rwanda. A multinomial logistic regression was also
used to examine the relationship between household decision-making, justifications
about wife beating, and women’s empowerment covariates.
We found evidence that women’s empowerment can be achieved through pro-
viding education, media exposure, labor force participation, shifting negative tra-
ditional cultural norms (such as giving respect to women with more children,
marrying girls at an earlier age), and by focusing on integrated development.
The rest of this paper is organized as follows: The next section reviews literature
on the relationship between women’s empowerment and health outcomes, labor
force participation, access to finance and cultural norms. Section 2.3 describes the
empirical strategy. After an overview of the findings in Sect. 2.4, these are dis-
cussed in Sect. 2.5. The last section gives a conclusion.
We review literature from three perspectives: The first is concerned with the defi-
nitions of women’s empowerment. The second pertains to the determinants of
women’s empowerment and the association between their empowerment and dif-
ferent health outcomes, cultural norms and the influence of labor force participation
and women’s access to finance on their empowerment. The third strand relates to
the conceptual framework.
14 A. Musonera and A. Heshmati
of agency (carrying out their roles and responsibilities), and transformative (ca-
pacity to act on the restrictive aspects of roles and responsibilities and being able to
challenge them).
In the new global economy, women’s empowerment has become a central issue for
countries to achieve development goals such as economic growth, poverty reduc-
tion, health, education, and welfare (Golla et al. 2011). Of late there is a renewed
interest in the relationship between women’s empowerment and health outcomes.
Some of these theories focus on women’s empowerment and health care use
(Bloom et al. 2001; Fotso et al. 2009; Lee-Rife 2010; Sado et al. 2014). Women’s
empowerment has been identified as a driving force in ensuring improved maternal
health care (Sado et al. 2014). The place of delivery is mainly influenced by wealth,
education, and demographic and health covariates, while autonomy,
decision-making and freedom of movement are found to have little influence on the
place of delivery (Fotso et al. 2009).
Women’s involvement in decision-making and their attitudes toward negative
cultural norms such as domestic violence have been highlighted as the main
determinants in the use of maternal healthcare services (Sado et al. 2014).
Overall, these studies highlight the need for policy actions that focus not only on
education but also on other factors that are likely to enhance health status with the
aim of improving health outcomes for women and their families.
However, a majority of these maternal health studies mainly focus on women’s
individual-level variables such as age, education, and income or community-level
factors while little attention is paid to the effect of bargaining powers within
households. Thus, without an unbiased and accurate measurement of power,
decision-making processes and different paths through which they affect repro-
ductive health outcomes, our understanding of the covariates of maternal health and
child health are incomplete.
A large and growing body of literature has investigated the association between
women’s empowerment and fertility preferences (Abadian 1996; Al Riyami et al.
2004; Larsen and Hollos 2003; Patrikar et al. 2014; Schuler et al. 1996; Upadhyay
and Karasek 2012; Upadhyay et al. 2014). Fertility preferences are mainly influ-
enced by women’s resource control, freedom of movement, and freedom from
household domination. The most striking result to emerge from the data is that all
three variables exert little influence on contraceptive use (Schuler et al. 1996).
The results are not consistent with regard to the number of children because some of
the studies show a negative relationship between women’s empowerment and the
number of children, while others show that there is a positive connection between
women’s empowerment and fertility preferences (having children or not). A few
studies also show that there is no connection between empowerment and fertility
preferences (Upadhyay et al. 2014).
16 A. Musonera and A. Heshmati
Women get empowered through two pathways (different ways of being and
experience sharing) that operate individually. However, it is also found that a
woman’s potential to attain positive outcomes is accelerated when she possesses
more than one pathway (Allsopp and Tallontire 2014). The level of empowerment
in a village depends on different pathways (personal, economic, and political) and
linkages across scale ranging from personal bodies and household relations to the
community (Goldman and Little 2014). Kabeer (1999a, b) points out that women’s
empowerment is conceptualized as a three-dimensional process that encompasses
resources or pre-conditions of empowerment, agency, or process and achievements
that measure outcomes. Kabeer further argues that women’s potential to exercise
strategic life choices is conceptualized in terms of three dimensions or moments for
the social change process to be completed:
18 A. Musonera and A. Heshmati
Our study set out to assess what determines women’s empowerment in Rwanda
using household decision-making and self-esteem indicators. The results will
extend our knowledge of variables which are a source and setting of empowerment.
The data used are from the 2010 Demographic and Health Survey (DHS) by the
National Institute of Statistics for Rwanda (NISR 2010b, 2013). The respondents
were married women aged between 15 and 49. A multiple regression analysis was
used to empirically analyze the determinants of women’s empowerment in Rwanda.
A multinomial logistic regression was used to examine the relationship between
household decision-making, justifications for wife beating and, women’s empow-
erment covariates.
Questions on who had the final say on what to do with a respondent’s earnings,
respondent’s health care, large household purchases, and visits to family or relatives
were asked during the survey. Different responses for each question were labeled
as: others (0), joint decision (1), and decision alone (2). Then, each decision was
used as a dependent variable to determine the likelihood of that decision being
taken given different covariates of women’s empowerment using a multinomial
logistic regression.
Moreover, attitude toward physical abuse (in the survey labeled as wife beating)
was investigated using five questions that were asked to know the circumstances
under which wife beating was justified: going outside without permission,
neglecting children, arguing with husband, burning food and refusing to have sex
with her husband. Responses to the questions were labeled: Yes (1), No (2) and
others (0). Then, a multinomial logistic regression was used to regress each decision
on different covariates of women’s empowerment to determine the odds in their
ratios. The covariates used were the same as those used in the previous model with
women’s empowerment, that is, age group, children ever born, education, media
exposure, employment for cash, residence, wealth and age at first marriage.
This baseline model is associated with models used by Kabeer and Subaiya
(2008), Sado et al. (2014), Mahmud et al. (2012), and Mahmud and Tasmeen
(2014). Kabeer and Subaiya (2008) point out that women’s empowerment is largely
determined by access and control over resources, indicators of sources of
empowerment (educational attainment, employment for cash and media exposure)
and a setting of empowerment including indicators such as a higher age at first
marriage and smaller spousal age difference.
The main weakness of Kabeer and Subaiya’s (2008) study is the paucity of data
on all indicators of women’s empowerment (only data on household
decision-making and attitudes toward wife beating was available) and some of the
covariates that were used in previous studies. Another weakness of their study is
that the results might have been affected by measuring women’s empowerment
using data which contained missing values.
Data used in our study were obtained from the Demographic and Health Survey
(DHS 2010a). The respondents were married women aged between 15 and 49.
Women’s empowerment was investigated using two indicators—household
decision-making and attitudes toward gender roles.
2 Measuring Women’s Empowerment in Rwanda 21
A. Dependent variables
The dependent variables used in our study were the cumulative empowerment
index (the main component) and its constituents, that is, the decision-making index,
the self-esteem index, decision-making (alone and jointly) and agreeing with jus-
tifications for wife beating (yes or no).
The decision-making index
Respondents were asked different questions regarding who had the final say on
different household decisions such as respondent’s health care, visits to family and
relatives, large household purchases and decision on what to do with the money that
the husband earned. The responses were coded 1 if the decision was taken by the
respondent alone, 2 if the decision was jointly taken by the respondent and her
husband, 3 if the decision was taken by the respondent and another person, 4 if the
decision was taken by the husband/partner alone, 5 if the decision was taken by
someone else, and 6 for others.
The decision-making index was computed by assigning scores to different
responses. A (2) was assigned to every response where the decision was taken alone
by the respondent, (1) was assigned to every response where the decision was
jointly taken and (0) otherwise. Then, individual scores for the different decisions
were added to get total scores out of 10 (10 is the maximum score), that is, 2 (marks
maximum/decision) * 5 questions.
The self-esteem index
Respondents were asked questions about their attitudes toward gender roles and
norms. They were also asked whether wife beating was justified under one of the
following circumstances:
• When she goes out without telling her husband.
• If she neglects the children.
• If she argues with her husband.
• If she refuses to have sex with her husband.
• If she burns the food.
Responses were coded (1) if the respondent said yes and (0) if the respondent
said no.
In our study, the scores assigned to different responses were: (1) for every
response where the respondent said no and (0) for every response where the
respondent answered yes. Finally, individual scores were added to get the total
scores out of five (maximum 1 mark *5 questions).
The value of either the decision-making index or the self-esteem index should
fall in the interval 0–1 or alternatively between 0 and 100%.
The cumulative empowerment index
While conducting DHS, the respondents were not asked to assign weights to
different indicators of women’s empowerment. Therefore, we assumed that all the
22 A. Musonera and A. Heshmati
indicators had the same weight and then computed the cumulative empowerment
index using a nonparametric method as indicated by:
Table 2.1 depicts the relationship between women’s empowerment and its
covariates. In column 1, it gives the association between the cumulative empow-
erment index and its covariates. It is apparent from this column that there is a
significant positive correlation between women’s empowerment and some of its
covariates such as age, number of children ever born, education, employment for
cash, exposure to media and wealth. Younger women in their twenties are less
likely to be empowered (0.0274) as compared to older women (0.0339). The results
show that women with more children (five and above) are more likely to be
empowered (0.160) than women with less children (one or two) whose coefficient is
only 0.114. The results also indicate that women with higher education are more
empowered (0.171) than those with primary education (0.030). Similarly,
employment for cash and media exposure is positively associated with the cumu-
lative empowerment index (see Table 2.1, column 1). Women in wealthier families
are more likely to be empowered (0.0525) as compared to those from poor families
(0.0190).
In the same way, the same direction of causality is observed with the
decision-making index (see Table 2.1, column 2). These results match those
observed in previous studies. Women’s empowerment was found to be positively
associated with education levels, age, household wealth (income), and employment
status (such as in Sultana and Hossen 2013). Likewise, Khan and Noreen (2012)
found that women’s empowerment was mainly determined by age, husband’s
education, assets inherited from the father, number of children alive, and the
amount of microfinance.
On the contrary, living in a rural area and getting married at a younger age were
found to be negatively associated with both the cumulative empowerment and
decision-making indices. Moreover, the results reveal a significant positive asso-
ciation between self-esteem and variables such as education, wealth, and age of the
respondent (see Table 2.1, column 3). Women with higher education had higher
levels of self-esteem (0.268) than those with primary education (0.0527). Women
from wealthier families had higher self-esteem (0.080) than those from poor
2 Measuring Women’s Empowerment in Rwanda 25
families (0.020). However, residence (rural) and age at first marriage were found to
be negatively associated with self-esteem (see Table 2.1, column 3).
These results are in agreement with those obtained by Kishor and Subaiya
(2008) who found that women in urban areas were more likely to reject wife beating
as compared to women in rural areas and younger age at first marriage was asso-
ciated with a high likelihood of accepting justifications for wife beating.
Tables 2.2 and 2.3 present odds ratios (using a multinomial logistic regression) for
respondents’ decision-making (jointly and alone) on five household decisions—
what to do with a respondent’s earnings, respondent’s health care, large household
purchases, visits to family or relatives, and what to do with the money that the
husband earns. Women in their twenties had high odds in favor of taking decisions
alone on all the five aspects as compared to older women. Table 2.2 shows that
women with more children (five and above) were more likely to take the five
2 Measuring Women’s Empowerment in Rwanda 27
Table 2.2 Odds ratios (using a multinomial logistic regression) for household decision-making
(alone)
What to do with Respondent’s Large Visits to What to do
respondent’s health care household family and with husband’s
earnings purchases relatives earnings
Age groups
15–19
20–29 1.902*** 1.919*** 2.039*** 2.117*** 1.985***
(9.81) (12.42) (13.56) (14.20) (13.58)
30–39 1.741*** 1.805*** 2.020*** 2.039*** 1.850***
(8.60) (10.94) (12.65) (12.79) (11.87)
40–49 1.304*** 1.480*** 1.678*** 1.482*** 1.376***
(5.96) (8.08) (9.53) (8.33) (7.98)
Children categories
None
1 or 2 2.318*** 2.405*** 2.354*** 2.524*** 2.412***
(23.88) (28.28) (29.94) (32.17) (30.50)
3 or 4 2.588*** 2.822*** 2.556*** 2.839*** 2.628***
(24.24) (29.22) (28.69) (31.30) (29.30)
5 and 2.806*** 3.305*** 2.979*** 3.494*** 3.101***
above (23.68) (30.22) (29.49) (33.28) (30.47)
Education
No
education
Primary 0.0980 0.208*** 0.133* 0.151* 0.158**
(1.51) (3.33) (2.25) (2.45) (2.70)
Secondary 0.149 0.0228 0.0248 0.0289 0.0267
(1.45) (0.23) (0.27) (0.30) (0.29)
Higher 0.904*** 0.516** 0.733*** 0.707*** 0.613***
(4.56) (2.60) (4.00) (3.73) (3.38)
Employment for cash
No paid
work
Paid work 1.186*** −0.0383 0.127* 0.190*** 0.0558
Exposure to media
No media
exposure
Low 0.359*** 0.365*** 0.430*** 0.463*** 0.471***
media (6.33) (6.75) (8.41) (8.71) (9.26)
exposure
High 0.344* 0.442** 0.273 0.362* 0.464**
media (2.20) (2.89) (1.89) (2.43) (3.25)
exposure
(continued)
28 A. Musonera and A. Heshmati
household decisions alone as compared to women with less children. The results
also show that women with higher education had higher chances of taking decisions
alone compared to those with primary education. Media exposure was found to
increase a respondent’s likelihood of taking decisions alone for all the five ques-
tions. Likewise, women from wealthier families had higher odds when it comes to
taking decisions alone as compared to those from poor families. Surprisingly,
women with low age at first marriage (18–24) were found to be more likely to take
decisions alone compared to those with a higher age at first marriage. However,
employment for cash influenced taking decisions alone for some decisions, while
residence had no influence on decision-making alone.
As shown in Table 2.3, the odds of joint decision-making for four of the five
questions were high among younger women as compared to older women.
2 Measuring Women’s Empowerment in Rwanda 29
Table 2.3 Odds ratios (using a multinomial logistic regression) for household decision-making
(jointly)
What to do Respondent’s Large Visits to What to do
with health care household family or with
respondent purchases relatives husband’s
earnings earnings
Age groups
15–19
20–29 2.402*** 2.399*** 2.019*** 2.090*** 2.193**
(4.03) (5.66) (3.31) (5.25) (2.95)
30–39 2.613*** 2.585*** 2.576*** 2.184*** 2.461**
(4.33) (6.01) (4.16) (5.36) (3.24)
40–49 2.296*** 2.623*** 2.568*** 2.059*** 2.217**
(3.74) (5.95) (4.05) (4.88) (2.84)
Children categories
None
1 or 2 2.460*** 2.706*** 2.162*** 2.659*** 2.269***
(10.94) (15.54) (7.82) (13.56) (6.91)
3 or 4 2.845*** 3.280*** 2.610*** 3.253*** 2.535***
(12.13) (17.92) (9.11) (15.83) (7.28)
5 and 3.171*** 3.588*** 2.949*** 3.824*** 3.113***
above (12.92) (18.44) (9.87) (17.52) (8.54)
Education
No
education
Primary 0.0189 0.175* 0.177 0.0860 0.0294
(0.19) (2.06) (1.47) (0.94) (0.19)
Secondary 0.0915 0.465*** 0.368 0.189 -0.005
(0.57) (3.49) (1.79) (1.21) (-0.02)
Higher 0.611* 1.155*** 0.799 0.710* 0.137
(2.04) (4.59) (1.93) (2.13) (0.27)
Employment for cash
No paid
work
Paid work 1.298*** 0.553*** 0.531*** 0.513*** −0.000
(10.54) (6.11) (3.86) (5.20) (−0.00)
Exposure to media
No media
exposure
Low −0.0250 0.241** −0.0912 0.277*** 0.093
media (−0.28) (3.21) (−0.85) (3.38) (0.66)
exposure
High −0.0962 0.209 −0.00921 0.0677 0.477
media (−0.38) (0.99) (−0.03) (0.26) (1.28)
exposure
(continued)
30 A. Musonera and A. Heshmati
Surprisingly, older women were more likely to take a decision jointly on their
health care as compared to younger women. Joint decision-making was found to be
an increasing function of the number of children that a woman had. Employment
for cash increased the odds of joint decision-making on all five household deci-
sions. However, variables such as education, wealth, media exposure, and residence
influenced only a few of the decisions. For example, residence (rural areas) reduced
a respondent’s likelihood to jointly decide about what to do with her earnings and
about large household purchases.
2 Measuring Women’s Empowerment in Rwanda 31
Table 2.4 Odds ratios (using a multinomial logistic regression): justifications for physically
abusing a wife
Beating Beating Beating Beating Beating
justified if justified if justified if justified if justified if
she goes she wife wife refuses wife burns
without neglects argues to have sex the food
telling her children with her with her
husband husband husband
Age group
15–19
20–29 0.130* −0.056 0.003 −0.026 0.007
(2.18) (−0.98) (0.05) (−0.44) (0.10)
30–39 −0.0866 −0.276*** −0.185* −0.0731 −0.159
(−1.09) (−3.59) (−2.25) (−0.91) (−1.61)
40–49 −0.229* −0.320** −0.299** −0.115 −0.310*
(−2.14) (−3.10) (−2.72) (−1.07) (−2.34)
Children categories
None
1 or 2 −0.040 −0.053 0.092 0.037 0.051
(−0.70) (−0.96) (1.54) (0.64) (0.71)
3 or 4 0.0267 0.0606 0.178* 0.0465 0.124
(0.38) (0.89) (2.48) (0.66) (1.44)
5 and above 0.0799 0.0889 0.203* 0.0501 0.127
(0.97) (1.11) (2.41) (0.61) (1.25)
Education
No education
Primary −0.225*** −0.230*** −0.315*** −0.320*** −0.367***
(−4.24) (−4.39) (−5.88) (−6.02) (−6.00)
Secondary −1.090*** −1.021*** −1.215*** −1.257*** −1.295***
(−13.35) (−13.34) (−14.32) (−15.23) (−12.03)
Higher −2.814*** −2.566*** −2.695*** −2.384*** −3.006***
(−7.62) (−8.64) (−7.28) (−7.99) (−5.09)
Employment for cash
No paid work
Paid work 0.0412 −0.0121 −0.115** −0.0948* −0.178***
(0.95) (−0.29) (−2.59) (−2.17) (−3.39)
Exposure to media
No media
exposure
Low media −0.111** −0.068 0.024 −0.0946* 0.018
exposure (−2.61) (−1.66) (0.57) (−2.20) (0.36)
High media −0.0934 0.106 0.0754 0.0459 0.206
exposure (−0.73) (0.90) (0.57) (0.36) (1.29)
(continued)
32 A. Musonera and A. Heshmati
Table 2.4 illustrates the odds ratios about respondents’ attitudes on justifications for
wife beating. Women with higher education were less likely to agree with wife
beating (for all the five reasons) than those with primary education. Women from
wealthier families were less likely to agree with wife beating for all five reasons
than those from poor families. Residing in rural areas was found to increase the
odds for agreeing with wife beating for all five reasons. However, variables such as
2 Measuring Women’s Empowerment in Rwanda 33
age, children ever born, media exposure, and paid work influenced some of the
reasons. Unlike our expectations, age at first marriage had no influence on attitudes
toward wife beating.
et al. (2012). Employment for cash had a positive association with both the cu-
mulative empowerment index (0.0202) and the decision-making index (0.0332).
However, employment for cash had no association with the self-esteem index.
Regular media exposure was positively associated with both the cumulative
empowerment and decision-making indices. This can be attributed to the fact that
the media exposes women to a world outside their homes, including new ideas and
non-traditional roles for them. These results are consistent with Mahmud et al.’s
(2012) findings. Unlike our expectations, no relationship was found between media
exposure and women’s empowerment and self-esteem. Residence (rural area) was
negatively associated with the cumulative empowerment and self-esteem indices,
but it was unrelated to the household decision-making index (see Table 2.1).
Age at first marriage had a significant negative relationship with the cumulative
empowerment and decision-making indices (see Table 2.1). One possible expla-
nation for this is that an early age at first marriage limits the access that a woman
has to education. She also has less time for her development and maturity without
the interference of marriage and the responsibilities of raising children. Moreover,
being young she is less likely to be accorded much power and independence in her
parents’ home. These findings are similar to those by Kishor and Subaiya (2008).
However, unlike them, our study did not find any association between self-esteem
and age at first marriage.
Wealth was found to be positively associated with the cumulative empowerment
and self-esteem indices. Women from wealthier families were more empowered and
had higher self-esteem than those from poor families. However, wealth was posi-
tively associated with household decision-making for only the rich but was unre-
lated to the poorest, poorer, and middle-income families (see Table 2.1).
Younger women (20–29) were less likely to take decisions alone and jointly as
compared to those in the 30–39 years age bracket, but women in the 40–49 years
age group were less likely to take four or five decisions alone and jointly as
compared to women in their twenties (see Table 2.1). Surprisingly, older women
were more likely to take decisions jointly about their health care than younger
women (see Tables 2.2 and 2.3). These results are in line with those of previous
studies such as those by Mahmud et al. (2012), whose findings revealed that young
and older women had lower decision-making powers while women in their
mid-twenties had high decision-making powers. This phenomenon can be
explained by the fact that there are chances that young women live in extended
families and old women are no longer involved in decision-making as most of them
rely on their adult sons.
Decision-making alone and jointly increased with the number of children for all
five decisions (see Tables 2.2 and 2.3). These results further support Kishor and
Subaiya’s (2008) findings who state that the proportion of women who take
decisions alone or jointly increases with the number children.
As a potential source of empowerment, education was positively associated with
household decision-making, notably with decision-making alone. The odds of
women’s participation in decision-making increased with the level of education but
with variations in terms of type of participation and decisions. The results show that
2 Measuring Women’s Empowerment in Rwanda 35
households were less likely to have a say in household decision-making and that
they tended to have the view that their voices were relatively not worthwhile but
there was a high likelihood of their having access to cash to spend (Mahmud et al.
2012).
Older women were found to be less likely to agree with four of the five justi-
fications for wife beating. Education was negatively associated with agreeing with
justifications for wife beating for all five reasons (see Table 2.4). Women with
higher education were less likely to agree with wife beating for any of the five
reasons as compared to those with lower education levels (primary education).
These findings are in agreement with Kishor and Subaiya’s (2008) findings which
show that the higher the education level, the lower the likelihood of a woman
agreeing that wife beating was justified for any reason, and the higher the likelihood
of her agreeing with the fact that it was a woman’s right to refuse sex with her
husband.
Women with paid work were less likely to agree with justifications for wife
beating for three of the five reasons (see Table 2.4). Women with regular exposure
to the media were less likely to agree with wife beating for two of the five reasons.
Women residing in rural areas were more likely to agree with justifications for wife
beating for all the five reasons. Wealth reduced the odds in favor of saying yes to
justifications for wife beating for all the five reasons. Women from wealthier
families were less likely to agree with justifications for wife beating for all five
reasons as compared to women from poor families.
Table 2.4 illustrates the odds ratios about respondents’ attitudes toward justifi-
cations for wife beating. Women with higher education were less likely to agree
with wife beating (for all five reasons) than those with primary education. Women
from wealthier families were also less likely to agree with wife beating for all five
reasons than those from poor families. Residing in rural areas increased the odds in
favor of justifications for wife beating for all five reasons. However, variables such
as age, children ever born, media exposure, and paid work influenced some of the
reasons. Unlike our expectations, age at first marriage had no influence on attitudes
toward wife beating.
2.6 Conclusions
The most obvious finding of this study is that education, age of the respondent,
media exposure, and employment for cash and wealth had a positive relationship
with women’s empowerment. Our study also found that education, wealth, age, and
the number of children had high explanatory powers for women’s empowerment as
compared to the other variables. Taken together, the findings suggest that women’s
empowerment can be achieved by providing them education, labor force partici-
pation, media exposure, shifting negative traditional cultural norms, and by
focusing on integrated development.
2 Measuring Women’s Empowerment in Rwanda 37
The main weakness of this study is the paucity of data on all indicators of
women’s empowerment (only data on household decision-making and attitudes
toward wife beating was available) and some of the covariates that were used in
previous studies. Another weakness is that the results might have been affected by
missing values on the data on measuring women’s empowerment. As society is
evolving fast through education, technology, urbanization, and globalization,
continuous improvement in survey structures is required; there is also a need to
collect data on women’s empowerment indicators that have not been taken into
account in previous surveys.
More studies need be carried out on the uncovered aspects of women’s
empowerment, especially the relationship between women’s empowerment and
variables such as fertility, health care, contraceptive use, and microfinance.
Women’s autonomy and their determination to participate in the labor force, as well
as their contribution to economic growth and well-being also need to be considered.
References
Abadian S (1996) Women’s autonomy and its impact on fertility. World Dev 24(12):1793–1809
Al Riyami A, Afifi M, Mabry RM (2004) Women’s autonomy, education and employment in
Oman and their influence on contraceptive use. Reprod Health Matters 12(23):144–154
Ali DA, Deininger K, Goldstein M (2014) Environmental and gender impacts of land tenure
regularisation in Africa: pilot evidence from Rwanda. J Dev Econ 110:262–275
Allendorf K (2007) Do women’s land rights promote empowerment and child health in Nepal?
World Dev 35(11):1975–1988
Allsopp MS, Tallontire A (2014) Pathways to empowerment? Dynamics of women’s participation
in global value chains. J Cleaner Prod 107:114–121
Alsop R, Heinsohn N (2005) Measuring empowerment in practice: structuring, analysing and
framing indicators. World Bank Policy Research Working Paper 3510
Bloom SS, Wypij D, Gupta MD (2001) Dimensions of women’s autonomy and the influence of
maternal health care utilization in a north Indian city. Demography 38(1):67–78
El-Halawany HS (2009) Higher education and some upper Egyptian women’s negotiation of
self-autonomy at work and home. Res Comp Int Educ 4(4):423–436
Faridi MZ, Chaudhry IS, Anwar M (2009) The social-economic and demographic determinants of
women’s work participation in Pakistan: evidence from Bahawalpur district. MPRA Paper
22831
Fotso JC, Ezeh AC, Essendi H (2009) Maternal health in resource-poor urban settings: How does
women’s autonomy influence the utilization of obstetric care services? Reprod Health, 16 June
2009
Ganle JK, Afriyie K, Segbefia AO (2015) Microcredit: empowerment and disempowerment of
rural women in Ghana. World Dev 66:335–345
Ghuman SJ, Lee HJ, Smith HL (2004) Measurement of women’s autonomy according to women
and their husbands: results from five Asian countries. Soc Sci Res 35:1–28
Goldman MJ, Little JS (2014) Innovative grassroots NGOs and the complex processes of women’s
empowerment: an empirical investigation from northern Tanzania. World Dev 66:762–777
Golla AM, Malhotra A, Nanda P, Mehra R (2011) Understanding and measuring women’s
economic empowerment: definitions, framework and indicators. International Center for
Research on Women (ICRW)
38 A. Musonera and A. Heshmati
Abstract In the 2000s, the Government of Rwanda initiated health sector reforms
aimed at increasing access to health care. Despite these reforms, there has not been
a corresponding increase in demand for health services, as only about 30% of the
sick use modern care (NISR in Preliminary results of interim demographic and
health survey 2010. NISR, Kigali, 2011). The objective of this paper was to
examine the factors influencing the demand for outpatient care in Rwanda and
suggesting appropriate measures to improve utilization of health services. The data
are from the Integrated Household Living Conditions Survey (EICV2) conducted in
2005 by the National Institute of Statistics Rwanda (NISR). A structural model of
demand for health care is estimated to measure the demand effects of covariates.
The findings indicate that health insurance is a significant determinant of outpatient
medical care. In addition, the price of health care and household income are among
the main drivers of utilization of health care. Women are more likely to seek
outpatient health care as compared to men. Two main policy recommendations
emerge from these findings. First, the government should reduce out-of-pocket
healthcare expenditures (OOPE) through subsidies for public health facilities.
Second, the government should reduce the premiums for community-based health
insurance schemes (CBHIs) to increase coverage rates.
Keywords Outpatient Health insurance Endogeneity User fees Logit model
JEL Classification Codes I10 I11 I12 I13 D12
3.1 Introduction
The theoretical model for analyzing human capital and health and its effect on
productivity, earnings, and labor supply was first developed by Grossman (1972).
The premise of his theory was that an increase in a person’s stock of health raises
his or her productivity in both market and non-market activities. There exist large
productivity and wage payments benefits of a better health. There is evidence to
show that sickness can have adverse effects on learning, and that these impacts can
later influence economic outcomes (Bhargava et al. 2001). Better health can make
workers more productive either through fewer days off or through increased pro-
ductivity while working. Improved nutrition and reduced diseases, particularly in
early childhood, lead to improved cognitive development, enhancing the ability to
learn. Healthy children also gain more from school because they are absent for
fewer days due to ill health.
While health is determined by many factors including medical care, food,
housing conditions, and exercising, it is accepted that medical care is one of the key
determinants in the health production function (McKeown 1976). Santerre and
Neun (2010) argue that as a firm uses various inputs such as capital and labor to
manufacture a product, an individual uses healthcare inputs to produce health.
When other factors are held constant, an individual’s health status indicates the
maximum amount of health that can be generated from the quantity of medical care
consumed.
Considering the importance of medical care, both policymakers and researchers
have directed much attention to the question of how broad access to health services
can be ensured (Lindelow 2002). Early policy and research initiatives focused on
the need to improve physical access through an expansion of the network of health
facilities. This consisted of improving healthcare delivery including healthcare
professionals, equipment, and buildings. A growing literature on health care,
however, points out that supply is not sufficient and this means that providing
maximum access to health care remains a challenge for governments in many
low-income countries.
In Rwanda, access to health care was identified as an important objective for
formulating public policies since good health is recognized as a necessary condition
for enjoying economic and social opportunities. The country has developed a
healthcare setting open to all Rwandans that is accessible to everyone regardless of
socioeconomic status. For instance, in the Rwanda Economic Development and
Poverty Reduction Strategy (EDPRS, 2008), access to health care is one of the
strategies for eradicating poverty. The strategy’s objective is promoting health care
among the entire population, increasing geographical accessibility, increasing the
availability and affordability of drugs, and improving the quality of services.
Increased accessibility to health care has several benefits particularly among the
poor segments of the population (The World Bank 2001). The millennium devel-
opment goals (MDGs) also recognize health as an essential ingredient in the social
and economic progress of any country. However, despite improvements in access to
3 Determinants of Demand for Outpatient Health Care in Rwanda 43
health care through community-based health insurance schemes (CBHIs) and other
insurance providers, it is not known why healthcare utilization has remained low in
Rwanda.
To increase access to health services, the Government of Rwanda initiated a
number of health policies and other economic stimulus efforts, some of them tar-
geting the supply side of the market while other policies are aimed at increasing
service utilization. The policies include Vision 2020, the Economic Development
and Poverty Reduction Strategy (EDPRS) 2008–2012, One-Cow-One-Family, the
Social Security Policy 2009, and the Health Policy 2004 (Ministry of Health 2009).
These policies are meant to increase access to health services and hence improve the
health status of the population. The reforms are also meant to strengthen the
healthcare system and make it more accessible. Despite these reforms, less than two
out of five sick people sought formal health care in Rwanda (NISR 2011). The
ineffectiveness of previous policies aimed at increasing healthcare utilization is due
to their implementation without adequate evidence about the factors influencing
health service utilization in Rwanda. The aim of this study is to examine the factors
that influence demand for outpatient healthcare services in Rwanda.
Although economic theory offers potential factors that influence demand for
health care, there is lack of a quantitative assessment of their effects in Rwanda.
Evidence on these factors is needed for implementing policies designed to improve
health service utilization in the country. To my knowledge, no studies have been
done in Rwanda in recent years to determine the factors influencing healthcare
demand. The only available evidence on this is from studies by Jayaraman et al.
(2008) and Shimeles (2010), which focus on maternal health care and on effects of
CBHIs at the district level. In countries in which estimates of demand for health
care exist, research results provide conflicting evidence of the demand effects of
price, income, and insurance suggesting that more studies are needed.
Most studies on demand for health care do not address the problems of endo-
geneity (reverse causality) and heterogeneity (variations in the estimated effect size
due to unobservables). Failure to address these problems leads to biased estimates
(Kabubo-Mariara et al. 2009; Lawson 2004; Rosenzweig and Schultz 1982).
McCool et al. (1994) point out that differences in data, model specifications, and/or
empirical methods can contribute to diversity in demand estimates and hinder
clarity in healthcare financing policies. Our paper addresses these estimation
problems by providing rigorous evidence on outpatient healthcare demand deter-
minants in Rwanda that policymakers can use for improving health service uti-
lization across all the regions.
Healthcare services are demanded as an input into the production of health that is
part of an individual’s utility function together with other goods. Empirically, an
analysis of health services examines their determinants based on the microeconomic
44 C.M. Ruhara and U.M. Kioko
3.3 Methodology
Following Grossman (1972), individuals maximize their utility over health and
other goods subject to market and non-market factors. Health is one of the several
commodities over which individuals have well-defined preferences. Market factors
include availability of health inputs and their prices, insurance, and household
incomes. Non-market factors include household characteristics, location or dis-
tance, and individual characteristics such as age, education, health status, and the
perceptions that they have about the quality of health services (Ajakaiye and
Mwabu 2007; Appleton and Song 1999; Bategeka et al. 2009). Assuming that
health care is a consumption good, a consumer’s problem can be expressed as
H ¼ HðZ; I; S; C; A; hs ; Ph ; NO Þ ð3:3Þ
Given H ¼ HðZ; I; S; C; A; hs ; Ph ; NO Þ
Solving the maximization problem yields a demand function for health care
specified as
3 Determinants of Demand for Outpatient Health Care in Rwanda 47
Dh ¼ f ðI; B; A; S; C; hs ; Ph ; NO Þ ð3:5Þ
where Dh refers to the demand for outpatient; I is health insurance; B is the budget
or income; A is household asset; and S represents socio-demographic variables;
C represents community characteristics including distance to health facility; hs is
the household composition; Ph is the price of health care; and NO are household
non-observable characteristics.
Equation 3.5 is a structural outpatient healthcare demand equation that includes
an endogenous variable among the independent variables. The endogenous variable
is health insurance because of reverse causality between demand for health care and
insurance while exogenous variables include the monetary price for health care,
income, age, gender, educational attainment of the individual, household size,
location, and region. In our study, the demand for outpatient care is discrete rather
than continuous because patients seek or do not seek health care. In Eqs. 3.1 and
3.2, a health investment good is purchased only for the purpose of improving health
so that it enters an individual’s utility function only through H.
In the demand for outpatient model, insurance is assumed to improve access to
health services. In addition, the heterogeneity of health insurance due to a nonlinear
interaction of demand for health services with unobservable and omitted variables
could bias the estimates. Our study assumes that demand for health services has
only one endogenous variable and demand for outpatient refers to any curative
outpatient service provided by a physician or any other medical staff. Given the
dichotomous nature of outpatient care, the estimation adopts a binary discrete
model, where health care is either sought or not. Assuming that the errors are
distributed logistically, we adopt a logit regression method to estimate both out-
patient and inpatient healthcare demands. The dependent variable takes any two
values: l if an individual uses outpatient health care and zero representing indi-
viduals who do not use any health services. The logit regression is also preferred
because most of the studies on demand for health services use a logit regression
(see Hahn 1994; Lépine and Nestour 2008). This relationship can be expressed as
1 if the event takes place ðthe individual seeks outpatient serviceÞ
Yi ¼
0 if the event has not taken place ðthe individual has not sought treatmentÞ
Equation 3.5 expressing the demand for health care can be rewritten as
where yi is a latent variable showing the probability that medical care is sought or
not sought, x0i is a vector of characteristics related to the individual, household, and
community, and ei is the error term.
48 C.M. Ruhara and U.M. Kioko
The values zero and 1 are used because they allow the definition of probability
of occurrence of an event as the mathematical expectation of the variable Y. This
can be expressed as
X
3
log itðpi Þ ¼ b0 þ bj Xji ¼ b0 þ b1 X1i þ b2 X2i þ b3 X3i þ ei ð3:10Þ
j¼1
where
Yi is an indicator of the choice of modern health care (outpatient) by the ith
household member,
X1i Vector of characteristics related to an individual such as age, education, and
sex,
X2i Vector of characteristics related to a household such as income and insurance,
and
X3i Vector of characteristics related to community-level characteristics such as a
medical specialist and the distance from the household to the health facility.
If in Eq. 3.10, bj [ 0, then an increase in Xji (for instance, household income),
while all other exogenous variables remain unchanged will increase the log-odds
ratio of individual i seeking health services. If bj \0, then an increase in Xji (for
example, user fee) will reduce the log-odds ratio. If bj ¼ 0, then the variable has no
effect.
3 Determinants of Demand for Outpatient Health Care in Rwanda 49
However, in the case of Eq. 3.10, bs indicates changes in the logistic index with
the sign of b indicating the direction of the eventual change in the probability of
seeking care from a given health facility. Equation 3.10 is the structural form of the
probabilistic healthcare demand function. In this equation, as in recent literature,
one of the independent variables—health insurance—is endogenous and the esti-
mation has to address this problem. Endogeneity is due to reverse causality between
health insurance and demand for health care. So, in order to obtain unbiased and
consistent estimates, instrumentation of the endogenous variable is required. The
instrumental variable should be correlated with the endogenous regressor but
unrelated directly to the dependent variable (Ajakaiye and Mwabu 2007).
Health insurance in Eq. 3.10 is endogenous to the dependent variable. Thus,
estimating the equation without taking into account this problem might encounter
the problem of simultaneity which is due to the possibility of reverse causality
between demand functions and health insurance. Endogeneity of health insurance
arises because the decision to purchase health insurance and the utilization of health
services are intertwined. First, since insurance reduces the effective price of medical
care, insured people tend to consume more health services (Rashad and Markowitz
2009). Second, even if individuals cannot perfectly predict their future health needs,
they are likely to have information about their health status which could lead them
to anticipate higher use of health services and then decide to buy health insurance.
Thus, healthcare utilization not only depends on an individual’s health insurance
coverage, but the level of coverage may also be influenced by the anticipated
utilization of health services (Jutting 2004). Manning et al. (1987) argue that
treating insurance as exogenous in demand for healthcare models produces biased
results because people who anticipate consuming more health services have an
obvious incentive to obtain insurance cover either by selecting a more generous
option at the place of employment by working for an employer with a generous
insurance plan, or by purchasing a generous coverage privately.
Existing literature suggests useful methods for dealing with the endogeneity
problem. Among the common approaches to this problem is the use of the
two-stage residuals inclusion (2SRI) regression method which is appropriate for
nonlinear models. The procedure is used to address problems relating to mea-
surement error, simultaneity, and omitted variables. This method requires identi-
fying an observable variable or instrument that is correlated with the endogenous
variable but uncorrelated with the error term (Ajakaiye and Mwabu 2007; Kioko
2008; Rosenzweig and Schultz 1982; Strauss and Thomas 1995; Wooldridge 2002).
The problem, however, is identifying an observable variable, zi, that satisfies two
conditions. First, the selected variable is uncorrelated with the error term. This
means that cov(zi, e) = 0, that is, zi is exogenous in the estimation of the endoge-
nous equation (see Wooldridge 2002; Behrman and Deolalikar 1988; Griliches and
Mairesse 1998; Ackerberg and Caves 2003). The second requirement involves the
relationship between the identified instrument, zi, and demand for health services.
This means that the identified variable should not have an impact on health
insurance, that is, zi must be relevant. This requires regressing health insurance
against all the exogenous variables, including the instrument (Greene 2007; Jowett
50 C.M. Ruhara and U.M. Kioko
et al. 2004; Wooldridge 2002). In the first regression, the variables should have
significant coefficients when the choice variable is regressed on the identifying
variable together with all other exogenous variables (Ackerberg and Caves 2003).
In the first stage, we estimated the reduced form of health insurance on all
exogenous variables including instrumental variables. In the second stage, we
regressed demand for health care on all independent variables plus insurance and
insurance residuals obtained from the first-stage regression (Palmer et al. 2008;
Terza et al. 2008).
Following Ajakaiye and Mwabu (2007) and Kabubo-Mariara et al. (2009), we
can re-formulate the demand for health services in the form of simultaneous
equations as
D ¼ dd Z1 þ bj Ij þ eij ; j ¼ 1. . .2 ð3:11Þ
I ¼ dj Z þ e2 ð3:12Þ
where D and I are demand for health care and health insurance, respectively. Z is a
vector of independent variables consisting of Z1 covariates that belong to the
demand for health services function and a vector of instrumental variables that
affect insurance but have no direct impact on demand for health services. d and b
are parameters to be estimated and e is a disturbance term. Equation 3.11 is the
structural equation to be estimated while Eq. 3.12 is the linear projection of the
potentially endogenous variable I on all the exogenous variables. The system of
equations assumes that there is only one endogenous regressor in the demand
equation.
A major challenge of the instrumental variable approach is obtaining a valid
instrument for identifying the effect of endogenous variables in a structural model.
Once the potential instrument is identified, it is important to test for its suitability by
assessing whether it has three properties: relevance, strength, and exogeneity of
instruments (Kabubo-Mariara et al. 2009; Stock 2010). An instrument satisfying all
three properties is said to be a strong and valid instrument. As used in Meer and
Harvey (2004), after testing for validity and strength, the variables’ employment
status and community health association membership were used as an instrument
for insurance.
We tested for the endogeneity of insurance and the validity of instruments. First,
we carried out the test for endogeneity of health insurance. If insurance was
exogenous, there would be no justification to estimate the structural model of
demand for health care because the logit models will yield unbiased estimates. We
used the Durbin–Wu–Hausman test. The results showed that the Durbin–Wu–
Hausman statistic values were significant at the 10% level.
We also conducted the Wald test of exogeneity of the insurance variable which
showed that the values were significant at the 1% level. We then rejected the null
hypothesis of exogenous insurance. Second, the coefficients of insurance residuals’
variable were also significant at the 1% level to the demand for medical care
services. Third, we tested the impact of the instruments on the dependent variable.
3 Determinants of Demand for Outpatient Health Care in Rwanda 51
These were found to be insignificant. Fourth, the strength of the instruments was
tested by considering the impact of the instruments on the endogenous variable. As
the coefficients of instruments were large and significant at the 1% level, the
instruments were strong. In addition, we conducted the F-test to check the role of
the instruments on the endogenous variable. While an F-statistic of at least 10 is
recommended (Kioko 2008; Staiger and Stock 1997), the minimum eigenvalue
statistic for the F-test was 133.04 suggesting that the null hypothesis of weak
instrument had to be rejected.
A second estimation issue is the heterogeneity bias which arises from unob-
served factors interacting with the variable of interest and thus biasing the results.
These are some unobservable preferences and health endowments of individuals
that influence their demand for health care (Kabubo-Mariara et al. 2009; Schultz
2008). Even with valid instruments, in practice, it is not easy to separate the impact
of endogenous variables from the effect of unobservables on a structural model.
Failure to take into account heterogeneity could lead to unreliable estimates.
In our study, heterogeneity may arise from at least three sources. First, a risk
reduction effect where the preferred level of utilization is greater because of the
financial certainty created by insurance than utilization under uncertainty (Meza
1983). Second, an access effect where the insurance may extend an individual’s
opportunity set by giving access to health care that would otherwise not be avail-
able. Nyman (2005) has argued that the pooling effect of insurance provides access
to expensive medical technologies that would otherwise not be affordable. Third, an
income transfer effect where insurance creates an ex-post transfer of income from
the healthy to the ill and this may increase utilization through an income effect on
the demand for medical care (Nyman 2005). The three sources relate to reasons
known by an individual but not by a researcher because of which health insurance
may affect demand for health services.
To handle the problem of heterogeneity, we used the control function approach
(CFA) (Florens et al. 2008). This involved estimating a reduced form of insurance
residual (I*), where the inclusion of the residuals was identical to the one obtained
by 2SRI using an instrument for insurance. Assuming that the unobserved com-
ponent was linear in the insurance residual (I*), we introduced an interaction term
[of the insurance and its residual (II*)] as a second control variable to eliminate an
endogeneity bias even in a case where the reduced-form insurance was
heteroscedastic (Card 2001).
Introducing the control function variables (insurance residual and interaction)
gives
D ¼ b0 þ dd Z1 þ sI þ cII þ e1 ð3:13Þ
where I* are the fitted residuals from the reduced form of the insurance variable,
which is explained by Z1; all other variables are as defined earlier. sI* captures the
nonlinear indirect effects of insurance (I) on demand for health services (D),
because the fitted residuals serve as a control for unobservable variables which are
correlated with insurance. Inclusion of both I* and the interaction term II* controls
52 C.M. Ruhara and U.M. Kioko
for the effects of unobservable factors and therefore purges the coefficients of the
effects of the unobservables (Ajakaiye and Mwabu 2007; Card 2001). If any
unobservable variable is linear in I*, it is only the intercept in Eq. 3.13 that is
affected by inclusion of the unobservable variable, and therefore, the 2SRI estimates
are efficient without the interaction term (II*). The 2SRI estimates will be unbiased
and consistent if at least one of two conditions holds: First, the expected value of
the interaction between insurance and its fitted residuals is zero. Second, the
expectation of the interaction between insurance and the fitted residuals is linear.
The data used in this paper are drawn from the Integrated Household Living
Conditions Survey (EICV2) conducted in 2005 by the National Institute of
Statistics of Rwanda (NISR). This nationally representative survey collected data
from 7620 households and 34,819 individuals. Data were collected at the household
and individual levels. EICV2 aimed at enabling the government to assess the impact
of its different policies and programs which had been implemented for improving
the living conditions of the population in general.
The survey covered all the 30 districts in Rwanda and collected data on a wide
spectrum of socioeconomic indicators—labor, housing, health, agriculture, debt,
livestock and expenditure and consumption in different areas, regions, and locations
in the country. Household level information included consumption expenditures on
health and OOPE (consultation, laboratory tests, hospitalization, and medication
costs). Individual level information included socioeconomic indicators and insur-
ance status. There were also a number of community variables such as distance to
the nearest health facility. To improve the reliability of the data, the recall period for
the use of health services was two weeks prior to the survey. In this paper, demand
for healthcare services was estimated for a single visit because the survey did not
capture multi-visits to health facilities. Hence, the demand for outpatient care was
limited to the last consultation or admission.
In Table 3.1, Wald chi2 tests measuring the goodness of fit indicate that the esti-
mated models give an adequate description of the data because it is highly sig-
nificant implying that all the model’s parameters are jointly different from zero. The
2SRI results are reported in columns 4–5 of Table 3.1 while the first-stage
regression estimates are given in Table 3.3 in the Annexure 1. Columns 6–7 in
Table 3.1 present the results of demand for outpatient care after correcting for
heterogeneity of insurance. Due to the inclusion of insurance residuals and inter-
action between insurance residuals and insurance, the results remain close to the
2SRI results in terms of signs of coefficients although they are different in mag-
nitude. The significance of the coefficient on insurance residuals suggests that
insurance is endogenous to outpatient medical care. The coefficient on the
3 Determinants of Demand for Outpatient Health Care in Rwanda 53
Table 3.1 Logistic demand estimates for outpatient care: Dependent variable is probability of an
outpatient visit
Explanatory Baseline z-statistics 2SRI z- Control z-
variables estimates estimates statistics function statistics
estimates
Household income 0.00030 3.50*** 0.0004 3.60*** 0.003 3.40***
User fees −1.108 −26.74*** −0.980 −15.40** −1.43 −18.9***
Quality of health −0.011 −0.27 −0.010 −0.41 −0.004 −0.11
care (=1)
Health insurance 0.492 13.26*** 0.921 1.87* 4.106 29.29***
(=1)
Distance to the −0.434 −8.00*** −0.072 −5.2*** −0.239 −4.29***
health facility
Household size −0.019 −2.52** 0.004 1.79* −0.017 −2.31**
Age 0.013 2.57** 0.056 1.91* −0.0008 −0.74
Square age −0.001 −2.90** −0.0051 −2.79** −0.0002 −1.80*
Primary (=1) 0.006 1.89* 0.021 3.2** 0.018 2.4**
Secondary (=1) 0.030 2.90* 0.040 1.95* 0.028 1.99*
Tertiary (=1) 0.002 5.8*** 0.008 4.12*** 0.067 2.02**
Male (=1) −0.163 −4.44*** −0.023 −3.66*** −0.148 −3.85***
Urban (=1) −0.311 −4.19*** −0.340 −5.15*** −0.164 −2.14**
Kigali region (=1) −0.035 −0.45 −0.070 −1.43 −0.024 −0.26
Southern region −0.066 1.23 −0.204 −2.67** −0.063 −1.18
(=1)
Western region 0.027 0.53 0.024 2.40** 0.035 0.68
(=1)
Northern region 0.195 3.25*** 0.17 3.54*** 0.164 2.73**
(=1)
Insurance residuals – – −1.3 −4.7*** −2.869 19.05***
Interaction of – – – – −1.269 −6.88***
insurance and
insurance residuals
Constant −2.644 −24.56 −1.789 −5.67 −2.411 −25.62
Number of 5040 5040 5040
observations=
Durbin–Wu– 0.054*
Hausman chi-sq
F(1, 5040)= 133.88
LR chi2(19) 5880.20*** 5889.70*** 5897.44***
Log likelihood −3020.4388 −3016.3138 −3006.2254
Source Researcher’s construction
Note ***, **, and * = significant at the 1, 5, and 10% levels, respectively
54 C.M. Ruhara and U.M. Kioko
Alzheimer’s disease are more prevalent in women than men. In line with this,
Ahmad (2001) further adds that gender differences in healthcare utilization for
women were related to specific diseases such as cardiovascular and chronic
illnesses.
Some research has shown that women use less outpatient health care than men
because of the time they spend taking care of the elderly and other people with
disabilities. Caregivers, especially women elderly caregivers, were found to neglect
their own health in order to fulfill this responsibility (Fredman et al. 2008). These
responsibilities made it difficult for severely disadvantaged women to take steps to
improve their living situations and health behaviors by consuming less health
services than men. Similarly, Oxaal and Cook (1998) show that the constraints to
access for poor women and girls made them less likely to have access to appropriate
care and to seek adequate treatment. Their paper notes that the range of factors
limiting access for women includes the socioeconomic status of the household, time
constraints, composition of households, intra-household resource allocation and
decision-making, less education and employment and legal or social constraints on
access to care, heavy work burdens, and the opportunity costs of time in seeking
care.
Given these results, a number of recommendations emerge. Since user fees are
an impediment in using health care in Rwanda, the government should reduce user
fees in health facilities through increased budget allocations for all health facilities,
particularly in the public sector, where the poor go for medical care. From 2003,
OOPE increased gradually to reach 32.2% of the total health expenditure in 2010.
High OOPE has a variety of negative consequences, including household impov-
erishment. Subsidies on user fees should target vulnerable groups such as children
and women or low-income households. The government should also consider
subsidizing private health facilities to increase access to high-quality services by
low-income households. The subsidies will help reduce the effect of income
inequalities on healthcare utilization.
Health insurance is an important determinant of healthcare seeking behavior in
Rwanda. Thus, policies that increase health insurance coverage will substantially
increase health service utilization. The 2013 health insurance coverage rate in
Rwanda was 73%, the highest in the East African Community, but the high
premiums associated with this coverage are not sustainable. The government should
subsidize health insurance to make it accessible to the most disadvantaged people.
The current level of premium (of $4.5) for CBHIs per year, per person should be
reduced. The premium rate more than doubled in 2011 from $1.7 to $4.5, and this
reduced the coverage rate from 91 to 73%. In addition, while with the earlier
premium level, healthcare expenditure represented 10% of the total household
expenditure holding other factors constant, and with the new premium, the
healthcare expenditure for households represents 26% of household expenditure.
This will cause households to incur catastrophic expenditures and push them into
poverty. Further, with an average household size of 6.6 persons, this level of
premium per individual does not seem to be sustainable given that 44.9% of the
population lives on less than $1 per day.
56 C.M. Ruhara and U.M. Kioko
Annexure 1
Table 3.3 Determinants of demand for health insurance, first-stage regression (demand for
outpatient care model)
Explanatory variables Estimates Standard errors z-statistics
Employment status (=1) 0.0510 0.0064 7.90***
Household income 0.0034 0.0004 8.50***
User fees −0.0278 0.0231 −1.20
Quality of health care (=1) 0.0033 0.0069 0.47
Distance to the health facility −0.0483 0.0108 −4.47***
Household size −0.0132 0.0013 −10.58***
Age 0.0072 0.0008 9.20***
Age squared −0.0001 0.00001 −6.00***
Primary (=1) 0.0023 0.0045 5.10***
Secondary (=1) 0.0052 0.0085 0.61
Tertiary (=1) 0.0023 0.0087 0.26
Male (=1) 0.0068 0.0058 1.17
Urban (=1) 0.0847 0.0138 6.13***
Kigali (=1) −0.0385 0.0129 −2.98***
Southern (=1) −0.0624 0.0088 −7.04***
Western (=1) 0.0555 0.0087 6.32***
Northern (=1) 0.0582 0.0099 5.87***
Constant 0.3250 0.0174 18.62***
Number of observations 5040
F(18, 27934) = 56.19***
Source Researcher’s construction
Note ***, **, and * = significant at the 1, 5, and 10% levels, respectively
References
Bategeka LO, Asekeny L, Musiime JA (2009) The determinants of birth weight in Uganda.
AERC, Nairobi
Behrman JR, Deolalikar AB (1988) Health and nutrition. In: Chenery H, Srinivasan TN
(eds) Handbook of development economics. Elsevier Science Publishers BV, North Holland
Bhargava A, Jamison D, Lau L, Murray C (2001) Modeling the effects of health on economic
growth. J Health Econ 20:423–440
Blomqvist AG, Carter RAL (1997) Is healthcare really a luxury? J Health Econ 16(2):207–229
Blunch N (2004) Maternal literacy and numeracy skills and child health in Ghana. Paper presented
at the Northeast Universities Development Consortium Conference, HEC Montreal
Buchmueller TC, Grumbach K, Kronick R, Kahn JG (2005) The effect of health insurance on
medical care utilization and implications for insurance expansion: a review of the literature.
Med Care Res Rev 62(1):3–10
Card D (2001) Estimating the return to schooling: progress on some persistent econometric
problems. Econometrica 69(5):1127–1160
Cunningham P, Kemper P (1998) Ability to obtain medical care for the uninsured: How much does
it vary across communities? JAMA 280(10):921–927
De Bethune X, Alfani S, Lahaye JP (1989) The influence of an abrupt price increase on health
service utilization: evidence from Zaire. Health Policy Plan 4(1):76–81
Diop F, Yazbeck A, Bitrán R (1995) The impact of alternative cost recovery schemes on access
and equity in Niger. Health Policy Plan 10:223–240
Elo I (1992) Utilization of maternal health-care services in Peru; the role of women’s education.
Population Studies Center, University of Pennsylvania
Florens JP, Heckman MC, Vytlacil E (2008) Identification of treatment effects using control
functions in models with continuous, endogenous treatment and heterogeneous effects.
Econometrica 76(5):1191–1206
Fredman L, Cauley JA, Satterfield S (2008) Caregiving, mortality, and mobility decline: the health,
aging, and body composition (Health ABC) study. Arch Intern Med 168:2154–2162
Greene WL (2007) Econometric analysis. Macmillan Publishing Company, New York
Griliches Z, Mairesse J (1998) Production functions: the search for identification. National Bureau
of Economic Research, Working paper no. 5067
Grossman M (1972) On the concept of health capital and the demand for health. J Polit Econ 80
(2):223–235
Hahn B (1994) Healthcare utilization: the effect of extending insurance to adults on medicaid or
uninsured. Med Care 32(3):227–239
Jayaraman AS, Chandrasekhar T, Gebreselassie T (2008) Factors affecting maternal healthcare
seeking in Rwanda. USAID, Working paper
Jones AM, Koolman X, Doorslaer EV (2006) The impact of having supplementary private health
insurance on the use of specialists in European Countries. Anales d’Economie et de Statistique
83(84):251–275
Jowett M, Deolalikar A, Martinsson P (2004) Health insurance and treatment seeking behaviour:
evidence from a low-income country. Health Econ 13:845–857
Jütting JP (2003) Do community-based health insurance schemes improve poor people’s access to
healthcare? Evidence from rural Senegal. World Dev 32:273–288
Katz L, Kling J, Liebman J (2001) Moving to opportunity in Boston: early results of a randomized
mobility experiment. Q J Econ 116(2):607–654
Kioko MU (2008) The economic burden of malaria in Kenya: a household level investigation. PhD
thesis, University of Nairobi
Lawson D (2004) A microeconomic analysis of health, healthcare and chronic poverty, The
University of Nottingham (unpublished)
Lépine A, Nestour A (2008) Healthcare utilization in rural Senegal: the factors before the
extension of health insurance to farmers. International Labor Office, Research paper no. 2
Lindelow M (2002) Healthcare demand in rural Mozambique: evidence from 1996/97 household
survey. International Food Policy Research Institute (IFPRI), FCND discussion paper no. 126
3 Determinants of Demand for Outpatient Health Care in Rwanda 59
Litvack JI, Bodart C (1993) User fees plus quality equals improved access to health care: results of
a field experiment in Cameroon. Soc Sci Med 37:369–383
Manji JE, Moses SF, Bradley NJ, Nagelkerke MA, Plummer FA (1992) Impact of user fees on
attendance at a referral centre for sexually transmitted diseases in Kenya. Lancet 340:463–466
Manning WG, Newhouse JP, Duan N, Keeler EB, Leibowitz A (1987) Health insurance and the
demand for medical care: evidence from a randomized experiment. Am Econ Assoc Rev 77
(3):251–277
McCool JH, Kiker BF, Ng YC (1994) Estimates of the demand for medical care under different
functional forms. J Appl Econometrics 9(2):201–218
McKeown T (1976) The role of medicine: dream, mirage or nemesis? Basil Blackwell, Oxford
Meer J, Harvey SR (2004) Insurance and the utilization of medical services. Soc Sci Med
58:1623–1632
Meza D (1983) Health insurance and the demand for healthcare. J Health Econ 2:47–54
Miller L (1994) Medical schools put women in curricula. Wall Street J B1
Ministry of Health (MoH), Republic of Rwanda (2009) Health sector strategic plan (unpublished)
Ministry of Health Rwanda (2009) Rwanda health financing policy review of Rwanda—options
for universal coverage. World Health Organization
Mocan NH, Tekin E, Jeffrey SZ (2004) The demand for medical care in urban China. World Dev
32(2):289–304
Mwabu KMJD, Nd’enge GK (2009) The consequences of fertility for child health in Kenya:
endogeneity, heterogeneity and the control function approach. AERC, Nairobi
Mwabu GJ, Wang’ombe B, Nganda B (2003) The demand for medical care in Kenya. African
Development Bank, Oxford
National Institute of Statistics of Rwanda (NISR) (2011) Preliminary results of interim
demographic and health survey 2010. NISR, Kigali
Nyman JA (2005) Health insurance theory: the case of the missing welfare gain, University of
Minnesota (unpublished)
Oxaal Z, Cook S (1998) Health and poverty gender analysis. Swedish International Development
Co-operation (unpublished)
Palmer T, Thompson J, Tobin M, Sheehan N, Burton P (2008) Adjusting for bias and unmeasured
confounding in Mendelian randomization studies with binary responses. Int J Epidemiol
37(5):1161–1168
Phelps CE, Newhouse JP (1974) Coinsurance, the price of time, and the demand or medical
service. Rev Econ Stat 56:334–342
Rashad IK, Markowitz S (2009) Incentives in obesity and health insurance. Inquiry 46:418–432
Ridde V (2003) Fees-for-services, cost recovery, and equity in a district of Burkina Faso operating
the Bamako Initiative. Bull World Health Organ 81:532–538
Ringel JS, Hosek SD, Ben AV, Mahnovski S (2002) The elasticity of demand for healthcare.
A review of the literature and its application to the military health system. National Defense
Research Institute (unpublished)
Rosenzweig MR, Schultz TP (1982) The behavior of mothers as inputs to child health: the
determinants of birth weight, gestation, and the rate of fetal growth. In: Fuchs VR
(ed) Economic aspects of health. The University of Chicago Press, Chicago, pp 53–92
Saksena P, Xu K, Elovaino R, Perrot J (2010) Health services utilization and out-of-pocket
expenditure at public and private facilities in low-income countries. World Health
Organization, Background paper no. 20, Geneva
Santerre RE, Neun SP (2010) Health economics: theories, insights and industry studies, 5th ed.
Cengage Learning
Sauerborn R, Nougtara A, Latimer E (1994) The elasticity of demand for healthcare in Burkina
Faso: differences across age and income groups. Health Policy Plan 9(2):186–192
Schultz TP (2008) Population policies, fertility, women’s human capital. In Schultz TP, Strauss J
(eds) Handbook of development economics, vol 4, Chap. 52. Elsevier, Amsterdam
Shimeles A (2010) Community based health insurance schemes in Africa: the case of Rwanda.
African Development Bank group, Working paper no. 120
60 C.M. Ruhara and U.M. Kioko
Kokeb G. Giorgis
Abstract This study sheds light on the effect of institutional variables on economic
growth in sub-Saharan African countries. It empirically analyzes the impact of
institutional quality proxied by control for corruption, government effectiveness,
and protection of the property right index among others on economic growth in
sub-Saharan African countries during the sample period 1996–2012. The sample
consisted of 21 sub-Saharan African countries. The methodology is based on
first-differenced GMM estimator proposed by Arellano and Bond (Rev Econ Stud
58(2):277–297, 1991) for dynamic panel data, which is robust for taking care of
individual fixed effects, heteroskedasticity, and auto-correlation in the presence of
endogenous covariates. The results of this study indicate that improving institu-
tional quality, specifically protecting property rights on average had a positive
contribution to growth in output per capita in the sampled countries though its effect
was small. However, institutional variables such as control for corruption and
government effectiveness had a positive effect on growth though they were statis-
tically insignificant. These findings agree with some of the studies conducted so far
on the effect of institutions on growth.
4.1 Introduction
After the independence of many African countries in the 1950s and 1960s there was
a widely held expectation that poor countries in Africa would ‘catch up,’ that is,
converge in per capita income terms with developed countries. However, this was
confirmed to be an unrealistic expectation as more than half a century after inde-
pendence the continent is still the poorest in the world by any standard where more
record taking into account the effects of institutions using the Arellano-Bond GMM
estimator. The regression is based on data from 21 sub-Saharan Africa countries
employing panel data covering the period 1996–2012.
This paper is organized as follows: Section 4.2 provides a brief review of the-
oretical and empirical literature; Sect. 4.3 deals with descriptive statistics of the
growth and institutional patterns in sub-Saharan Africa during the sample period;
The empirical methodology is described in Sect. 4.4; and the results are presented
in Sect. 4.5; Section 4.6 gives the conclusion.
Growth literature uses three major theories to explain the difference in output per
capita among nations. First, the neo-classical and endogenous growth theories
which have long recognized that differences in output per capita in a society are
intimately related to differences in the amount of human capital, physical capital,
and technology that workers and firms in that country have access to. For instance,
the Solow model emphasizes capital accumulation as a major driver of growth
(Solow 1956) while Grossman and Helpman’s (1991) theoretical model highlights
the quality of capital stock to boost growth. Second, the geographic theory which
explains how essential the geographic location of a country is in affecting its
growth; this is linked to market access and climatic conditions. Theoretical and
empirical research has so far found strong causality between geographic location
and the level of income in a country. Third, the last and recent theory, deals with an
institutional approach. It emphasizes the importance of institutions in affecting
growth.
Institutions are often seen as providing the ‘rules of the game’ required to set up
baseline situations for human interactions which consequently have an impact on
social, economic, and political relationships in a society. Institutions include the
moral, ethical, and behavioral norms of a society so as such they matter for growth
and development (Nelson and Sampat 2001).
To empirically analyze the effect of institutions on economic growth, it is
important to identify which types of institutions are more important in affecting
economic growth. Different researchers and international organizations including
the Heritage Foundation have different classifications of institutions depending on
their respective objectives. According to literature, there are at least three types of
institutions: political, economic, and financial. The quality of each of these types of
institutions is measured through different variables. For example, the main variables
of economic institutions are protection of property rights; regulation and the
business freedom index; freedom in doing business; financial freedom; investment
freedom; and the quality of the regulation system. The main variables for political
66 K.G. Giorgis
institutions are the rule of law that contains the rule of law index, controlling
corruption and corruption freedom, and other variables.
Our study used the main economic and political institutional indicators which
are expected to have an impact on economic growth in the context of Africa. With
this objective, the three indicators used are ‘protection of property rights,’ ‘control
of corruption,’ and ‘government effectiveness.’
When it comes to the extent to which institutional aspects such as property
rights, incentive structures, and transaction costs affect economic growth, North
(1981) was a pioneer who developed the contract and predatory theory by
extending the neo-classical theory to include institutional variables. The contract
theory states that if contracts are well enforced, then they contribute beneficially to
the efficiency of business and society. If a state provides the legal framework that
reduces transaction costs in the presence of some institutions, productivity and
innovation increase. On the other side, the predatory theory treats the state as a
vehicle for collecting monopolistic rents and transferring the resources among
different groups in order to maximize incomes.
growth. That is, as institutional quality increased by 1%, the government’s fiscal
burden decreased by 0.03%. Similarly, Naude (2004) sheds light on the same
objective, but this time using data from 44 African countries and employing both
single-year cross-section data and panel data covering the period 1970–90. For
comparative purposes, the study used different econometric estimation methods
including a dynamic Arellano-Bond GMM estimator. Moreover, the study used
three proxies for institutional quality (ethno-linguistic heterogeneity (ELH),
corruption and graft and the incidence of revolutions and coups) as proxies. Based
on the GMM estimator, the author concluded that none of these had a significant
impact on growth but supported Acemoglu et al.’s (2001) ‘reversal of fortune’
thesis, namely that settler mortality (instrumenting for the quality of institutions) is
inversely related to economic growth.
Likewise, a study by Valeriani and Peluso (2011) analyzed the impact of
institutions on economic growth and examined whether the eventual impact differed
depending on the level of development in a country. They used panel data from
1950 to 2009 for 181 countries (both developing and developed) through a pooled
regression model and a fixed effects model. They employed institutional indicators
of civil liberties, number of veto players, and quality of government and found that
institutional quality impacted economic growth in a positive way. This was true for
all three institutional indicators that were examined. The only difference between
how developing and developed countries were affected by institutional quality was
the size of the impact and not in the direction of it. On a more specific level, out of
the three institutional indicators, improved civil liberties had a greater effect on
economic growth in developing countries, whereas the number of veto players
assumed more importance for developed countries’ economies.
With a similar objective, a study by Dushko et al. (2011) used cross-country data
from 212 groups of countries and geographic regions and applied different
econometric models (OLS, G2SLS, 2SLS). It used the rule of law, revolutions, and
Freedom House ratings as well as war casualties as indicators of institutional
quality. Their study found that in all the models used, institutional quality had a
positive and significant effect in enhancing GDP per capita on average for the
sampled countries during the study period.
Acemoglu and Robinson (2010) investigated whether political or economic
institutions should be given primacy. Even though their study emphasized that
differences in prosperity across countries were due to differences in economic
institutions, it also underscored that without building strong political institutions it
was not possible to build strong economic institutions which could facilitate growth
because economic institutions are the outcome of a political process. Hence, the
study deduced that solving the problem of development entailed understanding what
instruments can be used to push a society from a bad to a good political equilibrium.
Unlike Acemoglu and Robinson’s (2010) study, Glaeser et al.’s (2004) study
had the objective of exploring the causal link between institutions and growth. It
confirmed that rather than political institutions, human capital had a causal effect on
economic growth. Importantly, in that framework, institutions did not directly affect
growth.
68 K.G. Giorgis
Table 4.2 shows that the average measures of institutional quality for the study
period were not greater than 40% implying that sub-Saharan African countries had
poor quality institutions. One can also see that there was a huge difference between
the sampled countries. For example, regarding controlling corruption the minimum
figure is 10% while the maximum is around 78% which shows that there was a clear
difference among countries in the region concerning controlling corruption; this was
also true for the other two variables.
Figure 4.1 shows the index for control of corruption proposed by the World
Governance Indicators (WGI) of the World Bank which could serve as a proxy for a
country’s level of institutional development. It indicates the degree of corruption
within a given political system by taking into consideration financial corruption
(import and export licenses, exchange controls, tax assessments, or police protec-
tion), as well as the following forms of corruption: patronage, nepotism, job
reservations, ‘favor-for-favors,’ and secret party funding. On average, for the
80
60
40
20
0
Camerron
Botswana
Ghana
Kenya
Mozambique
Mauritania
Uganda
Namibia
Guinea
Niger
Ethiopia
Malawi
Rwanda
Senegal
Togo
Lesotho
Nigeria
South Africa
Burkina Faso
MauriƟus
Chad
Fig. 4.1 Rank of sub-Saharan African countries by average control of corruption (1996–2012).
Source Author’s calculations based on WGI data, the World Bank
70 K.G. Giorgis
80
70
60
50
40
30
20
10
0
Kenya
Uganda
Guinea
Niger
Senegal
Togo
Camerron
Ethiopia
Botswana
Lesotho
Mozambique
Nigeria
Rwanda
Ghana
Mauritania
Malawi
Namibia
Burkina Faso
MauriƟus
South Africa
Chad
Fig. 4.2 Average rank of Sub-Saharan African countries by government effectiveness
(1996–2012). Source Author’s calculations based on WGI data, the World Bank
sample period, the worse corruption was in Nigeria, Niger, Chad, Cameroon, and
Kenya while Botswana, South Africa, Mauritius, and Namibia had relative control
over corruption.
Another proxy for institutional development is the quality of government poli-
cies which is analyzed through WGI’s government effectiveness index. This is a
multi-dimensional index which reflects both the quality of public services and of
civil services. It accounts for the quality of policies formulated and implemented,
for political pressures and also for the government’s credibility. The country with
the lowest level of government effectiveness was Togo, followed by Chad and
Nigeria. South Africa and Botswana registered the highest average on the gov-
ernment effectiveness index during 1996–2012 (Fig. 4.2).
Figure 4.3 shows the average percentage of protection of property rights for the
sampled countries in the study period. Botswana and Mauritius were relatively
better in the protection of property rights. Rwanda, Chad, and Togo showed poor
performance.
Following North (1981) our paper assesses the effect of institutions on economic
growth. For this, one can incorporate a proxy for institutions in the neo-classical
growth model. To do so we started with the aggregate production function which
describes how inputs (labor, physical and human capital, and technology) are
combined to produce the output:
4 The Impact of Institutions on Economic Growth in Sub-Saharan … 71
80
70
60
50
40
30
20
10
0
Namibia
Senegal
Malawi
Niger
Rwanda
Camerron
Botswana
Uganda
Ghana
Kenya
Mozambique
Mauritania
Guinea
Togo
Ethiopia
Lesotho
South Africa
Nigeria
Burkina Faso
MauriƟus
Chad
Fig. 4.3 Average rank of protection of property right for sub-Saharan African countries
(1996–2012) (%). Source Author’s calculations based on WGI data, the World Bank
where Y is output, H is human capital, L is labor and the parameter A represents the
level of technology in the economy, and K is physical capital. Where human capital
is the knowledge, skills, and abilities of people who are or who may be involved in
the production process while the labor force is the number of people who are able to
work. Rewriting Eq. 4.1 in per capita form yields:
/1 ðII Þ /2 ðII Þ
At ¼ A0 kt ht ð4:3Þ
where A0 represents the basic level of technology, I* and I denote the best-quality
institutions and the country’s current level of institutional quality, respectively. The
traditional growth model considers that economies function close to best-quality
institutions hence in these models I = I*. This reduces the effect of institutional
quality to zero. However, since North (1981) more recent growth theories recognize
the importance of institutions. Accordingly, the mathematical statement, I − I*,
measures the degree to which a country’s institutions fall short of the best
conditions.
72 K.G. Giorgis
Therefore, substituting Eq. 4.3 in the equation on the production function per
worker, and rewriting it, gives the following:
h þ /1 ðII Þ b þ /2 ðII Þ
yit ¼ A0 kt ht ð4:4Þ
To study the dynamic of output per capita, taking the log of Eq. 4.4 and a
derivative with respect to time (t) and rearranging it gives the following:
and adding an error term e gives growth rate of output per capita as follows:
The coefficient estimates for p1 and p2 measure the return to physical and human
capital investments, while coefficient p3 measures an increasing return to physical
and human capital investments as the country’s institutional quality improves.
Therefore, Eq. 4.6 is used to test the impact of institutions on growth where p3
measures the effect of a change in institutional quality on growth through a change
in the productivity of both human and physical capital.
To investigate the impact of institutions on economic growth, we used the
first-differenced GMM estimator proposed by Arellano and Bond (1991) for
dynamic panel data. Thus, Eq. 4.6 can further be rewritten in dynamic panel
specification as follows:
where, 1
yt ¼ w ln GDPit represents the natural logarithm of real GDP per capita
expressed in constant 2000 US$ for country i at time t and hence Dy yt is the growth
t
rate of GDP per capita as discussed earlier. I i;t stands for the institutional variables
for country i at time t (controlling of corruption, government effectiveness, and
protection of property rights). Xi:t represents both physical and human capital
variables as discussed earlier and other macroeconomic control variables. gi sig-
nifies the individual fixed effects specific to each country, and it is constant in time.
ei N ð0;r2 Þ is a random disturbance term.
Using the OLS method for estimating, Eq. 4.7 raises several concerns. First, the
presence of the lagged dependent variable ln GDPi,t − 1, which is correlated with the
fixed effects ηi, gives rise to a dynamic panel bias (Nickell 1981). The coefficient
estimate for lagged ln GDP is inflated by attributing a predictive power that actually
4 The Impact of Institutions on Economic Growth in Sub-Saharan … 73
belongs to the country’s fixed effects. Moreover, it is clear that estimating a panel
data model with a lagged dependent variable will lead to biased results at least in
small samples with a small time period (Judson and Owen 1999).
Therefore, the alternative solution is to use the generalized method of moments
(GMM) developed by Arellano and Bond (1991). It is an efficient estimator for
dynamic panels. It is popular in the context of empirical growth research as it allows
relaxing some of the OLS assumptions. The Arellano and Bond estimator corrects
endogeneity in the lagged dependent variable and provides consistent estimates.
Moreover, it allows auto-correlation and heteroskedasticity among others
(Roodman 2006).
The first step of the GMM procedure is to differentiate Eq. 4.7 to remove
individual effects, that is, gi which gives the following:
In the differenced Eq. 4.8, we still have a correlation between Dei;t and
D ln GDPi;t1 , which could be addressed by instrumenting D ln GDPi;t1 . Finding a
valid external instrument is very difficult; hence, GMM draws instruments from
within the dataset, that is, lagged values of the dependent and independent variables
in case of endogeneity. Thus, the GMM procedure gains efficiency compared to
OLS by exploiting additional moment restrictions.
The regression outputs from Eq. 4.7 are short-term estimates in the context of
economic growth. Since the effect of different factors should be evaluated in the
long run, it is also vital to compute the long-run coefficients. Hence, transforming
Eq. 4.7 yields the following:
First, Eq. 4.8 is estimated using the Arellano-Bond first difference GMM esti-
mator to get the short-run coefficients. Second, the long-run coefficients and the
error correction term are computed and tested for its significance using the Wald
test. The short-term equations correspond to Model 1 to Model 3 in Table 4.4,
while the corresponding long-term equations (Model 1 to Model 3) are given in
(Table 4.5).
Finally, to test the consistency of the GMM estimator, checking the validity of
the moment conditions is required which can be done using two specification tests:
the Hansen test which is a test for over-identifying restrictions and the joint null
hypothesis (the instruments are valid) and the Arellano-Bond test for no
second-order serial correlation in the error term. To ascertain the consistency of the
estimator both the tests are applied.
Table 4.3 represents the various macroeconomic variables and national accounts
data. To capture institutional quality, we used some of the vital indicators from the
WGI database: controlling of corruption, government effectiveness, and protection
of property rights. The dependent variable is represented by real GDP per capita in
21 sub-Saharan African countries. The analyzed period is 1996–2012, covering a
series of financial and economic crises.
As can be seen from short-run estimates in Table 4.4, Model 1 is the baseline
equation where besides the lagged level of GDP, we also introduce classical growth
determinants such as gross fixed capital formation and trade openness as a per-
centage of GDP (export to GDP ratio). Both the lagged levels of GDP and exports
to the GDP ratio have the expected signs and are significant while gross fixed
capital formation has an unexpected negative sign.
An increase in trade openness (exports as a percentage of GDP) by 1% will raise
GDP per capita by 0.093%. The gross fixed capital formation (as a percentage of
4 The Impact of Institutions on Economic Growth in Sub-Saharan … 75
Table 4.4 Institutions and economic growth—short-run estimations Dependent variable: Real per
capita GDP (logarithm)
Regressors Model 1 Model 2 Model 3
L.lngdpc 0.156*** 0.132** 0.106**
(0.059) (0.065) (0.058)
lnGFCF −0.067*** −0.07*** −0.063*** (0.023)
(0.018) (0.018)
lnTrade 0.093** 0.107*** 0.113***
(0.027) (0.027) (0.027)
lnschooling 0.003 −0.003
(0.016) (0.014)
Government effectiveness 0.001 0.0004
(0.001) (0.002)
Controlling of corruption −0.0001 −0.0002
(0.001) (0.0008)
Property rights 0.003*** 0.002***
(0.001) (0.001)
lnGovernmentConsumption −0.089**
(0.051)
N 315 315 315
No. of instruments 75 79 80
Hansen j statistic (p value) 0.122 0.476 0.357
Serial correlation test AR2 (p value) 0.839 0.901 0.563
Note
Robust standard errors in brackets
*, ** and *** denote significance levels of 10, 5 and 1%
Dependent variable: Real per capita GDP (logarithm)
N represents the number of panel observations
Method used is Arellano and Bond’s (1991) first difference GMM
Instruments, Arrelano-Bond type: the dependent variable from lags 2 to 5. Standard instruments:
the level of all other regressors
The Hansen test reports the validity of the instrumental variables test. The null hypothesis is that
the instruments are not correlated with the residuals (for robust estimations Stata reports the
Hansen j statistic instead of the Sargan test)
For the Arellano-Bond test, the null hypothesis is that of no serial correlation between residuals
GDP) is negatively related to GDP per capita because of the crowding out effect—
in this case, domestic investments are much more important than public invest-
ments. On a similar basis, if we look at the long-run estimates (Table 4.5 Model 1)
an increase in trade openness by 1% will raise GDP per capita by 0.11%, moreover,
a 1% increase in the gross fixed capital formation will reduce GDP per capita by
0.08%. Further, the catch-up term has the expected negative sign, and it is statis-
tically significant.
In Table 4.4, Model 2, the proxy for institutional quality such as the index for
controlling of the corruption, the government effectiveness index, and index for
protection of property rights have been added to the classical growth determinants
to see the effect of institutions on economic growth. The results show that the index
76 K.G. Giorgis
Table 4.5 Institutions and economic growth—long-run estimations Dependent variable: Real per
capita GDP (logarithm)
Regressors Model 1 Model 2 Model 3
L.lngdpc −0.844*** −0.868*** −0.894***
(Convergence.Coefficient) (0.059) (0.065) (0.058)
lnGFCF −0.079** −0.081** −0.071**
(0.036) (0.041) (0.034)
lnTrade 0.110** 0.123*** 0.126***
(0.053) (0.027) (0.035)
lnschooling 0.004 −0.003
(0.02) (0.014)
Government effectiveness 0.002 0.0004
(0.01) (0.002)
Controlling of corruption 0.002 −0.0002
(0.001) (0.001)
Property rights 0.004*** 0.002***
(0.001) (0.001)
lnGovernmentConsumption −0.090**
(0.056)
Note
Standard errors in brackets
*, ** and *** denote significance levels of 10, 5 and 1%
for protection of property rights has a positive though the negligible impact on
growth in GDP per capita. However, government effectiveness and controlling of
corruption indices are not statistically significant although they have the expected
sign. Similarly, the gross enrollment rate (schooling) which is a proxy for human
capital, even if it is insignificant has the expected positive sign in the presence of
institutions. As expected, gross fixed capital formation and trade openness remain
highly significant, and the impact of gross fixed capital formation on growth even
increases slightly in the presence of institutions.
From Table 4.5, Model 2, the long-term effect of the institutional variable
‘protection of property rights’ is slightly higher as compared to its short-run esti-
mate, that is, it increases from 0.003 to 0.004% for a 1% increase in the quality of
protection of property rights indicating that even in the long run its effects are
negligible. Besides, the introduction of institutional variables slightly raises the
speed of convergence to the steady state from 0.84 to 0.86.
To test the robustness of the models, we introduced one control variable, the
general government final consumption expenditure (as a percentage of GDP) in
Model 3 (Tables 4.4 and 4.5). The impact of institutions on growth was still sig-
nificant to the introduction of the macroeconomic policy variable. The impact of
corruption and government effectiveness on economic growth remained
insignificant.
Two major concerns in using GMM estimators is how valid the instruments are
and controlling the serial correlations of residuals. The p values obtained (see
4 The Impact of Institutions on Economic Growth in Sub-Saharan … 77
Table 4.4) using the Hansen test indicate exogeneity of the instruments used, that
is, the instrument sets were orthogonal to the regressors and were therefore valid for
estimation. Similarly, to tackle the problem of the serial correlation of residuals, we
needed to test auto-correlation of second order or more in the errors. Therefore, as
can be seen from Table 4.4, the Arellano and Bond test confirmed the null
hypothesis of the absence of second-order auto-correlation.
4.6 Conclusion
References
Acemoglu D, Robinson JA (2010) Why Africa is poor? Econ Hist Dev Reg 25:21–50
Acemoglu D, Johnson S, Robinson JA (2001) The colonial origins of comparative development.
Am Econ Rev 91(5):1369–1401
Arellano M, Bond S (1991) Some tests of specification for panel data: Monte Carlo evidence and
an application to employment equations. Rev Econ Stud 58(2):277–297
Dushko J, Darko L, Risto F, Cane K (2011) Inst and growth revisited: OLS, 2SLS, G2SLS,
Random Effects IV regression and panel fixed (within) regression with cross Country data
Glaeser E, Porta RL, Lopez-de-Silanes F, Shleifer A (2004) Do inst cause growth? J Econ Growth
9:271–303
Grossman GM, Helpman E (1991) Comparative advantage and long-run growth. Am Econ Rev 80
(4):796–815
Hall RE, Jones CI (1999) Why do some countries produce so much more output per worker than
others? Quart J Econ 114:83–116
Judson R, Owen A (1999) Estimating dynamic panel data models: a guide for macroeconomists.
Econ Lett 65:9–15
Naude WA (2004) The effects of policy, inst and geography on economic growth in Africa: an
econometric study based on cross-section and panel data. J Int Dev 16:821–849
Nelson RR, Sampat BN (2001) Making sense of institutions as a factor shaping economic
performance. J Econ Behav Organ 44:31–54
Neuhaus M (2006) The impact of FDI on economic growth. Physica-Verlag, Wurzburg
Nickell S (1981) Biases in dynamic models using fixed effects. Econometrica 49:1417–1426
North DC (1981) Structure and change in economic history. Norton, New York
North DC (1990) Inst, inst change and economic performance. Cambridge University Press, New
York
Redek T, Sušjan A (2005) The impact of inst on economic growth: the case of transition
economies. J Econ Issues 39(4):995–1027
Rodrik D, Subramanian A, Trebbi F (2004) Institutions rule: the primacy of inst geography and
integration in economic development. J Econ Growth 9:131–165
Roodman D (2006) How to do xtabond2: an introduction to difference and system GMM in
STATA. Center for Global Development, Working paper no. 103
Solow RM (1956) A contribution to the theory of economic growth. Quart J Econ 70:65–94
Valeriani E, Pelso S (2011) The impact of inst quality on economic growth and development: an
empirical investigation. J Knowl Manage 6:1–25
Chapter 5
Fiscal Effects of Aid in Rwanda
Abstract This paper analyzes the dynamic relationship between foreign aid and
domestic fiscal variables in Rwanda using a co-integrated vector auto-regressive model
for quarterly data over the period 1990Q1–2015Q4. The results show that aid and fiscal
variables form a long-run stationary relationship and that aid is a significant element of
long-run fiscal equilibrium and the hypothesis of aid exogeneity is not statistically
supported; anticipated aid appears to have been taken into account in budget planning.
Aid is associated with increased tax efforts, public spending, and lower domestic bor-
rowings. Aid has contributed to improved fiscal performance in Rwanda, although the
slow growth in tax revenue and regular aid shortfalls has prevented sustaining a balanced
budget inclusive of aid. In terms of policy, continued efforts by donors to coordinate aid
delivery systems, make aid more transparent, and support improvements in government
fiscal statistics will all contribute to improving fiscal planning. Recipients need to know
how much aid is available to finance spending and how this is delivered, that is, whether
through donor projects or government budgets.
The views expressed in this paper are those of the authors. They do not necessarily represent the
views of the Bank of Uganda, University of Kigali and the National Bank of Rwanda or their
affiliated organizations.
T. Bwire (&)
Bank of Uganda, Kampala, Uganda
e-mail: tbwire@bou.or.ug
C. Tamwesigire
University of Kigali, Kigali, Rwanda
e-mail: ctamwesigire@yahoo.com
P. Munyankindi
National Bank of Rwanda, Kigali, Rwanda
e-mail: pmunyankindi@bnr.rw
5.1 Introduction
The underlying economic rationale for foreign aid to developing countries can be
traced back to Chenery and Strout’s (1966) two-gap model. In their model,
investments are the cornerstone of growth, but they require domestic savings and, at
least initially, imported capital goods. Low-income countries are constrained by two
gaps: insufficient domestic savings to provide the resources needed for financing the
level of investments required to achieve their target growth rates and insufficient
foreign exchange earnings (as they are unlikely to have sufficient export earnings)
to finance capital imports. As these savings and foreign exchange gaps constrain
growth, capital flows (of which foreign aid is one form) are an important source of
development finance (Franco-Rodriguez et al. 1998; McGillivray and Morrissey
2000) as they relax savings and foreign exchange constraints.
Aid is premised on different development constraints. However, the fact that
most of the aid that is spent in a country goes to (or through) the government or
finances the provision of public goods and services that would otherwise place
demands on the budget (Franco-Rodriguez et al. 1998; McGillivray 1994, 2001)
makes understanding its effects on central government fiscal behavior a necessary
condition for its effective and successful deployment.
Fiscal response models (hereafter FRM) offer important insights into how for-
eign aid donors expect their efforts to impact the fiscal behavior of a recipient
government. This is because the new incentives and conditions created by the
addition of foreign aid to the actions of the state definitely disrupt how the state
disposes of the fiscal tools of tax revenues, expenditure, and public debt, but only in
uncertain ways. Aid packages come with strong pressures to spend, so there is an
expectation that aid will increase spending (O’Connell et al. 2008). Moreover,
reforms linked to aid conditionalities are expected to increase tax revenues and tax
rates either because of influences on tax efforts or because they affect tax rates or the
tax base (Morrissey 2015). Perhaps because donors’ conditionality often requires
recipient governments to reduce budget deficits (Adam and O’Connell 1999;
McGillivray and Morrissey 2000), aid is also expected to lower domestic bor-
rowings. However, in reality, these are general expectations and may not always
hold true.
In the 10 years before 2008, the total overseas development assistance (ODA) as
a share of GDP averaged 29.7% (The World Bank 2008). Over the same time, a
World Bank report puts foreign direct investments and domestic savings as shares
of GDP at some dismal 0.23 and −1.4%, respectively, on average.
With the new G-8 initiative on debt forgiveness and donors’ increased focus on
the poorest countries, the level of support to Rwanda was scaled up until 2012 when
the country suffered aid suspension. During FY 2013–14, Rwanda’s budget and
sector support as well as project financing, grants, and loans accounted for 11.6% of
GDP and 40% of government spending. This is illuminating and clearly highlights
the importance of ODA in sustaining Rwanda’s broad growth prospects, making it
an interesting case study for the effects of foreign aid.
5 Fiscal Effects of Aid in Rwanda 81
Studies that have investigated the effect of aid on the fiscal behavior of recipient
countries are reviewed and discussed in Morrissey (2015). As echoed in Riddell
(2007), the debate suggests that country-based evidence provides the only reliable
backdrop for exploring aid–fiscal behavior dynamics as experiences between
countries vary due to their different institutional foundations. Our paper investigates
the fiscal effects of foreign aid in Rwanda using a quarterly dataset for
1990Q1–2015Q4. The advantage of quarterly data is that aid is measured by the
Ministry of Finance and Economic Planning and should be closer than the donors’
measurement of aid as recorded in the budget. A potential disadvantage is that this
may not correspond fully with an annual budget planning cycle. Nonetheless, as
shown in Bwire et al. (2016), quarterly data give qualitative results similar to what
are obtained from annual data and these in general are consistent with what is
known about the fiscal effects of aid.
The rest of the paper is organized as follows. Section 5.2 provides a brief
literature review, while the data, econometric methodology, and aid-related
hypotheses of interest are presented in Sect. 5.3. Section 5.4 discusses the empir-
ical results. Section 5.5 gives the conclusion and policy recommendations.
There is significant empirical literature on the impact of aid on the fiscal behavior of
aid recipients. A detailed review of this literature is provided in McGillivray and
Morrissey (2004) and Morrissey (2015). An important distinction is made between
fungibility and fiscal response studies.
Fungibility studies analyze the effects of foreign aid on the composition of
government spending. Aid is said to be fungible if the recipients fail to use it in the
manner intended by the donor. As presented in World Bank (1998), the underlying
assumption is that donors grant aid to finance public investments as increments to
the capital stock which are the principle determinants of growth; fungibility arises
when recipients divert the aid to finance government consumption spending. This is
undesirable because such a diversion reduces the effectiveness of aid. However, to
the extent that consumption spending is a necessary complement to investment
spending (recurrent spending is required to operate investments such as nurses and
medicines for a healthcare center), the assumption that fungibility diminishes the
effectiveness of aid may be misleading.
Analogously, fungibility is said to occur if aid intended to finance a particular
sector such as health or education services that would otherwise be funded by tax
revenues, release domestic resources for spending in some other sectors of the
economy. In this case, fungibility arises because donors and recipients have dif-
fering expenditure allocation preferences. Evidence as to whether aid has been
fungible or not and whether fungibility limits aid effectiveness is imprecise largely
due to data limitations. Morrissey (2015) details the practical difficulties of directly
linking aid, donor intentions, and sector spending, given the need to distinguish
82 T. Bwire et al.
in aid and this was not because aid was fungible but because investment spending
was linked to borrowing and declined as borrowing was reduced, whereas recurrent
spending was linked to tax and this increased as revenues increased.
Morrissey et al. (2007) extended this approach with official Kenyan data for
1964–2004 and estimated two relationships: the fiscal effects of aid grants and
loans, and the impact of aid on growth. They found that aid grants were associated
with increased spending, while loans were a response to unanticipated deficits; that
is, if spending exceeded revenues (tax and grants), the government sought loans to
finance the deficit. Aid grants were positively associated with growth through
financing government spending, and loans were negatively associated with growth
perhaps because they were associated with deficits. There was no evidence that aid
affected tax revenue or that tax had an effect on growth (except indirectly via
financing spending).
Martins (2010) provides a comprehensive application of the CVAR method
using quarterly data for Ethiopia over 1993–2008. He finds evidence of a long-run
positive relationship between aid and development spending, but not between aid
and recurrent spending (hence, no evidence that aid is fungible), domestic bor-
rowings increased in response to shortfalls in revenue (tax and grants), and there
was no evidence that aid reduced tax efforts. Further, aid grants adjusted to the level
of development spending.
Bwire et al. (2013, 2016) formulated a set of testable hypotheses for the fiscal
effects of aid (budgetary constraints, a balanced budget, aid additionality/illusion,
tax revenue displacement, and aid-domestic borrowing substitution) in Uganda
within the CVAR framework on both annual and quarterly fiscal data. They found
that aid was a significant element in the long-run fiscal equilibrium and did not find
evidence supporting the assumption that aid was exogenous in the fiscal equilib-
rium. Aid was associated with increased tax efforts, lower domestic borrowings,
and increased public spending. Further investigation of the long-run relation among
the fiscal variables revealed support for the existence of a budget constraint and a
non-balanced budget excluding aid. Mascagni and Timmis (2014) applied a CVAR
analysis to Ethiopian government data over 1960–2009: Aid (grants and loans) was
positively related to tax revenue; tax did not adjust to aid but aid was an adjusting
variable, implying that donors rewarded Ethiopia when tax revenues were
increasing. Table 5.1 presents the results of selected country-specific studies on the
dynamic effect of aid.
Our study used a CVAR model to evaluate hypotheses of interest relating to the
interaction of aid with domestic fiscal aggregates in Rwanda based on quarterly
time series data for the period 1990Q1–2015Q4. In particular, our study evaluated if
there exists a fiscal equilibrium among the fiscal variables, including aid; if aid
forms part of this fiscal equilibrium relation; if donor governments do not react to
fiscal disequilibrium; if donors’ aid allocation is not influenced by past fiscal
conditions in Rwanda; if aid does not influence the fiscal conditions in Rwanda; and
it also estimates the long-run impact of aid on domestic fiscal aggregates.
84
5.3.1 Data
Our study used quarterly time series data (1990Q1–2015Q4) in Rwandan francs
reported at constant 2011 prices. Fiscal data on foreign aid, tax revenue, domestic
borrowings from the banking system, and recurrent and capital government
spending are from Rwanda’s Ministry of Finance and Economic Planning. The
non-tax revenue component of domestic revenue and other forms of borrowing are
omitted from the system as we are not estimating an identity. Aid data capture total
net disbursements from all donors as recorded by the government and comprises
capital and budgetary grants. As this data is from fiscal authorities, it is assumed to
fairly measure the actual aid known to the fiscal authorities and should be capable
of affecting budget planning. Nonetheless, while this is true for all on-budget or
program support, caution should be taken as an appropriate treatment of capital
grants is more complicated. Some of the these grants may be on-budget such as
sector projects that are known to the government, especially if matching funds are
required; some may be known and influence spending allocations such as health
projects that permit the government to reduce its own health spending; and some
may be genuinely off-budget such as technical assistance in an area that the gov-
ernment would not otherwise fund and this is spent either within the donor country
or under the control of the donors or that the donors retain control over project aid.
Some previous applications (Martins 2010; Morrissey 2001) have disaggregated
aid into grants and loans in principal, because they may have different effects (gov-
ernments prefer grants because they do not have to be repaid; loans may encourage
fiscal planning for future servicing and repayment costs), so that there could be an aid
aggregation bias. However, as argued in McGillivray and Morrissey (2001) and
Bwire et al. (2013), in practice, such a bias is likely to be minor. Aid loans are long
term, and governments currently in power are unlikely to be around when repayments
are due so they could be treated as grants. Indeed, the share of aid loans/GDP fell from
4.7% through the 1990s to 3.9% in the last 15 years. Over the same time, the share of
aid grants/GDP rose sharply from an average of 1.6–6.3%. Thus, capital grants are
similar to budgetary grants and are treated as grant or aid in this study.
Raw data are reported in Fig. 5.1. A visual inspection of the data reveals two
important features. First, levels were low and relatively persistent until the start of
the 2000s after which spending and revenue followed a clear upward trend but only
a slight irregular upward trend for aid. Aid was generally low during the 1990s,
hitting negatives in 1994 (perhaps reflecting the genocide) but increased dramati-
cally between 2000 and 2010. It increased erratically until 2015, dropping sharply
during 2012 when the country suffered aid suspension. In terms of spending, aid
was equivalent to 28.1% through the 1990s, increased steadily through the 2000s to
43.7% and averaged 30.7% over the last five years. Within years, aid tended to be
highest in the fourth quarter (or sometimes the second) and this was also the case,
86 T. Bwire et al.
240
Tax Revenue
Rwandan Francs (billions, 2011 prices)
200 Aid
DomesƟc Financing
160 Recurrent Spending
Capital Spending
120
80
40
0
1990q1
1991q1
1992q1
1993q1
1994q1
1995q1
1996q1
1997q1
1998q1
1999q1
2000q1
2001q1
2002q1
2003q1
2004q1
2005q1
2006q1
2007q1
2008q1
2009q1
2010q1
2011q1
2012q1
2013q1
2014q1
2015q1
-40
Fig. 5.1 Series in levels. Source Rwanda, Ministry of Finance and Economic Planning and
Ministry of Finance and Economic Planning, Rwanda
but less pronounced, for tax revenue. Domestic borrowings were negative
throughout most of the mid-1990s and late 2000s.
Second, all variables typically trended over time, suggesting a multiplicative
rather than additive model specification which under log transformation is brought
back into additive form. However, as argued in Bwire et al. (2013) and Juselius
et al. (2011), such transformation is innocuous only and only if the series data
points are strictly positive or are at least not too close to zero. In our study sample,
log transformation of domestic borrowing series and some data points in the aid
series are problematic with dire estimation consequences which perhaps make it
even more undesirable. First, it obviously generates lost observations, shortening an
already small sample. This alone weakens the power of the tests—making the
CVAR analysis less reliable. Second, the omission of non-positive observations
will be nonrandom, leading to a selection bias. And third, the trending in the data
begins from the early 2000s—a shift that might be lost with log transformation.
Given this, all series are left in non-log specifications. However, while a trade-off
in the choice between log and non-log specifications might matter, as we show our
analysis gives results that are consistent with what is known about the fiscal impact
of aid in some of the previous country-specific applications, particularly those that
typically used log transformations due to trends in the variables. This in itself
suggests that there is little to be gained from log over non-log specifications.
X
p1
Dxt ¼ Pxt1 þ Ci Dxti þ Wdt þ et ð5:1Þ
i¼1
P ¼ ab0 ð5:2Þ
where a and b are both (n r), and r is the rank of P corresponding to the number
of linearly independent relationships among the variables in xt . The fiscal equi-
librium thought of as the statistical analogue of the budgetary equilibrium in fiscal
response models is defined by the parameters in b. It follows then that b0 xt1
measures the extent to which the budget is out of equilibrium and a measures the
long-run rate at which each of the variables adjusts to restore the equilibrium.
Coefficients in the Ci matrices allow short-run adjustment in each of the variables to
88 T. Bwire et al.
differ from that given by their long-run rates (defined by the coefficients in a) and
hence, potentially at least, accommodate a wide range of dynamic responses.
The VECM in Eq. (5.1) is particularly attractive in the current context, since it
provides a natural framework in which parallels between the economics and
econometrics of fiscal response models can be exploited. Specifically, the frame-
work not only facilitates a statistical investigation of the role of aid in the budgets of
recipient countries but also shows whether fiscal conditions in recipient countries
affect aid-allocation behavior in donor countries. Because these economic
hypotheses of interest represent parameter restrictions within VECM, they can be
evaluated formally. In what follows, these economic issues of interest are set out as
a number of key propositions.
As discussed earlier, insofar as aid represents an injection of foreign finance, it
relaxes budget constraints. Aid allocated for financing debt or domestic con-
sumption is unlikely to achieve longer term effects on the budget, in which case the
impact of aid will be confined to the short run. In contrast, where aid is used as a
source of investment for development projects such as health care or infrastructure,
there may be more long-term effects on the budget as such investments spawn
further spending (aid illusion) or increased tax revenues. Since development pro-
jects of this sort are likely to have come about as a result of aid’s incorporation into
the process of budgetary planning, it is convenient to think of the aid’s long-run
effects and its incorporation into budgetary planning synonymously. Clearly,
whether aid is anticipated or not has a decisive bearing on the uses to which it is put
and thus the (short and/or long run) effects that it has.
The economic distinction between short and long run ties in neatly with the
VECM’s econometric formulation which in turn offers insights into the role of aid
in an empirical setting. The correspondence between the economics and econo-
metrics of aid in fiscal response is central to our paper, since it provides the basis for
the empirical testing of a range of economic hypotheses relating to the effects that
aid has in developing countries.
As can be deduced from the discussion earlier, the co-integrating relation is the
statistical analogue of the budgetary equilibrium in fiscal response models. Hence,
the fiscal response theory predicts the presence of a single co-integrating relation
(i.e., a stationary linear combination of the variables in xt ) such that b is an n 1
vector, the coefficients of which quantify the budgetary equilibrium. Of course, this
presupposes that all variables in xt are integrated of order 1, [I(1)]. Where a variable
is I(0), it will form a stationary linear combination with itself, so that there can exist
5 Fiscal Effects of Aid in Rwanda 89
at most n of these stationary linear combinations; n ¼ r implies that all variables are
I(0). As Johansen (1992) demonstrates, each of the r columns of a corresponds to
the r rows of b0 , so that inference on the number of co-integrating vectors (nonzero
rows in b0 ) can be evaluated by hypothesis testing on the adjustment coefficients
(nonzero columns in a) using likelihood ratio methods. Specifically, standard tests
for co-integration are equivalent to testing that the a0i s are insignificantly small for
r ¼ 1; . . . n. This leads us to the first set of co-integration hypothesis tests, which
amount to zero restrictions on each of the n columns of a in Eq. (5.2): Hc ðrÞ:
ar ¼ 0, where r ¼ 1; . . .n.
To assist the exposition, consider a VAR (5.2) in VECM form with unrestricted
constant partitioned conformably as mentioned earlier:
" 0 0
#
Dx1t a11 a12 b11 b12 x1t1 C11 C12 Dx1t1 e1t
¼ 0 0 þ þ Wdt þ
Dx2t a21 a22 b21 b22 x2t1 C21 C22 Dx2t1 e2t
ð5:3Þ
1
Where variables are found to be I(0) Rahbek and Mosconi (1999) suggest a tractable modification
to ensure that the limiting distributions of the co-integration test statistics are invariant to the
presence of the stationary regressors included in dt.
90 T. Bwire et al.
Dx1t a11 0 0 x1t1 C11 C12 Dx1t1 e
¼ b11 b12 þ þ Wdt þ 1t
Dx2t a21 x2t1 C21 C22 Dx2t1 e2t
ð5:4Þ
2
In practice, long run exclusion tests are applied in conjunction with co-integration tests to
determine whether multiple co-integrating vectors were indeed due to the presence of stationary
variables in xt or multiple co-integrating relations among I(1) variables in xt. Since the latter case is
implausible from an economic viewpoint it is ruled out in the following development.
5 Fiscal Effects of Aid in Rwanda 91
5.4.1 Preliminaries
The unrestricted model was estimated with a restricted trend and an unrestricted
constant—implying no quadratic growth in the data (Bwire et al. 2016; Juselius
2006). The lag-length was determined as the minimum number of lags that met the
crucial assumption of time independence of the residuals based on a Lagrange
multiplier (LM) test, starting with k = 5—this being quarterly frequency data.
92 T. Bwire et al.
Schwarz Bayesian Criterion (SC) suggests two lags, while both the Hannan-Quinn
(HQ) criteria and the Akaike Information criteria favor five lags. With two lags, the
LM test does not reject the null hypothesis of no serial correlation in the residuals,
suggesting, inter alia, that the underlying CVAR model has to be estimated using
two lags. In addition, this captures many more dynamics of the system. VAR model
residuals are finally subjected to a battery of residual misspecification tests
(Godfrey 1988), but as shown in Annexure 5.1, the histograms portray a reasonably
normal distribution behavior.
Ho : b ¼ ðb01 ; b2 Þ; ð5:5Þ
Table 5.4 Estimates of long-run relationships for different normalizations of the fiscal
equilibrium
Domfin C_spending K_spending Aid Tax_Rev
Coefficients of co-integrating relationship (b)
−1.000 1.886 1.359 −0.835 −2.533
(.NA) (7.029) (4.826) (−5.946) (−6.473)
0.530 −1.000 −0.721 0.443 1.343
(7.506) (.NA) (−5.101) (6.636) (9.660)
0.736 −1.387 −1.000 0.614 1.863
(5.745) (−5.686) (.NA) (4.411) (10.739)
−1.198 2.259 1.629 −1.000 −3.034
(−8.314) (8.690) (5.182) (.NA) (−7.596)
−0.395 0.745 0.537 −0.330 −1.000
(−6.147) (8.591) (8.568) (−5.159) (.NA)
Adjustment coefficients (a)
−0.378 −0.079 0.067 0.215 0.097
(−5.467) (−2.768) (3.417) (6.146) (6.695)
Note The rows of (b) represent different normalizations of the only uncovered co-integrating
relationship (t-ratios in parentheses). The adjustment coefficients (a) are those obtained from
normalizing the co-integrating vector on Domfin; p-values in brackets
Overall, more than three-fourth of the aid contributed to spending which is plausible
and is consistent with aid being fully additional. Note that our measure of aid
included project grants and not all of these are included directly as government
spending, so there is no implication that aid has not been additional.
Relative to the coefficient on aid, the coefficient on tax revenue is larger: 1.34 for
recurrent spending and 1.86 for capital spending, suggesting that spending
over-responds to tax revenue. One interpretation is overoptimism regarding the
sustainability of tax increases: The government commits to spending expected
revenues, and if this is not realized, it resorts to some other deficit financing. This,
however, is reflective of poor budget management. The a coefficients suggest that
current spending and domestic financing adjust quite quickly to disequilibrium.
The results for the co-integrating relations in Table 5.4 imply that all variables
are significant, so to provide an empirical content to the structural analysis
underlying the causal link between aid and domestic fiscal variables, we now focus
on two types of long-run parameter restrictions described in Propositions II and III
earlier.
Table 5.5 gives the results of the test for Proposition II, that is, long-run variable
exclusion (zero restrictions on each bi ), and Proposition III, that is, weak exo-
geneity (zero restrictions on each ai ) for r = 1, based on the likelihood ratio
5 Fiscal Effects of Aid in Rwanda 95
(LR) test distributed as v2 ðr Þ. Consistent with the results in Table 5.4, the null
hypothesis of the exclusion of the long-run variable is rejected for all variables
(robust to small sample bias correction). Of particular interest is that aid is a
significant element of a long-run fiscal equilibrium, so it supports spending, just like
tax revenue and domestic borrowings.
Long-run weak exogeneity is also rejected for all variables in the system and
importantly for aid at conventional levels. As in Franco-Rodriguez et al. (1998), this
is consistent with fiscal planners having a target for aid revenue that is taken into
account while forming the budget. Bwire et al. (2013) had a similar result for
Uganda. It is the case that like in Uganda, donors incorporate government spending
in deciding how much aid to allocate to Rwanda (Bwire et al. 2013; Foster and
Killick 2006: 19). Fiscal planners in Rwanda have a forward-looking view and have
achieved reasonable success in getting more aid allocated as budget support and
released early in the budget year. In Rwanda, weak exogeneity of aid suggests that
aid has been responsive to within the year budget planning.
In Rwanda, endogeneity of both current and capital spending as suggested by the
results appears counterintuitive as spending is very difficult to reverse once
implemented (especially if it involves increases in public payrolls or statutory
expenditures). However, it implies that government spending is planned based on
expected revenues, whereas the allocation is affected when the revenue outcome is
realized; that is, spending allocations responds to revenue outturn. While it is
surprising that weak exogeneity of domestic borrowings also cannot be rejected,
both trend developments in Fig. 5.1 and estimates of the long-run relation in
Table 5.4 suggest that it is determined by factors other than domestic fiscal vari-
ables—that is, it depends on aid outturn but not tax revenue.
Turning to the direction of causality, two issues are of interest: (1) whether past
values of the fiscal variables do not influence current values of aid, whether in terms
of long or short-run behaviors; and (2) whether aid is Granger non-causal for the
96 T. Bwire et al.
domestic budget. Results of block exogeneity, given in Table 5.6, suggest that
domestic fiscal variables influence current values of aid, allowing for the possibility
in particular, that government sets spending targets according to its development
objectives and then tries to find aid resources to finance these ambitions, albeit with
some level of unpredictability. This, however, should be interpreted with caution as
it does not imply that the authorities have control over aid allocations by donors
(aid commitments). Instead, as in Eifert and Gelb (2005), the disbursements could
be a reaction to the government’s ability to meet a donor’s administrative
requirements and/or other policy preconditions. As has been the case elsewhere, it
may also reflect exercising incentive clauses by donors in response to events over
which the Rwandan government has some direct control in the context of an
ongoing aid relationship (O’Connell et al. 2008).
The hypothesis that aid is Granger non-causal for the domestic budget is rejected
for domestic financing and for capital spending, although it is weakly significant.
This is consistent with estimates of the long-run parameters and implies in part that
the level of domestic debt is hugely influenced by the level of aid outturn such that
the higher the level of aid outturn, the lower the fiscal deficit to finance or that aid
enhances the authorities’ ability to repay domestic debt. Elsewhere, the weak results
are because overtime, aid as a share of the budget, has become numerically small as
the country strives to become self-reliant.
This paper assessed the dynamic relationship between foreign aid and domestic
fiscal variables in Rwanda over 1990Q1 to 2015Q4 using a CVAR model. An
investigation of the long-run relation between the fiscal variables provided inter-
esting insights into the fiscal dynamics in Rwanda.
Aid and fiscal variables form a long-run stationary relationship. Aid is a sig-
nificant element in the long-run fiscal equilibrium, and the hypothesis of aid exo-
geneity is not statistically supported; that is, anticipated aid appears to have been
taken into account in budget planning. Rwandan budget planners may have had a
target for aid revenue or the donors incorporated government spending in deciding
how much aid to allocate to Rwanda or a combination of both. This implies that the
5 Fiscal Effects of Aid in Rwanda 97
government sets its spending targets according to its own development objectives
and then tried to find resources to finance these ambitions in a priority order of
domestic revenue, aid, and domestic borrowings. As improved public finance
management and reduced domestic borrowings are common policy conditions
attached to aid, the results suggest that aid was either associated with or caused
beneficial policy responses in Rwanda.
Aid was associated with increased tax efforts, lower domestic borrowings, and
increased public spending. Although the results suggest that spending was less than
proportional to incremental aid, this was most probably because our measure of aid
included project grants and not all of these are included directly as government
spending, so this is consistent with aid being fully additional. It is evident that
spending was higher than it could have been in the absence of aid. As tax revenue
share of GDP relative to sub-Saharan African standards remained small over the
period, the government was unable to maintain a budget balance including aid, so
domestic borrowings remained frequent (with repayments in years of high aid).
These results suggest some policy implications. Corroborations from the trend
analysis and estimates of the long-run coefficients suggest that domestic borrowings
remain responsive to the uncertainties associated with aid inflows. Spending targets
appear to have been formed according to anticipated aid and shortfalls in aid
outturns induced domestic borrowings. If donors ensured that aid disbursements
were more reliable and predictable, the Rwandan authorities could improve fiscal
planning and reduce the instability associated with unanticipated deficits and the
need to resort to costly domestic borrowings. Of course, some of the aid volatility
arises because of absorption problems or failure to comply with conditionalities, so
the Rwandan authorities also have a significant role to play in ensuring a stable aid
relationship.
A comprehensive analysis of the relationship between aid and government
spending requires reliable data on aid received by the government, and this is a
deficiency in almost all government statistics, including in Rwanda which should be
addressed. Project grants related to donor-operated projects cannot increase recor-
ded spending as they do not go through the budget. If the government is aware of
donor projects, this could reduce government spending in that area. Therefore,
continued efforts by donors to coordinate aid delivery systems, make aid more
transparent, and support improvement in government fiscal statistics will contribute
to improving fiscal planning. Recipients need to know how much aid is available to
finance spending and how this is delivered, that is, whether through donor projects
or government budgets.
Acknowledgements The authors are grateful to an anonymous referee for constructive comments
and participants of the EABEW 2016 conference held in Kigali on 20–22 June who heard evolving
versions. The usual disclaimers apply. The data used in the analysis are available on request.
98 T. Bwire et al.
Annexure 1
1. Residual plots
Figure 5.2 is a panel containing four plots for each error correction model
equation: (a) actual and fitted values (top left); (b) standardized residuals (bot-
tom left); (c) auto-correlations (top right); and (d) histogram (bottom right).
Overlaid on the histogram is the estimated density function of the standardized
residuals (appears as a dotted line in print) and the density of the standard
normal distribution. It also contains some statistics: the univariate normality test
by Doornik and Hansen (DH) (2008) and Kolmogorov–Smirnov
(KS) (Lilliefors 1967) test for normality, and the Jarque-Bera test computed by
the RATS’ statistics instruction (Dennis 2006).
The actual and fitted residuals show an outlying observation in about 2013 in
virtually all residuals, except for tax revenue and domestic financing. This
notwithstanding, the histograms portray reasonably normal distribution
behavior.
DTAX_REV
14 1.00
Actual and Fitted Autocorrelations
12 0.75
10 0.50
8 0.25
6 0.00
4 -0.25
2 -0.50
0
-0.75
-2
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-4
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
5 0.9
Standardized Residuals Histogram SB-DH: ChiSqr(2) = 79.51 [0.00]
4 0.8 K-S = 0.92 [5% C.V. = 0.09]
J-B: ChiSqr(2) = 191.29 [0.00]
3 0.7
2 0.6
1 0.5
0 0.4
-1 0.3
-2 0.2
-3 0.1
-4 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -5.0 -2.5 0.0 2.5 5.0
DAID
25 1.00
Actual and Fitted Autocorrelations
20 0.75
15 0.50
10 0.25
5 0.00
0 -0.25
-5 -0.50
-10
-0.75
-15
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-20
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
6 0.8
Standardized Residuals Histogram SB-DH: ChiSqr(2) = 165.18 [0.00]
0.7 K-S = 0.92 [5% C.V. = 0.09]
4 J-B: ChiSqr(2) = 588.97 [0.00]
0.6
2
0.5
0 0.4
0.3
-2
0.2
-4
0.1
-6 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -6 -4 -2 0 2 4 6
Fig. 5.2 Actual, fitted, and standardized residuals, auto-correlations, and histograms
5 Fiscal Effects of Aid in Rwanda 99
DK_SPEND
14 1.00
Actual and Fitted Autocorrelations
12 0.75
10 0.50
8 0.25
6 0.00
4 -0.25
2 -0.50
0
-0.75
-2
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-4
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
4 0.7
Standardized Residuals Histogram SB-DH: ChiSqr(2) = 17.07 [0.00]
K-S = 0.87 [5% C.V. = 0.09]
3 0.6
J-B: ChiSqr(2) = 43.61 [0.00]
2 0.5
1 0.4
0 0.3
-1 0.2
-2 0.1
-3 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -4 -2 0 2 4 6
DCURRENT_SPEND
25 1.00
Actual and Fitted Autocorrelations
0.75
20
0.50
15 0.25
0.00
10
-0.25
5 -0.50
-0.75
0
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-5
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
7.5 1.0
Standardized Residuals Histogram SB-DH: ChiSqr(2) = 69.08 [0.00]
K-S = 0.96 [5% C.V. = 0.09]
J-B: ChiSqr(2) = 2843.89 [0.00]
5.0 0.8
2.5 0.6
0.0 0.4
-2.5 0.2
-5.0 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -4 -2 0 2 4 6 8
DDOMFIN
40 1.00
Actual and Fitted Autocorrelations
30 0.75
20 0.50
0.25
10
0.00
0
-0.25
-10
-0.50
-20
-0.75
-30
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-40
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
4 0.9
Standardized Residuals Histogram SB-DH: ChiSqr(2) = 104.81 [0.00]
3 0.8 K-S = 0.92 [5% C.V. = 0.09]
J-B: ChiSqr(2) = 340.04 [0.00]
2 0.7
1 0.6
0 0.5
-1 0.4
-2 0.3
-3 0.2
-4 0.1
-5 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -5.0 -2.5 0.0 2.5 5.0
0.5
0.0
-0.5
-1.0
choose the rank so that the largest unrestricted root is far from a unit root; that is,
it has modulus lower than 1. The model here is defined for p = 5, k = 1
implying p k = 5 roots in the characteristic polynomial (i.e., we assume full
rank of the P matrix). These are shown in Fig. 5.3, and as expected, all roots are
inside the unit circle.
References
Adam C, O’Connell S (1999) Aid, taxation, and development in Sub-Saharan Africa. Econ Polit
11:225–253
Bulir A, Javier Hamman A (2003) Aid volatility: an empirical assessment. IMF Staff Papers 50
(1):64–89
Bwire TM, Morrissey O, Lloyd T (2013). A time series analysis of the impact of foreign aid on
Central Government’s fiscal budget in Uganda. WIDER Working Paper No. 2013/101
Bwire TM, Morrissey O, Lloyd T (2016) Fiscal reforms and the fiscal effects of aid in Uganda.
J Dev Stud, forthcoming
Chenery H, Strout W (1966) Foreign assistance and economic development. Am Econ Rev
56:679–753
Dennis GJ (2006). CATS in RATS cointegration analysis of time series’. Version 2, Estima,
Evanston, Illinois, USA
Doornik JA (1998) Approximations to the asymptotic distribution of cointegration tests. J Econ
Surv 12:573–593
Doornik JA, Hansen H (2008) An omnibus test for univariate and multivariate normality. Oxford
Bull Econ Stat 70:927–939
Eifert B, Gelb A (2005). Improving the dynamics of aid: towards more predictable budget support.
The World Bank, Washington, DC, Policy Research Working Paper 3732
Engle RF, Hendry DF, Richard JF (1983) Exogeneity. Econometric 51:277–304
Foster M, Killick T (2006) What would doubling aid do for macroeconomic management in
Africa? Overseas Development Institute, London, ODI Working Paper 264
Franco-Rodriguez S, McGillivray M, Morrissey O (1998) Aid and the public sector in Pakistan:
evidence with endogenous Aid. World Dev 26:1241–1250
5 Fiscal Effects of Aid in Rwanda 101
Abstract This study examines the impact of economic stability measures (inflation
and unemployment rates) on real gross domestic product (GDP) in Rwanda. It uses
quarterly data for the period of 2000Q1–2015Q4 collected from the Ministry of
Finance and Economic Planning, Central Bank of Rwanda and the National
Institute of Statistics of Rwanda (NISR). This study concludes that inflation and
unemployment have a long-run negative and significant relationship on real gross
domestic product. In the long run, the coefficients are not significant at the 5% level;
it is only the inflation coefficient and error which are significant. Real gross
domestic product increases when inflation reduces with a p-value of 0.00266; real
gross domestic product increases when unemployment reduces with a p-value of
0.09882. The coefficient from the error correction model means that the effect of the
shock will reduce by 0.0483% each quarter, meaning that the effect of the shock
will reduce by 19.32% in each 4th quarter. This further means that it will end at 20
quarters, that is, after a five-year period. It has to be highlighted that there is a weak
relationship between real gross domestic product and both inflation and unem-
ployment rates.
JEL Classification E4 E5 E6
6.1 Introduction
For all countries, both developed and developing, one of the fundamental objectives
of macroeconomic policy is economic stability. Economic stability refers to an
economy that experiences constant growth and low inflation. Advantages of having
a stable economy include increased productivity, improved efficiencies, and low
unemployment. The common signs of instability are extended time in a recession or
crisis, rising inflation, and volatility in currency exchange rates. An unstable
economy leads to a decline in consumer confidence, stunted economic growth, and
reduced international investments. The main goals of any government usually
include economic growth, price stability, and low unemployment. The most
important means of moving toward these goals are detailed tax policies, spending,
regulation, and government management. However, the macroeconomic levers of
the fiscal stance and monetary policy also play a part. Attaining sustainable eco-
nomic growth coupled with price stability continues to be the central objective of
macroeconomic policies for most countries in the world today. Among others, the
emphasis on price stability in conducting monetary policy is with a view to pro-
moting sustainable economic growth as well as strengthening the purchasing power
of the domestic currency (Umaru and Zubairu 2012).
The question on whether or not inflation is harmful to economic growth has
recently been a subject of intense debate among policymakers and macroe-
conomists. Several studies have estimated a negative relationship between inflation
and economic growth. It is imperative for studies which base their arguments on
real business cycle theories to also base them on countries (Pradana and Rathnayaka
2013).
Luppu (2009) has established a positive relationship between inflation and GDP
growth in Romania in the short run. This implies that as inflation increases, GDP
must also increase in the short run. However, when inflation decreases, GDP should
also decrease. Drukker et al. (2005) have noticed that if the inflation rate is below
19.16%, increases in inflation do not have a statistically significant effect on growth,
but when inflation is above 19.16%, a further increase in inflation will decrease the
long-run growth.
Mallik and Chowdury (2001) indicate a long-run positive relationship between
the GDP growth rate and inflation among four South Asian countries. Specifically,
the bone of contention is whether inflation is necessary for economic growth or is it
detrimental to growth.
World economic growth and inflation rates have been fluctuating. Likewise,
inflation rates have been dominating when compared to growth rates over many
years; hence, the relationship between inflation and economic growth has continued
to be one of the most significant macroeconomic problems (Madhukar and
Nagarjuna 2011). Similarly, Ahmed (2010) maintains that this relationship has been
argued in economic literature, and these arguments show differences in relation to
the condition of the world economic order. In accordance with these policies,
increases in total demand have led to increases in production and inflation too.
6 Relationship Between Inflation and Real Economic Growth in Rwanda 105
In the 1970s, countries with high inflation, especially Latin American countries,
started experiencing a decrease in growth rates which led to the emergence of views
which stated that inflation had negative effects and not positive effects on economic
growth. Evidence showing a relationship between inflation and economic growth
from some of the Asian countries such as India showed that its growth in GDP
increased from 3.5% in the 1970s to 5.5% in the 1980s, while the inflation rate
accelerated steadily from an annual average of 1.7% during the 1950s to 6.4% in
the 1960s and further to 9.0% in the 1970s before easing marginally to 8.0% in the
1980s (Prasanna and Gopakumar 2010). Similarly, Xiao (2009) shows that from
1961 to 1977, China’s real GDP growth and real GDP per capita growth averaged
at 4.84 and 2.68%, respectively. Since 1978, China’s economy has grown steadily,
although the growth rate fluctuated among the years, and from 1978 to 2007, the
growth rate of China’s real GDP and real GDP per capita were recorded at 9.992
and 8.69%, respectively.
A study by Stein (2010) shows that in East African countries, Kenya had five
years of very positive economic development with four consecutive years of above
4% growth. The same study shows that Uganda was one of the fastest growing
economies in Africa with sustained growth averaging 7.8% since 2000 with the
annual inflation rate decreasing from 5.1% in 2006 to 3.5% in 2009. The average
annual real GDP growth rate for Rwanda in 1990–99 was −0.1 but from 2006 to
2009, the country had an annual average growth rate of 7.3%.
Since the late 1970s, the Tanzanian economy has experienced many internal and
external shocks. Kilindo (1997) documents the issues and maintains that all sectors
of the economy were affected by shocks, whose manifestations included large
budget deficits and an imbalance between productive and non-productive activities.
He also argues that the signs closely associated with these were high rates of
inflation, large balance of payment (BOP) deficits, declining domestic savings,
growing government expenditure, falling agricultural produce, and decreased uti-
lization of industrial capacity which in turn hindered economic growth.
Macroeconomists, central bankers, and policymakers have often emphasized the
costs associated with high and variable inflation. Inflation imposes negative
externalities on the economy when it interferes with its efficiency. Examples of
these inefficiencies are not hard to find, at least at the theoretical level. Inflation can
lead to uncertainty about future profitability of investment projects (especially when
high inflation is also associated with increased price variability). This leads to more
conservative investment strategies than would otherwise be the case, ultimately
leading to lower levels of investments and economic growth. Inflation may also
reduce a country’s international competitiveness by making its exports relatively
more expensive, thus impacting its balance of payments (Gokal and Hanif 2004).
The conventional view in macroeconomics holds that permanent and predictable
changes in inflation rates are neutral, and they do not affect real activity in the long
run. However, a substantial body of evidence suggests that sustained high inflation
rates can have adverse consequences for real economic growth even in the long run.
Nowadays, a consensus among economists seems to be that high rates of inflation
cause ‘problems’ not just for some individuals but for aggregate economic
106 F. Nkikabahizi et al.
performance. However, there is much less agreement about the precise relationship
between inflation and economic performance and the mechanism by which inflation
affects economic activity. The effects of permanent increases in the inflation rate for
long-run activity seem to be quite complicated.
The consensus about the adverse effects of inflation on real economic growth
reveals only a small part of the whole picture. Recently, intensive research has
focused on the nonlinear relationship between these two variables. That is, at lower
rates of inflation, the relationship is not significant or even positive, but at higher
rates, inflation has a significantly negative effect on growth. Bruno and Easterly
(1998) demonstrate that a number of economies have experienced sustained
inflation of 20–30% without suffering any apparently major adverse consequences.
However, once the rate of inflation exceeds some critical level (estimated at 40%),
significant declines occur in the level of real activity. The relationship between
inflation and economic growth is one of the most important economic controversies
among economists, policymakers, and monetary authorities. In particular, the core
of the argument is whether inflation is necessary for economic growth or is it
harmful for economic growth. Although the relationship between inflation and
economic growth has been widely examined and investigated, it has also been
debated in economic literature.
This section discusses different empirical studies which show the relationship
between inflation and economic growth. Previous studies’ concern was not only
finding a simple relationship between inflation and economic growth, but also
finding whether the relationship held in the long run or it was just a short-run
phenomenon, finding the causal direction of the relationship and whether the
relationship was linear or nonlinear and the like.
Adam Smith founded the classical theory. He recognized three factors of production
—land, labor, and capital. His production function can be expressed as ׃Y = f(L, K, T),
where Y is output, L is labor, K is capital, and T is land. Smith considered saving as the
most important factor affecting the growth rate. In classical theories, there is no direct
explanation of inflation and its tax effect on profit levels and output. But the rela-
tionship between the two variables is implicitly negative by a reduction in firms’ profit
levels and savings through higher wage costs (Gokal and Hanif 2004).
In 1936, John Maynard Keynes wrote The General Theory of Employment,
Interest and Money, which established the foundation of Keynesianism. Keynesians
believe that the government has to intervene to reach full production. They believe
that intervention by the government in the economy through expansionary
6 Relationship Between Inflation and Real Economic Growth in Rwanda 107
economic policies will boost investment and promote demand to reach full pro-
duction. The Keynesian model is based on aggregate demand (AD) and aggregate
supply (AS) curves. In this model, the AS curve is upward sloping in the short run,
so that a change in the demand side of the economy affects both price and output
(Dornbusch et al. 1996).
Dornbusch et al. (1996) have also argued that AD and AS yield an adjustment
path which shows an initial positive relationship between inflation and economic
growth but eventually which turns negative toward the latter part of the adjustment
path. The initial positive relationship between inflation and economic growth is due
to the time inconsistency problem. Producers feel that only the prices of their
products have increased, while the other producers are operating at the same price
level. However, in reality, overall prices have increased. Therefore, the producers
continue with more and more output. Moreover, according to Blanchard and
Kiyotaki (1987), inflation and economic growth are positively related because of
firms’ agreement to supply on an agreed price. So a firm has to produce even at
increased prices. Later on, the relationship becomes negative. This describes the
phenomenon of stagflation, that is, output decreases or remains the same when
prices increase (Gokal and Hanif 2004).
‘Stagflation’ is a phenomenon that incorporates high inflation and low growth or
high unemployment; this dominated almost all developed countries in the middle of
the 1970s. Monetarism was proposed by Milton Friedman. For this school of
thought, money supply is the only factor that determines price levels in an econ-
omy. They argue that government intervention manages the growth rate of money
supply to harmonize it with the growth rate of output in the long run. Monetarists
also argue that inflation will occur when money supply increases faster than the rate
of growth of national income. But the effect of money supply is different in the long
run and short run. In the short run, money supply has the dominant influence on real
variables (real GDP and employment) and price levels. But in the long run, the
influence of the variations in the money supply is primarily on price levels and on
other nominal variables but not on real variables such as real output and employ-
ment (Richard 1998).
Monetarism looks at the concept of anticipation in two parts—the Phillips curve
and the divide Phillips curve in the short run and long run (Gokal and Hanif 2004).
For this theory, the Phillips curve holds in the short run but not in the long run. In
the long run, anticipated inflation will be consistent with actual inflation. So
inflation will not influence unemployment, output, and other real economic vari-
ables. This concept is called neutrality of money. Gokal and Hanif (2004) explain
the concept of neutrality and super-neutrality as neutrality holds if the equilibrium
values of real variables, including the level of GDP, are independent of the level of
the money supply in the long run and super-neutrality holds when real variables
including the GDP rate of growth are independent of the rate of growth in the
money supply in the long run. Inflation will be harmless in the case of neutrality
and super-neutrality. But this may not be true in reality. Inflation is bad for the
economy because it affects capital accumulation, investments, and exports and
hence, affects output.
108 F. Nkikabahizi et al.
between inflation and return rate on capital will depend on the relationship between
the real money balance and investment. As discussed in the part of neoclassical
theory and as also discussed in Mundell (1963) and Tobin’s (1965) models and in
Haslag (1997) and Stockman (1981), if real money balances substitute investment,
inflation will decrease the return on real money balances, but the return on
investment will increase. But if real money balances complement investment,
inflation will have a negative effect on growth.
Like theoretical models, existing empirical studies too reflect different views on the
relationship between inflation and output growth. Their findings differ depending
on the data period and countries, suggesting that the association between inflation
and growth is not stable. Still, economists now widely accept the existence of a
nonlinear and concave relationship between these two variables; the traditional
point of view does not consider inflation as an important factor in the growth
equation. This is reflected in the studies of Dorrance (1963) and Johanson (1967)
who did not find any significant impact of inflation on growth in the 1960s.
Nevertheless, the traditional point of view changed when high and chronic inflation
was present in many countries in the 1970s; as a result, different researchers showed
that inflation had a negative impact on output growth.
Fischer (1993) and De Gregorio (1992, 1996) investigated the link between
inflation and growth in time series, cross-sectional and panel datasets for a large
number of countries. The main result of these works is that there is a negative
impact of inflation on growth. Fischer (1993) argues that inflation hampers the
efficient allocation of resources due to harmful changes in relative prices. At
the same time, relative prices appear to be one of the most important channels in the
process of efficient decision making.
Barro (1987) studied the relationship between inflation and economic growth.
He used 30 years data in 100 countries from 1960 to 1990. He included other
determinants of economic growth besides inflation. To analyze the data, he used the
systems of regression equation. The regression results indicated that an increase in
average inflation by 10% per year led to a reduction in the growth rate of real per
capita GDP by 0.2–0.3% per year and a decrease in the ratio of investment to GDP
by 0.4–0.6%. But the result is statistically significant only when high inflation
experiences are included in the sample.
Investigations into the existence and nature of the link between inflation and
growth have had a long history. Although economists now widely accept that
inflation has a negative effect on economic growth, researchers did not detect this
effect in data in the 1950s and the 1960s. A series of studies in the IMF Staff Papers
around 1960 found no evidence of damage from inflation (Bhatia 1960; Dorrance
110 F. Nkikabahizi et al.
1963, 1966; Wai 1959). Johanson (1967) found no conclusive empirical evidence
for either a positive or a negative association between the two variables. Therefore,
a popular view in the 1960s was that the effect of inflation on growth was not
particularly important.
Motley (1994) includes inflation in his model to examine the effect of inflation
on the real GDP growth rate. He extended the model developed by Mankiw et al.
(1992), which was based on the Solow growth model by allowing for the possibility
that inflation tended to reduce the rate of technical change. The result indicates a
negative relationship between inflation and the growth rate of real GDP. Khan and
Senhadji (2001) analyzed the relationship between inflation and economic growth
separately for industrial and developing countries. They used new econometric
techniques initially developed by Chan and Tsay (1998) and Hansen (2000) to
show the existence of threshold effects in the relationship between inflation and
economic growth. The authors used an unbalanced panel data containing 140
countries for the period 1960–98. The estimated value of the threshold was 1–3%
and 11–12% for developed and developing countries, respectively. The results
indicated that the threshold for industrialized countries was lower than developing
countries. It also indicated that inflation levels below the threshold level of inflation
had no effect on growth. But inflation rates above the threshold level had a sig-
nificant negative effect on growth.
Mubarik (2005) also estimated the threshold level of inflation for Pakistan. He
found a 9% threshold level of inflation as inflation above this level affected the
economic growth negatively. But inflation below the estimated level was conduc-
tive for economic growth.
Some other studies have shown that the link between inflation and growth is
significant only for certain levels of inflation. For instance, Bruno and Easterly
(1995) studied the inflation–growth relationship for 26 countries over 1961–92.
They found a negative relationship between inflation and growth when the level of
inflation exceeded some threshold. At the same time, they showed that the impact of
low and moderate inflation on growth was quite ambiguous. They argue that in this
case inflation and growth were influenced jointly by different demand and supply
shocks, and thus no stable pattern existed.
Numerous empirical studies have found that the inflation–growth interaction is
nonlinear and concave. Fischer (1993) was the first to investigate this nonlinear
relationship. He used cross-sectional data covering 93 countries and used the
growth accounting framework to detect the channels through which inflation
impacted growth. As a result, he found that inflation influenced growth by
decreasing productivity, growth, and investment. Moreover, he also showed that the
effect of inflation was nonlinear with breaks at 15 and 40%. Sarel (1995) found
evidence of a structural break in the interaction between inflation and growth. He
used the fixed effect technique to deal with a panel data sample covering 87
countries over 21 years (1970–90). His main result is that the estimated threshold
level equaled 8%, exceeding which led to a negative, powerful, and robust impact
of inflation on growth.
6 Relationship Between Inflation and Real Economic Growth in Rwanda 111
Mubarik (2005) analyzed the causal relationship between inflation and economic
growth. His test results indicated that causality between the two variables was
unidirectional, that is, inflation caused GDP growth but not vice versa. Chimobi
(2010) studied inflation and economic growth in Nigeria and found unidirectional
causality from inflation to growth. Erbaykal and Okuyan (2008) analyzed the causal
relationship between inflation and economic growth in the framework of the
causality test. Their results indicated no causal relationship between economic
growth and inflation, whereas there was a causality relationship from inflation to
economic growth.
In addition to unidirectional causality from inflation to economic growth and
bilateral causality, there are also studies which indicate unidirectional causality
from growth to inflation. Gokal and Hanif (2004) studied inflation and economic
growth in Fiji. They concluded that Granger causality runs one way, from growth to
inflation but not from inflation to growth. It means that it is unidirectional. Datta
and Kumar (2011) examined the relationship between inflation and economic
growth in Malaysia with data from 1971 to 2007. Their findings show that there
exists short-run causality between the variables and that the direction of causality is
from inflation to economic growth, and in the long run, Granger causes inflation in
economic growth.
Finally, there are also studies which indicate no causality relationship between
inflation and economic growth. Kigume (2011) studied inflation and economic
growth in Kenya from 1963 to 2000. The Granger causality test of his study showed
no causality relation between these two variables.
Many authors such as Luppu (2009) and Mallik and Chowdury (2001) who carried
out research on related subjects found that both inflation and real economic growth
were positively related in the long run, while Pradana and Rathnayaka (2013) show
the existence of a negative relationship between inflation and economic growth.
Drukker et al. (2005) and Bruno and Easterly (1998) indicate that the relationship
between inflation and economic growth depends on the inflation rate to have either
a positive or a negative impact. Empirical and theoretical evidence suggests that the
relationship between inflation and economic growth is positive, negative, and none,
which leads to ambiguity about the exact relationship.
In Rwanda, the inflation rate is likely to be stable which does not stop its
economy from improving as the inflation rate is low in the short run. Studies which
prove a relationship between variables on economic growth, however, do not focus
on the Rwandan economy as they all focus on the long-run relationship rather than
112 F. Nkikabahizi et al.
the short-run relationship. Our study, which examines the relationship between
inflation and economic growth in Rwanda, will enable other scholars and even
macroeconomists and authorities to know the exact relationship between inflation
and real economic growth in Rwanda, and help macroeconomic policymakers to set
strategies leading to economic stability in Rwanda.
Our paper examines the relationship between inflation and economic growth and
analyzes the causality relationship between the two. The research findings are
significant for monetary policy authorities, business owners, and investors. They are
also important for policymakers as they can get to know the link between inflation
and GDP which will help them decide and set strategies concerning variables by
taking into account the fact that all these variables have an impact on a country’s
well-being. As for researchers, apart from their contribution to knowledge about
Rwandan society and inflation and GDP, our study will also give them an oppor-
tunity to know about the correlation between inflation and economic growth in the
world, particularly in Rwanda, and its effect on investment decisions and business
performance in Rwanda.
The purpose of our study is to investigate the relationship between inflation and
economic growth in Rwanda and determine whether there is a turning point or a
threshold level of inflation at which the inflation effect on economic growth
switches from positive or insignificant to negative. For the purpose of economic
stability, unemployment rates are also taken into account in our study.
Our study seeks to answer the following questions: (i) Is there a significant
relationship between inflation and unemployment and economic growth? If so, is
the relationship positive or negative? (ii) Is the causality relationship between
inflation and unemployment and economic growth bidirectional, unidirectional
(either from inflation to economic growth or from economic growth to inflation), or
a no causality relation? (iii) Is the Rwandan economy stable?
We believe Granger’s (1969) model is simple and is also accurate in supporting the
specificity of the effect of inflation on economic growth in Rwanda. This leads us to
formulate this model in detail so that it is consistent with the hypotheses of the
study, assuming that an increase in inflation rate has a negative effect on economic
growth as the dependent variable. For the economic stability measure, unemploy-
ment rate is added to the model. The empirical model used for testing the rela-
tionship between real GDP and inflation rate and unemployment rate can be
specified by a simple model as
6 Relationship Between Inflation and Real Economic Growth in Rwanda 113
where RGDPt is the Rwandan real gross domestic product, INFRt is inflation rate,
and UNERt is unemployment rate.
Next, we estimate the following co-integration equations by VAR:
Both long- and short-term relationships were tested using the Johansen
co-integration test and ECM, respectively. VAR was used to estimate all the
parameters.
The data used for this study is basically time series data covering the period
2000–15. The two macroeconomic variables included in this study are inflation rate
and unemployment rate as independent variables and the real gross domestic product
at market prices as an indicator to measure economic growth. Data was sourced from
the Central Bank of Rwanda (BNR), the National Institute of Statistics in Rwanda
(NISR), a World Bank report, and the Ministry of Finance and Economic Planning.
We used a methodology which is presented as follows: test of lags, an analysis of
the stationarity of the series, the Johansen co-integration test, the Granger causality
test and the Chow test for the structure break, and the short-run relationship model
specification by ECM. We performed an economic interpretation of the
co-integration relation between the variables. We used GRETL as the appropriate
software for performing the econometric analysis better, and VAR was adopted for
estimating the parameter. The unit root test was initially performed to find the
stationary properties of each time series. An augmented Dickey–Fuller (ADF) unit
root test was used for this purpose. In testing, if any variable did not show stationary
at level, then the stationary property was tested on its first difference. If the variables
were stationary at their first difference long run, the association of the variable was
tested by using the co-integration technique. To achieve the objective, the station-
arity check used the unit root test named the augmented Dickey–Fuller test, while the
Johansen co-integration test was used to confirm the existence of long-run
114 F. Nkikabahizi et al.
Investigations into the existence and nature of the link between inflation and growth
have experienced a long history. Although economists now widely accept that
inflation has a negative effect on economic growth, researchers did not detect this
effect in data in the 1950s and the 1960s. A series of studies in the IMF Staff Papers
around the 1960s found no evidence of damage from inflation (Bhatia 1960; Dorrance
1963, 1966; Wai 1959). Johanson (1967) quoted in Ferdous and Shahid (2013) found
no conclusive empirical evidence for either a positive or a negative association
between the two variables. Therefore, a popular view in the 1960s was that the effect
of inflation on growth was not particularly important. Most empirical findings have
established an inverse relationship between inflation and the GDP growth rate. The
persistent increase in general prices of goods and services over time impedes efficient
resource allocation by obscuring the signaling role of relative price changes which is
an important guide to effective decision making (Fischer 1993 quoted in Enu et al.
2013). Inflation makes an economy’s exports relatively expensive, affecting BOPs
negatively thereby reducing a country’s international competitiveness.
In the short run, the relationship between economic growth and unemployment rate
may be a loose one. It is not unusual for the unemployment rate to show a sustained
6 Relationship Between Inflation and Real Economic Growth in Rwanda 115
decline sometime after other broad measures of economic activity have turned
positive. Hence, it is commonly referred to as a lagging economic indicator. Over
an extended period of time, there is a negative relationship between changes in the
rates of real GDP growth and unemployment. This long-run relationship between
the two economic variables was most famously pointed out in the early 1960s by
economist Arthur Okun. ‘Okun’s law’ has been included in a list of ‘core ideas’ that
are widely accepted in the economics profession. Okun’s law, which economists
have expanded upon since it was first articulated, states that real GDP growth about
equal to the rate of potential output growth is usually required to maintain a stable
unemployment rate (Levine 2013). Ernst and Berg (2009) as cited in Mosikari
(2013) explain that high growth is associated with a high degree of employment
intensity which is a necessary condition for the reduction of poverty. See Table 6.1.
The unit root test was used to examine the stationarity of the datasets. This enabled
us to avoid the problems of spurious results that are associated with non-stationary
time series models. We used the specific unit root test to check the stationarity of
variables, that is, augmented Dickey–Fuller (ADF). The ADF test is based on the
following regression:
116 F. Nkikabahizi et al.
where a is constant, d is slope coefficient, t is a linear time trend, and l is the error
term (Granger 1969 as cited in Iqbal et al. 2012).
In the case of the Dickey–Fuller test, they may create a problem of
auto-correlation. To tackle the auto-correlation problem, Dickey–Fuller developed a
test called the ADF test:
Hypothesis, null hypothesis (H0): The variable has a unit root, not stationary.
Alternative hypothesis (H1): The variable does not have a unit root, stationary. To
make the variable stationary, we go for I(1), 1st differencing, or for I(2), 2nd
differencing if the series has two unit roots in order to induce stationarity. The series
is stationary when the p-value < 5%, H0 is rejected. Same rule applies when ADF is
calculated in absolute value > ADF critical value.
Lag was selected according to vector auto-regression estimates, we chose the
lowest AIC value for the whole model, the lowest the AIC value, the better the
model. Therefore, the lag value selected was equal to 10. It had the lowest AIC
value compared to the others.
Based on the Durbin-Watson statistic value (0.044), which is less than 1, this means
that there is evidence of a positive auto-correlation. In a regression analysis using
time series data, with multiple interrelated data series, auto-correlation in variables
of interest is typically modeled with the vector auto-regression (VAR).
The ADF test shows that LRGDP is transformed into its first difference, the null
hypothesis is rejected, and the series becomes stationary. INFR and UNER are I(0).
Therefore, they are said to maintain stationarity at an integration of order one, I(1)
and I(0), respectively. All the results from the ADF test are given in Table 6.2.
6 Relationship Between Inflation and Real Economic Growth in Rwanda 117
As the times series variables are stationary, there is no need of testing for
co-integration using Engel and Granger and Johansen tests because the
co-integration test is equivalent to examining whether the residuals of regression
between two non-stationary series are stationary (Gujarati 2004).
The values in brackets represent the standard errors associated with the estimated
coefficient of Eq. (6.10).
Economic interpretations
All variables in the co-integrating equation have expected signs. Inflation rate,
which is a measure of macroeconomic instability, has a negative sign. This implies
that as inflation increases by 1%, RGDP reduces by 0.67%, inflation discourages
investments and therefore leads to a contraction in real economic activity. Similarly,
the unemployment rate also has a negative sign which means that when the
unemployment rate increases by 1%, real GDP declines by 7.05%. However, the
direction of causality may not necessarily run from unemployment to RGDP, since
unemployment tends to be high during recessions because firms often lay off some
workers. The appropriate method of analysis is using the error correction model
(ECM) that leads to the real impact of all independent variables on LRGDP.
Economic interpretations
Just like in the long-run model, the variables in ECM have expected signs. The
probability of DINFRt1 (0.00266) is less than 5%, meaning that DINFRt1 is
significantly negatively related to LRGDP, since an increase of 1% in INFR reduces
6 Relationship Between Inflation and Real Economic Growth in Rwanda 119
The AR roots graph helps test whether the inverse roots of the AR characteristics
polynomial are inside the unit circle. As shown in Fig. 6.1, the AR roots graph
confirms that the estimated VAR model was stable over the period of the study
(also see Table 6.3). We note that the residuals are normally distributed, and
(0.3539) is greater than 5%.
Our research carried out a VAR model to trace the impact of economic stability
measures (inflation rate and unemployment) on Rwandan real economic growth
(RGDP). The conclusive outcome of the research shows that between inflation,
unemployment and Rwandan real economic growth (RGDP) there is a long run
negative and significant relationship. However, for Rwanda, a short-run negative
relationship was found between real economic growth and both inflation and
unemployment. In the long run, the related standard error for each coefficient was
greater than 5%; thus, the coefficient was not significant. In the short run, only the
coefficient of unemployment was not significant.
Countries like Rwanda which are characterized by relatively high economic
growth and stability. Macroeconomic conditions do not suffer from an inflation
impact, otherwise inflation and unemployment influence RGDP and thereby have a
long term negative impact on economic growth. Therefore, policymaking bodies’
attention has to aim at macroeconomic policies which provide cost efficiency and a
route for steady and sustainable growth. Therefore, the Rwandan economy was
stable over the period of study.
References
Ahmed S (2010) An empirical study on inflation and economic growth in Bangladesh. OIDA Int J
Sustain Dev 2(3):41–48
Barro R (1987) Determinants of economic growth, a cross country empirical study. MIT Press,
Cambridge, Mass
Bhatia RJ (1960) Inflation, deflation, and economic development. Int Monetary Fund 8(1):
101–114
Blanchard OJ, Kiyotaki N (1987) Monopolistic competition and the effects of aggregate demand.
Am Econ Rev 77(4):647–666
Bourbonnais R (2007) Econométrie, 6ème édn. Dunod, Paris
Bruno M, Easterly W (1995) Inflation crises and long-run growth. NBER Working Papers
No. 5209, National Bureau of Economic Research. Available at: http://ideas.repec.org/p/nbr/
nberwo/5209.html
Bruno M, Easterly W (1998) Inflation crisis and long-run growth. J Monetary Econ 41:3–26
Chan KS, Tsay RS (1998) Limiting properties of the least square estimator of a continuous
threshold autoregressive model. Biometrica 85:413–426
Chimobi OP (2010) Inflation and economic growth in Nigeria. J Sustain Dev 3(2):159–166
6 Relationship Between Inflation and Real Economic Growth in Rwanda 121
Datta K, Kumar C (2011) Relationship between inflation and economic growth in Malaysia.
International conference on economics and finance research IPEDR Vol. 4, No. 2, pp 415–416
De Gregorio J (1992) The effect of inflation on economic growth. Eur Econ Rev 36(2–3):417–425
De Gregorio J (1996) Inflation, growth and Central Banks: theory and evidence. The World Bank,
Policy Research Working Paper No. 1575
Dornbusch R, Fischer S, Kearney C (1996) Macroeconomics. The Mc-Graw-Hill Companies Inc.,
Sydney
Dorrance GS (1966) Inflation and growth. Int Monetary Fund 13(1):82–102
Dorrance S (1963) The effect of inflation on economic development. Int Monetary Fund 10(1):
1–47
Drukker D, Hernandez-Verme P, Gomis-Porgueras P (2005) Threshold effects in the relationship
between Inflation and Growth: a new Panel-Data Approach. Working paper presented at the 11
th International conference on panel—data, Texas A&M University, College Station, Texas
Enu P, Attah-Obeng P, Hagan E (2013) The relationship between GDP growth rate and
inflationary rate in Ghana: an elementary statistical approach. Acad Res Int 4(5):310–318
Erbaykal E, Okuyan H (2008) Does inflation depress economic growth? Evidence from Turkey.
Int Res J Finance and Econ 13(17):40–48
Ernst C, Berg J (2009) The role of employment and labour markets in the fight against poverty. In:
Promoting Pro-Poor Growth, Employment. OECD. http://www.oecd.org/dac/povertyreduction/
43514554.pdf Accessed 15 Apr 2017
Ferdous M, Mahbuba Shahid E (2013) Study on nature of inflation and its relationship with GDP
growth rate: a Case Study on Bangladesh. IOSR J Econ Finance 1(3):40–49
Fischer S (1993) The role of macroeconomic factors in growth. J Monetary Econ 32(3):485–511
Gokal V, Hanif S (2004) Relationship between Inflation and Economic Growth in Fiji. Reserve
Bank of Fiji Suva, Fiji, Economics Department. Working Paper No. 4
Granger CWJ (1969) Investigating causal relationships by econometric models and cross-spectral
methods. Econometrica 37(3):424–438
Gujarati DR (2004) Basic econometrics, 4th edn. Tata McGraw Hill
Hansen BE (2000) Sample splitting and threshold estimation. Econometrica 68:575–603
Haslag JH (1997) Output, growth, welfare, and inflation: a survey. Econ Rev Second Q, Int
Monetary Fund 8(11–12):1011–1014
Iqbal N, Din M, Ghani E (2012) Fiscal decentralisation and economic growth: role of democratic
institutions. Pak Develop Rev 52(3):176–196
Johanson HG (1967) Is inflation a retarding factor in economic growth? In fiscal and monetary
problems in developing states. Proceedings of the third Rehoroth conference. Preager, New
York, pp 121–130
Khan MS, Senhadji SA (2001) Threshold effects in the relationship between inflation and growth.
Int Monetary Fund 48(1):1–21
Kigume RW (2011) The relationship between inflation and economic growth in Kenya. Int J Bus
Soc Sci 3(10). Available at: http://ir-library.ku.ac.ke/handle/123456789/2124
Kilindo A (1997) Fiscal operations, money supply and inflation in Tanzania. Afr Econ Res
Consortium 65(3):1–7
Levine L (2013) Economic growth and the unemployment rate. Congressional research service,
7-5700, R 42063, CRS report for congress, pp 1–10
Levinsohn J (2008) Two policies to alleviate unemployment in South Africa. Center for
International Development, at Harvard University, CID Working Paper No. 166
Luppu DV (2009) The correlation between inflation and economic growth in Romania. Luccrari
Stiintifice, p 53
Madhukar S, Nagarjuna B (2011) Inflation and growth rates in India and China: a perspective of
transition economies. Int Conf Econ Finance Res 4(97):489–490
Mallik G, Chowdury A (2001) Inflation and economic growth: evidence from four South Asian
countries. Asia-Pac Dev J 8(1):123–135
Mankiw NG, Romer D, Well DN (1992) A contribution to the empirics of economic growth.
Quart J Econ 107(2):407–437
122 F. Nkikabahizi et al.
Mosikari TJ (2013) The effect of unemployment rate on gross domestic product: case of South
Africa. Mediterr J Soc Sci 4(6):429–434
Motley B (1994) Growth and inflation: a cross-country study. Center for economic policy research,
Stanford University, CEPR Publication No. 395, pp 15–28
Mubarik YA (2005) Inflation and growth: an estimate of the threshold level of INFLATION in
Pakistan. SBP-Res Bull 1(1):35–43
Mundell R (1963) Inflation and real interest. J Polit Econ 71(3):280–283
Pradana MBJ, Rathnayaka MKTR (2013) Testing the link between inflation and economic growth:
evidence from Asia. Mod Econ 4:87–92
Prasanna S, Gopakumar K (2010) An empirical analysis of inflation and economic growth in India.
Int J Sustain Dev 15(2):4–5
Richard T (1998) Macroeconomics theories and policies, 6th edn. University of North Carolina at
Chapel Hill
Sarel M (1995) Nonlinear effects of inflation on economic growth. Int Monetary Fund 43(1):
199–215
Sidrauski M (1967) Inflation and economic growth. J Polit Econ 75(6):796–810
Stein P (2010) The economics of Tanzania, Kenya, Uganda, Rwanda and Burundi. Report
prepared for Swed Fund International AB, pp 12–32
Stockman AC (1981) Anticipated inflation and the capital stock in a cash-in-advance economy.
J Monetary Econ 8:387–393
Tobin J (1965) Money and economic growth. Econometrica 33(4):671–684
Umaru A, Zubairu J (2012) The effect of inflation on the growth and development of the Nigerian
economy: an empirical analysis. Int J Bus Soc Sci 3(10):187–188
Wai UT (1959) The relationship between inflation and economic development: a statistical
inductive study. Int Monetary Fund 7(2):302–317
Xiao J (2009) The relationship between inflation and economic growth of China: empirical study
from 1978–2007. Lund University, Sweden, pp 1–56
Chapter 7
Macroeconomic, Political,
and Institutional Determinants of FDI
Inflows to Ethiopia: An ARDL Approach
Addis Yimer
Abstract Based on the lines of the eclectic theoretical framework of Foreign direct
investment (FDI) flows, this study investigates the macroeconomic, political, and
institutional determinants of FDI inflows to Ethiopia for the period 1970–2013.
Using the ARDL modeling approach, it finds that political and institutional factors
are crucial both in the long run and the short run in FDI inflows to the country. On
the macroeconomic side, the market size of the country, availability of natural
resources, openness to trade, and deprecation in the nominal exchange rate are
found to positively affect FDI inflows to the country. On the other hand, macroe-
conomic instability is found to effect FDI inflows negatively. In addition, better
political stability, government effectiveness and regulatory quality, and better
performance of the rule of law are found to positively affect FDI inflows to the
country. A careful liberalization of the foreign exchange market and that of external
trade, sustaining the current growth momentum of the economy, improving insti-
tutional quality, and strengthening the political stability of the country, among
others, are fundamental areas that the government could work on to strengthen
Ethiopia’s position in FDI inflows on the continent.
Keywords ARDL Determinants Ethiopia FDI Macroeconomic stability
Political Institutional
7.1 Introduction
Foreign direct investment (FDI) plays an important role in the growth process of
poor nations (UNCTAD 2013). Not only does it provide the much needed capital
for filling the saving-investment and foreign exchange gaps in these countries, but it
is also important for generating employment opportunities and transferring tech-
nology and managerial know-how. In addition, by providing access to foreign
A. Yimer (&)
Department of Economics, Addis Ababa University, Addis Ababa, Ethiopia
e-mail: addisyimer@gmail.com
markets and building capacity through the transfer of technology, FDI improves the
integration of the host country into the global economy thus fostering growth.
The Ethiopian economy has to grow at least at an annual growth rate of 11% for
more than two decades so that it can attain the per capita income levels that have
been achieved today by most sub-Saharan African (SSA) countries (UNDP 2011).
However, the country’s domestic sources of finance are limited and cannot help it
achieve such a level of growth. In 2013, its gross domestic capital formation as a
share of GDP was around 33%, with gross domestic savings lagging behind at
around 6%. One alternative for filling this savings gap is through loans and
development assistance from multilateral agencies such as the World Bank and
IMF. However, as noted by Astatike and Assefa (2005) such a source of foreign
finance is unstable in nature.
Acknowledging this fact, the current Ethiopian government has opened several
economic sectors to foreign investors so that they fill the desired saving-investment
gap. The government has issued several investment incentives, including tax hol-
idays, duty-free imports of capital goods, and export tax exemptions to encourage
FDI. Further, the Ethiopian Investment Authority (EIA) has been established to
service investors and streamline investment procedures. In addition to liberalizing
investments, other areas of the external sector have also been liberalized through
unilateral, multilateral, and regional liberalization.
However, despite all these efforts, Ethiopia is not a major recipient of FDI
inflows. The country’s average share of global FDI inflows was only 0.01% in
2000–2013. In the same period, its annual average share in FDI inflows to the SSA
region was only 2%. The central question, therefore, is Why does Ethiopia not
attract much FDI?
There exists a very large body of literature on the determinants of FDI flows.
While most of them are cross-country studies in the developing world in general,
little has been done to investigate the determinants of FDI flows to Ethiopia
specifically. While cross-country studies are able to identify the factors that drive
FDI and examine its impact across countries, they fail to provide in-depth analyses
and country specific factors that are crucial in attracting FDI. Even the few studies
done on Ethiopia (which are by and large unpublished Masters’ theses) deal with
the economic determinants of FDI flows and ignore the role of political, gover-
nance, and institutional determinants of FDI flows to the host country. To the best
of our knowledge, ours is among the first studies that try to capture the effects of a
wide range of political and institutional quality indicators in the host country for
attracting FDI inflows. Among other things, most studies also share the problem of
a short series of data and omission of relevant macroeconomic variables in their
models. They are not theoretically and empirically systematic either. Our study
attempts to address these gaps.
The rest of the paper is organized as follows. Section 7.2 presents the trends in
FDI inflows to Ethiopia. Section 7.3 gives a review of the theoretical and empirical
7 Macroeconomic, Political, and Institutional Determinants of FDI … 125
Net FDI inflows to Ethiopia were at a mere US$3.9 million in 1970, representing a
very negligible share in global investment flows. This figure increased substantially
to US$953 million in 2013, although its share in global FDI flows was still a
decimal. This increase in FDI inflows to the country may be explained by factors
that characterized the economic and political landscape that prevailed over the
period under study. This period mainly witnessed two distinct political regimes.
The first period, 1974–1991 related to the Derg regime, where the socialist ideology
of a centralized command economic system controlled the sphere of socioeconomic
policy making in the country. As noted by Geda (2008), this regime was mainly
characterized by a deliberate repression of market forces and socialization of the
production and distribution process and adoption of a ‘hard control’ regime. In this
period, the country’s economic performance was highly irregular due to its
dependence on the agricultural sector (which is vulnerable to the vagaries of nature)
and the intense conflict that characterized the period (see Geda 2008). The second
period, post-1991 to the present, started with the coming to power of the Ethiopian
People Revolutionarily Democratic Front (EPRDF) in 1991, after the demise of
Derg. In terms of socioeconomic policies, there was a significant move away from
the doctrines of the command system in favor of a free market.
The regime has adopted structural adjustment policies of market liberalization
with the support of the World Bank and IMF (see Geda 2008). Economic perfor-
mance during this period has substantially improved not only by the Derg’s stan-
dards but also by African standards. The improvements in economic performance in
this period appear to be a combined result of the reforms, favorable weather con-
ditions, and better political stability and relative peace that have prevailed (see Geda
2008). Likewise, FDI inflows to the country have also registered a significant
increase in this period. They increased from a period’s average of US$5.9 million
during the Derg regime to around US$270 million in the EPRDF regime
(UNCTAD 2013). Thanks to the ups and downs (due to the global financial crisis in
2008 and deteriorating peace as a result of the war with Eritrea in 1998–2000,
among other things), net FDI inflows reached a level of nearly US$1 billion by
2013 (Fig. 7.1). As argued in a report of the Ethiopian Investment Commission
(2014), this was mainly due to the various liberalization policies, better economic
performance, and a stable political sphere that characterized the period.
126 A. Yimer
1000
953
800
600
400
200
0
1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2010 2013
-200
Fig. 7.1 FDI inflows to Ethiopia (1970–2013) (in million US$). Source Author’s computation
based on World Development Indicators (2015b) and UNCTAD (2013)
25.00
20.00
15.00
10.00
6.28
5.00
0.00
1970
1972
1974
1976
1978
1980
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
-5.00
Fig. 7.2 Ethiopia’s FDI inflows as percentage of gross fixed capital formation. Source Author’s
computation based on World Development Indicators (2015b) and UNCTAD (2013)
Total FDI inflows as a percentage of gross fixed capital formation in the country
were around 0.7% in 1990. This reached a little over 6% in 2013, despite the ups
and downs over the years. However, this is not a very big increase (see Fig. 7.2).
If we see the distribution of FDI inflows by sector, manufacturing led the list
(with a 70.6% share of the total FDI inflows) followed by the service sector (10.7%)
and agriculture (8.7%) (Ethiopian Investment Commission 2014).
7 Macroeconomic, Political, and Institutional Determinants of FDI … 127
(to the ‘South’) representing its crucial aspects. Krugman (1979) notes that tech-
nological progress raises the marginal product of capital and provides an incentive
for FDI. On the other hand, this process may be reversed through technology
transfer. Mainstream trade theories usually underlie this type of analysis. Recent
theories of trade such as that of the ‘economies of specialization’ which emphasize
the existence of intra-industry (as well as intra-firm) trade, also provide scope for an
analysis of FDI (see, for instance, Ocampo’s 1986 survey).
Notwithstanding Vernon’s contribution, building on Hymer’s original contri-
bution a second wave of refinements to the neo-classical capital movement/portfolio
theory of FDI has also come into being with the emergence of explanations based
on the ideas of ‘international firm’ and ‘industrial organization.’ The fact that
decision making about FDI takes place within the context of oligopolistic firm
structures and that such an investment includes a package of other inputs such as
intermediate imports and capital flows has led to the development of alternative
explanations grounded in the theory of industrial organization (see Agarwal 1980;
Dunning 1993; Helleiner 1989). In this approach as set out by Hymer, foreign firms
are seen as having an advantage over local ones. The foreign firms’ pursuit of FDI
is explained by the theory of internalization. This is characterized by the desire to
minimize transaction costs, a la Coase (1937) to tackle risks and uncertainties,
increase control and market power, achieve economies of scale, and ensure
advantageous transfer pricing (Buckley and Casson 1976; Hymer 1976). In this
approach, oligopoly is seen as mitigating, rather than creating market imperfections
(Helleiner 1989).
Dunning’s (1993) work, which he terms the ‘eclectic paradigm,’ represents a
culmination of this trend toward a refinement of FDI theories. Without departing
much from the Heckscher–Ohlin–Samuelson theory of trade for explaining the
spatial distribution of multinational firms, Dunning’s paradigm summarizes this
strand of theory under an ‘ownership-specific, location and internalization’
(OLI) framework (see Dunning 1993). Framed in a micro-macroeconomic frame-
work, Dunning’s (1981, 1988, 1993) approach provides a flexible and popular
framework where he argues that FDI is determined by three sets of advantages
which direct investments should have over the other institutional mechanisms
available for a firm in satisfying the needs of its customers at home and abroad. The
first of the advantages is an ownership (O)-specific one which includes the
advantage that a firm has over its rivals in terms of its brand name, patent, or
knowledge of technology and marketing. This allows the firm to compete with other
firms in the markets that it serves regardless of the disadvantages of being foreign.
The second is location (L)-specific advantages which relate to the importance for a
firm operating and investing in the host country and these advantages that make the
chosen foreign country a more attractive site for FDI than others. The third
advantage is the internationalization (I) advantage which relates to the preference of
a ‘bundled’ FDI approach over ‘unbundled’ product licensing, capital lending, or
technical assistance (Wheeler and Mody 1992). These refer to the superior com-
mercial benefits for firms resulting from the exploitation of ownership and
location-specific advantages by investing in foreign affiliates that they control,
7 Macroeconomic, Political, and Institutional Determinants of FDI … 129
rather than through transactions with unrelated firms located abroad. Helleiner
(1989) notes that ‘this “eclectic” theory of direct investment drawing on firm-
specific attributes, location advantages and internalization advantages—is widely
accepted.’ There also exists an international trade version of FDI determination
(termed the macro-approach) which is associated with Kojima (1973) work. The
Kojima model argues that FDI may be explained by the ‘comparative disadvantage’
of industry in the investing countries. According to Kojima’s theory, this may be
mitigated by investing in a foreign industry, which may be able to achieve com-
parative advantages in the production of a particular product and potentially even
export back to the home country. Naturally, this type of FDI will also have the
effect of increasing trade volumes (Kojima 1973).
In sum, the determinants of the FDI theory cover a range of explanations: the
pure capital movement, product cycle, industrial organization, the stagnation thesis,
and other political considerations. In the African context, the pure capital theory
does not work since the assumptions do not hold. Neither is Krugman’s hypothesis
workable since it is more relevant for countries with a good industrial base and
infrastructure. On the other hand, the concentration of multinational corporations in
the mining sectors in most African countries and, to a good degree, the importance
of the colonial history in determining their spatial pattern (see Geda 2002) might be
taken as lending support to the importance of the ‘eclectic’ approach. This theo-
retical insight is used in identifying FDI determinants in the empirical analysis and
construction of our model.
and Nigeria received foreign investments targeted at the oil and minerals sectors of
their economies (Basu and Srinivasan 2002). Though natural resource abundance is
a common factor which explains much of the FDI inflows, a few successful African
countries have also managed to attract FDI by creating favorable economic, social,
and political environments (Basu and Srinivasan 2002; UNCTAD 1998). For
instance, countries such as Mauritius and Seychelles have managed to attract FDI
by tailoring their FDI policies through liberalization, export orientation, tax, and
other investment incentives. Moreover, some countries such as Lesotho and
Swaziland have attracted FDI because they are near South Africa and investors
wanting to serve the large market in South Africa have located their subsidiaries in
these countries (Basu and Srinivasan 2002; UNCTAD 1998).
Asiedu (2002) analyzed 34 countries in sub-Saharan Africa over 1980–2000.
Using a panel data analysis, she found that openness to trade, higher incomes and
better growth prospects, and better institutional frameworks and infrastructure were
‘rewarded’ with more investments. Later studies by Asiedu (2003, 2006) show the
significant role of a country’s market size and natural resource endowment in
enhancing FDI. Lower inflation, good infrastructure, an educated population,
openness, less corruption, political stability, and a reliable legal system were also
found to have similar positive effects on FDI flows into the continent in these
studies. Asiedu and Gyimah-Brempong (2008) validated these finding to a large
extent and noted that countries that were small or lacked natural resources could
attract FDI by improving their institutions and policy environments.
Based on a co-integration analysis for 1970–2000 using data from 19 SSA
countries, Bende-Nabende (2002) found market growth, export-oriented policies,
and liberalization as the most dominant long-run determinants of FDI in Africa. In
line with Bende-Nabende (2002), focusing on manufactured goods, primary com-
modities, and services, Kandiero and Chitiga (2003) analyzed the impact of
openness on FDI flows to Africa in 51 African countries. Their findings indicate
that FDI responds significantly to increased openness in the whole economy in
general and in the service sector in particular.
Using fixed and random effects models on a panel dataset for 29 African
countries over the period 1975–1999, Onyeiwu and Shrestha (2004) identified
economic growth, inflation, openness of the economy, international reserves, and
natural resource availability as important determinants of FDI to Africa. Contrary to
conventional wisdom, political rights and infrastructure were found to be unim-
portant in their study. Krugell (2005) also empirically tested the significance of a
number of hypothesized determinants of FDI in sub-Saharan Africa. The pooled
cross-country and time-series estimation covered the period 1980–1999 in 17
countries. Krugell’s results are in line with the findings mentioned earlier, partic-
ularly with respect to economic growth and openness.
Abdoul (2012) estimated a model of FDI determination using five-year panel
data with the system-GMM technique over 1970–2009 for 53 African countries. He
found that larger countries attracted more FDI. However, regardless of their size,
more open and politically stable countries that offered higher returns to investments
also attracted FDI. FDI inflows were also found to be persistent in the sense that
7 Macroeconomic, Political, and Institutional Determinants of FDI … 131
countries that manage to attract FDI today are likely to attract more FDI in the
future. Using cross-country data for 53 African countries for the period 1996–2008,
Anyanwu (2012) found market size (whose proxy is urban population as percentage
of total population and GDP per capita of the host country), openness to trade, the
rule of law, foreign aid, natural resources, and past FDI inflows (increased
agglomeration) to have a positive effect on FDI inflows. He also found domestic
financial development to have a negative effect on FDI inflows. Further, he found
that East and Southern African sub-regions appeared positively disposed to
obtaining higher levels of inward FDI.
Among the most recent FDI studies on Africa, Geda and Yimer (2015) have
estimated a model of FDI determination for Africa based on a new analytical
country classification of African economies as ‘Fragile, Factor, and Investment
driven’ economies. Using a panel co-integration approach over 1996–2012 they
found market size, availability of natural resources, openness to international trade,
a stable macroeconomic environment, better infrastructure, and an effective
bureaucracy to have a strong positive impact on attracting FDI to the continent. On
the other hand, they also found that political and macroeconomic instability and
high financial and transfer risks had a negative effect on attracting FDI to the
continent. However, the effect of these factors varied significantly across the ana-
lytical country classification that they developed (Geda and Yimer 2015). Among
all determinants of FDI only government effectiveness and natural resource abun-
dance were found to be important across all countries. They stress on the impor-
tance of emphasizing different policies in different countries or country groups.
Country case studies on Africa, which invariably use time series analyses, have
reported results that are similar to those in recent cross section-based studies
reviewed earlier. Among these, Astatike and Assefa (2005) examined determinants
of FDI in Ethiopia over 1974–2001 using a time series analysis. Their empirical
analysis shows that economic growth, export orientation (openness), and liberal-
ization had a significant positive impact on FDI, while macroeconomic instability
(measured by inflation) and a low level of physical infrastructure (measured by
telephone lines per 1000 people) had a negative impact. Similarly, using a time
series analysis for Cameroon, Sunday and Lydie (2006) show that the level of
infrastructure development (increased electricity production and the ratio of paved
roads) was the most significant determinant of FDI in the country. Market size
(GDP per capita), openness, human capital development, and the rate of economic
growth were also important but were found to be less significant. Exchange rate,
political risk, the rate of inflation, debt burden, agglomeration effect, and the cre-
ation of an export-processing zone did not have any influence on FDI in Cameroon.
Seetanah and Rojid (2011) examined the determinants of FDI in Mauritius using
reduced-form demand for the inward FDI function. In their study, openness, wages,
and the quality of labor in the host country were important. Size of the market was
reported to have a relatively lesser impact on FDI; this is probably related to the
limited size of the population and the good export opportunities from Mauritius to
other African countries especially in SADEC/COMESA regions. The significant
coefficient of the lagged dependent variable in their model suggests the presence of
132 A. Yimer
dynamism in the system. Finally, Okpara (2012), using Granger causality and an
error correction model investigated the determinants of FDI flows to Nigeria during
1970–2009. He found that natural resource abundance, fiscal incentives, favorable
government policies, exchange rate, and infrastructural development had a positive
and statistically significant effect on FDI flows to Nigeria. Though statistically
insignificant, market size and trade openness were found to have a positive sign
while political risk was found to have a negative sign. Further, the statistically
significant error correction term revealed that past foreign investment flows could
significantly stimulate current investment inflows.
In sum, both the theoretical discussion in the previous section and the brief
review of empirical studies in this section show that market size, openness of the
economy, natural resource endowments, and political and macroeconomic stability
are important determinants of FDI flows to Africa. These are important factors that
any model about determinants of FDI flows to Africa needs to consider. However,
when examined in light of FDI theoretical literature, none of these African studies
formulate their empirical models by explicitly following one or the other strand of
literature. The variables used in their models, however, suggest the use of
Dunning’s eclectic paradigm without stating which variable is used as a proxy for
which theoretical concept. This is partly a result of missing theoretical discussions
and formulations in almost all these studies.
explanatory variables and remove the problems that may arise due to the presence
of auto-correlation and endogeneity. The ARDL co-integration estimates short-run
and long-run relationships simultaneously and provides unbiased and efficient
estimates. The appropriateness of using the ARDL model is that it is based on a
single equation framework. The ARDL model takes sufficient numbers of lags and
directs the data-generating process in a general to specific modeling framework
(Harvey 1981). Unlike other multivariate co-integration techniques such as
Johansen and Juselius (1990), the ARDL model permits the co-integration rela-
tionship to be estimated by OLS once the lag order of the model is identified. The
error correction model (ECM) can also be drawn by using the ARDL approach
(Pesaran and Shin 1999). ECM allows drawing outcomes for long-run estimates
while other traditional co-integration techniques do not provide such types of
inferences. As noted by Pesaran and Shin (1999), ECM joins together short-run
adjustments with long-run equilibrium without losing long-run information.
These advantages of the ARDL technique over other standard co-integration
techniques justify the application of ARDL approach in our study to analyze the
relationship among the FDI model’s variables.
In order to examine the long-run relationship and the dynamic interaction between
FDI and institutions, our study employs an ARDL modeling approach. According
to Pesaran et al. (2001) the ARDL approach requires three steps:
The first step is estimating the long-run relationship among the variables. This is
done by testing the significance of the lagged levels of the variables in the error
correction form of the underlying ARDL model. Following Pesaran et al. (2001),
our ARDL model can be written as:
where LFDI is log of FDI, LRGDP is log of real GDP, RES is log of natural
resource abundance, INF is log of the domestic annual inflation rate, LDEBGDP is
log of external debt to GDP ratio, LOPNES is log of openness, LNER is log of
nominal exchange rate, Polinst is an indicator of political stability, and quality of
institutions in the host country. As there is a high degree of multi-collinearity
among the six political and institutional indicators, we used each of the political and
institutional indicators separately. Hence, the variable Polinst indicates in all of the
three steps a model that incorporates only a single political and institutional indi-
cator among the macroeconomic variables. The selection of the optimum lagged
orders of the ARDL models is based on the Schwarz Bayesian Criterion (SBC). In
order to test co-integration among the variables, the Wald F-statistics for testing the
joint hypotheses has to be compared with the critical values as tabulated by Pesaran
et al. (2001).
The joint hypotheses to be tested are as follows:
H0 : b1 ¼ b2 ¼ b3 ¼ b4 ¼ b5 ¼ b6 ¼ b7 ¼ b8 ¼ 0
H1 : bi 6¼ 0; i ¼ 1; 2. . .; 8
If the F-statistic is higher than the upper bound critical value, the null hypothesis
ðH0 Þ is rejected, indicating that there is a long-run relationship between the lagged
level variables in the model. In contrast, if the F-statistic falls below the lower
bound, then H0 cannot be rejected and no long-run relationship exists. However, if
the F-statistic falls in between the upper bound and lower bound critical values, the
inference is inconclusive. At this condition, the order of integration of each variable
should be determined before any inference can be made.
In the second step, once the co-integration is established, the conditional ARDL
(p, q, r, s, t, u, v, w) long-run model of the determinants of LFDIt can be estimated
as follows:
X
p X
q X
r X
s
LFDIt ¼ a0 þ b1 LFDIt1 þ b2 RGDPt1 þ b3 LRESt1 þ b4 LINFt1
i¼1 i¼0 i¼0 i¼0
X
t X
u
þ b5 LDEBGDPt1 þ b6 LOPNESt1
i¼0 i¼0
Xv X
w
þ b7 LNERt1 b8 Polinstt1 þ et
i¼0 i¼0
X
p X
q X
r
DLFDIt ¼ a0 þ d1 DLFDIt1 þ d2 DLRGDPt1 þ d3 DLRESt1
i¼1 i¼0 i¼0
X
s X
t X u
þ d4 DLINFt1 þ d5 DLDEBGDPt1 þ d6 DLOPNESOt1
i¼0 i¼0 i¼0
Xv X
w
þ d7 LNERt1 þ d8 DLPolinst1 þ hECMt1 þ et
i¼0 i¼0
RGDP: Real GDP is a measure of the size of the host market, which also represents
the host country’s economic conditions and the potential demand for output.
Following the literature, real GDP is used to proxy for market size. Since this
variable is used as an indicator of the market potential for products of foreign
investors, the expected sign is positive.
RES: Natural resource availability. The availability of natural resources might be
a major determinant of FDI to the host country. FDI takes place when a country
richly endowed with natural resources lacks the amount of capital or technical skills
needed to extract or/and sell to the world market. Foreign firms embark on vertical
FDI in the host country to produce raw materials or/and inputs for their production
processes at home. This means that certain FDI may be less related to profitability
or market size of the host country than natural resources which are unavailable to
the domestic economy of foreign firms. As posited by the eclectic theory, all else
being equal, countries that are endowed with natural resources receive more FDI.
As noted by Asiedu (2002) very few studies on the determinants of FDI control for
136 A. Yimer
natural resource availability (except Morisset 2000; Geda and Yimer 2015). The
omission of natural resources from estimations, especially for African countries
may cause the estimates to be biased (Asiedu 2002). Given the absence of fuel and
other petroleum related resources in the country, the share of mining and quarrying
value added (current US$) is used to capture the availability of natural resource
endowments. This variable is considered acknowledging the fact that a good share
of FDI inflows to the country found its way to this sector.
OPNES: Trade openness as measured by total trade as a percentage of GDP. In
literature, the degree of liberalization of the trade regime in the host country is
regarded as a very important factor that promotes FDI inflows. This proxy is
important for foreign direct investors who are motivated by the export market. More
open economies usually follow ‘appropriate’ trade and exchange rate policies and
espouse a relatively liberal investment regime (Geda and Yimer 2015).
DEBGDP: External debt as a percentage of GDP. External debt is considered a
component of financial risk, influencing FDI inflows negatively (Nonnenberg and
Mendonca 2004). In addition, heavily indebted countries represent higher transfer
risks—the risk of potential restrictions on the ability to transfer funds across
national boundaries. Transfer risks are an important component of country risks and
a variable closely monitored by foreign investors. Higher transfer risks may cause
foreign capital to move out of a country and new FDI flows to be re-routed to safer
locations. The sign associated with EXTDEBTGDP is expected to be negative.
INF: Annual inflation rate. This is another important variable of macroeconomic
stability indicators which may affect FDI. It represents changes in the general price
level or inflationary conditions in the economy. In our study, the impact of inflation
rates on FDI is expected to be negative.
NER: The nominal exchange rate. The effect of changes in exchange rates on
FDI flows is ambiguous. Elbadawi and Mwega (1997), among others, used the real
exchange rate as an indicator of a country’s international competitiveness,
hypothesizing that a real depreciation would attract larger FDI flows. However, it
may be argued that unless the purpose of FDI flows to a country is to build an
export platform overvalued exchange rates should not represent a considerable
hurdle to foreign investors. On the contrary, depreciation increases the costs of
imported inputs and reduces the foreign currency value of profit remittances, both
of which have adverse effects on the profitability of FDI projects. This effect will
dominate if FDI is undertaken primarily to serve the domestic market. Thus, if we
assume that a prospective investor uses the previous year’s change in the exchange
rate as a guide to its evolution in the near future, we would expect a negative sign
on the variable Δ ER (since an increase in the index represents a depreciation).
As noted by Schneider and Frey (1985) political instability and the frequent
occurrence of disorder ‘create an unfavorable business climate which seriously
7 Macroeconomic, Political, and Institutional Determinants of FDI … 137
erodes the risk-averse foreign investors’ confidence in the local investment climate
and thereby repels FDI away.’ Political stability, as argued by Aseidu (2002), is a
significant factor in location decisions of multinational corporations (MNCs),
especially in their decisions to invest in African states.
Our study used the Worldwide Governance Indicators (WGI) research dataset of
the Political Risk Services (2015) to capture the effect of political instability and
quality of institutions in attracting FDI inflows to the host country. This dataset
summarizes the views on the quality of governance provided by a large number of
enterprises, citizens, and expert survey respondents in industrial and developing
countries. This data was gathered from a number of survey institutes, think tanks,
non-governmental organizations, international organizations, and private sector
firms.
WGI projects constructs of aggregate indicators of six broad dimensions of
governance: Voice and accountability; political stability and absence of violence/
terrorism; government effectiveness; regulatory quality; the rule of law; and control
of corruption. The six aggregate indicators are based on 31 underlying data sources
reporting the perceptions of governance of a large number of survey respondents
and expert assessments worldwide.1
Voice and accountability (VOIACC): Reflects perceptions about the extent to
which a country’s citizens are able to participate in selecting their government, as
well as freedom of expression, freedom of association, and a free media.
Political stability and absence of violence/terrorism (POLSTAB): Reflects per-
ceptions about the likelihood that the government will be destabilized or over-
thrown by unconstitutional or violent means including politically-motivated
violence and terrorism.
Government effectiveness (GOVEFFE): Reflects perceptions about the quality
of public services, the quality of civil services and the degree of its independence
from political pressures, the quality of forming and implementing policies and the
credibility of the government’s commitment to such policies.
Regulatory quality (RQ): Reflects perceptions about the government’s ability to
formulate and implement sound policies and regulations that permit and promote
private sector development.
Rule of law (RoL): Reflects perceptions about the extent to which agents have
confidence in and abide by the rules of society, in particular, the quality of contract
enforcement, property rights, the police and the courts, as well as the likelihood of
crime and violence.
Control of corruption (CORR): Reflects perceptions about the extent to which
public power is exercised for private gain, including both petty and grand forms of
corruption as well as the ‘capture’ of the state by elites and private interests.
Political and institutional risk rating, as provided by the International Country
Risk Guide of Political Risk Services (2015), awards the highest value to the lowest
1
Details on the underlying data sources, the aggregation method, and the interpretation of the
indicators, can be found in Kaufmann et al.’s (2010) WGI methodology paper.
138 A. Yimer
risk and the lowest value to the highest risk and provides a means for assessing the
political and institutional framework of countries. The expected signs for all the
institutional variables are positive, which indicates that better quality institutions
will stimulate more foreign investments.
As there is a high correlation among the political and institutional indicators and
the possibility of a high degree of multi-collinearity among them, we used each of
the political and institutional indicators separately and hence estimated six separate
models (see Annexure 1 for the correlation matrix).
In an econometric analysis, before carrying out any estimation, a test for station-
arity2 of the variables in the model is undertaken. We found that some of the
variables to be integrated were of order one-I(1), while others to be integrated were
of order zero-I(0) (see Table 7.1).
Once checked for the unit root tests, the next step in the bounds test approach for
co-integration is estimating the ARDL model using the appropriate lag length. One
of the most important issues in applying ARDL is choosing the order of the dis-
tributed lag functions. Pesaran et al. (2001) have shown that the Schwarz Bayesian
Criterion (SBC) should be used in preference over other model specification criteria
because it often has more parsimonious specifications: the small data sample in our
current study further reinforces this point. Since we had 43 annual observations, we
chose two as the maximum lag length in the ARDL model.
For all the models, the bound test for co-integration with the null hypothesis of
no long-run relationship among the variables is rejected as the F-statistic is greater
than that of the upper bound critical value even at the one percent significance level.
This proved the presence of a long-run relationship among the variables of interest
in each of the models estimated (Table 7.2).
In the standard least squares model, the coefficient variance-covariance matrix is
derived with a key assumption that the error terms are conditionally homoskedastic
and serially uncorrelated (White White 1980). In cases where this assumption is
relaxed to allow for heteroskedasticity or auto-correlation, the expression for the
covariance matrix will be different and our inferences based on it will be misleading
(Roecker 1991; White 1980; Wooldridge 2000, among others).
Given that the problem of heteroskedasticty and serial correlation is a customary
problem in a time series analysis, it is necessary to estimate the coefficient
covariance under the assumption that the residuals are conditionally
heteroskedasticity and auto-correlated (Newey and West 1987). The coefficient
2
In this study, the Augmented Dickey-Fuller unit root testing procedure (which does not take into
account a structural break in the data) and the Lumsdaine and Papell (1997) unit root test (which
captures two structural breaks in a series) are used. Though the latter is not reported here, both tests
are in conformity.
7 Macroeconomic, Political, and Institutional Determinants of FDI … 139
have proposed a more general covariance estimator that is consistent in the presence
of both heteroskedasticity and auto-correlation of unknown form. This procedure is
followed in our study. Tables 7.3 and 7.4 present the long-run and short-run
determinants of FDI inflows to Ethiopia based on the ARDL approach.
(A) The long-run model
In line with previous empirical studies on Africa, most of the explanatory
variables have their expected signs in the long run. Market size (as proxied by
GDP), trade openness (as proxied by trade as a percentage of GDP), resource
abundance and deprecation in the official exchange rate are found to have a sig-
nificant positive impact on FDI inflows in the long run.
The significant positive long-run coefficient on the GDP variable is in line with
theory and suggests the presence of market seeking FDI inflows to the country.
Given that Ethiopia is home to more than 90 million people and a rising
middle-class population this may not be surprising.
The positive sign of the resource abundance indicator variable, as proxied by the
mining and quarrying value added, indicates the presence of resource seeking FDI
inflows to the country. This is not surprising given that a good share of FDI inflows
to the country found their way to this sector.
The significant positive coefficient on the exchange rate variable may indicate, as
noted by Elbadawi and Mwega (1997) among others, that depreciation in Ethiopia’s
exchange rate is affecting the inflows of FDI positively.
On the other hand, macroeconomic instability as proxied by the inflation rate
was found to affect FDI inflows negatively. The significant negative coefficient of
the inflation variable in the long run implies that foreign investors prefer investing
their money in countries where they perceive better macroeconomic stability.
Similarly, the significant positive coefficient of the trade openness variable suggests
that liberalization in the external trade sector of the country has encouraged FDI
inflows; this also supports the proposition that foreign investors are more likely to
invest in countries that have opened up to the outside world (see Onyeiwu and
Shrestha 2004; Asiedu 2006; Anyanwu 2012; among others).
In addition, better political stability and absence of violence/terrorism, govern-
ment effectiveness in forming and implementing quality policies and the credibility
of the government’s commitment to such policies, regulatory quality with regard to
the ability of the government to formulate and implement sound policies and
regulations that permit and promote private sector development, and better per-
formance of the rule of law affect FDI inflows into the country positively.
(B) The short-run model
In line with previous empirical studies on Africa, most of the macroeconomic
determinants of FDI inflows have their theoretical expected signs in all the models
in the short run. Market size, natural resource abundance, and trade openness were
found to affect FDI inflows in a significant positive way. The positive sign of the
natural resource availability variable as proxied by the mining and quarrying value
7 Macroeconomic, Political, and Institutional Determinants of FDI … 141
added indicates the presence of resource seeking FDI flows to the country. This is
not surprising given that a good share of FDI inflows to the country found their way
to this sector.
The consistent negative coefficient of the inflation variable in all the models in
the short run implies that foreign investors prefer investing their money in countries
where they perceive better macroeconomic stability. Similarly, the significant
positive coefficient of the trade openness variable suggests that liberalization in the
142 A. Yimer
Table 7.4 The short-run model: Error correction model’s (ECM) results
Dependent variable: D(log of net FDI inflows)
Sample: 1970–2013; no. of observations: 43
Variables Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
ARDL ARDL ARDL ARDL ARDL ARDL
(1, 1, 0, 0, (1, 1, 0, 0, (1, 0, 0, 1, (1, 1, 0, 0, (1, 0, 0, 1, (1, 1, 0, 1,
0, 0, 2, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0)
Coefficient Coefficient Coefficient Coefficient Coefficient Coefficient
D(Log of real 4.06*** 3.66*** 1.57* 3.93*** 1.22* 2.7**
GDP per
capita)
D(Log of 2.36** 2.50** 1.33* 2.27** 1.78* 1.46
natural
resource
abundance)
D(Log −1.28** −2.06* −1.75 −1.81 −2.24* −1.38
inflation)
D(Log of −0.29 −0.61 −0.19 −0.65 −0.23 −0.72
external debt to
GDP ratio)
D (Log of 0.19* 0.18* 0.02 0.16 0.04 0.04
openness)
D(Log of −3.36** −0.98 −0.06 −1.52 0.44 −1.08
nominal
exchange rate)
D(Rule of law) 4.38**
D(Political −0.26
stability)
D(Government 4.19***
effectiveness)
D(Control of 0.66
corruption)
D (Regulatory 5.09**
quality)
D(Voice and 4.02**
accountability)
ECMt−1 −0.92*** −0.88*** −0.84*** −0.79*** −0.90*** −0.78***
Note ***, ** and * indicate 1, 5 and 10% level of significance respect
external trade sector of the country has encouraged FDI inflows and also supports
the proposition that foreign investors are more likely to invest in countries that have
opened up to the outside world (see Onyeiwu and Shrestha 2004; Asiedu 2006;
Anyanwu 2012; Geda and Yimer 2015; among others).
In addition, except for controlling corruption and political stability, all the other
political and institutional indicators have their a prior expected significant positive
signs. Among the political and institutional indicators, better regulatory quality,
7 Macroeconomic, Political, and Institutional Determinants of FDI … 143
better performance of the rule of law, and government effectiveness have a sig-
nificant positive effect on FDI inflows to the country.
As Table 7.4 shows, the expected negative sign of the error correction term
(ECM) is highly significant, suggesting that deviations from the long-term trajec-
tory are corrected very quickly. The ECM coefficient shows how quickly/slowly the
relationship returns to its equilibrium path, and it should have a statistically sig-
nificant coefficient with a negative sign. This holds for all the models estimated. As
noted by Banerjee et al. (1998), a highly significant error correction term is further
proof of the existence of a stable long-term relationship.
(C) Diagnostic and stability tests
As shown in Table 7.5 all the estimated models had a good fit. In addition, all
the models passed all the exhaustive post-estimation diagnostic tests. Such tests
included the normality test, heteroskedasticity test, test for serial correlation, model
specification and stability test and a test for normality. In analyzing the stability of
the long-run coefficients together with short-run dynamics, the cumulative sum
(CUSUM) and the cumulative sum of squares (CUSUMQ) were applied (see
Annexure 2 for the results). Following Pesaran et al. (2001), the stability of the
regression coefficients was evaluated by stability tests as they can show whether or
not the regression equation is stable over time. This stability test is appropriate in
time series data, especially when we are uncertain about when structural change
might have taken place.
As can be seen in the graphs in Annexure 2, the plots of both CUSUM and
CUSUMSQ statistics moved between the critical bounds at the 5% significance
level and did not cross the lower and upper critical limits. The latter implies that the
estimated coefficients were stable and there was no structural break.
7.6 Conclusion
Based on the ARDL modeling approach along the lines of Dunning’s (1981, 1988)
‘eclectic theory,’ this study identified the main determinants of FDI flows to
Ethiopia for the period 1970–2013. The results of the empirical modeling exercise
in this study conclusively support the hypothesis that FDI in Africa is conditional
on prudent macro-policies and enabling business environments manifested through
better political stability and institutional quality. Better macroeconomic conditions,
political stability, institutional quality, and resource availability affect FDI flows to
Ethiopia positively. The effect of depreciation in the exchange rate was also found
to effect FDI inflows positively.
Prudent fiscal and monetary policies to tackle the negative impact of inflationary
pressures on FDI inflows and a move toward a careful liberalization of the foreign
exchange market and of external trade are important policy options that the gov-
ernment could work on to boost FDI inflows to the country. In addition, sustaining
the current growth momentum of the economy and further strengthening political
stability in the country, taking sincere steps to increase transparency, controlling
corruption and improving the regulatory quality of the country’s institutions are
fundamental areas that the government could work on to strengthen the country’s
position in the FDI inflows to the continent.
Further, regarding institutional and political factors, foreign investors are
attracted to those African countries that are more democratic. To attract foreign
investors, the country needs to improve its political and social situation and elevate
its democracy from a mere electoral level to a more liberal one. What is needed,
therefore, is deep introspection and political reforms of the various institutions and
political parties seeking to govern so as to promote a sustained commitment to
democracy that will guarantee equal citizenship, political pluralism, freedom,
human rights, general respect for others, and socio-political cum economic
inclusion.
7 Macroeconomic, Political, and Institutional Determinants of FDI … 145
Model 1
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8 0.0
-12 -0.2
-16 -0.4
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
Model 2
CUSUM CUSUMSQ
16 1.4
1.2
12
1.0
8
0.8
4
0.6
0
0.4
-4 0.2
-8 0.0
-12 -0.2
-0.4
-16
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
Model 3
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8 0.0
-12 -0.2
-16 -0.4
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
Model 4
CUSUM CUSUMSQ
16 1.4
1.2
12
1.0
8
0.8
4
0.6
0
0.4
-4 0.2
-8 0.0
-12 -0.2
-0.4
-16
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
Model 5
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8
0.0
-12 -0.2
-16 -0.4
84 86 89 90 92 94 96 98 00 02 04 06 08 10 12 84 86 89 90 92 94 96 98 00 02 04 06 08 10 12
Model 6
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8 0.0
-12 -0.2
-16 -0.4
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
References
Abdoul GM (2012) What drives foreign direct investments in Africa? An empirical investigation
with panel data. African Center for Economic Transformation (ACET), Accra
Agarwal JP (1980) Determinants of foreign direct investment: a survey. Rev World Econ
116(4):39–773
Anyanwu JC (2012) Why does foreign direct investment go where it goes? New evidence from
african countries. Ann Econ Fin 13(2):433–470
Asiedu E (2002) On the determinants of foreign direct investment to developing countries: Is
Africa different? World Dev 30(1):107–118
Asiedu E (2003) Foreign direct investment in Africa: the role of government policy, institutions
and political instability. KS working paper No. 23, University of Kansas
Asiedu E (2006) Foreign direct investment in Africa: the role of natural resources, market size,
government policy, institutions and political instability. World Econ 29(1):63–77
7 Macroeconomic, Political, and Institutional Determinants of FDI … 149
UNCTAD (2013) World investment report: trends and determinants. In: United Nations
conference on trade and development. United Nations, New York
UNDP (2011) Illicit financial flows from the least developed countries: 1990–2008, UNDP
discussion paper
Vernon R (1966) International investment and international trade in the product cycle. Quart J
Econ 80:190–207
Wheeler D, Mody A (1992) International investment location decisions: the case of US firms. J Int
Econ 33:57–76
White H (1980) A heteroskedasticity-consistent covariance matrix and a direct test for
heteroskedasticity. Econometrica 48:817–838
Wooldridge JM (2000) Introductory econometrics: a modern approach. South-Western College
Publishing, Cincinnati, OH
Part III
Capital Structure and Bank Loan Growth
Effects
Chapter 8
Firm-Specific Determinants of Insurance
Companies’ Capital Structure in Ethiopia
8.1 Introduction
Capital structure (CS) is a mix of long-term debt, specific short-term debt, common
equity, and preferred equity. It shows how a firm finances its overall operations and
growth by using different sources of funds. While looking at what constitutes CS,
debt comes in the form of bond issues or long-term notes payable and equity as
common stock, preferred stock, or retained earnings. It is in insurance companies’
Y. Takele (&)
College of Business and Economics, Addis Ababa University,
Addis Ababa, Ethiopia
e-mail: yitbarekt87@gmail.com
D. Beshir
Libya Oil Ethiopia Ltd, Addis Ababa, Ethiopia
interest to know about their CS patterns as they need funds to settle claims or pay
damages at the time of loss. This helps insurance companies to be sustainable
because of the nature of risks involved in their businesses and the inherent
impracticality of retaining all risks that they face during operations.
The paper is structured as follows. Section 8.2 gives a brief overview of the
Ethiopian insurance sector. Section 8.3 discusses major theoretical underpinnings
of the subject. The next section addresses the link between theoretical lenses and the
variables chosen along with empirical reviews and the conceptual framework.
Section 8.5 explains the relationship among the variables, the methodology, and
data, while Sect. 8.6 analyzes the empirical results. Section 8.7 gives a conclusion.
The determinants of CS have been debated for many years and still represent one
of the unresolved issues in the corporate finance literature. Though a few of the
theories that have been developed have been empirically tested, their findings have
led to different, anomalous, and sometimes conflicting results and conclusions. This
also suggests that the different theories are not mutually exclusive making the
debates on CS more exciting (Rajan and Zingales 1995). Moreover, Morri and
Beretta (2008) emphasize the lack of a fully supported and commonly accepted
theory of CS decisions and the unfolding nature of its determinant factors.
The different studies have made immense contributions to the theory of CS.
However, these studies are inclined toward the developed economies, and less
developed countries have received little attention. This has raised concerns about
the generalizability of such works, for example, where capital markets are not well
developed or are underdeveloped. Consequently, research designs, methodologies,
and theoretical frameworks that best fit such contexts are worth undertaking. In
previous studies, antecedent variables, commonly regarded as determinants of CS
decisions, include profitability, age, agency cost, business risk, asset tangibility,
growth, non-debt tax shields, liquidity, political risks, and size. These variables,
among others, are related to firm value and risk exposure in one way or another.
Our study, therefore, investigates the determinants of decisions about CS in the
insurance industry in Ethiopia during 2005–2014. Our research identified six
hypotheses ðHai Þ:
Ha1 : There is a negative relationship between leverage and profitability in Ethiopian
insurance companies.
Ha2 : There is a positive relationship between leverage and asset tangibility in
Ethiopian insurance companies.
Ha3 : There is a positive relationship between leverage and growth in Ethiopian
insurance companies.
Ha4 : There is a negative relationship between leverage and business risk in
Ethiopian insurance companies.
Ha5 : There is a positive relationship between leverage and size of the firm in
Ethiopian insurance companies.
Ha6 : There is a negative relationship between leverage and liquidity in Ethiopian
insurance companies.
8 Firm-Specific Determinants of Insurance Companies’ … 157
The emergence of modern insurance in Ethiopia can be traced back to the estab-
lishment of the Bank of Abyssinia in 1905. The bank acted as an agent for foreign
insurance companies to underwrite fire and marine policies. The first domestic
private insurance company was established in 1951 with a share capital of Eth Br
1,000,000, and in the 1960s, the number of domestic private companies was started
increasing (Zeleke 2007).
At present, there are 15 insurance companies that are operational in Ethiopia that
provide general insurance services, except one, which provides life insurance. One
of the insurance companies, the Ethiopian Insurance Corporation (EIC), is
state-owned, while the rest are private. Ethiopian insurance companies’ investment
activities are heavily constrained by the restrictions imposed by the National Bank
of Ethiopia’s investment proclamation which requires them to invest a majority of
their funds in government securities and bank deposits at negative real interest rates.
Moreover, lack of infrastructure, especially a stock market, has constrained
investment activities of Ethiopian insurance companies (Mezgebe 2010). Following
this, competition has become stiff in the industry and some of the private insurance
companies that want to increase their sales volumes have been granting unfair and
huge discounts to attract clients, thus attaining sales targets. This aggressive pricing
policy has led to an unhealthy spiral of premium cutting which significantly
undermines the growth and prospects of the insurance industry in Ethiopia.
The dynamic trade-off theory, on the other hand, recognizes the role of time that
requires specifying a number of aspects that are typically ignored in a single-period
model. Of particular importance are the roles of expectations and adjustment costs.
In a dynamic model, the correct financing decision typically depends on the
financing margin that a firm anticipates in the next period (Goldstein et al. 2001).
Thus, an optimal financial choice today depends on what is expected to be optimal
in the next period.
Agency cost is another theory that predicts that CS choice is dependent on
agency cost. It advocates an investigation of the conflicting interests of managers
and equity and debt holders and its impact on CS decisions. It argues that managers
who are well placed to access superior information as compared to both debt and
equity holders, mainly due to ex-post asymmetric information (Jensen and
Meckling 1976; Jensen 1986), may make CS decisions that maximize their interests
but destroy the firm’s value.
Yet another interesting theory is the pecking order theory developed by Myers
and Majluf (1984) which states that CS is driven by a firm’s desire to finance new
investments, first internally and then with low-risk debt and finally, if all fails, with
equity. Its main thesis is an association of asymmetric information and signaling
problems with external financing.
Finally, Baker and Wurgler (2002) have suggested another theory of CS: the
‘market timing theory of CS.’ Market timing implies that firms issue new shares
when they perceive they are overvalued and that firms repurchase their own shares
when they consider these to be undervalued.
What we can deduce from these theories is that they are not mutually exclusive
and do not stand on their own; rather, there exists a thread connecting them:
information asymmetry. The exception to this could be the trade-off theory which
mainly bases itself on tax shield advantages and bankruptcy costs.
Myers’ (1984) views, empirical evidence from financial and non-financial firms
(Ahmed et al. 2010; Gill et al. 2009; Najjar and Petrov 2011; Rajan and Zingales
1995; Sharif et al. 2012; Teker et al. 2009) found that profitable firms used less debt
financing in line with the pecking order theory, while studies by Kumar et al. (2012)
and Sayeed (2011) found that profitable firms used more debt finance. As a proxy
for the measure of profitability, our study used the ratio of operating income to total
assets (return on assets) used by Booth et al. (2001), Cassar and Holmes (2003),
Mohammed Amidu (2007), and Adesola (2009).
According to Jensen and Meckling’s (1976) agency cost theory, there is a
conflict between lenders and shareholders due to the possibility of moral hazard on
the part of borrowers. This conflict creates incentives for shareholders to invest in a
suboptimal way, and lenders require tangible assets as collateral to protect them-
selves. The agency cost of debt increases when firms cannot collateralize their
debts. The outsized proportion of a firm’s assets can be used as collateral to fulfill
lenders’ requirements. In the trade-off theory, Modigliani and Miller (1963) argue a
reduction in financial distress costs for those firms with more tangible assets
because of a better chance to get debt financing. Empirical studies by Najjar and
Petrov (2011); Noulas and Genimaks (2011); Rajan and Zingales (1995); and
Titman and Wessels (1988) found that firms with more proportion of tangible assets
raised more debt using the same as collateral. As indicated in the studies by
Mohammed Amidu (2007) and Adesola (2009), our study also used the ratio of
fixed assets over total assets as a proxy measure of tangibility.
The pecking order theory argues that firms prefer debt financing over equity due
to its riskiness, and hence, a positive relationship between leverage and growth is
expected. However, in the static trade-off theory, growing firms face financial
distress and prefer to use equity financing. Empirical studies by Ahmed et al.
(2010); Noulas and Genimaks (2011); Kumar et al. (2012); and Sharif et al. (2012)
have found that growing firms used more debt to finance their businesses. Contrary
to this, studies by Rajan and Zinglas (1995); Shah and Khan (2007); and Titman
and Wessels (1988) show that growing firms used equity financing instead of debt.
In our study, sharing the argument given by Dawood et al. (2011) and Onaolapo
and Kajola (2010) growth was measured as annual percentage change in total
assets.
The static trade-off theory (Myers 1984) argues that risky firms can borrow less
as compared to less risky firms because the costs of financial distress offset the tax
shields of debt. The riskier a firm, the greater the chance of defaulting and being
exposed to such costs. That is, high-volatile earning firms face a risk of the earnings
level dropping below their debt servicing commitments, thereby incurring higher
costs of financial distress. Hence, such firms should reduce their leverage levels to
avoid the risk of bankruptcy. As indicated in Song (2005), income variability is a
measure of business risk. In our study, it is measured as the ratio of the standard
deviation of operating income over total assets.
Theoretically, the static trade-off theory states that for large companies the risk of
bankruptcy is minimized due to the economies of scale. The assets of a company
will be financed more through debt, as optimality of CS can be reached by
160 Y. Takele and D. Beshir
balancing the benefits and costs of debt (Modigliani and Miller 1958). The
empirical results of Ahmed et al. (2010); Kumar et al. (2012) and Najjar and
Petrove’s (2011) studies support the argument that the size of a firm and its leverage
are positively related. According to the pecking order theory, however, informa-
tional asymmetry for large firms is smaller, and as a result, they prefer to be
financed by equity instead of debt (Myers and Majluf 1984) because this reduces
the chances of undervaluation of the new issued equity and thus encourages the
large firms to use equity financing. In our study, such as Booth et al. (2001) and
Cassar and Holmes (2003), the natural log of total assets is used to measure the size
of the firm.
There are two different opinions about the association between liquidity and CS.
The first view, as explained in the trade-off theory, argues that firms with more
liquidity tend to use more external borrowings because of their ability to pay off
their liabilities. On the contrary, the pecking order theory believes that firms with
financial slack will prefer internal sources than debt or equity to finance future
investments (Myers 1984). Most previous studies confirm the negative relation.
Harris and Raviv (1991); Najjar and Petrov (2011); and Sharif et al. (2012) found
that firms with high liquidity ratios or more liquid assets preferred using these assets
to finance their investments and discouraged raising external funds (either equity
or debt). But Bayeh found an insignificant effect of liquidity on leverage usage
by insurance companies. But Bayeh found an insignificant effect of liquidity on
leverage usage by insurance companies. Like Dawood et al. (2011) in our study
also, the ratio of current assets to current liabilities was used to capture liquidity
(see Table 8.1).
Our study used the quantitative research approach to construct an empirical model.
Multiple regression analyses were used to measure the effects of the determinants
on the output variable and to examine the associative relationships between vari-
ables in terms of the relative importance of the independent variables and predicted
values of the dependent variables.
Our study used secondary data from annual reports of insurance companies and
the National Bank of Ethiopia (NBE). As per NBE’s current information, 15 in-
surance companies are operating in the country. Since there are only a few insur-
ance companies, there was no need to take a sample from them. Accordingly, based
on the years of service, audited financial data of those insurance companies which
were operational in 2005–2014 were included in our study. The reason behind
selecting the stated period was to obtain strongly balanced data for the analysis. In
order to make the panel data model structured and balanced, the same regular
frequency of the cross-sectional data with the same start and end dates was
maintained. Six insurance companies did not have the required data for the period
and were excluded from the sampling frame. Moreover, one insurance company is
8 Firm-Specific Determinants of Insurance Companies’ … 161
The subscript i represents the cross-sectional dimension, and t denotes the time
series dimension. The left-hand side in the equation, Yi;t , represents the dependent
variable in the model, which is a firm’s leverage. On the right side, Xi;t represents
162 Y. Takele and D. Beshir
the set of independent variables in the estimated model. Therefore, the expanded
forms of both models built in line with the hypothesis of the study are as follows:
DEBT model: debt ratio (total debt/total asset) as the dependent variable
(1) TD=TAit ¼ b0 þ b1 ðPFit Þ þ b2 ðTNit Þ þ b3 ðGRit Þ þ b4 ðRKit Þ þ b5 ðSZit Þ þ b6
ðLQit Þ þ e DE Model: debt–equity ratio as the dependent variable
(2) D=Eit ¼ b0 þ b1 ðPFit Þ þ b2 ðTNit Þ þ b3 ðGRit Þ
þ b4 ðRKit Þ þ b5 ðSZit Þ þ b6 ðLQit Þ þ e
where
TD/TA Total debt to total assets
D/E Debt to equity
PF Profitability
TN Tangibility
GR Growth
RK Risk
SZ Size of the firm
LQ Liquidity
e Error term
The models were tested for the classical linear regression model’s (CLRM)
assumptions. Accordingly, Shapiro–Wilk, the correlation matrix, and Breusch–Pagan
tests were conducted to test normality, multi-collinearity, and heteroskedasticity,
respectively. We found no multi-collinearity problem which would exist if the cor-
relation between the two independent variables was more than 0.75 (Malhotra 2008).
Moreover, Shapiro–Wilk showed that normality had been established. See
Annexure 2 for diagnostic tests.
We used the regression models and applied different tests (Breusch and Pagan
Lagrangian multiplier (LM) test, Hausman test) to choose the best model for the
panel data under the study:
• Pooled OLS (POLS) model regression,
• Pooled OLS with dummy variable (least square dummy variable: LSDV) model
regression or fixed effects regression model, and
• Random effects GLS (generalized lease square) model regression.
Before explaining the results of the regression analysis, the results of the descriptive
statistics and Pearson’s correlation coefficient matrix are briefly explained.
The mean of debt ratio (total debt to total assets) of the 80 observations was
66.8% with a standard deviation of 8.3% indicating that more than 66% of the
balance sheets of insurance companies in Ethiopia were debt-financed, while the
mean debt ratio in the USA and in the UK is 58 and 54%, respectively (Rajan and
8 Firm-Specific Determinants of Insurance Companies’ … 163
Table 8.2 Descriptive Variable Obs. Mean Std. dev. Min Max
summary statistics
TD/TA 80 0.668 0.083 0.453 0.822
D/E 80 0.755 0.405 −0.189 1.669
gro 80 0.231 0.157 −0.066 0.670
tang 80 0.194 0.110 0.026 0.542
pr 80 0.082 0.049 −0.047 0.182
risk 80 0.141 0.099 0.025 0.432
size 80 18.914 0.843 16.965 20.294
lq 80 1.022 0.264 0.543 2.306
Source Structured review of annual financial report (generated
from STATA)
Zingales 1995) (Table 8.2). Though theoretically it is argued that firms in devel-
oped countries are levered as compared to their developing country counterparts
mainly due to their well-developed bond markets, the findings of our study show
otherwise. This could be related to the absence of stock markets in developing
country which makes equity financing more unattractive. What is interesting about
the descriptive statistics of our results is the presence of high variability in the
growth, tangibility, size, and liquidity of insurance companies in Ethiopia which
may stress the need to consolidate the sector through mergers and acquisitions.
Annexure 1 presents all model selection tests including the results for the POLS
model regression, the fixed effects (or LSDV) regression model, and the random
effects model regression. We used the Breusch and Pagan Lagrangian multiplier
(LM) test to decide between random effects and POLS and the Hausman test to
decide between random effects and fixed effects models.
The results of Breusch and Pagan LM test for the DEBT model revealed that
there was very strong evidence (p-value 0.0006) at the 1% level of significance
against the null hypothesis; POLS is appropriate. This result suggests the random
effects model’s estimation over the pooled OLS model. The same LM test for the
DE model showed indifference between POLS and the random effects model’s
estimations. Moreover, the results of the Hausman test showed very strong evidence
(p-value 0.0085 for the DEBT model and p-value 0.0012 for the DE model) against
the null hypothesis at the 1% level of significance suggesting fixed effects estimates
rather than random effects estimates. Accordingly, the analysis and discussion of
results are based on the fixed effects estimates.
In order to make the fixed effects estimation results robust, the modified Wald
group-wise heteroskedasticity test in the fixed effects regression model was
undertaken. The results for both the DEBT and DE models revealed very strong
164 Y. Takele and D. Beshir
The results of the fixed effects model with a robust standard error regression for the
DEBT model are presented in Table 8.3. The results show that asset tangibility,
profitability, risk, and liquidity had a negative relation with debt ratio, while growth
and firm size had a positive association with leverage. The results also indicate that
growth and tangibility were statistically significant at 5%. Moreover, profitability
and liquidity were significant at 1%, while risk and firm size were insignificant. In
Table 8.3 Fixed effects estimates with a robust standard error for the DEBT model’s regression
Fixed effects (within) regression: DEBT Number of obs. 80
MODEL
Group variable: ID Number of 8
groups
R2 Within 0.7165 Obs. per group: 10
min
Between 0.8782 avg 10.0
Overall 0.7918 max 10
F (6,7) 1792.72
cor ðui ; xbÞ 0.4602 Prob > F 0.0000
(Std. Err. adjusted for 8 clusters in ID)
lev Coeff. Robust std. err. T p > |t| [95% conf. Interval]
gro 0.757 0.022 3.44 0.011 0.024 0.128
tang −1.366 0.045 −3.04 0.019 −0.243 −0.030
Pr −0.583 0.100 −5.80 0.001 −0.821 −0.345
risk −0.319 0.198 −1.61 0.151 −0.787 0.148
size 0.016 0.024 0.68 0.521 −0.014 0.074
lq −0.120 0.016 −7.61 0.000 −0.157 −0.083
_cons 0.582 0.490 1.19 0.274 −0.577 1.741
Sigmau 0.032
Sigmae 0.029
rho 0.554 (Fraction of variance due to ui )
Source Structured review of annual financial report (generated using STATA)
8 Firm-Specific Determinants of Insurance Companies’ … 165
addition, the value of R2 -within = 0.7165 and adjusted R2 ¼ 0:6931 for the DEBT
model. Hence, 69.31% of the variability in leverage is explained by selected
firm-specific factors.
The results of the fixed effects model with a robust standard error regression for the
DE model are given in Table 8.4. The results show that profitability, risk, and
liquidity had a negative relation with the debt–equity ratio, while asset tangibility,
growth, and firm size had a positive association with the debt–equity ratio. The
results also indicate that only profitability and liquidity were statistically significant
at 5%. The other explanatory variables were insignificant. In this model, the value
of R2 -within was 0.5199 and adjusted R2 was 0.4804. This shows that only 48% of
the variability in the debt–equity ratio is explained by selected firm-specific factors.
Table 8.4 Fixed effect estimates with a robust standard error for the DE model regression
Fixed effects (within) regression: DE MODEL Number of obs. 80
Group variable: ID Number of 8
groups
R2 Within 0.5199 Obs. per group: 10
min
Between 0.7077 avg 10.0
Overall 0.6022 max 10
F (6,7) 74.13
cor ðui ; xbÞ 0.2470 Prob > F 0.0000
(Std. Err. adjusted for 8 clusters in ID)
lev Coeff. Robust std. err. t p > |t| [95% conf. interval]
gro 0.283 0.189 1.49 0.179 −0.165 0.731
tang 0.061 0.731 0.08 0.936 −1.669 1.791
Pr −2.128 0.569 −3.74 0.007 −3.475 −0.781
risk −0.802 0.803 −1.00 0.351 −2.700 1.096
size 0.220 0.114 1.93 0.095 −0.050 0.490
lq −0.345 0.116 −2.98 0.020 −0.619 −0.072
_cons −2.848 2.192 −1.30 0.235 −8.032 2.336
Sigmau 0.179
Sigmae 0.215
rho 0.410 (Fraction of variance due to ui )
Source Structured review of annual financial report (generated using STATA)
166 Y. Takele and D. Beshir
The empirical findings of both the models indicate that profitability and liquidity
were significant in determining Ethiopian insurance companies’ financing deci-
sions, while business risk and size of a firm were found to be insignificant in
shaping the behavior of the firm. On the other hand, asset tangibility and growth
opportunities for firms had a significant impact on the total debt ratio. However,
these factors were insignificant for the debt–equity ratio. Insurance companies in
Ethiopia rely on short-term debt due to the absence of a stock market in the country.
They also depend more on external borrowings to expand their markets.
Based on previous studies and an extensive literature review, the major theories
of CS including the static trade-off theory, the pecking order theory, and the agency
8 Firm-Specific Determinants of Insurance Companies’ … 169
theory were selected and an attempt was made to identify the theory that best
explained the financial decision behavior of insurance companies in Ethiopia. The
results revealed that pecking order, information asymmetry, and the static trade-off
theories were all important in explaining the CS of insurance companies in
Ethiopia, even if the pecking order theory appeared to be dominant.
Considering the current growth opportunities for insurance companies in
Ethiopia, internal sources of funding might not be enough. Therefore, it is advisable
not to depend only on internal sources of funds. Having a reasonable proportion of
long-term debt in CS is considered a priority for growth in developing countries as
this helps them utilize available market opportunities. Moreover, the industry
should keep in touch with the trade-off theory since it has strong practical appeal; it
rationalizes moderate debt ratios and sets a target debt-to-equity ratio.
Future Research Direction
Macroeconomic factors (such as inflation, GDP, and interest rate), other qualitative
factors (management quality of each insurance company, policies, and procedures),
and the ownership structures of the companies which might have an impact on CS
choice and the effect of regulation on solvency and CS of insurance companies are
recommended as area for further research. Moreover, there is a need to thoroughly
study why pecking order happens to be the dominant theory in explaining the
financing behavior of insurance companies in Ethiopia.
POLS model regression, fixed effects (or LSDV) regression model, and the random
effects model regression results of the DEBT model regression
(continued)
Variable POLS LSDV Fixed effects Random effects
7 −0.073***
8 −0.0566**
_cons 0.949*** 0.615* 0.582 0.651**
N 80 80 80 80
Note *p < 0 0.05; **p < 0.01; ***p < 0.001
Source Structured review of annual financial report (generated using STATA)
POLS model regression, fixed effects (or LSDV) regression model, and the
random effects model regression results of the DE model regression
Breusch and Pagan Lagrangian multiplier test for random effects: DEBT model
lev [ID, t] = xb + u[ID] + e[ID, t]
Estimated results:
Var sd = sqrt (var)
lev 0.0069 0.0832
e 0.0008 0.0291
u 0.0003 0.0180
Test: var (u) = 0
Chi2 ð01Þ ¼ 10:63
prob [ Chi2 ¼ 0:0006
Source Structured review of annual financial report (generated using STATA)
Breusch and Pagan Lagrangian multiplier test for random effects: DE model
lev[ID, t] = xb + u[ID] + e[ID, t]
Estimated results:
var sd = sqrt (var)
lev 0.164 0.405
e 0.046 0.215
u 0 0
Test: var(u) = 0
Chi2 ð01Þ ¼ 0:00
prob [ Chi2 ¼ 1:0000
Source Structured review of annual financial report (generated using STATA)
Coefficients
(b) (B) (b − B) Sqrt (diag(v_b − v_B))
Fixed effects Random effects Difference S.E
gro 0.076 0.089 −0.13 .
tang −0.137 −0.206 0.069 0.370
pr −0.583 −0.645 0.062 0.014
(continued)
172 Y. Takele and D. Beshir
(continued)
Coefficients
(b) (B) (b − B) Sqrt (diag(v_b − v_B))
Fixed effects Random effects Difference S.E
risk −0.319 −0.329 0.087 0.067
size 0.016 0.015 0.001 0.009
lq −0.120 −0.147 0.027 0.012
b = consistent under H0 and Ha ; obtained from xtreg
B = inconsistent under Ha and efficient under H0 ; obtained from xtreg
Test: H0 : difference in coefficiens not systematic
Chi2 ð6Þ ¼ (b − B)′[(v_b − v_B) ^ (−1)] (b − B)
¼ 17.21
prob [ Chi2 ¼ 0.0085
Source Structured review of annual financial report (generated using STATA)
Coefficients
(b) (B) (b − B) Sqrt (diag(v_b −v_B))
Fixed effects Random effect Difference S. E
gro 0.283 0.496 −0.213 .
tang 0.061 −0.834 0.893 0.358
pr −2.128 −3.222 1.094 0.329
risk −0.802 −1.530 0.728 0.639
size 0.220 0.119 0.102 0.0903
lq −0.345 −0.674 0.329 0.115
b = consistent under H0 and Ha ; obtained from xtreg
B = inconsistent under Ha and efficient under H0 ; obtained from xtreg
Test: H0 : difference in coefficiens not systematic
Chi2 ð6Þ ¼ (b − B)′[(v_b −v_B) ^ (−1)] (b − B)
¼ 22.10
prob [ Chi2 ¼ 0.0012
Source Structured review of annual financial report (generated using STATA)
8 Firm-Specific Determinants of Insurance Companies’ … 173
References
Abor J (2005) The effect of capital structure on profitability: empirical analysis of listed firms in
Ghana. J Risk Financ 6:438–445
Adesola WA (2009) Testing static tradeoff theory against pecking order models of capital structure
in Nigerian quoted firms. Glob J Soc Sci 8(1):51
Ahmed N, Ahmed Z, Ahmed I (2010) Determinants of capital structure: a case of life insurance
sector of Pakistan. Eur J Econ Financ Adm Sci 24:7–12
Amidu M (2007) Determinants of capital structure of banks in Ghana: an empirical approach.
Balt J Manag 2(1):67–79
Baker M, Wurgler J (2002) Market timing and capital structure. J Finance 57(1):1–32
Booth L, Aivazian V, Demirguc-Kunt A, Maksimovic V (2001) Capital structures in developing
countries. J Finance 56(1):87–130
Cassar G, Holmes S (2003) Capital structure and financing of SMEs: Australian evidence. Account
Finance 43(2):123–147
Dawood MHAK, Moustafa ESI, El-Hennawi M (2011) The determinants of capital structure in
listed Egyptian corporations. Middle East Finance Econ 9:83–99
Ebru Ç (2011) An empirical investigation on the determinants of capital structures of Turkish
firms. J Finance Econ 9:35–42
Gill A, Biger N, Pai C, Bhutani S (2009) The determinants of capital structure in the service
industry: evidence from United States. Open Bus J 2:48–53
Goldstein R, Ju N, Leland H (2001) An EBIT-based model of dynamic capital structure. J Bus 74
(4):483–512
Harris M, Raviv A (1991) The theory of capital structure. J Finance 46(1):297–355
Jensen MC (1986) Agency cost of free cash flow, corporate finance, and takeovers. Corporate
Finance, and Takeovers. Am Econ Rev 76(2):323–329
Jensen MC, Meckling WH (1976) Theory of the firm: managerial behavior, agency costs and
ownership structure. J Financ Econ 3(4):305–360
Kindie AB (2011) Determinants of capital structure on Ethiopian insurance companies. Thesis
Addis Ababa University, Addis Ababa
Kumar MS, Dhanasekaran M, Sandhya S, Saravanan R (2012) Determination of financial capital
structure on the insurance sector firms in India. Eu J Soc Sci 29(2):288–294
Malhotra NK (2008) Marketing research: an applied orientation, 5th edn. Pearson Education India
Mezgebe M (2010) Assessment of the reinsurance business in Developing Countries: Case of
Ethiopia. MA thesis, Graduate School of Business, University of South Africa
8 Firm-Specific Determinants of Insurance Companies’ … 175
Modigliani F, Miller MH (1958) The cost of capital, corporation finance and the theory of
investment. Am Econ Rev 48(3):261–297
Modigliani F, Miller MH (1963) Corporate income taxes and the cost of capital: a correction. Am
Econ Rev 53(3):433–443
Morri G, Beretta C (2008) The capital structure determinants of REITs. Is it a peculiar industry?
J Eur Real Estate Res 1(1):6–57
Myers SC (1984) The capital structure puzzle. J Finance 39(3):574–592
Najjar N, Petrov K (2011) Capital structure of insurance companies’ in Bahrain. Int J Bus Manag 6
(11):138
Noulas A, Genimakis G (2011) The determinants of capital structure choice: evidence from Greek
listed companies. Appl Financ Econ 21(6):379–387
Onaolapo AA, Kajola SO (2010) Capital structure and firm performance: evidence from Nigeria.
Eur J Econ Finance Adm Sci 25:70–82
Rajan RG, Zingales L (1995) What do we know about capital structure? Some evidence from
international data. J Finance 50(5):1421–1460
Sayeed MA (2011) The determinants of capital structure for selected Bangladeshi listed
companies. Int Rev Bus Res Pap 7(2):21–36
Shah A, Khan S (2007) Determinants of capital structure: evidence from Pakistani panel data. Int
Rev Bus Res Pap 3(4):265–282
Sharif B, Naeem MA, Khan AJ (2012) Firm’s characteristics and capital structure: a panel data
analysis of Pakistan’s insurance sector. Afr J Bus Manage 6(14):4939
Solomon MA (2012) Characteristics and capital structure: a panel data analysis from Ethiopian
insurance industry. Int J Commer Manag 3(12):21–27
Song HS (2005) Capital structure determinants an empirical study of Swedish companies. CESIS
Electronic Working Paper Series 2005–25
Teker D, Tasseven O, Tukel A (2009) Determinants of capital structure for Turkish firms: a panel
data analysis. Int Res J Finance Econ 29:179–187
Titman S, Wessels R (1988) The determinants of capital structure choice. J Finance 43(1):1–19
Zeleke H (2007) Insurance in Ethiopia: historical development, present status and future
challenges. Master Printing Press, Addis Ababa, Ethiopia
Chapter 9
Income Distribution and Economic
Growth
Atnafu Gebremeskel
Abstract This paper links access to bank loans and income distribution to pro-
ductivity growth. Its main focus is on examining how functional income distribu-
tion can influence the evolution of productivity and thereby promote economic
growth. We obtained key variables and their evolution from the Ethiopian Central
Statistical Agency dataset on medium and large scale manufacturing firms. The
paper uses the evolutionary economic framework and the evolutionary theory
jointly with its evolutionary econometric approach. This sees economic growth as
an open-ended process. The major findings and conclusions of this paper are lack of
strong evidence of evolution (intra-industry selection) to foster productivity growth
and reallocation (structural change). The employment share of each firm within an
industry entered the model with a negative sign but a significant coefficient. In
economic terms, the positive and negative coefficients of labor share within a firm
and employment share of each firm within the industry give us important infor-
mation about structural changes within the manufacturing sector. The key policy
lesson is that access to bank loans is of great importance to firms. This is partic-
ularly so for industries such as spinning, tanning and publishing in which all firms
that had access to bank loans revealed movements in their employment shares. This
is evidence of structural transformation. It is desired that future research includes
economy-wide modeling, estimation and more formalization of evolutionary eco-
nomic models to study the link between access to bank loans and its effects on
income distribution and inclusive economic growth.
Keywords Income distribution Evolutionary economics Evolutionary econo-
metrics Productivity Growth
A. Gebremeskel (&)
Department of Economics, Addis Ababa University, Addis Ababa, Ethiopia
e-mail: atnafuga@gmail.com
9.1 Introduction
incentive and growth considerations might be traded off against equity goals. On
the other hand, development economists have long expressed counter-arguments.
For example, Todaro (1997) provides four general arguments why greater
equality in developing countries may in fact be a condition for self-sustaining
economic growth: (a) dissaving and/or unproductive investments by the rich;
(b) lower levels of human capital held by the poor; (c) demand pattern of the poor
being more biased toward local goods; and (d) political rejection by the masses.
Overall, the view that inequality is necessary for accumulation and that redis-
tribution harms growth has faced challenges from many fronts. For example,
Alesina and Rodrik (1994) and Persson and Tabellini (1994) combine political
economy arguments with the traditional negative incentive effect of redistribution.
These authors maintain that inequalities affects taxation through the political pro-
cess when individuals are allowed to vote in order to choose the tax rate (or,
equivalently, vote to elect a government whose programs include a certain redis-
tributive policy). If inequalities determine the extent of redistribution, then this will
have an indirect effect on the rate of growth of the economy.
In their paper ‘Social Conflict, Growth and Income Distribution,’ Benhabib and
Rustichini (1996) explore the effect of social conflict arising due to income dis-
tribution on both short-run and long-run economic growth rates. According to them,
despite the predictions of the neo-classical theory of economic growth, poor
countries were observed to invest at lower rates and have not grown faster than rich
countries. They studied how the level of wealth and the degree of inequalities
affected growth and show how lower wealth can lead to lower growth and even to
stagnation when the incentives to domestic accumulation are weakened by redis-
tributive considerations.
Perotti (1996) contends that equality has a positive impact on growth while
Rehme (2006) argues that redistributing governments may have a relatively
stronger interest in technological advances or high economic integration. He
observes a positive association between redistribution and growth across countries.
While we can find vast literature on income inequalities and economic growth
similar to the studies mentioned earlier, they exclude the role of firms and the
mechanisms behind them for the creation and evolution of the links between in-
come distribution and economic growth. However, the existence of firms and their
actions are recognized in economic theory.
Thus, our introduction of firms in such an analysis is not arbitrary. Firms play a
central role in shaping the path of economic theory and as sources of growth in the
process of economic evolution. This argument is theoretically consistent with one
of the questions in economics (Coase 1937). Thus, any analysis which omits the
role of firms in the creation and evolution of income distribution in the growth
process cannot make a complete description. More specifically, empirical evidence
on how firms’ financial structures can influence their productivity and thereby drive
economic growth is scarce. This study bridges this gap.
Two crucial questions arise for policymakers which have policy relevance. The
first is whether inequality is a prerequisite for growth. And the second concerns the
180 A. Gebremeskel
prices. The troubled economic times after World War I, in particular the great
depression, also pulled the attention of economists toward analyzing shorter-run
phenomena such as balance of payment disequilibria, inflation and unemployment.
There was a renaissance of interest in long-run economic growth after World
War II. One reason for this was that new national product data was first available for
USA and later for other advanced industrial nations. This for the first time allowed
economists to measure economic growth at the national level (Nelson 1996).
In modern times, the starting point for any study of economic growth is the
neo-classical growth model which emphasizes the role of capital accumulation.
This model, first constructed by Solow (1956) and Swan (1956), shows how eco-
nomic policy can raise an economy’s growth rate by inducing people to save more.
But the model also predicts that such an increase in growth cannot last indefinitely.
In the long run, a country’s growth rate will revert to the rate of technological
progress, which neo-classical theory takes as being exogenous. Underlying this
long-run result is the principle of diminishing marginal productivity which puts an
upper limit on how much output a person can produce simply by working with
more and more capital given the state of technology. Aghion and Howitt (1992,
1998) provide a presentation on this.
The strengths of the neo-classical approach for economic growth are consider-
able. The neo-classical theory has provided a way of thinking about the factors
behind long-run economic growth in individual sectors and in the economy as a
whole. The theoretical structure has called attention to historical changes in factor
proportions and has focused an analysis of the relationship between those changes
and factor prices. These key insights and the language and formalism associated
with them have served to effectively guide and to give coherence to research that
has been done by many different economists around the globe. The weakness of the
theoretical structure is that it provides a grossly inadequate vehicle for analyzing
technical change.
The fundamental problems with neo-classical explanations of economic growth
are: (1) despite much empirical efforts at the neo-classical production function, the
model still faces problems in explaining considerable inter-plant and international
differences in productivity as well as differences between developed economies.
Even more striking is evidence for single industries, showing big sectoral pro-
ductivity gaps between different countries (Hodgson 1996); and (2) increasing
capital creates a growing burden of depreciation. It is also noted that the economic
life of capital assets has been declining. In particular, the orthodox formulation
offers no possibility of reconciling analyses of growth undertaken at the level of the
economy or the sector with what is known about the processes of technical changes
at the microeconomic level. Hodgson (1996) has a detailed account of this and
similar arguments.
In response to some of the problems in the standard neo-classical growth theory, the
idea of an endogenous growth theory emerged in the works of Romer (1986, 1987,
1990, 1994), Lucas (1988) and a second generation variant pioneered by Aghion
and Howitt (1992, 1998). They developed the endogenous growth theory which
includes a mathematical explanation of technological advancement.
This broke from the preceding neo-classical thinking by encompassing learning
by doing and knowledge spillover effects. In these models, cumulative divergence
of national output and productivity becomes more likely than convergence and thus
seems to correspond more adequately to available data.
However, the amended aggregate production function is still at the conceptual
foundation of the endogenous growth models, typically embodying features such as
increasing marginal productivity of knowledge but diminishing returns in the
productivity of knowledge (Hodgson 1996).
Therefore, overall, there are constant returns to capital and economies never
reach a steady state. Growth does not slow as capital accumulates, but the rate of
growth depends on the type of capital that a country invests in. Research done in
this area has focused on what increases human capital (for example, education) or
technological change (for example, innovation).
9 Income Distribution and Economic Growth 183
The basic paradigm in mainstream economic theory, namely that individuals take
decisions in isolation using only the information received through some general
market signals such as prices, is built on the general equilibrium model. However,
as is well known, this model guarantees neither stability nor uniqueness of equi-
librium. Since the latter is essential for macroeconomists who wish to use com-
parative statistics, they have had to avoid this fundamental problem by resorting to
what has become the standard paradigm in modern macroeconomics, that is, the
representative agent (RA) framework.
The basic assumption is that the behavior of the aggregate can be treated as the
behavior of an average individual. The use of such an approach has been frequently
contested and has several obvious disadvantages. Firstly, it means that one has to
ignore communication and direct interaction among agents and ultimately defines
away the problem of coordination (Hahn and Solow 1995; Leijonhufvud 1992). In
this setting, interaction and coordination occur only through prices. The role of
prices is undoubtedly important, but the price mechanism alone can work only if
information is complete; in such a case, one can ignore the influence of other
coordination and interaction mechanisms. Here, again, these difficulties can be
sidestepped by assuming that a sector of the economy can be described by a RA.
There is no simple, direct, correspondence between individual and aggregate
regularities. It may be that in some cases, aggregate choices correspond to those that
can be generated by an individual. However, even in such exceptional cases, the
individual in question cannot be thought of as maximizing anything meaningful
from the point of view of society’s welfare. Our approach is exactly the opposite
from the representative individual approach. Instead of trying to impose restrictions
on aggregate behavior, by using, for example, the first-order conditions obtained
from the maximization program of the representative individual, the claim is that
the structure of aggregate behavior (macro) actually emerges from the interaction
between the agents (micro). In other words, statistical regularities emerge as a
self-organized process at the aggregate level: complex patterns of interacting
individual behavior may generate a certain regularity at the aggregate level. The
idea of representing a society by one exemplar denies the fact that the organiza-
tional features of the economy play a crucial role in explaining what happens at the
aggregate level.
The way in which markets are organized is assumed to have no influence on
aggregate outcomes. Thus, aggregate behavior, unlike that of biological or physical
systems, can be reduced to that of a glorified individual. Such an idea has, as a
corollary, the notion that collective and individual rationality are similar. What we
suggest is that collective outcomes be thought of as a result of an interaction
between agents who may have rather simple rules of behavior and who may adapt
184 A. Gebremeskel
rather than optimize. Once one allows for direct interaction among agents, mac-
robehavior cannot, in general, be thought of as reflecting the behavior of a ‘typical’
or ‘average’ individual.
The key assumption behind the construction of the aggregate production func-
tion is that all factor markets are perfect in the sense that individuals can buy or sell
as much as they want at a given price. With perfect factor markets (and no risk), the
market must allocate the available supply of inputs to maximize total output
(extensively found in Gatti et al. 2007 and the literature cited there).
Evolutionary theory in economics is as old as economics itself. It was pioneered
by Veblen (1898) when he asked, ‘Why is economics not an evolutionary science?’
and suggested that the only rational approach for economists was to assume that
economies evolve. Otherwise, he argued, we can describe an economy but have no
effective theory of change and development.
Veblen started his argument by asserting that all modern sciences are evolu-
tionary sciences (1898: 374) while Alchian (1950) brought out the evolutionary
approach as an alternative framework in economics. He started by proposing a
suggestion for a modification of economic analyses to incorporate incomplete
information and uncertain foresight as axioms. In the words of Alchian, this
approach dispensed with ‘profit maximization’ and it did not rely on predictable
individual behavior that is usually assumed as a first approximation in standard
textbook treatment.
The suggested approach embodies the principles of biological evolution and
natural selection by interpreting economic systems as an adaptive mechanism which
chooses among exploratory actions generated by the adaptive pursuit of ‘success’ or
‘profit.’
Krugman (1996) articulates economics as it is about what individuals do: not
classes, not ‘correlations of forces’ but individual actors. This is not to deny the
relevance of higher levels of analyses, but they must be grounded in individual
behavior. Methodological individualism is of the essence. He further notes that
individuals are self-interested. He extends his argument by saying that there was
nothing in economics that inherently prevented us from allowing people to derive
satisfaction from others’ consumption, but the predictive power of economic theory
came from the presumption that normally people care about themselves.
Individuals are intelligent; they do not neglect obvious opportunities for gain. It
is often asserted that economic theory draws its inspiration from physics, and that it
should become more like biology. If that is what you think, you should do two
things. First, read a text on evolutionary theory, like John Maynard Smith’s
Evolutionary Genetics. You will be startled at how much it looks like a textbook on
microeconomics. Second, try to explain a simple economic concept, like supply and
demand, to a physicist. You will discover that our whole style of thinking, of
building up aggregative stories from individual decisions, is not at all the way they
think (Krugman 1996). Veblen and Krugman’s suggestion is that ‘evolutionary
economics is the only rational proposition’ (Boulton 2010).
9 Income Distribution and Economic Growth 185
The renaissance in evolutionary economics in the past two decades has brought
with it a great deal of theoretical developments and interdisciplinary import (Dopfer
and Potts 2004).
Inspired by Veblen’s theory, evolutionary economics has become one alternative
approach to economic analyses involving complex economic interactions. Recent
contributors include Nelson’s (1974), Neo-classical vs Evolutionary Theories of
Economic Growth: Critique and Prospectus. More importantly, Richard Nelson
and Sidney Winter’s seminal work An Evolutionary Theory of Economic Change
(1982), Dopfer’s The Evolutionary Foundations of Economics (2005) and
Beinhocker’s The Origin of Wealth, Evolution, Complexity and the Radical
Remarking of Economics (2006) are advancements in the theory of evolutionary
economics.
The questions to be answered before using an evolutionary theoretical frame-
work to understand how economies grow are: What is evolutionary economics?
Why evolutionary economics? What are the theoretical foundations of evolutionary
economics? Where do economies come from? (Beinhocker 2006). How do the
behaviors, relationships, institutions and ideas that underpin an economy form, and
how do they evolve over time?
Beinhocker has argued that questions about origins play a prominent role in most
sciences because like it will be difficult to imagine modern cosmology without the
Big Bang or biology without evolution, it would be hard to believe that economics
could ever truly succeed as a science if it were not able to answer the question
‘Where do economies come from?’
Yet, the question about the origin of economies has not played a central role in
traditional economics which has tended to focus on how an economy’s output is
allocated rather than how it got there in the first place. The process of economy
formation presents us with a first-class scientific puzzle and one of the sharpest
distinctions between traditional economics and what is described as Complexity
Economics (Beinhocker 2006).
But what is evolution in economic science? A relatively narrow definition of
evolution is change in the mean characteristics of a population (Andersen 2004).
Economic growth, that is, the aggregate change in real output per person, is a
consequence of increasing the productivity of the factors of production and of
technological changes in a very wide sense. For a constant participation rate, it can
be modeled as a change in firm-level mean real output per employee weighted by
the firm’s employment share in the total number of firms in the economy. In Holm
(2014) this is referred to as the evolution of labor productivity.
The key ideas of evolutionary theory are that firms at any time are viewed as
possessing various capabilities, procedures and decision rules that determine what
they do given external conditions. They also engage in various ‘search’ operations
whereby they discover, consider and evaluate possible changes in their ways of
doing things. Firms, whose decision rules are profitable, given the market envi-
ronment, expand; those firms that are unprofitable contract. The market environ-
ment surrounding individual firms may be in part endogenous to the behavioral
186 A. Gebremeskel
system taken as a whole; for example, product and factor prices may be influenced
by the output of the industry and the demand for inputs (Nelson and Winter 1982).
According to Holm (2014), economic evolution is an open-ended process of
novelty generation and the reallocation of resources. Selection is the sorting of a
population of agents (firms) that is implicit to their differential growth rates. Firms
perform innovations and develop knowledge in attempts to gain decisive compet-
itive advantages over competitors, but firms are intentionally rational agents with
limited information and innovation; so more generally, learning may also lead to
decreased productivity. Firms prosper or decline as a result of the interaction
between their own learning activities, the learning activities of competitors and the
external factors that set the premises for the interaction. We can find more on this in
Dosi and Nelson (2010) and Metcalfe (1998). Safarzyńska (2010) also has an
excellent survey.
Holm (2014) explores how the evolution of productivity or any other charac-
teristic in a population of firms can be described. According to him, evolution can
be understood as the sum of two effects, which is referred to by different names in
literature: inter-firm or reallocation or selection effect and intra-firm or learning or
innovation effect. To this, the effects of entry and exit are added but as far as entry is
the introduction of new knowledge by entrepreneurs and exit is the disappearance
of an inferior firm, these effects are also learning and selection. As a stylized
depiction of economic evolution Holm (2014) expresses evolution as the total effect
of selection, learning, entry and exit.
Whereas inter-firm selection is driven by the process of competition,
inter-industry selection is driven by the process of structural change, which is
somewhat different. Productivity understood as physical efficiency is important in
competition among firms which produce homogenous products, for example,
within industries. This is less the case with heterogeneous outputs because com-
puting physical efficiency for heterogeneous products does not make sense because
as the composition of demand changes over time, not least as a consequence of
economic growth in itself, relative prices change as well and this affects
inter-industry selection (Holm 2014).
Holm has emphasized the importance of indicating the basic differences between
standard growth theories and growth theories in evolutionary economics.
Evolutionary economists (for example, Richard Nelson, Eric Beinhocker, Geoffrey
Hodgson and John Foster) strongly argue that an evolutionary framework is more
encompassing than standard approaches. Carlsson and Eliasson (2003) note that
economic growth can be described at the macrolevel but never explained at that
level. Economic growth is basically a result of experimental project creation and
selection in a dynamic market and in hierarchies of the capacity of the economic
system to capture winners and losers. Castellacci (2007) gives a review on the
evolution of evolutionary theories in economics which is presented in Table 9.1.
Metcalfe et al. (2006) explored an evolutionary theory of adaptive growth. They
supposed economic growth as a product of structural change and economic
self-transformation based on processes that were closely connected with but not
reducible to the growth in knowledge.
9 Income Distribution and Economic Growth 187
Table 9.1 Contrast between new growth theories and evolutionary growth
Issues New growth theories Evolutionary theories
What is the main level of Aggregate models based on Toward a co-evolution between
aggregation? neo-classical micro-foundations micro-levels and macrolevels of
(methodological individualism) analysis (‘non-reductionism’)
Representative agent or Representative agent and Heterogeneous agents and
heterogeneous typological thinking population thinking
individuals?
What is the mechanism Learning by doing and searching Combination of various forms of
of creation of activity by: the R&D sector; learning with radical
innovation? radical innovations; and general technological and organizational
purpose technologies innovations
What is the dynamics of History is a uniform-speed Toward a combination of
the growth process? transitional dynamics gradualist and dynamics: history
How is history is a process of qualitative change
conceived? and transformation
Is the growth process ‘Weak uncertainty’ (computable ‘Strong’ uncertainty:
deterministic or risk): stochastic but predictable non-deterministic and
unpredictable? process unpredictable process
Toward equilibrium or Toward the steady state Never ending and ever changing
never ending
In Eq. 9.1, b is the net, that is, it allows for deterioration or deaths, firm
entry-exit rate or diffusion coefficient, and K is the carrying capacity of the envi-
ronment, for example, total industry or economy’s market size, employment or
output over which each firm will compete to capture as much of it. K is a constraint,
for example, the total sales of an industry and X could be a firm’s sales so that
X/K is the firm’s market share.
Two points must be raised about Eq. 9.1. First X/K can be understood as any
share. If we are to work at the macrolevel, we may interpret X/K as the ratio of GDP
to capital stock. This ratio is less than 1 because at any point in time the total
9 Income Distribution and Economic Growth 189
national output is some fraction of inputs, the magnitude of the fraction depending
on the productivity of the economy.
Equation 9.1 can be expanded to employ the existing econometric framework
for estimation. Foster and Wild (1999b) have acknowledged that the application of
the LDE of this type has been common in literature on the economics of innovation,
following Griliches’s (1957) pioneering work. However, economists have tended to
view LDE in terms of disequilibrium adjustments from a stable equilibrium state to
another in economics of the evolutionary growth theory.
As it stands, Eq. 9.1 depicts a smooth process tending toward infinite time. Only
in a discrete interval version of LDE, we can generate the kinds of discontinuities
that we can see in historical data. However, discrete interval dynamics are not
pronounced features of most aggregated economic data. Thus, it is unlikely that we
can generate a discontinuity endogenously in most cases.
Now, it is convenient for the purposes of an econometric investigation to rear-
range Eq. 9.1 in the following way to obtain the Mansfield (1981) variant,
employed in many such studies. Dividing both sides of Eq. 9.1 by K and rear-
ranging, we arrive at:
Xt1
Xt Xt1 ¼ Xt1 b 1 þ ut
K ð9:2Þ
InXt InXt1 ¼ b bXt1 =K þ et where et ¼ ut =K
The transformation into approximation in Eq. 9.2 allows the logistic equation to
be estimated linearly and the error term is corrected for bias because of the upward
drift of the mean of the X-series.
Equation 9.2 offers a representation of the endogenous growth of a
self-organizing system subject to time irreversibility and constrained by boundary
limits. To come up with the complete econometric model, Foster and Wild qualified
their argument in the following ways:
(a) Regulation in the economic system can restrict economic agents and their
organizations to particular market niches. This means, again, that the principle
of competitive exclusion is significantly weakened. For example, governments
restrict the issue of bank licenses, which preserves a niche which non-bank
financial institutions have difficulty entering. Typically, competition in the
economic sphere is overlaid by ‘public interest’ regulations that attempt to limit
competition;
(b) Economic sub-systems rely on an interaction with the wider economic system
in order to engage in trade. Thus, the K limit for a particular system will tend to
rise continually in line with the general expansion of economic activity; and
(c) Increasing politicization of an economic system will lead to more predator–prey
type interactions. This will tend to occur in saturation phases of LD growth.
Thus, we do not always witness smooth transitions from one LD growth path
to another but, instead, Schumpeterian ‘creative destruction’, dominated by
190 A. Gebremeskel
Thus, b and K are now themselves functions of other variables. The function ()
allows for factors that affect the diffusion coefficient, rendering it non-constant over
time and K() takes into account the factors in the greater system that expand or
contract the capacity limit faced by the system in question. The resource compe-
tition term, a(), is now a more general functional relationship than the simple
mechanism containing, for example, relative prices and existing demand for a
particular product, the general economic condition in the environment.
A potential problem with Eq. 9.3 is that as X tends to its limit, growth in X will
tend to zero so that the impact of factors in b() will also tend to zero. This is
unlikely to be the case, so it is more appropriate to allow exogenous variables that
affect the diffusion rate to influence the rate of growth of X with the same strength at
all points on the logistic diffusion:
Xt1
InXt InXt1 ¼ ½bðÞ 1 aðÞ þ bðÞ þ et ð9:4Þ
KðÞ
Xt1
InXt InXt1 ¼ ½bðÞ 1 aðÞ þ bðÞ þ cðInXt InXt1 Þt1 þ et
KðÞ
ð9:5Þ
Stockhammer et al.’s (2008) basic assertion for the inclusion of income distri-
bution in consumption, investment and net export and government expenditure
terms in Eq. 9.6 is: in the consumption function wage incomes (W) and profit
incomes (R) are associated with different propensities to consume. The Kaleckian
assumption is that the marginal propensity to save is higher for capital incomes than
for wage incomes; consumption is therefore expected to increase when the wage
share rises. They argue that Keynesian as well as neo-classical investment functions
depend on output (Y) and the long-term real interest rate or some other measure of
the cost of capital. The latter is part of z1 . The authors further argue that in addition
to output and interest rate, investments are expected to decrease when the wage
share rises because future profits may be expected to fall. Moreover, it is often
argued that retained earnings are a privileged source of finance and may thus
influence investment expenditures.
They claim that first, the policy implications of their findings are that wage
moderation in the EU is unlikely to stimulate employment. They suggest that wage
moderation leads to a (moderate) contraction in output. Since an expansion in
output can be regarded as a necessary (but not sufficient) condition for an expansion
in employment, wage moderation (at the EU level) is not an ‘employment-friendly’
wage policy. Their second implication refers to wage coordination; they contend
9 Income Distribution and Economic Growth 193
that their findings suggest that demand is wage-led in the Euro area. This finding
does not extend to individual Euro member states.
Our paper takes advantage of the formalization of evolutionary economics by
Foster (1994, 2014) and Foster and Wild (1999a, b).
This section examines if firms’ access to bank loans has any effect on growth
through1 its effects on functional income distribution. The dataset is the medium
and large manufacturing industries as compiled by the Central Statistical Agency
(CSA) of Ethiopia. The available panel data covers 1996–2009 with 611 and 1943
firms in 1996 and 2009, respectively.
If access to bank loans first affects functional income distribution and if func-
tional income distribution affects productivity growth that would imply that facil-
itating access to bank loans might ultimately foster growth in the economy. To
achieve this objective, we first explore the real firms over the period on some key
variables and econometrically estimate Eq. 9.5 using the generalized method of
moments (GMM). Finally, alternative policy simulation scenarios are performed to
understand the full effect of bank loans, income distribution and productivity
growth linkage.
First, from firm-level data, the parameters of interest are computed for each firm
for each year:
• Employment share (EMPSHAFIRM): Is supposed to capture if there is an
indication of a structural change, that is, the movement of labor from less
productive to more productive sectors;
• Market share (MKTSHARE): This is the available resource over which firms
have to compete. It is through this competition process that decisions to invest in
productivity fostering factors are undertaken;
• Output share (OUSHA): Firms can also compete over industry output; and
• Productivity growth (GROWTHPRO): Is the main variable of interest. Its
growth rate is understood as the growth of mean characteristics in evolutionary
economics. Thus, growth is perceived to mean growth in productivity.
Based on these variables, our paper draws some inferences about the connection
between access to bank loans, functional income distribution and productivity
growth.
1
In the evolutionary growth framework, growth is mainly understood as growth of any mean
characteristics (in our case productivity growth).
194 A. Gebremeskel
The evolution of employment shares, market shares, output shares and growth in
productivity are shown in Figs. 9.1, 9.2, 9.3 and 9.4 in Annexure 2. The purpose of
these figures is to learn if there is any indication of a structural transformation process
within the manufacturing sector. If there is a change in the structure of production in
the manufacturing sector, we expect the labor share to be continuously shifting within
the industry. The shift should take place from low productivity to high productivity
industries. This would mean higher labor productivity and consequently higher labor
incomes which will form a positive feedback loop with productivity.
In Fig. 9.1, we observe movements for employment share within the industries
only for 11 industries. We identified these industries from the data as:
• Production, processing and preserving of meat, fruits and vegetables
• Manufacture of animal feed
• Manufacture of non-metallic NEC
• Manufacture of basic iron and steel
• Manufacture of other fabricated metal products
• Manufacture of pumps, compressors, valves and taps
• Manufacture of other general purpose machinery
• Manufacture of batteries
• Manufacture of bodies of motor vehicles
• Manufacture of parts and accessories
• Manufacture of furniture.
From the firm-level dataset, it was possible to learn that most of the firms within
these industries had access to bank loans. For example, overall, the 105 firms within
the production, processing and preserving of meat, fruits and vegetables industries
had access to bank loans. In the manufacture of animal feed industry, out of 98
firms, 37 had access to bank loans. Generally, all the indicated firms had access to
bank loans during the years of observation. In Fig. 9.1, we can observe that in these
industries, there is a significant movement (fluctuation) in employment shares. The
only exceptions are spinning, tanning and publishing industries in which all firms
had access to bank loans. However, any indication of movement in their employ-
ment share is not displayed.
One can argue that the employment share must be within the same sector (in-
dustries) and not across industries. If the reallocation of labor was taking place
across industries, we could have observed variations in the employment share in the
rest of the industries, but this is not evidenced.
Whether these industries are high productivity sectors and hence growth and
equality promoting can be another area of enquiry. But looking at their face value
alone, we may tentatively conclude that those industries which are related to
metallic manufacturing in particular are connected to the government (see Fig. 9.1
in Annexure 2).
9 Income Distribution and Economic Growth 195
manufacture of pasta and macaroni Manufacture of food NEC Distiling rectifying and blending of spirit Manufacture of wine Malt liqores and malt Manufacture of soft drinks Manufacture of tobacco
1
.5
0
spining , weaving and finishing Manufacture of cordage rope and twine Kniting mills manufacture of wearing apparal except fur Tanning and dressing of leather manufacture of footwear Manufacture wood and wood products
1
employment share
.5
0
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
Manufacture of furniture
1
.5
0
period
Graphs by International standard industrial classification (ISIC)
Referring to Fig. 9.3, firms’ shares in total industry output are more pronounced
than their market shares. This tells us the underlying market structure, which may
subsequently have an effect on functional income distribution and productivity
growth (see Fig. 9.3 in Annexure 2).
196 A. Gebremeskel
manufacture of pasta and macaroni Manufacture of food NEC Distiling rectifying and blending of spirit Manufacture of wine Malt liqores and malt Manufacture of soft drinks Manufacture of tobacco
8
6
2 4
0
Manufacture of paper and paper Manufacture of basic chemicals Manufacture of phrmaceuticals, Manufacture of soap detregents, Manufacture of chemical products
products Publishing and printing services except fertilzers Manufacture of paints varnishes medicinial perfumes.. NEC
market share
6 8
4
2
0
Manufacture of glass and Manufacture of structural clay Manufacture of cement , Manufacture of articles of concrete,
Manufacture of rubber Manufacture of plastics glass products products lime andplaster cement Manufacture of non-metalic NEC
8
4 6
0 2
Manufacture of structural Manufacture of other fabricated Manufacture of pumps, manufacture of bodies for
Manufacture of basic iron and steel metal products Manufacture of cuttlery hand tools.... metal products compressors,valves and taps Manufacture of ovens mothor vechiles
8
4 6
2
0
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
Manufacture of furniture
8
4 6
2
0
period
Graphs by International standard industrial classification (ISIC)
It has been discussed that firms are at the heart of an evolutionary approach to
economic growth and growth in productivity at the firm level is a key to economic
growth. We can see from Fig. 9.4 that there are fluctuations in the productivity
growth rate (from −20 to 10%). We also note that, for example, the productivity
growth for production, processing and preserving of meat, fruits and vegetables
remained positive, which might be an indication of the effect of access to bank
loans (see Fig. 9.4 in Annexure 2).
9 Income Distribution and Economic Growth 197
Production, processing and preserving Manufacture of sugar and manufacture of pasta and
of meat, fruit and veg manufacture of edible oil Manufacture of dairy products Manufacture of flour Manufacture of animal feed manufacture of bakery confecionary macaroni
.15
.1
.05
0
output share of each industry based on the value of output
Manufacture of glass and Manufacture of structural Manufacture of cement , Manufacture of articles of Manufacture of basic iron Manufacture of structural
glass products 2691 clay products lime and plaster concrete, cement Manufacture of non-metalic NEC and steel metal products
.15
.1
.05
0
Manufacture of cuttlery Manufacture of other fabricated Manufacture of pumps, Manufacture of other general
hand tools.... metal products compressors, valves and taps 2912 Manufacture of ovens 2919 purpose machnery 2929
.15
.1
.05
0
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
year of observation
Graphs by International standard industrial classification (ISIC)
This section deals with the econometric estimation of the logistic differential
equation in Eq. 9.5. The variables entering the model are two natured: the evolu-
tionary component and the exogenous component.
We estimated Eq. 9.5 using firm-level panel data. To achieve this, the data was
transformed (logarithms, growth rates, lags and differences) so that the transformed
data was consistent with the evolutionary econometric framework.
198 A. Gebremeskel
Manufacture of paper and Manufacture of basic chemicals Manufacture of phrmaceuticals, Manufacture of soap detregents, Manufacture of chemical
paper products Publishing and printing services except fertilzers Manufacture of paints varnishes medicinial perfumes.. productsNEC
GROWTHPRO
10
0
-20 -10
Manufacture of glass and Manufacture of structural Manufacture of cement , Manufacture of articles of concrete, Manufacture of non-metalic
Manufacture of rubber Manufacture of plastics glass products clay products lime and plaster cement NEC
10
0
-20 -10
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
Manufacture of furniture
10
0
-20 -10
period
Graphs by International standard industrial classification (ISIC)
2
Here the complement of variable x is equal to (1 − x) (see the first term on the right hand side in
Eq. 9.5).
9 Income Distribution and Economic Growth 199
change in labor productivity (LAGDELTFP) which represents the last term of Eq. 9.5
and finally, employment share of each firm (EMPSHAFIRM).
For the evolutionary approach, once the logistic differential in Eq. 9.5 is for-
mulated, it can be estimated using standard panel data econometric techniques
(random effects, fixed effects or GMM) which do not require separate treatment
here. The reported results are with a Wald Chi-square value of 773.57 with six
degree of freedom and probability value of (p > X2) of 0.0000 (Table 9.2).
The estimated results indicate that all explanatory variables entered the esti-
mation with statistically significant estimates. As expected, productivity was pos-
itively affected by the growth in labor share. However, the employment share
entered with a negative and statistically significant coefficient. We may interpret
this as lack of labor movement from low productive to high productive industries.
The basic research question in this paper was explaining how firm-level labor share
affects firm and industry level productivity and how it affects aggregate productivity
in an economy taking the case of Ethiopia.
The most direct interpretation of the estimated results is that evolution and
change in mean characteristics (change in productivity) are positively affected by
the growth of functional income distribution (the growth in labor share: even if the
economic sign of the coefficient is of small order), its statistical significance is quite
acceptable.
The other variable of interest here is employment share of each firm within an
industry, which entered the model with a negative sign but a significant coefficient.
In economic terms, the positive and negative coefficients of labor share within a
firm and the employment share of each firm within the industry tell us very
important information about structural changes in the manufacturing sector.
If structural change was evident, the employment share would have entered with
a positive effect. However, it did not do this. Therefore, this does not support the
popular view of a structural bonus hypothesis which postulates a positive
200 A. Gebremeskel
relationship between structural change and economic growth. This hypothesis was
based on the assumption that during the process of economic development,
economies upgrade from industries with comparatively low to those with a higher
value added per labor input. For example, Timmer and Szirmai (2000) have a
detailed explanation on this.
This result is supported by an almost opposite mechanism, where structural
change has a negative effect on aggregate growth; this is revealed by Baumol’s
hypothesis of unbalanced growth. Intrinsic differences between industries in their
opportunities to raise labor productivity (for a given level of demand) shift ever
larger shares of the labor force away from industries with high productivity growth
toward stagnant industries with low productivity growth and accordingly higher
labor requirements. In the long-run, the structural burden of increasing labor shares
getting employed in the stagnant industries tends to diminish the prospects for
aggregate growth of per capita income. Baumol (1967) is key literature on this.
When the complement of firms’ market share enters the regression result with a
positive sign, the actual market share would have entered with a negative sign
which has a direct and clear economic meaning, that is, since firms may try to
capture the market through nominal ways (for example, price competition or
advertising or any other institutional arrangements) this will harm productivity. Our
major conclusion is lack of strong evidence for intra-industry selection.
The policy lesson is that access to bank loans is of great importance to firms.
Particularly those industries (spinning, tanning and publishing) in which all firms
had access to bank loans revealed movements in employment share, which is
evidence of structural transformation.
There are reasons why it is important to introduce appropriate public loan
policies, that is, ensuring a lending channel of monetary policy to work without
breaks. First, a credit aggregate can be a better indicator of monetary policy than an
interest rate or a monetary aggregate in Ethiopia. Second, monetary tightening that
reduces loans to firms can have negative distributional consequences. Particularly
for those firms for whom bank loans are a primary source of finance, ease of access
to bank loans can have economy-wide distributional consequences. More specifi-
cally, the credit policy should be such that manufacturing firms get better access to
banks.
It is desired that the future research direction includes economy-wide modeling,
estimation and more formalization of evolutionary economic models to study the
link between access to bank loans and its effects on income distribution and
inclusive economic growth.
References
Ricardo D (1815) Essay on The influence of a low price of corn upon the profits of stock, 2nd edn.
John Murray, London
Romer P (1986) Increasing returns and long-run growth. J Polit Econ 94(5):1002–1037
Romer P (1987) Crazy explanations for the productivity slowdown. NBER Macroecon Annu
2:163–202
Romer P (1990) Endogenous technological change. J Polit Econ 98(5):71–102
Romer P (1994) The origins of endogenous growth. J Econ Perspect 8(1):3–22
Safarzyńska K, van den Bergh JCJM (2010) Evolutionary models in economics: a survey of
methods and building blocks. J Evol Econ 20(3):329–373
Salvadori N (2003) The theory of economic growth. Edward Elgar, Cheltenham
Solow R (1956) A contribution to the theory of economic growth. Q J Econ 70(1):65–94
Stockhammer E, Onaran O, Ederer S (2008) Functional income distribution and aggregate demand
in the Euro area. Camb J Econ 33(1):139–159
Swan TW (1956) Economic growth and capital accumulation. Econ Rec 32(2):334–361
Timmer M, Szirmai A (2000) Productivity growth in Asian manufacturing: the structural bonus
hypothesis examined. Struct Change Econ Dyn 11(4):371–392
Todaro MP (1997) Economic development. Longman, London
Veblen T (1898) Why economics is not an evolutionary science? Quart J Econ 12(4):373–397
Part IV
Trade, Mineral Exports and Exchange
Rate
Chapter 10
Determinants of Trade with Sub-Saharan
Africa: The Secret of German Companies’
Success
Johannes O. Bockmann
Abstract This paper evaluates the degree to which internal, micro and
macro-environmental variables explain why some small- and medium-sized
enterprises (SMEs) based in Germany export more successfully to sub-Saharan
Africa (SSA) than other firms in the same category. It derives explanatory factors
specific to the region from experts. A bivariate correlation analysis identifies
relations between (in)dependent export performance (EP) measurements. Stepwise
multiple regression equations for firms’ overall EP and overall export profitability
in the last three years highlight factors with the most significant correlations. As
evaluated in previous research and as mentioned by experts, it applies a multidi-
mensional approach, investigating variables according to the resource-based view
and the contingency paradigm. This study indicates that SSA has specific
requirements for successful exports which differ from other regions. Knowledge
about these particular characteristics of the market will enable managers and pol-
icymakers to improve trade relations. By focusing on the EP of German SMEs in
SSA, this study fills a research gap since no previous study has concentrated on this
specific aspect.
Keywords German small- and medium-sized enterprises Export performance
Comparative advantages Internal Micro and macro-environmental factors
Sub-Saharan Africa
10.1 Introduction
Exports represent the preferred method for entry into foreign markets (Lado et al.
2004; Sousa et al. 2014; Zhao and Zou 2002) since they offer firms a comparatively
high level of flexibility with relatively small necessary investments thus permitting
a fast entry into new markets (Katsikea et al. 2007; Leonidou 1995; Sousa and
Novello 2014). Research on export modalities is of high interest to three major
stakeholders: public policymakers, managers, and researchers (Katsikea et al. 2000;
Sousa 2004).
Scholars explain the increasing interest in exports on the basis of its positive
effect on a country’s growth alongside the business opportunities that it offers
individual firms (Dean et al. 2000). Public policymakers encourage export activities
since they foster the accumulation of foreign exchange reserves, support the
development of national industries, create new jobs, and improve productivity
(Czinkota 1994). Developed countries see cross-border economic relationships as a
necessary instrument for maintaining their standard of living (Baldauf et al. 2000).
A detailed review of 33 articles published between 2000 and May 2015 looking
at export performance (EP), we identified 65 internal and 35 external determinants.
However, none of them focused on sub-Saharan Africa (SSA). This is surprising
since these markets offer great business opportunities. According to data from the
World Bank (Catalog Sources World Development Indicators 2015), the region’s
total GDP grew by 5.72% per year on average from 2000 to 2013. Further, imports
of goods and services increased by an average of 12.05% per year from 2010 to
2012 (Catalog Sources World Development Indicators 2015; United Nations
Statistics Division 2011, 2014). In 2012, SSA countries imported US$496.50 bil-
lion worth of goods and services (United Nations Statistics Division 2014). The
increasing demand for foreign products together with a relatively high level of
uncertainty in the region makes SSA predestined for exports rather than alternative
market entry methods such as foreign direct investment (Boly et al. 2014; Riddle
2008; Sousa and Novello 2014).
Regarding the exporter’s home country, only three papers concentrated on
Germany although the country was one of the top three merchandise exporters with
a share of 7.7% of world trade in 2013 and a trade surplus of US$264 billion (WTO
2014). The main drivers of this success are Germany’s small- and medium-sized
enterprises (SMEs) (MoAE 2015), a situation which is similar to that in most
European countries (Bijmolt and Zwart 1994). According to an EU definition,
SMEs include all firms with a maximum of 250 employees (Sousa et al. 2014).
However, Katsikea et al. (2007) argue that SMEs are not just smaller versions of
large firms but that they operate differently because of their size. Therefore, an
insight into the success factors of German SMEs may be relevant for German
policymakers and executives interested in the guarantors of EP (Baldauf et al.
2000).
10 Determinants of Trade with Sub-Saharan Africa … 209
Between 2000 and 2013, exports from Germany to all SSA countries grew on
average by 8.8% to US$13.51 billion. 89% of German exporters with experience in
Africa plan to expand on their commitments, especially in West and Central Africa
(Foly 2013). Politicians too, including the German Chancellor Angela Merkel are
showing an increasing interest in Africa. For example, during conferences such as
the EU–Africa summit a steady cross-sectoral rise in demand is expected thanks to
a growing middle class (Merkel 2014). Consequently, a deeper insight into the
factors which influence German EP in SSA is necessary.
Scholars argue that further research is needed to investigate the possible pre-
dictors of EP (Baldauf et al. 2000; Fevolden et al. 2015; Navarro-García et al.
2015). A focus on the EP of SMEs is specifically important since they in particular
profit from a combination of flexibility with limited resource commitments (Sousa
et al. 2014), while their significant contributions to national economies underline
their relevance for policymakers (Sousa and Novello 2014). Further, there is a need
to investigate the specifics of EP in selected regions/countries (Navarro-García et al.
2015; Rambocas et al. 2015). Concerning Germany, Wagner (2014) maintains that
detailed company characteristics should be worked out. Sousa et al. (2008) and
Sung (2015) have identified a strong demand for more research on developing
countries (DC), such as the ones in SSA, since their share in world trade is
increasing thus offering significant opportunities in the present and future global
economic order.
In summary, the quoted views substantiate the need for additional research in the
field of EP, covering individual regions and explanatory variables. To provide
evidence if SSA requires different or additional internal, micro and macroeconomic
variables, this study concentrates on the factors relevant for German SMEs tar-
geting this region. The rest of this paper is organized as follows. It first gives a
literature review which is followed by a section on methodology. The next section
gives the findings and analysis of the semi-structured interviews and questionnaire.
The last section gives the conclusions and discusses possible areas for further
research.
Research about EP goes back to Tookey’s (1964) work about factors associated
with success in exporting. In a wider context, it addresses the outcomes of export
activities, mostly at the firm or export venture level (Kahiya and Dean 2014).
Nowadays, EP is the mostly studied in the field of export marketing (Leonidou and
Katsikea 2010). Multiple aspects arise from the fact that the ‘Export performance
dialogue is spread over a large pan-discipline research landscape which includes
210 J.O. Bockmann
10.2.1 Measuring EP
Approaches for measuring EP are fragmented and uncoordinated (Kahiya and Dean
2014; Katsikea et al. 2000) and no single view prevails (Sousa 2004). An almost
philosophical approach points out that for most export start-ups pure survival is
already some measurement of success (Kahiya and Dean 2014). Indicators reflect
objective and subjective facts. While objective measures deal with absolute per-
formance, subjective ones are concerned with a firm’s expectations or its perceived
performance as compared to its competitors (Akyol and Akehurst 2003). Scholars
have identified 42 (Katsikea et al. 2000: 497) or even 50 (Sousa 2004: 9) indicators
for EP. Since no individual indicator adequately captures the phenomenon of EP
(Kahiya and Dean 2014; Lages and Lages 2004; Zou et al. 1998), there is general
agreement in favor of a multidimensional approach. Many researchers such as
Baldauf et al. (2000) and Papadopoulos and Martín-Martín (2010) prefer a multiple
approach.
10.2.2 Determinants of EP
Two major theoretical approaches to classify the determinants of EP stand out. The
resource-based view emphasizes a firm’s individual competencies as its unique
bundles of assets (Conner and Prahalad 1996; Nalcaci and Yagci 2014; Stoian et al.
2011). Accordingly, the success of a company is a result of its acquiring and
exploiting its own unique resources such as competence, experience, and size (Zou
and Stan 1998). Research also identifies how higher performance can be achieved
in comparison with other firms (Barney 2002; Dhanaraj and Beamish 2003; Singh
and Mahmood 2014).
On the other hand, the contingency paradigm proposes that environmental fac-
tors affect the companies’ strategies and EP which is then the result of a specific
company context (Sousa et al. 2008). Consequently, exports are considered an
organization’s strategic response to the interaction of external and internal factors
(Robertson and Chetty 2000; Sousa et al. 2008; Yeoh and Jeong 1995).
In the meantime, there is a general agreement that a multidimensional approach
including a range of determinants such as managerial, organizational, and envi-
ronmental aspects is most appropriate (Baldauf et al. 2000; Katsikea et al. 2000;
Rambocas et al. 2015). This is confirmed by Morgan et al. (2004) who synthesized
the different views into a robust theoretical model.
10 Determinants of Trade with Sub-Saharan Africa … 211
Thirty-three papers published between 2000 and May 2015 were analyzed and 65
variables were identified. International experience measured in years (21.21% of the
reviewed papers), firm size as represented by the number of employees (18.18%),
adapting the price strategy to market conditions (15.15%), and the number of foreign
markets served by a firm (12.12%) are mostly applied to explain a business’ EP.
Most scholars extend their research scope by using qualitative and quantitative
determinants. In the 33 papers published between 2000 and May 2015, 21 studies
covered external variables, identifying 35 external factors. An increasing level of
competition in the foreign market influences EP, but there is no consensus if it is
positive (9.09% of reviewed papers) or negative (6.06%). Scholars are equally
inconsistent regarding the influence of distance. Two papers (6.06%) show that an
increasing distance has a positive impact, whereas one paper presents a negative
result. Also, the foreign exchange rate plays a multifaceted role: in one paper it has
a positive influence, whereas three papers (9.09%) found no significant effect.
Customs and tariffs (9.09%) and regulations (15.15%) are frequently named as
impacting EP negatively, while one study claimed that they were irrelevant.
10.3 Methodology
Our research applies pragmatism, which is not committed to any single philosophy.
The lack of studies about EP of German SMEs in SSA leads to pragmatism since it
allows a researcher to consider different points of view to get a holistic picture.
Consequently, multiple approaches are necessary to gain quantitative and qualitative
data (Collis and Hussey 2014; Saunders 2012). Actually, many EP studies (e.g.,
Freeman and Styles 2014; Rambocas et al. 2015) have applied this philosophy.
Our study is abductive since it combines both deductive and inductive elements.
The initial semi-structured interviews aimed at expanding knowledge about EP
from experts without reference to the existing theory. The respective results were
212 J.O. Bockmann
merged with the findings from existing literature into one questionnaire. Thus, for
German SMEs targeting SSA, the existing theory could be tested and modified by
new insights (Collis and Hussey 2014; Saunders 2012).
To answer the research question, a varied approach (multiple methods) rather than
one method was chosen achieving a broader view (e.g., by Freeman and Style 2014;
Rambocas et al. 2015; Wagner 2014). First semi-structured interviews were carried
out which mainly resulted in qualitative data. Subsequently, a questionnaire survey
was done to gain primarily quantitative, but also qualitative data.
The aim of exploratory research is to ‘seek new insights into phenomena, to ask
questions, and to assess the phenomena in a new light’ (Saunders 2012: 670).
Consequently, this study started with semi-structured interviews examining the
factors known to influence EP as well as searching for additional ones prior to
developing a questionnaire. A good reason to include exploratory research as a first
step is the positive experience of Freeman and Styles (2014), Lacka and Stefko
(2014), and Nalcaci and Yagci (2014) who gained new insights about EP for other
regions by conducting interviews.
Explanatory research has its emphasis on clarifying the relationship between
variables. The questionnaire supports this purpose by enabling the identification of
interrelations between dependent and independent factors of EP and the develop-
ment of casual relationships between them (Saunders 2012). It tests the interaction
between existing measurements for EP relevant in other countries identified during
the literature review. The fact that researchers such as Singh and Mahmood (2014)
and Sousa and Novello (2014) have applied explanatory research in their EP studies
underlines the value of this approach.
Based on a detailed literature study to gain secondary data and information about
the current status of research activities, semi-structured interviews were chosen to
extract new insights from experts concerning the factors which influence a firm’s
EP, thus getting answers to specific key questions while providing the flexibility to
react to the flow of conversation (Saunders 2012). Freeman and Styles (2014) have
previously used a similar approach.
Subsequently, a self-completion questionnaire (Collis and Hussey 2014) was
developed to collect data for empirical tests. The nature of this questionnaire was
mainly quantitative and explanatory since the participants were asked to grade the
influence of different variables on their firm’s EP. By evaluating the data with a
bivariate correlation and multiple regression, the relationships were identified, as
10 Determinants of Trade with Sub-Saharan Africa … 213
previously done, for example, by Castellacci and Fevolden (2014), Fevolden et al.
(2015) and Stoian et al. (2011). Moreover, the participants were encouraged to
explain their grading and to suggest additional factors influencing EP.
The applied semi-structured interviews and questionnaire fall in the survey
strategy which is mostly applied to gain quantitative data, but qualitative infor-
mation can also be accumulated this way. A questionnaire allows an efficient
collection of standardized data from a large population enabling comparisons and
further analysis. Moreover, it helps define the relationship between EP’s indepen-
dent and dependent factors. This strategy is generally perceived as authoritative,
comparatively easy to explain and understandable for participants.
At first, general information about the participants and their firms was derived from
answers to closed questions, followed by an inquiry regarding target markets in
SSA. Closed questions were used since the participants were surveyed on a specific
issue. In the second part, participants elaborated freely on internal and external
factors which were perceived to influence their firm’s EP (Saunders 2012).
As a sampling technique, a non-probability sample was chosen because ‘the
probability of each case being selected from the total population is not known’
(Saunders 2012: 261). More specifically, purposive sampling based on the scholar’s
judgment was applied. Although all participants had been in charge of exports to
SSA for several years and were therefore a good fit, this approach is not statistically
representative. Therefore, it was followed by a questionnaire survey (Saunders
2012). The response rate of 40% was fairly high compared to Sousa’s reviews with
30 and 25% (Sousa et al. 2008).
Table 10.1 summarizes the general information, which has been changed to
ensure confidentiality about the participants.
Based on the literature review and the interviews, a Web-based questionnaire was
developed. For Easterby-Smith et al. (2015), this is an efficient way to collect data
from a large number of people, which was also important for our analysis (Collis
and Hussey 2013). First, general information about the respondents was gathered,
which was followed by questions regarding their target markets in SSA. Later, the
participants were asked to grade their EP and the respective determinants. Finally,
they could enter personal data to receive an executive summary of the findings.
The seven-point Likert scale: Answers were graded on a seven-point Likert scale
because this allows the gathering of perceptions (Navarro-García et al. 2015).
214 J.O. Bockmann
Further, this extended scale ‘has been shown to process valid psychometric measure
properties’ (Singh and Mahmood 2014: 88) and has been successfully used in
previous EP studies (e.g., Rambocas et al. 2015; Singh and Mahmood 2014; Ward
and Duray 2000).
Subjective self-reporting was employed because of the expectation (and expe-
rience) that firms are unwilling to disclose full data (Leonidou et al. 2002; Singh
and Mahmood 2014) and because of a proven correlation between subjective and
objective measures (Akyol and Akehurst 2003; Dess and Robinson 1984; Matanda
and Freeman 2009; Stoian et al. 2011).
Dependent variables of EP: Since there is no generally accepted definition for EP
(Sousa 2004; Sousa et al. 2008; Stoian et al. 2011; Wheeler et al. 2008), the
measurements for our study were developed on the basis of existing literature which
guaranteed success and facilitated comparisons with previous results.
First, the respondents were encouraged to rate their overall perceived satisfaction
with EP in SSA in the last three years on a seven-point Likert scale, ranging from
‘extremely dissatisfied’ to ‘extremely satisfied’ (similarly applied, e.g., by Akyol
and Akehurst 2003; Cadogan et al. 2012; Freeman and Style 2014; Lee and Griffith
2004; Navarro-García et al. 2015; Sousa and Novello 2014; Sousa et al. 2014).
They were told that the overall satisfaction about EP should include the areas of
international sales growth, export business profitability, the firm’s image in foreign
markets, international expansion, and market share (Cavusgil and Zou 1994;
Navarro-García et al. 2015; Navarro et al. 2010 ).
Second, they were asked about their overall satisfaction with their company’s
performance in terms of export profitability in SSA in the last three years (similar
to, e.g., Cadogan et al. 2002; Dean et al. 2000; Nalcaci and Yagci 2014; Robertson
and Chetty 2000; Singh and Mahmood 2014; Sousa and Novello 2014; Stoian et al.
10 Determinants of Trade with Sub-Saharan Africa … 215
2011). The time frame was adapted from Cadogan et al. (2012) and
Navarros-García et al. (2015). Sousa and Novello’s (2014) and Sousa et al.’s (2014)
approaches to ask for the overall satisfaction with EP and export profitability was
employed.
Independent variables of EP: The items applied to measure each construct were
based on the earlier interviews with professionals as well as existing literature.
Participants were again asked to grade internal, micro and macro-factors on a
seven-point Likert scale.
Questionnaire sampling: Only German SMEs exporting to at least one SSA
country in the last three years were considered. Following most researchers in the
field of EP such as Nalcaci and Yagci (2014), Sousa et al. (2014), and Sousa and
Novello (2014), only CEOs and managers with decision making responsibilities
regarding exports to SSA were accepted. As shown in Table 10.2, the response rate,
when compared with Sousa was quite low, possibly because the authors only/or
additionally sent out the questionnaire via post or called all potential participants
(Sousa 2008).
To ensure representative sampling, the number of participants should be as large
as possible (Cooper and Schindler 2014; Saunders 2012). According to Saunders
et al. (2012), a relatively low response rate, however, is not necessarily bad as a
sample size of 30 or more represents a high degree of accuracy and reliability. With
a useable sample size of 41, this was a given. Moreover, Armstrong and Overton’s
(1977) extrapolation procedure was applied to ensure that no differences existed
between early and late responses (the basic details of the participants in the ques-
tionnaire are given in Table 10.3).
A content analysis was done to quantify the orally given data. Using this widely
applied method, items of qualitative data were systematically converted into
numerical data (Collis and Hussey 2014; Easterby-Smith et al. 2015).
The factors mentioned in an open question to influence EP are given in Table 10.4.
Besides other factors such as export promotion programs and the prohibition of
bribery, German politics and the legal environment were also considered to have an
impact on EP. A survey by Transparency International (Hardoon 2013) shows that
bribe is a serious matter in Africa and that decision makers are willing to accept
such payments. For example, 54% of the 2207 households questioned in Ghana in
2013 said that they had paid bribes; politicians were described as corrupt by 76%
(Hardoon 2013). A participant in one of the studies stated that for this reason his
firm concentrated on private customers. Two others argued that contributions were
illegal in all European countries, but Germany was the only country where the law
was strictly enforced. France, besides others, was said not to apply existing legis-
lations. In cultures where expensive presents express esteem and where decision
makers depend on special payments to support their families and tribes, German
companies have no chances of getting contracts. This supports O’Cass and Julian’s
study (2003) stating that legal and political decisions influence EP. Dean et al.
(2000) confirm that governmental agencies may support exports.
10 Determinants of Trade with Sub-Saharan Africa … 217
General willingness of firms to deal with the aspect of risk in Africa has not been
mentioned in previous studies.
Two participants said that time spent in abroad or rather continuous physical
presence in the target country was essential. However, Stoian et al. (2011) could not
prove any relevance of this for Spanish exporters.
Employees’ principle mistrust toward SSA was mentioned as influencing EP
negatively. The attitude of employees toward a target market has been previously
researched by Nalcaci and Yagci (2014).
Data were imported from the online questionnaire provider into IBM SPSS. From
58 given datasets, 41 emerged as valid, once they were edited following Brase and
Brase (2010) and Pallant (2013). The included datasets fulfilled the mathematical
requirements for analysis and fit into the target group:
– Except firms larger than 250 employees (SME threshold),
– Except unfinished datasets, and
– Including individuals who are involved with their firms in exports to SSA.
10 Determinants of Trade with Sub-Saharan Africa … 219
Figure 10.1 gives the regions served by at least 20% of the participants’ firms.
Countries colored green (Ghana, Nigeria, and South Africa) enjoyed the
patronage of more than 60% of the German SMEs exporting to SSA. However, this
was almost equally true for the orange zone (Cameroon, Angola, Namibia,
Mozambique, Tanzania, Kenya, and Ethiopia), with a total of 50–60% of the
companies having export activities there.
Obviously, all areas colored in green and orange (except Ethiopia) are located by
the sea. German SMEs prefer exporting to countries that are easily accessible and
they avoid landlocked markets.
Dependent variables
The dependent variables over all of EP and export profitability were graded by
all participants on a scale from one (extremely dissatisfied) to seven (extremely
satisfied). The results are symmetrically bell-shaped thus representing normal dis-
tribution (Figs. 10.2 and 10.3).
To test the null hypothesis if there is no correlation between overall EP and
export profitability of German SMEs, a Pearson product-moment correlation
coefficient was established following Anderson et al. (2014). Since the significance
(2-tailed) is less than 0.05, the correlation is significant. The Pearson correlation
actually shows a strong positive relationship between the variables (0.682), that is,
10 Determinants of Trade with Sub-Saharan Africa … 221
higher levels in one variable are associated with higher values in the other. A shared
variance of 46.51% can explain each other’s variance. In view of these results, there
is significant evidence to reject the formulated null hypothesis (Pallant 2013)
(Table 10.6).
Independent variables
The participants were asked to grade the influence of different macro-factors on
their company’s EP in SSA from one (none) to seven (substantial). Internal and
microenvironmental factors were graded from one (much worse) to seven (much
better) in comparison with major competitors in the market.
Both measurements for EP were tested with each factor by a bivariate correlation
to describe the strength and direction of their relationship following Anderson et al.
(2014) and Pallant (2013). The following hypotheses were tested:
H0a: There is no correlation between overall EP and the ‘independent variable.’
H1a: There is a significant correlation between overall EP and the ‘independent
variable.’
H0b: There is no correlation between export profitability and the ‘independent
variable.’
H1b: There is a significant correlation between export profitability and the ‘inde-
pendent variable.’
In case of p < 0.05, the correlation is significant at the 0.05 level (2-tailed) and
H0 can be rejected. If p < 0.01, the correlation is even significant at the 0.01 level
(2-tailed) and H0 can be rejected (Pallant 2013). The relationships were charac-
terized depending on ‘r’ (Table 10.7).
Table 10.8 summarizes the results on internal and microenvironmental factors.
Table 10.9 summarizes the results on macro-environmental factors.
Table 10.10 Variables entered/removed during the stepwise multiple regression analysis
(dependent factor overall EP)
Model Variables entered Variables Method
removed
1 Firm characteristics: willingness . Stepwise (criteria:
to deal with risks in sub-Saharan probability-of-F-
Africa caused by insufficient to-enter 0.050,
experience in the region is … Probability-of-F-
2 SSA: level of competition . to-remove 0.100)
3 SSA: ecological environment .
As recommended by Pallant (2013), for relatively small sample sizes the model
summary is evaluated regarding the adjusted R square which helps understand the
degree to which each model represents the variance of the dependent variable. It
turns out that Model 1 explains 19.4%; Model 2, 31.9%; and Model 3, 41.1% of the
variance of overall EP. See Table 10.11.
To determine the statistical significance of the three models, the ANOVA tables
were checked. All three models reached an overall statistical significance since in
each case p < 0.01 (Pallant 2013). In each of the three models, all independent
variables had a significance value of <0.05. This indicates that all variables made a
significant statistical contribution to the prediction of overall EP (Pallant 2013).
According to Pallant (2013), an adjusted R square of 0.411 for Model 3 is quite a
respectable result since it explains 41.1% of the variance in overall EP. The Mastery
Scale of the third-factor ecological environment in SSA has a part-correlation
coefficient of −0.32. The squared value 0.1024 indicates that 10.24% of the vari-
ance in overall EP is attributable to the ecological environment. The same proce-
dure shows that the level of competition makes a unique contribution of 11.09%
and that of willingness to deal with risks in SSA 22.09% (Pallant 2013; Tabachnick
and Fidell 2013).
10 Determinants of Trade with Sub-Saharan Africa … 227
Table 10.11 Model summary of stepwise multiple regression analysis with dependent factor
overall EP
Model R R2 Adjusted R2 Std. error of the estimate
a
1 0.463 0.215 0.194 1.214
2 0.594b 0.353 0.319 1.116
c
3 0.675 0.455 0.411 1.038
Note Significant at 1% (a), 5% (b) and 10% (c) levels of significance
Following Tabachnick and Fidell (2013), the regression equation was formulated
using the unstandardized coefficient B selected from Model 3. Regression equation
for overall EP is obtained from:
Y ¼ b1 x1 þ b2 x2 b3 x3
where
Y Overall EP (seven-point Likert scale)
x1 Willingness to deal with risks in SSA (seven-point Likert scale)
x2 Level of competition in SSA (seven-point Likert scale)
x3 Ecological environment in SSA (seven-point Likert scale)
Overall EP ¼ 1:111
þ 0:501 Willingness to deal with risks in SSA
þ 0:281 Level of competition in SSA
0:233 Ecological environment in SSA
With values entered on a seven-point Likert scale, the results are shown on this
scale as well. The equation demonstrates that the willingness of the managers to
deal with risks had the greatest positive influence on overall EP. A change of one
point in the Likert scale increased overall EP by 0.501 Likert points. Since this
factor has not been researched before, no comparisons with existing literature can
be done.
Also, the level of competition in SSA had a positive influence on the dependent
factor. A change of one point led to a change of 0.281 Likert points. This confirms
Matanda and Freeman (2009) and Sousa and Novello’s (2014) works who identi-
fied a positive relation. However, Cadogan et al. (2012), Lee and Griffith (2004),
and Navarro-García et al. (2015) found a negative relation in their research.
The ecological environment had the smallest (yet negative) influence. Higher
ecological standards resulted in a lower overall EP; an improvement by one Likert
point was associated with a decrease of 0.233. Again, a comparison with existing
literature is not possible since this factor, which emerged during the semi-structured
interviews, has not been researched before.
228 J.O. Bockmann
Table 10.12 shows the variables that were selected during the stepwise multiple
regression analysis.
To ensure that the statistical significance is given, the ANOVA was checked
again. Model 11, explaining 80.4% of the variance in export profitability, was
selected since it had the highest adjusted R square (Pallant 2013) (Table 10.13).
Following Tabachnick and Fidell (2013), the subsequent regression equation
was formulated based on the unstandardized coefficient B. Regression equation for
export profitability:
Y ¼ b1 x1 b2 x2 þ b3 x3 þ b4 x4 b5 x5 þ b6 x6 þ b7 x7 þ b8 x8 þ b9 x9
where
Y Export profitability (seven-point Likert scale)
x1 Customer sensitivity for product origin (seven-point Likert scale)
x2 Customs and tariffs in SSA (seven-point Likert scale)
x3 Psychic distance (seven-point Likert scale)
x4 Adaptation of product strategy (seven-point Likert scale)
x5 Network in industrial sector in home country (seven-point Likert scale)
x6 Updating with market information (seven-point Likert scale)
x7 Foreign exchange rate (seven-point Likert scale)
x8 Research and development (seven-point Likert scale)
x9 Dependence on intermediaries (seven-point Likert scale)
Table 10.12 Variables entered/removed during stepwise multiple regression analysis (dependent
factor export profitability)
Model Variables entered Variables removed Method
1 Firm characteristics: . Stepwise (criteria:
willingness to deal with probability-of-F-
risks in sub-Saharan Africa to-enter 0.050,
caused by insufficient probability-of-F-
experience in the region is to-remove 0.100)
…
2 Relationships with .
customers and customer
characteristics: customer
sensitivity concerning
product origin/image of
company’s home country
…
3 SSA: customs and tariffs .
4 SSA: psychic distance .
5 Adaptation of product .
strategy to the markets of
sub-Saharan Africa
6 Managerial characteristics .
and relationships: network
in the industrial sector in
home country
7 Firm characteristics: our .
firm keeps up to date with
relevant export market
information
8 SSA: foreign exchange rate .
9 . Firm characteristics:
willingness to deal
with risks in
sub-Saharan Africa
…
10 Firm characteristics: .
research and development
11 Relationship with foreign .
intermediaries: Relative
dependence on
intermediaries
The negative influence of customs and tariffs in SSA confirms the results from
the semi-structured interviews. Although Baldauf et al. (2000) consider this factor
to have a neutral influence, most researchers (e.g., Fugazza and McLaren 2014;
Jordan 2014; Kahiya and Dean 2014) have proved a negative influence.
230 J.O. Bockmann
Table 10.13 Model summary of stepwise multiple regression analysis with dependent factor
export profitability
Model R R square Adjusted R square Std. error of the estimate
1 0.542a 0.294 0.276 1.094
2 0.637b 0.406 0.375 1.016
c
3 0.713 0.509 0.469 0.936
4 0.776d 0.602 0.558 0.854
5 0.808e 0.652 0.603 0.810
6 0.834f 0.695 0.641 0.770
7 0.877g 0.769 0.720 0.680
8 0.894h 0.799 0.749 0.644
9 0.889i 0.790 0.745 0.648
10 0.907j 0.823 0.779 0.605
11 0.921k 0.848 0.804 0.569
Note Significant at 1% (a), 5% (b) and 10% (c) levels of significance
Both analyses indicate that the willingness to deal with risks in SSA has a high
impact on the dependent variables. All three models constructed with overall EP as
a dependent variable include this factor, whereas models relating to export prof-
itability exclude this factor from Model 8 onwards. Otherwise, all other variables
included in the various models differ. Therefore, decision makers wanting to
influence EP need to differentiate between the targets to overall EP or export
profitability and choose suitable strategies. These findings tally with suggestions
made by, for example, Sousa et al. (2008), Stoian et al. (2011), and Wheeler et al.
(2008), that different measurements for EP are necessary for adequate results.
10.6 Conclusion
Sousa et al. (2008) name EP as one of the most widely researched but least
understood areas of international marketing. Our paper, specifically analyzing the
EP of German SMEs targeting SSA, contributes to know-how in this field and fills a
research gap. It carried out and evaluated a comprehensive literature review,
semi-structured interviews, and a questionnaire survey. New questions were iden-
tified like why German SMEs tend to prefer exporting to countries with direct
access by sea.
The results prove that SSA has specific requirements for successful exports
which differ from other regions. This knowledge enables managers and policy-
makers to improve trade relations and to enhance their businesses.
In order to generalize the findings, like in cases of Sousa et al. (2014), Stoian et al.
(2011), and Styles (2014), we suggest that the scope of work be extended to
additional home markets as well as foreign countries/regions. Since our paper
evaluated the whole of SSA without considering country specifics, additional
research focusing on individual target markets within SSA is desirable. Another
shortcoming of this paper lies in the fact that it covers only a specific time frame.
Longitudinal studies about German SMEs targeting SSA would be useful for
gaining further insights into their EP. It would also be useful to research individual
industries instead of multi-industries to find out if particular criteria need to be
considered (Stoian et al. 2011). Although there is no academic limit to the number
of independent and dependent variables for further analysis, two concrete ideas can
be derived from the suggestions made by respondents. They said that ‘area
232 J.O. Bockmann
References
Dess GG, Robinson RB (1984) Measuring organizational performance in the absence of objective
measures: the case of privately-held firm and conglomerate business unit. Strateg Manag J
5:265–273
Dhanaraj C, Beamish PW (2003) A resource-based approach to the study of EP. J Small Bus
Manage 41(3):242–261
Easterby-Smith M, Thorpe R, Jackson P (2015) Management research, 5th edn. Sage Publications
Ltd, London
Felbermayr GJ, Yalcin E (2013) Export credit guarantees and EP: an empirical analysis for
Germany. World Econ 36(8):967–999
Fevolden AM, Herstad SJ, Sandven T (2015) Specialist supplier or systems integrator? The
relationship between competencies and EP in the Norwegian defence industry. Appl Econ Lett
22(2):153–157
Foly C (2013) Chancenkontinent Afrika. Bundesverband der Deutschen Industrie e.V, Berlin.
Available at: http://www.bdi.eu/images_content/GlobalisierungMaerkteUndHandel/BDI-
Umfrage_SSA.pdf (online)
Freeman J, Styles C (2014) Does location matter to EP? Int Mark Rev 31(2):181–208
Fugazza M, Mclaren A (2014) Market access, EP and survival: evidence from Peruvian Firms. Rev
Int Econ 22(3):599–624
Hardoon H (2013) Global corruption barometer. Transparency International, Berlin
Jordan AC (2014) The impact of trade facilitation factors on South Africa’s exports to a selection
of African countries. Dev Southern Afr 31(4):591–605
Kahiya ET, Dean DL (2014) EP: multiple predictors and multiple measures approach. Asia Pac J
Marketing Logistics 26(3):378–407
Katsikea CS, Leonidas S, Leonidou C, Morgan NA (2000) Firm-level EP assessment: review,
evaluation, and development. J Acad Mark Sci 28(4):493–511
Katsikea E, Theodosiou M, Morgan RE (2007) Managerial, organizational, and external drivers of
sales effectiveness in export market ventures. J Acad Mark Sci 35(2):270–283
Lacka I, Stefko O (2014) Key factors for development of export in Polish food sector. Organizacija
47(2):107–115
Lado N, Martínez-Ros E, Valenzuela A (2004) Identifying successful marketing strategies by
export regional destination. Int Mark Rev 21(6):573–597
Lages LF, Lages CR (2004) The ‘STEP’ scale. A measure of short term EP improvement. J Int
Marketing 12(1):36–56
Lages LF, Montgomery DB (2005) The relationship between export assistance and performance
improvement in Portuguese export ventures: an empirical test of the mediating role of pricing
strategy adaptation. Eur J Mark 39(7/8):755–784
Lee C, Griffith D (2004) The marketing strategy-performance relationship in an export-driven
developing economy: a Korean illustration. Int Mark Rev 21(3):321–334
Leonidou LC (1995) Export barriers: non-exporters’ perceptions. Int Mark Rev 12:4–25
Leonidou LC, Katsikea CS (2010) Integrative assessment of exporting research articles in business
journals during the period 1960–2007. J Bus Res 63(8):879–887
Leonidou LC, Katsikea CS, Samiee S (2002) Marketing strategy determinants of EP: a
meta-analysis. J Bus Res 55:51–67
Ling Yee L (2004) An examination of the foreign market knowledge of exporting firms based in
the People’s Republic of China: its determinants and effect on export intensity. Ind Mark
Manage 33(7):561–572
Matanda MJ, Freeman S (2009) Effect of perceived environmental uncertainty on
exporter-importer inter-organisational relationships and EP improvement. Int Bus Rev 18
(1):89–107
Merkel A (2014) Speech by Federal Chancellor Angela Merkel at the reception for the diplomatic
corps at the federal chancellery. In Federal Chancellor of Germany. The Press and Information
Office of the Federal Government, Berlin, p. 1
MoAE (2015). German Mittelstand: Motor der deutschen Wirtschaft. Bundesministerium für
Wirtschaft und Energie (BMWi) [Federal Ministry for Economic Affairs and Energy],
234 J.O. Bockmann
Ward PT, Duray R (2000) Manufacturing strategy in context: environment, competitive strategy
and manufacturing strategy. J Oper Manag 18(2):123–138
Wheeler C, Ibeh K, Dimitratos P (2008) UK EP research: review and implications. Int Small Bus J
26(2):207–239
Wierts P, Van Kerkhoff H, De Haan J (2014) Composition of exports and EP of Eurozone
countries. J Common Market Stud 52(4):928–941
WTO (World Trade Union) (2014). International trade statistics 2014. World Trade Organization
International Trade Statistics, pp 178–179
Yeoh PL, Jeong I (1995) Contingency relationships between entrepreneurship, export channel
structure and environment: a proposed conceptual model of EP. Eur J Mark 29:95–115
Zhao H, Zou S (2002) The impact of industry concentration and firm location on export propensity
and intensity: an empirical analysis of Chinese manufacturing firms. J Int Marketing 10(1):52–71
Zou S, Stan S (1998) The determinants of EP: a review of the empirical literature between 1987
and 1997. Int Mark Rev 15:333–356
Zou S, Taylor C, Osland G (1998) The EXPERF scale: a cross-national generalized EP measure.
J Int Marketing 6(3):37–58
Chapter 11
An Assessment of the Contribution
of Mineral Exports to Rwanda’s
Total Exports
Emmanuel Mushimiyimana
11.1 Introduction
Modern mining started in Rwanda in the 1930s even though before colonialism
Rwandans heated tin for the production of traditional hoes, machetes, spears, and
other domestic material. The mining sector in Rwanda was started by Belgians who
got mining experience in southeastern DRC, in Katanga. Then, two companies
International Council on Mining and Metals was formed in 2001 to catalyze improved
performance and enhance the contribution of mining, minerals, and metals to sustainable
development.
E. Mushimiyimana (&)
Department of Political Science and International Relations, College of Arts and Social
Sciences, University of Rwanda, Butare, Rwanda
e-mail: manemanu12@yahoo.fr
1
Société des Mines d’Etain du Ruanda-Urundi.
2
Société Minière de Muhinga-Kigali.
3
Régie d’Exploitation et de Développement des Mines.
11 An Assessment of the Contribution of Mineral … 239
case of metal resources such as tin and tungsten. ‘The establishment of processing
plants to smelt cassiterite into tin, refining wolframite and tantalite into tungsten and
tantalum respectively is open to private investors’ (RDB 2014: 1). The government
is committed to supporting over 400 local mining companies, and 30 cooperatives
are opened to consider partnerships and joint ventures, covering financing, capital
equipment, technical support, and competitive mineral trade contracts (RDB 2014).
Besides, there is a need to boost the exploitation of gemstones: ‘Rwanda possesses
a variety of gemstones including; beryl (aquamarine), amblygonite, corundum
(ruby and sapphire), tourmalines and different types of quartz and granites. Setting
up cutting and polishing plants of gemstones is also an opportunity’ (RDB 2014).
Trading of minerals is carried out by ‘holders of mining and mineral trading
licenses and owners of smelting and screening companies’ (RDB 2014: 2).
Rwanda’s target is ‘trading in minerals, including cassiterite, wolframite and nio-
bium—tantalite must contain at least 30% value added’ (RDB 2014). There is a
need to develop industrial minerals in order to meet the ‘demand for construction
materials especially tiles, slabs sculptures, paints, bricks and concrete aggregates.
Rwanda possesses a variety of minerals such as good quality silica sands, kaolin,
vermiculite, diatomite, clays, limestone, talcum, gypsum and pozzolan’ (RDB
2014). However, as compared to other countries, Rwanda’s performance in mineral
exports is yet to improve.
Botswana, for instance, used mineral resources as a source of income to finance
her expenditure for her independence. ‘Botswana’s success appears so exceptional
because the driving force behind Botswana’s economy has been its mineral sector’
(Dougherty 2011: 9). On the contrary, Rwanda considered the mining sector as a
subsidiary. Its main source of income has been aided and mineral exploitation has
remained weak since independence. However, due to the developmental needs of
the country in the twenty-first century, the policy is changing and the mining sector
is considered one of the strategic inputs that will help the country to sustain growth,
independence, and self-reliance. One does wonder about the means and way for-
ward to bring about positive changes though.
In comparison, Botswana’s strategy was to attract foreign direct investment
(FDI) and protect investors from any failures or to compensate them when they
failed. This helped the country to be FDI friendly, and it accumulated more and
more resources from abroad. Botswana’s openness to foreign assistance was also
reflected in its export-oriented productive structure. Initially, Botswana produced
beef and diamonds for export, but over time it diversified into non-traditional export
crops, mostly to South Africa (Dougherty 2011). The government was able to retain
a significant portion of the wealth generated by Botswana’s diamond mines through
a policy which rather than retaining a fixed percentage of the sales involved
profit-sharing agreements and a portion of equity in mining operations. This policy
allowed the government to retain significant shares of profitable ventures and fewer
shares of less profitable ventures; such a policy also did not deter new investors
(Dougherty 2011).
Further, interest in mining investments needs to be underpinned by an open
market economy. Restricted trade halts competitiveness. However, there should be
240 E. Mushimiyimana
a sense of control of the mining sector since it is based on natural resources and has
both embedded advantages and risks. One of the mechanisms of controlling mining
companies is framing proper agreements.
Botswana signed an agreement with De Beers, a heavy investor in the country in
a contract based on production sharing. In this regard, there are four types of
contracts: license agreements, production-sharing agreements, joint ventures, and
service agreements. License agreements give more rights to a contracting firm such
as right to a mining concession, production, and exports. Production-sharing
agreements state that the state cedes all production and exporting authority to the
firm, but this usually involves an equity arrangement and higher returns to the
government in the long run. Under these two types of agreements, the government
does not shoulder any risks (Dougherty 2011). However, in the license agreement,
the government can lose total control of mining concessions. In joint ventures and
service agreements, a firm gets a limited right to mineral exploitation and trading
and the government controls the concessions and the trading of the production. The
consequences are that political elites who control the government use political
power to mismanage production. Consequently, the firm that works with the
government gets over-tightened. It is worth knowing the type of contracts that
Rwanda has signed with key mining companies as improvements in mineral exports
not only depend on the type of contract and natural resource endowments, but also
depend on the diversification of mineral products for exports.
Namibia is a sound example of successful mining of gold and dimension stones
such as granite and marble; Rwanda too has potential in these minerals. Some
minerals that have been left behind are currently important given the fact that Africa
is modernizing with both styles and sizes. For instance, Rwanda has a new industry
that processes granite—the East African Granite Industry Ltd. Namibia exports
granite. It has gold in Miyove in the Northern Province. In 2011, Simba Gold
Corp. of Canada engaged in soil and rock sampling at its Miyove Gold project. In
November, Desert Gold Ventures Inc. of Canada purchased the Byumba conces-
sion, which had resources of 5.55 million metric tons at a grade of 1.48 g per metric
ton gold. Desert Gold and Simba planned to drill at Byumba and Miyove Gold,
respectively, in 2012 (Desert Gold Ventures Inc. 2012). Since gold is a precious and
lucrative metal worldwide, its exports can yield enough money for Rwanda once it
is well exploited.
In short, Namibia and Botswana are role models for sub-Saharan African
countries as they have enhanced their economic development by strengthening their
mining sectors. Though unlike some other sub-Saharan countries, Rwanda has not
extracted diamonds and oil as yet she has gold, cassiterite, and tantalum in addition
to methane gas, granite, and other types of dimension stones. The necessary thing is
to boost production and attract more foreign direct investment in order to generate
more income from mineral exports.
Our research hinges on the hypothesis that the exports of mineral resources can
contribute significantly and progressively to Rwanda’s total export revenue as has
happened in other low- and middle-income countries. In our research, we use
econometric methods to investigate the contribution of Rwanda’s mineral exports to
11 An Assessment of the Contribution of Mineral … 241
total exports from 1998 to 2014. The literature review discusses recent theories
developed by ICMM that argue that mineral exports increased in value from 2005
to 2010 and this has proven to have played a significant role in enhancing sus-
tainable economies and reducing poverty in developing nations. The contribution of
our research is in testing whether this ICMM theory is applicable to Rwanda from
1998 to 2014. It also looks at different perspectives that Botswana and Namibia
have used to reach high levels of mineral production and exports and thus highlight
the way that Rwanda can follow these African role models in the mining sector.
The research outcomes show that if mineral exports increase by 10%, then total
exports will increase by 7%. The probability calculated Pr = 0.00 is inferior (<) to
0.05. Therefore, there is a significant contribution of mineral exports (MINEX) to
total exports (TOTEX), considering the significance level of 5%. The recommen-
dation is that the Government of Rwanda can set up mechanisms to boost mineral
exploitation both at her domestic mineral sites and in neighboring countries through
private companies or public-private joint ventures. The government should respect
the legalization standards set up regionally and internationally so that the revenue
from mining empowers the state and the region instead of destroying it (Collier and
Hoeffler 2002).
Our study concludes that Rwanda did not reach the minimum average level of
contribution of mineral exports to total exports which was between 30 and 60%
according to ICMM. It is also argued that the pace is still slow for the country to
reach other low- and middle-income countries because even if Rwanda increases
mineral exports by 10%, ceteris paribus, total exports will only increase by 7%.
Instead, Rwanda needs to increase her mineral exports to at least 50% in order to
have a 35% increase in total exports or achieve a 100% increase in mineral exports
in order to have a 70% increase in total exports. Therefore, there is a need to reform
the mining sector by referring to role models such as Botswana and Namibia.
Our research hinges on the ICMM theory that mineral exports increase rapidly to
become a major share of total exports in low-income agrarian economies even when
they start from a low base. Developing countries’ exports are less than their
imports, and this implies that the LDCs4 balance of payments is always in deficit.
Increasing exports is a good way of boosting the economy. Increasing exports
implies that the government earns more foreign currency to be able to purchase the
commodities that the country needs to import for economic sustainability and the
welfare of its citizens. In a framework of self-reliance, the government of Rwanda is
4
Less developed countries or developing nations with GDP less than US$5000 per capita.
242 E. Mushimiyimana
looking at lowering its aid dependency and building an economy based on pro-
duction, accumulation of FDI, and expansion of other sectors such as services and
industries. The key sectors in Rwanda have been mainly agriculture, industry, and
services. According to Minister Gatete, the service sector was the main contributor
to the country’s GDP in 2011: ‘The Service sector contributed 45% of GDP
compared to 33 and 16% of agriculture and industrial sectors respectively. The
Service sector had the highest growth of 12% followed by Industry 7% and agri-
culture 3%.’ Based on the Prebisch–Singer hypothesis: ‘(a country) with high
export dependence on primary products5 stands to lose out from a worsening of the
terms of trade’ (Riley 2012), ICMM posits that the contribution of mineral
resources to the accumulation of FDI and to total exports is high at a level of 60–90
and 30–60%, respectively, while it is limited and very low to government revenue
(2–20%), national income (3–10%), and total employment (1–2%) in low- and
middle-income countries.
On the one hand, mining FDI often dominates total FDI flows in low-income
economies that have only limited other attractions for international capital; on the
other hand, mineral exports can increase rapidly to become a major share of total
exports (ICMM 2012). These are the domains in which mineral resources have
provided considerable outputs in the last two decades. However, without a con-
siderable increase in government revenue, income, and employment, no one can
assure the role of the mining sector in a more sustainable economy in a developing
nation. The mining sector has contributed to the growth of countries such as
Botswana and Namibia (Dougherty 2011), while it has also led to a reverse out-
come, namely a resource curse or put the countries at high risk (Collier and Hoeffler
2002; Global Witness 2010). In sub-Saharan Africa, the countries endangered by
mineral resources are Sierra Leone, Zimbabwe, DRC, and Angola. Therefore,
accumulation of FDI and increase in total exports go hand in hand with strategies
for the government to get a considerable share in mining revenue, otherwise
minerals will only raise profits for companies rather than for states and societies.
Mineral taxation has become a very significant source of tax revenue in many
low-income economies with limited tax-raising capacities (2–20%) (ICMM 2012).
However, this is not high because of lack of institutional capacity to tax mineral
exploiters and having mining concessions that are dominated by informal trade.
Moreover, some low- and middle-income nations have corrupt tax systems or
inefficiencies in managing collected money and other resources.
Mineral exports of some developing nations lack value addition since they
export raw materials. The modern mineral process technology is sophisticated and
requires intensive capital (ICMM 2012) and skilled labor to be more effective for
total exports. Wright and Czelusta (2004) argue that it is no coincidence that
countries’ exports of minerals and metals tend to emerge across multiple
5
Goods with low levels of processing, diversification and raw materials.
11 An Assessment of the Contribution of Mineral … 243
commodities in concert. Davis (2009) has argued that many countries have multiple
and various mineral endowments that are there for the taking, and mineral
extraction is a matter of domestic public interest, supported by sufficient
country-specific technological knowledge and in some cases technological advan-
ces that lead to production and exports across a broad range of endowments.
According to Davis (2009), a mining policy is important for potential augmentation
of endowments. For instance, Chile was a major exporter of copper in the 1800s,
which then fell away as its high grade deposits got exhausted and there was no
national consensus for supporting the industry. Production surged again in the
mid-1900s as government support for mining was renewed (Davis 2009: 5). In
actual fact, the main difficulties lay in the link between mineral income profitability
and the welfare of citizens.
Mining employment on its own is usually small relative to the total national
labor force (ICMM 2012) because the mining sector is developing and using more
machines than man power. This means that for minerals to be profitable for the
people and the economy in general, economic distribution is important. Other
findings also show that countries with mineral endowments become poorer than
those without mining concessions. Zimbabwe and Nigeria are an illustration of this.
‘Zimbabwe is a country tremendously blessed with vast and diverse precious stones
ranging from gold, chrome, lithium, asbestos, and cesium, as well as high-quality
emeralds and other minerals and metals’ (Mahonye and Mandishara 2015: 1–2).
Since independence, the mining sector has contributed an average of about 40% to
total exports (Hawkins 2009) with the major share coming from gold and other
minerals such as ferrochrome, nickel, and platinum. This, however, still falls in the
range of low-income countries with many people under the poverty line. In another
case, Mills (2010) highlights that Nigeria despite having earned an estimated
US$400 billion from oil in the past 40 years has the number of Nigerians living
under US$1 per day increasing consistently. Says Mills (2010: 171b): ‘Nigeria
would have been better- by some estimates the economy would have been 25%
bigger- if the Niger delta had no oil.’ Table 11.1 shows that not only have the
countries in the Great Lakes region misused natural resources for their economic
growth but also that the contribution of mineral exports was very poor in the other
countries in the same sub-Saharan region. This implies that mining policies in the
Great Lakes region in general and in Rwanda in particular should be taken
seriously.
Our research uses the ICMM theory that mineral resources can rapidly contribute
to total exports even if the economy of that country is agrarian. Therefore, we rely
on ICMM’s measurable data highlighted earlier besides Davis’ (2009) theory
referred to earlier which argues that the development of mineral exports does not
depend on an abundance of natural reservoirs but mostly on policy choices to
develop an added value for minerals for export and increasing their endowments in
the national economy. The contribution of our research is that it tests the appli-
cability of the existing knowledge to the Rwandan situation and tests the position
and pace of Rwanda as one of the low-income countries in the area.
244
Table 11.1 Mineral resources and the GDP PPP per capita of GLR countries as compared to advanced countries in the mining sector in the sub-Saharan
region
Great lakes region Other sub-Saharan countries with mining efficiency
Country name and her natural resources GDP GDP Country name GDP GDP
PPP per PPP per PPP per PPP per
capita capita capita capita
2012 2013 2012 2013
Burundi: nickel, uranium, rare earth oxides, peat, $600 $600 Botswana: diamonds, copper, nickel, salt, soda $15,900 $16,400
cobalt, copper, platinum, vanadium, arable land, ash, potash, coal, iron ore, silver
hydropower, niobium, tantalum, gold, tin, tungsten,
kaolin, limestone
DRC: cobalt, copper, niobium, tantalum, petroleum, $400 $400 The Republic of the Congo: petroleum, timber, $4700 $4800
industrial and gem diamonds, gold, silver, zinc, potash, lead, zinc, uranium, copper, phosphates,
manganese, tin, uranium, coal, hydropower, timber gold, magnesium, natural gas, hydropower
Rwanda: gold, cassiterite (tin ore), wolframite $1500 $1500 Namibia: diamond, copper, uranium, gold, $7900 $8200
(tungsten ore), methane, hydropower, granites, sand, silver, lead, tin, lithium, cadmium, tungsten,
and arable land zinc, salt, hydropower
Source CIA world fact book (data value in US$ 2013)
E. Mushimiyimana
11 An Assessment of the Contribution of Mineral … 245
11.3 Methods
11.4 Data
Our research used secondary data, official documents, and discourses related to
Rwandan exports. It also compared data from known sources such as the CIA
World Fact Book, the National Bank of Rwanda (BNR), and the Rwanda Natural
Resource Authority (RNRA). We visited BNR for a field visit and data gathering.
Table 11.1 gives information about mineral resources and GDP per capita of the
countries in the Great Lakes region (GLR) as compared to advanced countries in the
mining sector in the sub-Saharan region. From the table, it is clear that GLR’s
mineral resources did not contribute to the countries’ GDPs. Though our research
did not measure the rate of contribution of the mining sector to the rest of the
countries highlighted earlier due to the limitation of the scope of the research, it is
clear that countries such as Botswana and Namibia benefitted from good policies in
the mining sector to help them overcome poverty. Besides, Rwanda and GLR in
general have different mineral endowments. Development of Rwanda’s mineral
exports during 1999–2003 is shown in Table 11.2.
246 E. Mushimiyimana
Table 11.3 shows Rwanda’s annual export earnings and annual contribution of
mineral exports during 1995–2013. There is a strong and positive trend in both
indicators over time.
Table 11.3 shows that the contribution of mineral exports to total exports, cal-
culated in percentages, increased from 1995 to 2001, and went downward and
upward in a U-shaped curve from 2001 to 2005. It increased again in 2008 to take a
stable position in 2010 and 2013 (see also Table 11.4). However, though there was
a positive increase in general, mineral exports were in a sharp upward move from
1995 to 2001 while positively uneven from 2002 to 2012 (see Fig. 11.1).
Table 11.3 Annual contribution of mineral exports to total export of Rwanda since 1995 (in %)
Year Export earnings (US$ million) Contribution of mineral exports (%)
1995 1.5 3.0
1996 2.3 3.7
1997 3.8 4.1
1998 4.7 7.3
1999 6.9 11.2
2000 12.6 18.2
2001 42.6 45.6
2002 15.9 23.6
2003 11.1 17.5
2004 29.3 29.9
2005 37.3 29.9
(continued)
11 An Assessment of the Contribution of Mineral … 247
Figure 11.1 shows the increase in mineral revenues from 1998 to 2014 (drawn from
Table 11.2). There is a prediction that in 2020, mineral exports will be equal to or
248 E. Mushimiyimana
more than US$300 million. A scatter plot of mineral export data for Rwanda was
done between 1999 and 2013 to find the progress in generating revenue.
The results as shown in Table 11.4 and Fig. 11.1 are that the revenue accrued
was almost US$20 million to US$250 million in 2013. This shows how progressive
mineral income has been for Rwanda’s total revenue in the last 15 years. The linear
shape of the scatter plot shows that Rwanda will continue to get more and more
mineral revenue in the coming years, if other factors remain constant.
Though revenues from mineral exports increased positively from 2000 to 2013,
Fig. 11.2 shows that there were some downfalls in 2003, 2009, and 2012 and the
effect on total revenue, in percentage, decreased little by little in 2003 and 2006, to
be constant at almost 30% from 2009 to 2013. The effect in percentage is still low
Fig. 11.2 Contribution of mineral exports to total exports and earnings for Rwanda (in %)
11 An Assessment of the Contribution of Mineral … 249
though the real income from mineral exports increased sharply due to improve-
ments of other sectors in Rwanda’s GDP; this was mainly the service sector which
has taken the lead in the last few years. This is also quite similar to the Rwanda
Development Board’s (RDB 2014) position and prediction: ‘In the last three years,
mineral exports recorded USD 96.4M (2010), USD 15.4M (2011) and USD
136.1M (2012). The sub-sector’s contribution to GDP is to increase from 1.2 to
5.27% (10% growth rate per each year) up to 2017/2018.’
This model refers to the fact that the more the mineral export revenue (LMINEX)
increases, the more it significantly increases Rwanda’s total exports (LTOTEX). If
the total export revenue increases at a high pace, then Rwanda’s balance of pay-
ments will be positive and the country will be able to finance most of its imports
and other public expenditure. Therefore, the econometric model will define the
contribution of mineral exports to total exports:
(1) LTOTEX = (b1 + b2LMINEX + et)
From Table 11.4 we get an econometric table, set in logarithmic data in order to
ease an interpretation of percentages (Table 11.5).
The estimation is that LTOTEX = 6.4318 + 0.71423 * LMINEX. This means if
mineral exports increase at 10%, total exports will increase at 7%. The probability
calculated Pr = 0.00 < 0.05. Therefore, there is a significant effect of mineral
revenue LMINEX to total export revenue LTOTEX, considering the level of sig-
nificance at the 5% level, but this pace is very slow considering the level of other
Rwanda is far away from Botswana and Namibia, which have average percentage
contribution of mineral exports to total exports of 83.7 and 53.4%, respectively
(ICMM 2012). The results of our econometric model show that if Rwanda wants to
reach the levels of these role models, she has to increase her mining sector’s
performance to 80 or 120%.
Based on the model used, our research recommends that the mining policy of
Rwanda should focus on: (1) setting up a main strategy to boost exports of min-
erals, (2) structure and industrialize the mining sector so that the exploitation and
production of minerals stay smooth and increase instead of being uneven with
decreases and increases in years and to add value especially by setting up refineries,
(3) determine the types of contracts that the government signs with firms. We
recommend production-sharing agreements instead of license agreements or any
other type of contract. Production-sharing agreements maximize the government’s
revenue while giving all rights of exploitation and exports to private firms, (4) the
Government of Rwanda needs to reallocate mineral incomes to other
pro-development policies such as education and infrastructure starting from where
mining concessions are given as collateral to local environment damage, (5) the
mining sector should go hand in hand with other public reforms such as good
governance and politics that decrease the gaps between the rich and the poor. Once
the government has accrued mining revenue, it can also help other sectors such as
manufacturing, agriculture, and industry to develop, (6) the mining sector needs
more modern technology and market openness to be more effective and efficient—
attracting efficient investors could be an added value, and (7) Rwanda needs to
develop not only cassiterite or tantalum production but also gold exploitation,
methane gas, and the processing of dimension stones such as granite like
Namibia did.
Our research concludes that mineral exports have not contributed considerably to
Rwanda’s total export revenues. However, Rwanda had a significant increase in
revenues from mineral resources between 1998 and 2014 but did not reach the
average contribution of mineral exports to total exports of 30–60% as highlighted
by ICMM. Minerals only contributed 29.1% to her total exports, and this implies
11 An Assessment of the Contribution of Mineral … 251
that Rwanda still has a lot to do in terms of improving its mining sector. We have
also seen that Botswana and Namibia in Africa took off due to strategic and wise
exploitation of resources. Rwanda can learn from them.
The econometric model proves that if mineral revenues from exports increase by
10%, then total export revenues will increase by 7%. Rwanda needs to multiply its
existing efforts by 8–12 times if like Namibia and Botswana she needs a more
significant effect of mineral exports on its economy.
The Government of Rwanda can set up mechanisms to boost mineral
exploitation so that this sector contributes significantly to its economy. She can
come up with policy measures to attract foreign companies to invest heavily in the
exploitation of gold, methane gas, and dimension stones such as granite and marble,
as happened in Namibia, and not only focusing on cassiterites or tantalum. The
contractual frameworks with companies should be based on production-sharing
agreements like Botswana did in order to liberalize the mining sector with the state
maximizing its profits.
References
Fentahun Baylie
Abstract This study analyses the long-run relationship between economic growth
and real exchange rate for a group of 15 low- and middle-income countries for the
period 1950–2011. Co-integration between growth and exchange rate is established
by means of an augmented pooled mean group estimation method (which controls
for heterogeneity and cross-sectional dependence). Unlike previous studies,
cross-sectional dependence is accounted for which implies that the productivity
effect of the Balassa term is expected to be estimated consistently and without bias.
Moreover, our results indicate that the effect of the Balassa term depends more on
the income group (level of per capita income) than the rate of economic growth.
In general, the power of the effect is stronger for higher income countries in the long
run. The study clearly indicates that the Balassa hypothesis holds for middle-
income countries, while this is not the case for low-income countries. However,
fiscal policy and exchange rate volatility rather clearly explain the variations in the
real exchange rate.
12.1 Introduction
The Balassa hypothesis tests the impact of productivity growth on the real exchange
rate. It states that for a growing economy, the real exchange rate is expected to
appreciate in the long run. Our study is based on a finding by Baylie (2008). The
real effective exchange rate is an important policy parameter and among the most
determining factors of growth in Ethiopia (Baylie 2008). Though Baylie recom-
mends depreciation of the domestic currency for promoting economic growth in the
short run, the author discovered that it is healthier to allow appreciation in the long
F. Baylie (&)
Department of Economics, Addis Ababa University, Addis Ababa, Ethiopia
e-mail: fbaylie@yahoo.com
1
Though the idea has been mentioned by several authors (like Ricardo 1911; Harrod 1933; Viner
1937), the contribution of other authors is not as bold as Paul A. Samuelson and Bela Balassa
and hence the name Balassa-Samuelson hypothesis (Tica and Druzic 2006). The term ‘Balassa
hypothesis’ is used in this study.
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 257
tendency for levels of per capita income or productivity to equalize over time.
Growth theories2 state that countries with low capital-to-labor ratios (high marginal
productivity of capital) in general and with advantages of elements such as inno-
vation ability, human capital formation, technical progress, and economies to scale
in particular grow faster than others (Kumo 2011; Orlik 2003; Soukiazis 1995).
According to these growth theories, there is a tendency for developing countries
to grow faster than developed countries if some conditions in particular are satis-
fied. Given that the Balassa hypothesis is related to the impact of economic (pro-
ductivity) growth on the real exchange rate, there should be a greater probability of
finding evidence for the hypothesis in converging economies as compared to
developed ones. The convergence process, thus, may be used as a criterion for
identifying candidate countries for a sample study.
12.3 Methodology
Data for all countries and variables are from Penn World Table for the period
1950–2011. The variables include exchange rate, per capita GDP, and government
expenditure. While the choice of the study period for each country depends on
data availability, countries are selected on the basis of the convergence criterion
which suggests that the fastest growing economies are mainly the developing
economies.
According to IMF’s World Economic Outlook Report (2015), all 15 countries in
our sample are developing countries. However, for comparison purposes, the
sample is divided into two categories on the basis of the size of economies (relative
GDP). The first group represents the top five largest economies in the sample—
BRICS (Brazil, Russia, India, China, and South Africa). They are from the (upper)
middle-income countries’ category (except India) which together nearly represent
90% of the US economy. The second group consists of 10 low-income countries
(Angola, Ethiopia, Ghana, Indonesia, Kenya, Nigeria, the Philippines, Rwanda,
Tanzania, and Uganda). Lower middle-income countries (with per capita income
lower than $4125) are included in the second group in our sample.
2
There are three main theoretical approaches to explain the convergence phenomenon: the neo-
classical approach, endogenous growth theory, and demand-orientated approach. While (absolute)
convergence is the inherent nature of diminishing returns to reproducible capital in the first
approach, it is conditional on different factors and elements such as innovation ability, human
capital formation, technical progress, and economies to scale in the second and third approaches
(Soukiazis 1995).
258 F. Baylie
The original Balassa model was designed for a fully employed small open econ-
omy; a 2 2 2 system (two countries, two commodities, two factors); an
inter-sector mobile labor (scarce factor) and inter-nation mobile capital; law of one
price for factors within a nation and for tradables across nations; a constant return to
scale production frontier; perfect competition in both markets (goods and factors);
neural technical progress; and constant terms of trade (Podkaminer 2003).
A derivation of the Balassa–Samuelson model may be considered as a
three-stage process. The first is to derive the relationship between the productivity
differential and relative price. The second is to derive the relationship between
relative price and exchange rate. The third is to derive the relationship between
productivity differential and exchange rate.
STEP 1: The original Balassa–Samuelson model is framed on the basis of the
traditional Ricardian trade model (Asea and Corden 1994a, b). It is a supply-side
model defined by constant return to scale Cobb-Douglas style production functions
in two sectors as (Podkaminer 2003):
where T and N refer to traded and non-traded sectors, and a and b represent the
share of labor in each sector, respectively, with b a:
In a perfectly competitive market, factor prices must equal their respective value
of marginal products at equilibrium for both sectors:
1a
KT
PT AT a ¼w ð12:3Þ
LT
a
KT
PT AT ð1 aÞ ¼r ð12:4Þ
LT
1b
KN
PN AN b ¼w ð12:5Þ
LN
b
KN
PN AN ð1 bÞ ¼r ð12:6Þ
LN
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 259
Combing the two factor markets for each sector independently and taking the
logarithm of both sides for each equation yields:
Recalling the assumption that price of tradables (numeraire) and interest rate
(not technology) are the same across boundaries, differentiation of the above with
respect to time yields:
dPT ðsÞ a dwðsÞ dA T ð s Þ
ds ds ds
¼0¼ ð12:9Þ
PT ðsÞ wðsÞ AT ðsÞ
dPN ðsÞ b dwðsÞ dA N ð s Þ
ds ds ds
¼ ð12:10Þ
PN ðsÞ w ð sÞ A N ð sÞ
Substituting Eq. (12.9) into Eq. (12.10) helps define the relative price of
non-tradables in terms of productivity differentials for home and foreign country
^ represents growth rate):
(A
dPN ðsÞ dAT ðsÞ dAN ðsÞ
ds b ds ds
¼ ð12:11Þ
P N ð sÞ a AT ðsÞ A N ð sÞ
b ^ ^N
^pN ¼ AT A ð12:12Þ
a
b ^ ^
^pN ¼ AT AN ð12:13Þ
a
The difference between Eqs. (12.12) and (12.13) defines price differentials
across countries:
b ^ ^ b ^ ^
^pN ^pN ¼ AT AN AT AN ð12:14Þ
a a
This means that the price differential between sectors and across countries can be
explained by productivity differentials between sectors and across nations.
STEP 2: We follow Ahn (2009) to link the exchange rate and productivity differ-
ential through the price index. The real exchange rate is defined in a log-linear form
as (increase shows appreciation):
260 F. Baylie
Q ¼ P=eP
ð12:15Þ
q ¼ p e p
ð1hÞ
P ¼ PdN P1d
T and P ¼ Ph
N PT
In log-linear form:
d and h represent the share of non-tradables in the consumer basket at home and
abroad, respectively. Substituting Eqs. (12.16) and (12.17) into Eq. (12.15) helps
define the real exchange rate as a function of price differential:
q ¼ dðpN pT Þ h pN pT þ pT e pT ð12:18Þ
Since pT ¼ e þ pT (law of one price for tradables), Eq. (12.18) will be:
q ¼ dðpN pT Þ h pN pT ð12:19Þ
STEP 3: Eq. (12.19) defines the real exchange rate as a function of the relative price
differential between countries. Substituting Eq. (12.14) into Eq. (12.19) helps define
the exchange rate as a function of the productivity differential. We assume that the
share of non-tradables in the foreign consumer basket ðhÞ is the same as home ðdÞ:
Hence:
b ^ ^ b ^ ^
^q ¼ d AT AN AT AN ð12:20Þ
a a
If the home market grows faster than the foreign one, then the domestic currency
appreciates and vice versa.
In order to avoid the assumption of neutral technical progress, we introduced an
intercept in the econometric model (Kohler 1998). We also introduced demand-side
factors as the Balassa model is not complete by itself (De Gregorio and Wolf 1994).
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 261
Therefore, the econometric model used in our study is derived from Eq. (12.20) (see
Annexure 1 for derivation). It includes two more factors (demand and supply sides):
ðlnQÞit ¼ ai þ b1i lnðY=Y Þit þ b2i lnðG=G Þit þ b3i volðEÞit þ eit ð12:21Þ
ðlnQÞit is log of the real exchange rate of each country measured against the US
dollar. Increase implies appreciation. ‘it’ refers to ith country in period t. lnðY=Y Þit
is log of real GDP per capita relative to the US economy. It is a proxy for the
productivity growth differential in each country. The Balassa hypothesis declares
that productivity growth has a positive impact on the real exchange rate.
lnðG=G Þit is log of relative real government expenditure. It is a proxy for fiscal
policy. Kohler (1998) argues that government expenditure accounts for demand
shifts toward non-tradables which results in appreciation of the real exchange rate in
the short run. In the long run, it does not have an impact unless financed by
distortionary taxes. Distortionary taxes reduce real wages and relative prices of
non-tradables, and this leads to the depreciation of the real exchange rate in the long
run.
volðEÞit is exchange rate volatility measured as the absolute value of percentage
change in the nominal exchange rate. It is a supply-side factor. The impact of
volatility on the real exchange rate may be positive or negative; it depends on the
time horizon and type of regime. Kohler (1998) shows that the impact of volatility
is smaller in the short run and in poor countries due to greater nominal rigidities. In
relatively fixed exchange rate regimes (mainly poor economies), movements in
nominal exchange rate are restricted. In this case, growing economies experience
inflation in both sectors with relative prices of non-tradables falling. This leads to a
depreciation of the real exchange rate. In contrast, there is smaller rigidity in freely
floating exchange rate regimes (mainly rich economies). With productivity growth,
inflation in the non-tradable sector is balanced by deflation in the tradable sector
(as a result of a nominal appreciation). Relative prices of non-tradables increase,
and this leads to the real exchange rate appreciation.
Six types of panel unit root tests are available: Levin-Lin-Chu (LLC), Hariss-
Tzavalis (HT), Breitung, Im-Pesaran-Shin (IPS), Fisher type, and Hadri LM. The
panel data for our study are unbalanced, and N is fixed and smaller relative to T. It
also assumes that the auto-regressive parameter, q; is panel specific. Hence, the
candidate panel unit root tests that fit these criteria are the IPS and Fisher-type tests.
Another advantage of these tests is that they can be used to test a series which is not
serially independent across cross sections.
(a) The Im–Pesaran–Shin test
The following is a panel unit root test as proposed by Pesaran (2007) which
accounts for cross-sectional dependence. The standard Augmented Dickey–Fuller
(ADF) regressions are further augmented with cross-sectional averages of lagged
levels and first differences of individual series. Let yi;t be the observation on the ith
cross-sectional unit at time t, and suppose that it is generated according to the
simple dynamic linear heterogeneous panel data model:
The initial value, yi;0 , has a given density function with a finite mean and
variance, and the error term, eit , has a single-factor structure. ft is the unobserved
common effect, and eit is an individual-specific (idiosyncratic) error. The unit root
hypothesis of interest is expressed as:
N1 =N; a fraction of the individual processes that are stationary, is nonzero and
tends to the fixed value d such that 0 < d < 1 as N ! 1. This condition is nec-
essary for the consistency of unit root tests.
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 263
X
N
P ¼ 2 ln pi ð12:24Þ
i¼1
There are two possibilities to deal with nonstationary variables in a given model
after the stationarity test. First, to test whether the linear combination of nonsta-
tionary variables is stationary by using the co-integration test. If they are
co-integrated, then we proceed to a long-run analysis with the nonstationary vari-
ables. Otherwise, we difference the stationary variables for a short-run analysis.
Engle and Granger (1987) noted that ‘a test for co-integration can be thought as a
pretest to avoid “spurious regression” situations.’ If regression of one nonstationary
variable over another nonstationary variable yields a stationary series, it is known as
a co-integrating regression and the slope parameter in such a regression is known as
a co-integrating parameter.
We employ a residual-based Pedroni co-integration test which is simply a unit
root test applied to the residuals obtained from a co-integrating regression. If
variables are co-integrated, then the residuals should be I(0). If the variables are not
co-integrated, then the residuals are not I(0) (Pedroni 2004). The test allows for
heterogeneous intercepts and trend coefficients across cross sections. It is based on a
residual obtained from a regression:
264 F. Baylie
The choice of estimation method mainly depends on the results of preliminary tests
of data. In our case, we looked for a method that helped an analysis of nonstationary
variables which were co-integrated. We considered a method that provides esti-
mated coefficients for individual countries. Therefore, we are not supposed to
consider traditional estimators such as Pooled OLS, fixed effect, and first-difference
OLS models which assume homogeneous technology parameters and factor load-
ings (common slope). Eberhardt et al. (2011) and others have suggested using the
pooled mean group estimation method for analyzing nonstationary variables which
are co-integrated in a long panel setting. This method is helpful for heterogeneous
technology parameters and factor loadings in particular.
The pooled mean group (PMG) estimator involves averaging and pooling. It
restricts long-run coefficients to be homogenous over cross sections, but allows for
heterogeneity in intercepts, short-run coefficients (including the speed of adjust-
ment), and error variances. It is argued that country heterogeneity is particularly
relevant in short-run relationships given that countries may be affected by
over-lending, borrowing constraints and financial crises in short-time horizons.
Homogenous long-run relationships may be assumed for reasons such as budget or
solvency constraints, arbitrage conditions, or common technologies (Cavalcanti
et al. 2011).
The relationship in pooled mean group estimation may be defined by an ARDL
model as:
Dqit ¼ ai þ bi Dxit þ ki qi;t1 hxi;t1 þ eit ð12:26Þ
where q ¼ lnQ and x ¼ lnX: bi are short-run parameters, which like r2i differ across
countries. Error correction term¸ ki , also differs across i, long-run parameter; h,
however, is constant across the groups. This estimator is quite appealing when
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 265
studying small sets of arguably ‘similar’ countries. In I(1) panels, this estimator
allows for a mix of co-integration (ki [ 0Þ and non-co-integration (ki ¼ 0Þ. xi;t
represents the set of explanatory variables defined in Eq. (12.21).
To account for cross-sectional dependence which may result from any common
unobserved factor incorporated in the error term, we follow Pesaran’s (2013)
common correlated effect approach. Unlike de-meaning, the approach handles
multiple factors which can be correlated with regressors and serial correlation in
errors and lagged dependent variables (Shin 2014). It does not require prior
knowledge of the number of unobserved common factors and can be applied to
dynamic panels with heterogeneous coefficients and weakly exogenous regressors
(Pesaran 2013). The procedure consists of approximating the linear combinations of
unobserved common factors by cross-sectional averages of the dependent and
explanatory variables and then running standard panel regressions augmented with
these cross-sectional averages.
The PMG estimator for a cross-sectionally dependent series may be explicitly
defined as:
Dqit ¼ ai þ bi Dxit þ ki qi;t1 hxi;t1 þ cit ft þ eit ð12:27Þ
1X PT
Dqit ¼ ai þ bi Dxit þ ki qi;t1 hxi;t1 þ dzw;tl þ eit ð12:28Þ
N l¼1
where zw;t represents a set of cross-sectional averages of the dependent and inde-
pendent variables and their lagged values which approximate/proxy the unobserved
common factors ðft Þ. The focus of this estimator is on obtaining consistent estimates
of parameters related to observable variables, while the estimated coefficients on
cross-sectionally averaged variables are not interpretable in a meaningful way:
They are merely present to alter the biasing impact of unobservable common factors
(Eberhardt 2012).
266 F. Baylie
D denotes
the first-difference
operator, eit is a random error term, and
ui;t1 ¼ qi;t1 hxi;t1 is one-period lagged value of error term from a
co-integrating regression.
This ECM equation states that dqit depends on dxit and also on the equilibrium
error term. If the error term is nonzero, the model is out of equilibrium. Suppose dxit
is zero (Bhattarai 2011) and ui;t1 is positive, it means qit1 is too high (above) to be
in equilibrium. Since ki is expected to be negative, the term ki ui;t1 is negative, and
therefore, dqit will be negative to restore equilibrium. That is, if qit is above its
equilibrium value, it will start falling in the next period to correct the equilibrium
error. Similarly, if ui;t1 is negative (i.e., qit is below its equilibrium value), ki ui;t1
will be positive, which causes dqit to be positive, leading qit to rise in the next
period. The absolute value of ki determines how quickly the equilibrium is restored
(Engle and Granger 1987).
This analysis begins by performing different econometric tests. Since not all unit
roots provide the appropriate results, a cross-sectional independence test was per-
formed to decide the type of panel unit root test to be considered. Using the
Pesaran CD test, and possibly all other tests, the null hypothesis of cross-sectional
independence was rejected for the original data. Hence, the series for our data was
initially cross-sectionally dependent. However, after the data were augmented for
cross-sectional averages to eliminate unobserved common factors, the Pesaran CD
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 267
test, and possibly two other tests, failed to reject the null hypothesis of
cross-sectional independence. The test results are given in Annexure 2.
IPS and Fisher-type tests are panel unit root tests which account for cross-sectional
dependence. The results of the tests with different assumptions are given in
Annexure 2. All the variables are nonstationary at the 1% level of significance.
The next step is to test for co-integration—whether there is a long-run relation
between our nonstationary variables. The test for co-integration is residual based.
We used two Pedroni type tests (ADF and PP tests) and the IPS test. In all the cases,
we strongly reject the null hypothesis of no co-integration for both types of models
(augmented and non-augmented) (see Annexure 2). Augmented models include
cross-sectional averages of dependent and independent variables to account for
cross-sectional dependence.
We propose three types of augmented models for the model selection criterion:
models I, II, and III with one, two, and three explanatory variables, respectively.
Even though the model selection criterion suggests that a model with three variables
is our ‘best model’ in terms of log-likelihood ratio and Akaike information criteria
(see Annexure 2), we present the results of the other models as well for comparison.
Unlike most previous studies, the results of our study were not uniform across all
developing countries. The impact of productivity growth on the real exchange rate
differed by income group or per capita income. Productivity growth led to an
appreciation in middle-income countries and depreciation in low-income countries
in the long run. Our results substantiate the findings of Drine and Rault (2002,
2004) and Chuah (2012). Drine and Rault (2002, 2004) found evidence for the
hypothesis in a study for middle-income countries (MICs) in 2002 and failed to
arrive at the same conclusion for low-income countries (LICs) in another study in
2004. Our findings also seem to be in implicit confirmation of Chuah’s (2012)
results. He calculated a turning point ($2200) below which change in income
resulted in depreciation of the real exchange rate. Almost all LICs in our study had
a per capita income less than $2200. The conclusions of Chuah’s (2012) study
coincide with our conclusions for LICs such as Indonesia, Kenya, Nigeria,
Tanzania, and Uganda.
Table 12.1 shows the long-run results of the panel co-integration estimation
using the augmented PMG estimator for different groups of countries and models in
the sample. We follow the tradition of presenting estimated coefficients of only
observable variables as cross-sectionally averaged variables are not directly inter-
pretable in a meaningful way. Estimated coefficients of full models (with observable
and unobservable variables) are reported in Annexure 3.
Basically, we consider three types of models in comparing three types of groups:
the all countries group (15 countries), country groups by income (middle-income
countries (MICs), 5 countries; low-income countries (LICs), 10 countries), and
268 F. Baylie
country groups by region (Africa, 9 countries; Asia, 4 countries). For each group in
Table 12.1, the first row shows a model with one explanatory variable (produc-
tivity); the second row shows a model with two explanatory variables (productivity
and government expenditure); and the third row shows a model with three
explanatory variables (productivity—lnðY=Y Þ; government expenditure
—lnðG=G Þ; and exchange rate volatility—volðEÞÞ. The center of our discussion is
Model III (shaded rows) for each group below.
In general, the results in Table 12.1, in general, show that the Balassa hypothesis
holds for all countries as a group in the sample in the long run; that is, a 1%
improvement in productivity leads to an appreciation of domestic currencies in the
developing countries in the group by 0.388% on average. We find a different result,
however, when the sample is categorized into different groups. When categorized
by level of per capita income, the results show that the Balassa hypothesis holds
only for middle-income countries (MICs). The same fact holds when countries are
categorized by region; that is, the Balassa hypothesis holds only for Asian countries.
This may be related to the fact that in our sample, most middle-income countries are
from Asia and poor countries are from Africa. In both the cases, a 1% increase in
productivity appreciates the domestic currencies of countries in MICs and Asia
groups nearly by 0.34 and 0.77%, respectively (though only at the 10% level of
significance for the latter). For LICs and Africa groups, a 1% increase in produc-
tivity depreciates domestic currencies of countries in the groups by nearly 0.287
and 0.247%, respectively.
The long-run relationship between government expenditure and the real
exchange rate shows that expansionary fiscal policies result in appreciation of
domestic currencies in all cases except for the LICs and Asia groups. This may not
be surprising as the major countries with ‘big economies’ in both the groups are
almost similar (Indonesia and the Philippines are members of both groups). The
results of these groups are in line with Kohler’s (1998) argument who states that
government expenditure does not have an impact in the long run unless financed by
distortionary taxes.
Exchange rate volatility has the impact of depreciating the real exchange rate for
all countries in all groups except the middle-income group in the long run. This
confirms theoretical arguments which associate relatively fixed or highly managed
exchange rate systems (mainly in poor countries) to depreciation and flexible
regimes to appreciation in the real exchange rate.
Table 12.2 presents the results of short-run dynamics of the same groups of
countries and models as given in Table 12.1. The discussion that follows focuses on
Model III (the shaded rows). Short-run dynamics show that the impact of change in
productivity on change in the real exchange rate is significant but negative; that is, it
has the impact of depreciating the real exchange rate for all countries in all groups
in the short run.
Fiscal policy does not significantly explain the variations in the real exchange
rate. Exchange rate volatility has an impact only in MICs and all countries groups.
It negatively impacts the real exchange rate in the short run. This may be due to
greater rigidity in the short run.
270 F. Baylie
Table 12.2 Short-run dynamics of panel co-integration estimation: the augmented PMG
estimator
Sample Type of Adjustment Short-run coefficients
[# of countries] model coefficient D ln Q ¼ dependent variable
D lnðY=Y Þ D lnðG=G Þ D volðEÞ
All countries Model I −0.109894*** −0.327721***
[15] (0.022462) (0.081918)
Model II −0.142981*** −0.359401*** −0.046910*
(0.032498) (0.091545) (0.027791)
Model −0.122432*** −0.397465*** −0.057736* −0.161279***
III (0.037203) (0.082479) (0.030909) (0.038693)
MICs (BRICS) Model I −0.167326*** −0.300195***
[5] (0.068251) (0.082509)
Model II −0.193645*** −0.408364*** −0.082710
(0.102726) (0.133144) (0.073367)
Model −0.12243*** −0.3974*** −0.057736* −0.161279***
III (0.037203) (0.082479) (0.030909) (0.038693)
LICs [10] Model I −0.111716*** −0.352239***
(0.027206) (0.118025)
Model II −0.141304*** −0.364848*** −0.042973
(0.036735) (0.136587) (0.032613)
Model −0.086261*** −0.423972*** −0.016775 −0.020308
III (0.024822) (0.099213) (0.028716) (0.031332)
Africa [9] Model I −0.098028*** −0.427619***
(0.031963) (0.104199)
Model II −0.131601*** −0.427570*** −0.058400**
(0.041812) (0.122564) (0.031572)
Model −0.107015*** −0.408247*** −0.032190 −0.034526
III (0.032358) (0.100776) (0.029437) (0.046803)
Asia [4] Model I −0.127473*** −0.213683
(0.041693) (0.193959)
Model II −0.165803* −0.229174 0.035224
(0.092356) (0.192154) (0.067468)
Model −0.052518*** −0.458148*** −0.013347 −0.044258*
III (0.019297) (0.162181) (0.067688) (0.023795)
Note Δ ln Q = log of real exchange rate differenced, D ln Y/Y* = log of real GDP relative to
foreign (US) differenced, D ln G/G* = log of real government expenditure relative to foreign
(US) differenced, and vol(E) exchange rate volatility differenced
***, **, and * refer to the significance level at 1, 5, and 10%. Standard errors in parenthesis
MICs refers to middle-income countries of the BRICS group (Brazil, Russia, India, China, and
South Africa)
LICs refers to low-income countries (Angola, Ethiopia, Ghana, Indonesia, Kenya, Nigeria, the
Philippines, Rwanda, Tanzania, and Uganda)
Africa refers to African countries (Angola, Ethiopia, Ghana, Kenya, Nigeria, Rwanda, Tanzania,
Uganda, and South Africa)
Asia refers to Asian countries (China, India, Indonesia, and the Philippines)
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 271
The (negative) signs and statistical significance of the error correcting terms
show that the system is stable. A stable co-integrating relationship adjusts short-run
deviations by the extent of the error correcting term. The rate of adjustment is,
however, higher (12%) in MICs than LICs (8%). This means MICs have a faster rate
of adjustment and achieve equilibrium earlier than LICs. This may be associated
with better conditions to fulfill assumptions of the model in the former group.
Tables 12.3 and 12.4 present the short-run dynamics for individual countries in
two income groups (MICs and LICs), respectively. The results are for Model III.
The short-run dynamics show that the impact of productivity on the real
exchange rate was significant and negative for all countries except Brazil and South
Africa. Productivity did not have an impact on the real exchange rate in these
countries in the short run. Expansionary fiscal policies resulted in depreciation of
the real exchange rate in Brazil, Russia, and China. The role of exchange rate
volatility was significant in all countries. However, the effect was exceptionally
positive in Russia.
The rate of adjustment was the highest in Russia (56.25%) followed by Brazil
(22.76%). This may be associated with the size and features of these economies.
These are the two biggest economies in the group which account for 40 and 20% of
the US economy, respectively. A faster rate of adjustment means that they can
achieve equilibrium earlier than others.
Table 12.4 presents the short-run dynamics for LICs. The short-run dynamics
shows that the impact of productivity on the real exchange rate was significant and
negative for all countries except Indonesia and Uganda. Productivity did not impact
Table 12.3 Short-run dynamics by country: middle-income group (BRICS): Model III
Cases Adjustment coefficient Short-run coefficients
D ln Q ¼ dependent variable
D lnðY=Y Þ D lnðG=G Þ D volðEÞ
All countries −0.12243*** −0.3974*** −0.057736* −0.161279***
(0.037203) (0.082479) (0.030909) (0.038693)
Brazil −0.227627*** 0.228636* −0.088359*** −0.002061***
(0.004377) (0.096920) (0.006094) (1.62E−05)
China −0.064666*** −0.5913*** −0.101536*** −0.354891***
(0.000587) (0.012107) (0.006333) (0.007472)
India −0.046854*** −0.4483*** 0.015571 −0.301000***
(0.000797) (0.026819) (0.008586) (0.006721)
Russia −0.562507*** −0.4439*** −0.451463*** 0.006537***
(0.015174) (0.045163) (0.010053) (3.27E−05)
South Africa −0.078684*** −0.194967 0.039537 −0.373313***
(0.001472) (0.151551) (0.058783) (0.009758)
Note Δ ln Q = log of real exchange rate differenced, D ln Y/Y* = log of real GDP relative to
foreign (US) differenced, D ln G/G* = log of real government expenditure relative to foreign
(US) differenced, and vol(E) exchange rate volatility differenced
MICs refers to middle-income countries of the BRICS group (Brazil, Russia, India, China, and
South Africa)
272 F. Baylie
the real exchange rate in the short run in these countries. The role of fiscal policy
was significant in all countries even though the effect was different. The increase in
government expenditure resulted in a depreciation of the real exchange rate in all
countries except in Ghana, Kenya, Indonesia, and the Philippines. The strongest
impact of the fiscal policy was shown by Uganda (0.15%). The impact of exchange
rate volatility was significant in all countries except Indonesia.
The rate of adjustment was the highest in Kenya (24.89%) followed by the
Philippines (14.58%) and Ethiopia (13.63%). These three countries may achieve
equilibrium earlier than others in the group.
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 273
12.5.1 Conclusions
Unlike most previous studies, the results of our study are not uniform across all the
developing countries in our sample. The impact of productivity growth on the real
exchange rate varied by income group or per capita income. Productivity growth
led to an appreciation of the real exchange rate in middle-income countries and
depreciation of the real exchange rate in low-income countries in the long run. In
general, the results of our study confirm that the relationship between the real
exchange rate and productivity does exist and is stronger for higher income
countries in the long run. Real per capita income matters more than the rate of
economic growth in explaining the effects of the Balassa term in our study.
In the short run, however, we find almost uniform results across income groups.
Productivity growth (possibly of non-tradables), expansionary fiscal policies, and
high exchange rate volatility result in the real exchange rate depreciation. More
specifically:
• Improvements in productivity and expansionary fiscal policies both have the
impact of depreciating the real exchange rate in almost all the countries, both
middle and low incomes.
• The impact of exchange rate volatility is significant only in middle-income
countries. This may be associated with the type of exchange rate policy/regime
adopted. It is mainly fixed (unchanged) in low-income countries in which case it
may not be useful to explain variations in the real exchange rate in the short run.
The reasons for the anti-Balassa hypothesis results in low-income countries in
our study may be associated with a failure to satisfy the basic assumptions of the
model. The relationship between the real exchange rate and productivity in the
external version of the hypothesis assumes a positive relationship between pro-
ductivity and relative prices as well as relative prices and the real exchange rate in
the internal version. In addition, the law of one price must hold in the tradable
sector.
On the basis of our findings, we recommend the following policy options for MICs
and LICs:
• The Balassa hypothesis holds for middle-income countries in our sample.
Economic growth leads to an appreciation in the real exchange rate in these
countries. Hence, countries in this group may promote growth by increasing
productivity in the tradable sector.
274 F. Baylie
• Since the Balassa hypothesis does not hold for low-income countries in our
sample, economic growth does not lead to the real exchange rate appreciation in
these countries. Hence, countries in this group may continue to grow by pro-
moting productivity growth in the non-tradable sector.
• Depreciation of the real exchange rate can be associated with improvements in
the productivity of the non-tradable sector for low-income countries and should
be used accordingly.
• The role of fiscal policy may not last long in low-income countries and so should
be used accordingly.
Acknowledgements I am grateful for all comments and contributions of Professor Scott Hacker,
Professor Par Sjolander, and Dr. Girma Estiphanos for this work. It was a great pleasure to have
their say in my paper.
Annexure 1
Suppose that the growth rate of the real exchange rate is defined as a function of
productivity differential between the non-tradable and tradable sectors as in:
^ ¼d b ^ ^N b A ^ A
^
Q AT A T N ð1Þ
a a
^ ^p ^p .
with Q
This is the same as:
^ ¼d b ^ ^
^N A
Q AT AT A N ð1:1Þ
a
^ 0 ¼ d A
If we let M ^N A
^ and m1 ¼ d b ; then:
N a
^ 0 þ m1 A
^ ¼A
Q ^T A
^ ð1:2Þ
T
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 275
and in log-levels, it is
q ¼ m0 þ m1 ln AT =AT ð1:4Þ
where q ln Q and m0 ln M0
We proxy AT =AT with Y=Y where Y is the home real GDP per capita and Y* is
the foreign (US) real GDP per capita, so we get Eq. (2.23).
q ¼ m0 þ m1 ðlnY=Y Þ ð1:5Þ
Annexure 2
Annexure 3
Table 12.8 Panel unit root tests (IPS and Fisher-type tests)
Variables Specifications Pesaran statistics Fisher statistics Order of integration
lnðQÞ Constant −1.0567 32.73 I(1)
Constant and trend 1.6161 21.13
D lnðQÞ Constant −21.07*** 401.23*** I(0)
Constant and trend −20.07*** 344.39***
lnðY=Y Þ Constant −0.3133 34.84 I(1)
Constant and trend 3.3929 21.69
D lnðY=Y Þ Constant −16.22*** 296.62*** I(0)
Constant and trend −19.43*** 327.62***
lnðG=G Þ Constant −0.9245 32.50 I(1)
Constant and trend 0.6120 26.57
D lnðY=Y Þ Constant −19.99*** 377.62*** I(0)
Constant and trend −19.23*** 350.83***
volðEÞ Constant −13.74*** 246.72*** I(0)
Constant and trend −16.07 230.49
Testing the Balassa Hypothesis in Low- and Middle-Income Countries
Note *** indicates the rejection of the null hypothesis (unit root) at 1%
277
278
References
Ahn M (2009) Looking for the Balassa-Samuelson effect in real exchange rate changes: Andong
National University. J Econ Res 14:219–237
Asea K, Corden M (1994a) The Balassa-Samuelson model: an overview. USA
Asea K, Mendoza G (1994b) The Balassa-Samuelson model: a general equilibrium appraisal.
Review of International Economics, Working Paper #709
Balassa B (1964) The purchasing power parity doctrine: a reappraisal. J Polit Econ 72(6):584–596
Baylie F (2008) The impact of real effective exchange rate on the economic growth of Ethiopia.
Master thesis, Addis Ababa University, Ethiopia
Bhagwati N (1984) Why are services cheaper in the poor countries? The Econ J 94(374)
Bhattarai K (2011) Co-integration and error correction models: econometric analysis. Hull
University, Business School, England
Cavalcanti V, Mohaddes K, Raissi M (2011) Commodity price volatility and the sources of
growth. IMF Working Paper, Middle East and Central Asia Department
Chen M (2013) Panel unit root and co-integration tests. National Chung Hsing University, USA
Chuah P (2012) How real exchange rate move in growing economies: Anti-Balassa evidence in
developing countries. Malaysia
Chuoudhri E, Kahn S (2004) Real exchange rate in developing countries: are Balassa-Samuelson
effect present? IMF Working Papers WP/04/188
De Gregorio J, Wolf H (1994) Terms of trade, productivity and the real exchange rate. NBER
Working Paper No. 4407
Drine I, Rault C (2002) Do panel data permit to rescue the Balassa-Samuelson hypothesis for latin
American countries? An empirical analysis using panel data co-integration tests. William
Davidson Working Paper Number 504
Drine I, Rault C (2004) Does the Balassa-Samuelson hold for Asian countries? An empirical
analysis using panel data co-integration tests. Appl Econ Int Dev 4(4):000
Eberhardt M (2011) Panel time-series modelling: new tools for analyzing xt-data. University of
Nottingham, Case Business School, England
Eberhardt M (2012) Estimating Panel Time Series Models with Heterogeneous Slopes. Stata
Journal 12 (1):61–71
Engle R, Granger C (1987) Co-integration and error correction: representation, estimation, and
testing. Econometrica 55(2):251–276
Feenstra R, Inklaar R, Timmer M (2013) The next generation of the penn world table. Am Econ
Rev 105(10):3150–3182
Guo Q, Hall G (2008) A test of the Balassa-Samuelson effect applied to Chinese regional data.
Rom J Econ Forecast 2:57–78
Hassan F (2011) The Penn-Balassa-Samuelson effect in developing countries: price and income
revisited. London School of Economics and Political Science, London
Herberger C (2003) Economic growth and the real exchange rate: revising the Balassa-Samuelson
effect. University of California, Los Angeles
IMF (2015) World Economic Outlook Report. IMF
Isard P, Symansky S (1996) Long-run movements in real exchange rates. IMF Occasional Paper
No. 145
Jabeen S, Malik S, Haider A (2011) Testing the Harrod-Balassa-Samuelson hypothesis: the case of
Pakistan. Quaid-i-Azam University, Islamabad
Kohler M (1998) The Balassa-Samuelson effect and monetary targets. Centre for Central Banking
Studies, Bank of England
Kravis B, Lipsey E (1983) Towards an explanation of national price levels. Princeton Studies in
International Finance, No. 52, Princeton University, USA
Kumo W (2011) Growth and macroeconomic convergence in Southern Africa. African
Development Bank Group, Working Paper No. 130
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 287
Maddala S, Wu S (1999) A comparative study of unit root tests with panel data and a new simple
test. Oxford Bull Econ Stat 61:631–652
Miyajima K (2005) Real exchange rates in growing economies: how strong is the role of the
nontradables sector? IMF Working Paper No. 05/233
Orlik A (2003) Real convergence and its different measures; lessons to be learnt by EMU applicant
countries
Pedroni P (2004) Panel co-integration; asymptotic and finite sample properties of pooled time
series tests with an application to the PPP hypothesis. Econ Theor 28:597–625
Pesaran H (2007) A simple panel unit root test in the presence of cross-section dependence. J Appl
Econ 22(2):265–312
Pesaran H (2013) Large panel data models with cross-sectional dependence: a survey.
Unpublished, Cambridge, UK
Podkaminer L (2003) Analytical notes on the Balassa-Samuelson effect. BNL Q Rev 226
Rogoff K (1996) The purchasing power puzzle. J Econ Lit XXXIV: 647–668
Shin Y (2014) Dynamic panel data workshop. University of Melbourne
Soukiazis E (1995) The endogeneity of factor inputs and the importance of balance of payments on
growth: an empirical study for the OECD countries with special reference to Greece and
Portugal. unpublished PhD dissertation, University of Kent, Canterbury
Tica J, Druzic I (2006) The Harrod-Balassa-Samuelson effect: a survey of empirical evidence.
University of Zagreb, Zagreb
Wilson E (2010) European real effective exchange rate and total factor productivity: an empirical
study. Victoria University of Wellington, New Zealand
Part V
Growth, Productivity and Efficiency in
Various Industries
Chapter 13
Agricultural Tax Responsiveness
and Economic Growth in Ethiopia
Abstract Of late, the pattern of tax revenues and its nexus with economic growth
in developing countries become an increasing concern for policy framers and
researchers. Since tax revenue is one of the important sources of government
revenue, a tax policy assumes significance as a vehicle for a viable and long-term
source of revenue and economic growth. Similarly, economic growth has aug-
menting effects on the tax revenue of a country. This study investigates tax
responsiveness to the changes in gross domestic product in Ethiopia in the period
1981–2014. It mainly focuses on the components of agricultural tax revenue:
agricultural income tax and land use fee. In addition, it also studies personal income
tax and business profit income. Understanding and analyzing the level of sensitivity
of tax revenue to discretionary policy measures and GDP are essential in formu-
lating fiscal policy. The empirical evidence on Ethiopia suggests that the trends in
agricultural income tax and land use fee collection are highly inconsistent.
Agricultural income tax and land use fee are not buoyant, indicating that the growth
of the agricultural sector has no statistically significant impact on agricultural
income tax buoyancy. However, personal income tax revenue, business profit
revenue, and total direct tax revenue are responsive to changes in non-agricultural
GDP in Ethiopia. In light of these findings, some policy interventions for improving
tax revenue are suggested.
Keywords Tax buoyancy Tax elasticity Agricultural tax revenue Direct tax
revenue
H. Azime (&)
Institute of Tax and Customs Administration, Department of Public Finance
Ethiopian Civil Service University, Addis Ababa, Ethiopia
e-mail: azimeadem@gmail.com
G. Ramakrishna M. Asfaw
School of Graduate Studies, Ethiopian Civil Service University, Addis Ababa, Ethiopia
e-mail: profgrk@gmail.com
M. Asfaw
e-mail: drmelesse@gmail.com
13.1 Introduction
The second concept is defined as the overall reaction of tax revenue to changes
in GDP and discretionary changes in the tax policy over time. It is a measure of
how tax revenue varies with changes in GDP. Tax revenue is therefore expected to
increase as the economy grows, that is, the level of estimation is how far the tax
revenue reacts to changes in GDP. Tax buoyancy measures can be used to assess
the efficiency of a given tax system regarding its revenue generation capacity
(Jenkins et al. 2000). Knowledge of this measure is important in decision making
about the fiscal policy of a country because it allows us to determine the evolution
of the tax revenue collected by the government (Bunescu and Comaniciu 2013;
Moreno and Maita 2014). Hence, tax buoyancy is a valuable method for analyzing
the tax policy and examining the composition of a tax system.
The tax structures in developing countries should be responsive enough so as to
enable the countries to meet their government spending for development. Thus, the
main objective of our study was to examine the responsiveness of agricultural tax
revenue and other tax revenues to the changes in economic growth in Ethiopia.
More specifically, the paper has the following objectives:
• To estimate and analyze the responsiveness of agricultural income tax revenue
and land use fee to changes in the agricultural component of GDP, and
• To estimate the responsiveness of personal income tax and business income tax
revenue to changes in non-agricultural GDP.
The rest of the paper is organized as follows: The next section gives a brief
overview of the tax structure in Ethiopia. In Sect. 13.3, a conceptual model and a
brief review of earlier studies are presented. The data collection methods and the
variables are presented in Sect. 13.4. Section 13.5 gives the data analysis and
empirical findings. The last section gives a summary and conclusion.
This section presents an overview of the tax structure in Ethiopia across its two
economic regimes: the state-led liberalized regime (1991 onward) and the socialist
regime (1974–91) called the Derg regime. Under both the regimes, the Ethiopian
tax system consisted of direct and indirect taxes. Direct taxes include agricultural
income, land use fee, personal income, rental income, business profit, interest
income, and capital gain tax while indirect taxes include value-added tax (VAT),
turnover tax, excises, stamp duties, customs duties, and export taxes.
During the socialist regime, the government controlled all economic spheres
including agriculture. The land reform policy of 1975 nationalized land and took
another step of distributing land equally among peasants. Consequently, the peas-
ants were forced to establish and organize themselves into peasant associations
(Prichard 2015). Smallholder farmers in Ethiopia depend on small acres of land that
is owned or rented to generate income. The term ‘agricultural taxation’ used in our
294 H. Azime et al.
study includes only taxes paid by the farmers. So the smallholder farmers’ burden
of taxes is from agriculture income tax and land use fee.
During the socialist regime, the objective of agricultural tax was transferring a
substantial portion of the agricultural surplus to industry. As a result, the govern-
ment taxed the agricultural sector heavily. In particular, the agricultural income tax
rate was progressive and was as high as 89% in the highest income bracket.
Taxation on exports of the main crop reached as high as 100% of the farm gate
price (Rashid et al. 2007).
Because of the change of government in Ethiopia in 1991, the country witnessed
a shift in the policy regime. Different reforms were initiated in 1992. These included
new legislations for earnings tax, business income tax, rural land, and agricultural
income tax (Alemayehu and Shimeles 2005). During 1992, agricultural taxes were
not collected because of the transition period and difficulties in collecting taxes
from farmers. Since 1992, IMF and the World Bank have supported Ethiopia in
liberalizing its economy and implementing structural adjustment programs (SAPs)
to address the internal and external imbalances in the economy.
The government has initiated different reforms to liberalize its economy. It
undertook comprehensive tax reforms encompassing most of the principal revenue
sources. Along with the reforms in the tax system, the liberalization policies were
also extended to monetary policy tools, foreign and domestic trade, production, and
distribution (Geda and Shimeles 2005). The major goals of the tax reforms initiated
during this regime included increasing the tax base, improving tax collection, tax
incentives for the private sector, and dealing with equity in taxation.
Agricultural income tax is one of the most sensitive features of income taxation in
general. In most developing countries, governments impose taxes on agricultural
income, but it is hard to determine the income of smallholder farmers and to reach
income earners. These difficulties are due to the large number of small units of income
generation, the absence of accounting procedures suited to income taxation, the
fluctuating nature of agricultural productivity and profits, and low levels of education.
Ethiopia amended its 1978 agricultural income tax rates in 1995 and 1997.
Moreover, annual revenue exceeding birr1 1200 was subjected to a progressive tax
rate. Agricultural income tax rates imposed by the regional states with the provision
of the constitution were wide ranging from 5 to 40%. Agricultural income taxation
was based on the size of the landholding rather than the amount of annual agri-
cultural production. For instance, the Oromia regional state (the largest and most
populous region in Ethiopia) initially adopted a progressive agricultural income tax
system but replaced it with an agricultural income tax system based on the size of the
landholding, rather than the amount of agricultural produce (ONRS 2002, 2005).
1
Birr is the currency used in Ethiopia. Currently, one (1) USD is equal to about 22.24 birr.
13 Agricultural Tax Responsiveness and Economic Growth in Ethiopia 295
The agricultural income tax rate, exemption limits, and assessment differ slightly
across regions. Each region levying the tax has its statutes with specific provisions
for determining taxable incomes.
In principle, land taxes are less complex as compared to agricultural income tax
because assessment of land tax requires the total area of the land, its location, and
type of land grade; suitability for irrigation; land fertility; and rural transportation
for a market. As Newbery (1987) has suggested, this information might not be too
costly to collect. Based on this information, it would be possible to design a simple
presumptive tax structure for land tax (Sarris 1994).
According to the amended proclamation number 77/1997 of income tax for land
use and agricultural activities, smallholder farmers in the regional states are taxed
birr 10 for the first hectare and birr 7.5 for each extra half hectare (Geda and
Shimeles 2005). In some regions, the area of land and the land classification system
that is based on relative soil fertility estimates determine the level of taxation.
During 2004–14, the total rural area cultivated and expanded for agricultural pur-
poses increased by 2.7% per year and the number of smallholder farmers increased
by 3.8%. The total agricultural output level also increased during this period
(Bachewe et al. 2015; Moller 2015).
.15
.1
.05
0
Fig. 13.1 Total tax revenue, direct and indirect tax as shares of GDP. Source Authors’
computations using data from the Ministry of Finance and Economic Development (MOFED)
conducive to the collection of direct taxes. It is also argued that indirect taxes are
less sensitive to these influences; hence, they can be collected with little effort and
are relatively easy to administer (Khan 2001).
As depicted in Fig. 13.2, the agricultural tax revenue series shows a decline in
revenue until 1992. Because of the change in regime during 1991–92, there was no
assessment of agricultural tax revenue. The figure also shows that the tax ratio has
Agricultural Income tax and Land use fee as % of Total GDP Agricultural Income tax & Land Use fee as % of Agricultural GDP
.2
.0 8
.1 5
.0 6
.1
.0 4
.0 5
.0 2
0
0
1980 1990 2000 2010 2020 1980 1990 2000 2010 2020
Year Year
Agricultural Income tax Land Use fee Agricultural Income tax Land Use fee
Fig. 13.2 Agricultural tax revenue as share of total GDP and agricultural GDP. Source Authors’
computations using data from MOFED
13 Agricultural Tax Responsiveness and Economic Growth in Ethiopia 297
fluctuated consistently in the last two decades in Ethiopia. In fact, the tax ratio trend
is not stable, implying inconsistency in tax performance that could be due to
fluctuations in GDP.
According to Feger and Asafu-Adjaye (2014), to date, total tax revenue col-
lection in SSA countries has only averaged about 15% of GDP. However, in the
case of Ethiopia, it is 11.5%, which is still below the SSA average amount.
Moreover, the agricultural income tax collection efficiency in Ethiopia is not as
broad-based as it should be. The efforts of the tax administration, capacity, and
efficiency may have attributed to less progress in collecting the revenue generated
from the agricultural tax income base. In 2003-04, the agricultural income tax
revenue was 0.13% of agricultural GDP (0.06% of total GDP). It dropped to 0.07%
of agricultural GDP in 2007–08 (0.03% of total GDP), but it picked up to 0.13% of
agricultural GDP (0.08% of the total GDP) in the 2010–11 fiscal year.
Though agriculture remains the mainstay of the Ethiopian economy when it
comes to employment and its contribution to GDP, its contribution to the total tax
revenue collection is below 1%. Figure 13.3 shows the shares of personal income
tax and business profit tax to GDP from 1981 to 2014. In 1981, personal income
tax’s revenue share was around 0.1% of GDP; its share grew to 2% of GDP in
2014. Business profit income tax also fluctuated but was still slightly higher than
personal income tax until 2005. However, after this period, it increased moderately
and its contribution reached a 3.5% share of GDP.
4
3
2
1
0
Fig. 13.3 Personal income tax and business profit income tax revenue as shares of total
GDP. Source Authors’ computations based on data from MOFED
298 H. Azime et al.
The buoyancy of a tax is estimated with the relative deviation in tax collection
efforts, or it is a specific tax revenue item as compared to a change in the tax base.
Thus, buoyancy is based on actual tax income and shows the changes in the tax
structure, which may include tax rates, tax basis, and tax administration and
compliance. Therefore, tax buoyancy is a measure of both the soundness of the tax
base and the usefulness of tax changes regarding revenue collection.
On the other hand, tax elasticity measures the automatic response of tax revenue to
the evolution of the tax base. Tax elasticity does not include the effects of fiscal
policy changes in the tax structure such as a change in tax rates, coverage,
exemptions, and deductions or administration. Tax elasticity reflects only the
built-in responsiveness of tax revenue to movements in the national income.
Both the tax buoyancy and elasticity concepts help analyze the capacity of the
tax system in mobilizing revenue with and without changes in the tax policy. Tax
buoyancy is a useful concept for measuring the performance of both the tax policy
and tax administration over time whereas tax elasticity is a relevant factor for
forecasting purposes (Jenkins et al. 2000). The tax elasticity coefficient gives an
indication to policymakers on whether tax revenues will increase at the same pace
as the national income.
Different studies have investigated the impact of GDP on the sensitivity of tax
revenues for African countries. Among these, Osoro (1993) concluded that for the
main categories of taxes in Tanzania, elasticities were found to be less than 1%.
However, in comparison with buoyancy due to its discretionary changes, it became
higher than the elasticity coefficient. Mawia and Nzomoi (2013) evaluated the tax
13 Agricultural Tax Responsiveness and Economic Growth in Ethiopia 299
buoyancy of different taxes in Kenya and found that tax revenue did not respond to
economic changes except excise duty. Ahmed and Muhammad (2010) analyzed 25
countries for the period 1998–2008 and applied a pooled least squares analysis
method. Their results show that growth in the agricultural sector had little impact on
the efficiency of tax revenue and was also less responsive to revenue mobilization in
the case of developing countries mainly due to difficulties in assessing the incomes
generated and the low incomes that may not be taxed or may be under-taxed.
Other studies show that the agricultural share’s contribution demonstrated a
consistently negative impact on revenue collections, but tax revenue increased with
trade share (Prichard 2015). Leuthold (1991) studied eight African countries by
measuring the tax effort for the period 1973–81 in a panel data using the OLS
estimation. The author argues that the agricultural share will affect the estimation
coefficient of direct and indirect tax revenues negatively. His review suggests that
evidence is not in favor of improving tax buoyancy in agriculture, and it also seems
that there is no evidence available on Ethiopia. Studying the responsive elements of
agricultural taxation in Ethiopia’s current context is expected to provide an effective
agricultural taxation system that enhances domestic revenue mobilization and rural
investments, which can be used for stimulating development.
Public finance policies in developing countries typically change tax parameters and
structures from time to time. This affects ‘revenue buoyancy.’ According to Creedy
and Gemmell (2001), the tax buoyancy estimation coefficient is the ratio of the
observed increase in revenues to the observed increase in incomes. A tax is buoyant
if revenue measures are increased in excess of 1% for a 1% increase in GDP or
national income (Creedy and Gemmell 2008; McCluskey and Trinh 2013). More
than 1% tax buoyancy will indicate a more proportionate increase in tax revenues
compared to that of GDP. Therefore, tax buoyancy that includes discretionary
changes is a measure of the efficiency of the tax base and the soundness of changes
in the tax policy regarding revenue collection and mobilization.
According to Haughton (1998), tax buoyancy (TB) is formulated as the per-
centage change in tax revenue to the percentage change in the tax base:
where the base can be GDP, or the relative base can be considered. Revenue could
refer to the different components of the total tax or individual taxes.
300 H. Azime et al.
In our study, the focus is on two types of agricultural taxes: agricultural income
tax (AgIT) and agricultural land tax or land use fee (AgLT):
Tax elasticity measures the extent to which a tax structure generates revenues in
response to increases in taxpayer incomes without a change in statutory tax rates
(Craig and Heins 1980). If a tax is to be elastic, a 1% increase in GDP may bring in
a more than 1% increase in revenue from the tax, holding discretionary tax changes
constant.
Singer (1968) and Ehdaie (1990) have developed an econometric model mea-
surement for estimating the tax elasticity coefficient. The model takes into account
the relations between GDP, tax revenue, the formation of the tax system, the tax
base using time series data for analysis and a model based on logarithmic functions
(Bunescu and Comaniciu 2013).
Accordingly, we used a dummy variable (Di) to represent the shift in tax policy
during the study period 1981–2014. From Eq. 13.2, the functional tax form is as
follows:
X
logt ðAgITÞ ¼ log a þ b logðAgrGDPÞt þ hi Di þ et ð13:4Þ
where
a Constant;
b Elasticity coefficient;
hi Impact or coefficient of the discretionary change; and
13 Agricultural Tax Responsiveness and Economic Growth in Ethiopia 301
Di Dummy variable as a proxy for the ith discretionary tax measures (DTM) taken
during the period under review. The summation sign in Eq. 13.4 creates room
for the possibility of multiple changes in the tax system during the study period
We introduced a dummy variable to represent a shift in tax policy during the
administrative reforms starting from 1992. The decade of the 1990s differed from
the previous period in the application of a more liberal policy. During the second
half of the 1990s, tax reforms were implemented. Since 1993, the tariff structure has
improved extensively and more proclamations and regulations have been intro-
duced to streamline the old tax system.
In estimating the coefficient of tax buoyancy, annual time series data was col-
lected from 1981 to 2013. The data comprises the following variables of interest:
agricultural GDP, non-agricultural GDP, aggregated agricultural income tax,
aggregated land tax, personal income tax, business profit income tax, aggregated
direct tax, and consumer price index. This data is from the Ministry of Finance and
Economic Development (MOFED) and the World Development Indicators’
(WDI) database.
Agricultural income tax revenue, land tax revenue, personal income tax, busi-
ness profit income tax, and aggregated direct tax were converted to their real values
by dividing the nominal values with the consumer price index (CPI). The use of
CPI as the deflator helps smoothen the data and also avoids biased results that could
have resulted from inflation. CPI is used because it falls on the expenditure side of
the GDP equation. According to Triplett (2001), CPI is preferable as it represents
the cost-of-living index and avails appropriate guidance for measuring consumer
inflation. Hence, it is best used in deflating tax revenues.
The variables used in the models are as follows:
D.ln_RealAGDP is the first differenced log of real agricultural GDP;
D1992 is a dummy variable to show for 1992 when there was a change
in government and no collection of tax revenue;
Dpolicy is a dummy variable to capture policy changes due to the tax
reforms; and
t is time trend
The limitation of applying this approach is data requirement which separates tax
revenue from discretionary changes. Due to lack of this data, we corrected the dataset
for the effects of tax reforms and tax policy changes using dummies. This technique
assumes that income elasticity is constant over the range of revenues considered.
Initially, the agricultural GDP fluctuated steadily but was followed by a period
when there was a quick increase. Since 1992, the new Ethiopian regime has
introduced various changes in the tax system and it is expected that real agricultural
302 H. Azime et al.
GDP could be non-stationary. As such, to have meaningful results, the trend model
with options of Dickey-Fuller test that includes a constant and a time trend and the
Augmented Dickey-Fuller tests were employed to test for the presence of unit roots
in the variables. Also, other methods such as Kwiatkowski–Phillips–Schmidt–Shin
(KPSS) and Phillips–Perron (PP) unit root tests were also employed.
The results indicate that the real agricultural GDP exhibited unit roots at different
critical levels. However, real agricultural GDP was found to be stationary after
differencing once, implying that the variable was integrated of order one. However,
the real agricultural income tax, the real land use fee, and the total agricultural tax
variables were found to be stationary at levels (see Annexure 1). Thus, real agri-
cultural income tax and real land use fee, as well as the total agricultural tax series,
are integrated of zero. Therefore, the first difference of the real agricultural GDP (D.
ln_RealAGDP) was used as a dependent variable in the model. The independent
variables in the model include time (t); and a dummy variable d1992 was intro-
duced for 1992 when real economic activity for assessing agricultural income tax
and land use fee was substantially slower than the historical trend.
The results suggest that agricultural GDP had some significant impact on agricultural
income tax. In fact, the estimated value of revenue buoyancy is −1.13 which is
significant at the 10% level. This implies that a 1% increase in agricultural GDP was
associated with a 1.13% decrease in agricultural income tax in Ethiopia. The findings
also suggest that agricultural GDP had no statistically significant influence on agri-
cultural land use fee and total agricultural tax. The R2 value is high, suggesting that
the model is a good fit. Table 13.1 presents the regression results on tax buoyancy.
Under the category of direct taxes, non-agricultural tax revenue variables, which are
real personal income tax and business profit income tax, as well as the total direct
tax series, were analyzed. As the first step, a more detailed examination of the data
properties and the final model specification was done and the property of the series
was analyzed using the augmented Dickey–Fuller (ADF), KPSS, and Phillips–
Perron (PP) unit root tests (the results are presented in Annexure 1).
Since all series were found to be I(1), this required testing for co-integration to
establish the relationship between personal income tax and business income tax
with non-agricultural GDP. Upon realizing the existence of a unique co-integrating
vector, the structural vector auto-regressive (SVAR) model was used to investigate
and estimate the elasticity and buoyancy in the short run between the variables. AIC
was used to select the optimum lag length of SVAR models. Based on the SVAR
estimation, tax buoyancy and elasticity results are given in Table 13.4.
The results in Table 13.4 suggest that personal income tax had a buoyancy of
0.08. Estimates of the tax system yielded a 0.08% change in tax revenue as a
consequence of both automatic changes and a change in the discretionary fiscal
policy for a 1% change in non-agricultural GDP. In other words, a 1% increase in
non-agricultural GDP led to a 0.08% increase in personal income tax during the
current period even though some proportion of incremental income was transferred
to the government in the form of taxes, implying that the tax system was less
buoyant.
The results clearly show that elasticity for Ethiopia’s personal income tax was
0.068%, which indicates that the developments in non-agricultural GDP over the
study period spurred less than the automatic proportionate increase in tax revenue.
The implication is that the tax system did yield a 0.068% change in tax revenue,
resulting from economic activity for every 1% change in non-agricultural
GDP. Thus, a decreasing proportion of incremental income was collected and
transferred to the government in the form of tax revenue, which shows that the
personal income tax system in Ethiopia was inelastic over the study period.
This also shows that a 1% increase in non-agricultural GDP led to a 0.12%
increase in business profit income tax in the current fiscal year. Thus, a decreasing
amount of incremental business profit income tax was collected and transferred to
the government in the form of taxes, implying that the tax system was less buoyant.
When the policy change is captured as a dummy variable, the estimates of tax
elasticity result in a 1% increase in non-agricultural GDP leading to a 0.11%
increase in business profit income tax in the current period. Thus, a lesser pro-
portion of incremental business profit income tax was collected and transferred to
the government in the form of tax revenue. This shows that this tax was also
inelastic over the study period. In general, personal income tax and business profit
income tax were progressive in nature given that it was expected that their elas-
ticities would be greater than 1.
Further, a 1% increase in non-agricultural GDP led to a 0.13% increase in total
direct tax in the current period. When a policy change was included as a dummy
variable, a 1% increase in non-agricultural GDP led to about a 0.12% increase in
direct tax in the current period.
The overall elasticity of the tax system clearly shows that the tax system in the
country is inelastic and is therefore not responsive to changes in national income.
The elasticity coefficient was not much lower than buoyancy for all the variables,
implying that the discretionary measures did not significantly impact own revenue.
It can easily be observed that discretionary changes to personal income tax and
business profit income tax made little contribution to the growth in overall direct tax
revenues.
Our study analyzed and measured the responsiveness of agricultural tax to eco-
nomic growth in Ethiopia. Agricultural tax buoyancy measures growth in agricul-
tural tax revenue as a ratio of the growth in agricultural GDP. The study concludes
that growth in agricultural GDP had a significant and negative impact on the growth
in agricultural income tax collections in Ethiopia. Agriculture’s share had an
adverse influence on revenue collections consistently but non-agricultural direct
306 H. Azime et al.
Annexure 1
ADF, KPSS, and PP unit root test results
13
Non-agricultural −4.334 −3.58 −3.228 −3.632 2 0.216 0.146 0.119 0.041 3 −23.524 −18.508 −15.984 −23.396 3
GD
Personal income −4.325 −3.576 −3.226 −3.952 1 0.216 0.146 0.119 0.104 3 −23.396 −18.432 −15.936 −30.704 3
tax
Business income −4.316 −3.572 −3.223 −4.205 0 0.216 0.146 0.119 0.0922 3 −23.396 −18.432 −15.936 −22.889 3
tax
Direct tax −4.325 −3.576 −3.226 −4.427 0 0.216 0.146 0.119 0.102 3 −23.268 −18.356 −15.888 −25.549 3
307
References
Ahmed QM, Muhammad SD (2010) Determinant of tax buoyancy: empirical evidence from
developing countries. Eur J Soc Sci 13(3):408–418
Alemayehu G, Shimeles A (2005) Taxes and tax reform in Ethiopia, 1990–2003. Research Paper,
UNU-WIDER, United Nations University (UNU), No. 65
Bachewe FN, Guush B, Berhane B, Minten M, Taffesse AS (2015) Agricultural growth in Ethiopia
(2004–2014): Evidence and drivers. International Food Policy Research Institute (IFPRI),
Working Paper, Washington, DC, p 81
Besley T, Ghatak M (2006) Public goods and economic development. Oxford University Press,
Oxford
Besley T, Persson T (2014) Why do developing countries tax so little? J Econ Perspect 28(4):
99–120
Bunescu L, Comaniciu C (2013) Tax elasticity analysis in Romania: 2001–2012. Proc Econ
Finance 6:609–614
Craig ED, Heins AJ (1980) The effect of tax elasticity on government spending. Public Choice 35
(3):267–275
Creedy J, Gemmell N (2001) The revenue elasticity of taxes in the UK. Melbourne Institute
Working Paper, No. 11/01
Creedy J, Gemmell N (2008) Corporation tax buoyancy and revenue elasticity in the UK. Econ
Model 25(1):24–37
Ehdaie J (1990) An econometric method for estimating the tax elasticity and the impact on
revenues of discretionary tax measures: applied to Malawi and Mauritius. Country Economics
Deprtment, The World Bank, Working Paper Series, No. 334
Feger T, Asafu-Adjaye J (2014) Tax effort performance in sub-Sahara Africa and the role of
colonialism. Econ Model 38:163–174
Geda A, Shimeles A (2005) Taxes and Tax Reform in Ethiopia, 1990–2003 WIDER Working
Paper Series 065, World Institute for Development Economic Research (UNU-WIDER)
Haughton J (1998) Estimating tax buoyancy, elasticity and stability. United States Agency for
International Development, EAGER/PSGE Excise Project
Howard M, Foucade AL, Scott E (2009) Public Sector Economics for Developing Countries.
University of the West Indies Press
Jenkins GP, Kuo CY, Shukla GP (2000) Tax analysis and revenue forecasting. Harvard Institute
for International Development, Harvard University, Cambridge, Massachusetts
Khan MH (2001) Agricultural taxation in developing countries: a survey of issues and policy.
Agric Econ 24(3):315–328
Leuthold JH (1991) Tax shares in developing economies a panel study. J Dev Econ 35(1):173–185
Mawia M, Nzomoi J (2013) An empirical investigation of tax buoyancy in Kenya. Afr J Bus
Manage 7(40):4233–4246
McCluskey WJ, Trinh HL (2013) Property tax reform in Vietnam: options, direction and
evaluation. Land Use Policy 30(1):276–285
Milwood TAT (2011) Elasticity and Buoyancy of the Jamaican tax system. Bank of Jamaica
Moller LC (2015) Ethiopia’s great run: the growth acceleration and how to pace it. The World
Bank Group, Washington, DC
Moreno MA, Maita M (2014) Tax elasticity in Venezuela a dynamic cointegration approach.
Central Bank of Venezuela
Newbery DM (1987) Taxation and development. In the theory of taxation for developing
countries. Newbery DM, Stern NH (eds), pp 165–204. Published for the World Bank [by]
Oxford University Press
ONRS (2002) Oromia Rural Land Use and Administration Proclamation. No. 56, (ed.), Oromia
National Regional State. Finfinnee, Ethiopia
13 Agricultural Tax Responsiveness and Economic Growth in Ethiopia 309
ONRS (2005) Oromia national regional Government rural land use payment and agricultural
income tax amendment, 2005, Proc. No. 99, Megeleta Oromia, 13th year, No. 13. Oromia
National Regional State
Osoro N (1993) Revenue productivity implications of tax reform in Tanzania. African Economic
Research Consortium, Research Paper No. 20
Prichard W (2015) Taxation, responsiveness and accountability in Sub-Saharan Africa: the
dynamics of tax bargaining. Cambridge University Press
Rashid S, Assefa M, Ayele G (2007) Distortions to agricultural incentives in Ethiopia. Agricultural
Distortions Working Paper 43. The World Bank, Washington, DC
Sanjeev G, Tareq S (2008) Mobilizing revenue. Finance Dev 45(3):44–47
Sarris AH (1994) Agricultural taxation under structural adjustment. Food and Agriculture
Organization of the United Nations
Singer NM (1968) The use of dummy variables in estimating the income-elasticity of state
income-tax revenues. National Tax J 21(2):200–204
Tanzi V, Zee HH (2000) Tax policy for emerging markets: developing countries. National Tax J
53(2):299–322
Triplett JE (2001) Should the cost-of-living index provide the conceptual framework for a
consumer price index? Econ J 111(472):311–334
Chapter 14
Improving Agricultural Productivity
Growth in Sub-Saharan Africa
Keywords Total factor productivity Agricultural production Inclusive devel-
opment agro-processing Export of agricultural raw materials Panel data
Simulation
14.1 Introduction
farmers in the continent are confronted with policies that lower economic incentives
to invest in agricultural production and modern inputs.
This situation stresses the need for strategies that stimulate more rapid agricul-
tural growth in sub-Saharan Africa. However, increased exploitation of natural
resources or a spike in commodity terms of trade may only spur limited growth in
the long run. In contrast, policies anchored on key productivity determinants
(Binswanger and Townsend 2000) can help maintain agricultural growth over the
long run. In our paper, we pursue the question of how agro-industrial activities and
exports of agricultural raw materials can be used to generate effective agricultural
productivity growth in SSA. Our study differs from the literature on sources of total
factor productivity (TFP) growth in agriculture in two aspects. First, we circumvent
the simultaneity equation bias associated with TFP estimations from the panel data
by using the hybrid Olley and Pakes (1996) and Levinsohn and Petrin (2003)
procedure. Second, as against the deterministic forecasting approach in most
studies, the simulation approach that we use acknowledges that uncertainties are
associated with realization of values of some TFP determinants, and by extension,
the random nature of TFP itself.
The rest of the paper is organized as follows. Section 14.2 presents the con-
ceptual framework, while Sect. 14.3 gives details of the econometric model
underlying the analysis. It also presents the estimated model and data sources.
Section 14.4 discusses the results and gives a conclusion.
of their operations. For instance, the Africa Post Harvest Loss Index (2014) esti-
mates that losses for roots and tubers were at 10–40%, for fruits and vegetables at
15–44%, while fish and sea food at 10–40%. Developing the agro-processing
potential, either through indigenous knowledge (drying, salting, crushing, pre-
cooking) or modern technology-based methods (extraction, canning, bottling,
concentration), has the capacity to reverse these losses. Therefore, agro-industrial
activities also have the potential to contribute toward food security.
However, unplanned agro-industrial development may generate negative exter-
nalities and sustain primary agricultural production in a low level of equilibrium.
For example, there may be significant risks in terms of equity, sustainability, and
inclusiveness when value addition and capture are concentrated in the hands of a
few value chain participants to the detriment of the others (da Silva and Baker,
2007). This will be the case in a situation of unbalanced market power in the
agri-food chain. Moreover, sustainability of agro-industrial development depends
on its competitiveness in terms of costs, prices, operational efficiencies, product
offers, and other associated parameters. Establishing and maintaining competi-
tiveness may constitute a particular challenge for small- and medium-scale
agro-industrial enterprises and small-scale farmers.
The preconditions for developing agro-industries include necessary transportation,
information, and communication technologies and access to reliable supplies of key
utilities, notably electricity and water. Therefore, infrastructural constraints influence
the cost and reliability of the physical movement of raw materials and end products,
the efficiency of processing operations, and responsiveness to customer demands. The
prevailing macroeconomic and business conditions and the level, quality, and relia-
bility of infrastructure are also critical determinants of competitiveness in the export of
processed agro-food products (Crammer 1999). In a situation of acute infrastructural
constraints, the additional complexities of processing operations may outweigh the
benefits of diversification in the exports of primary commodities toward value addi-
tion (Love 1983). Weak infrastructure may further put agro-processing enterprises at a
competitive disadvantage vis-à-vis their industrialized competitors and distort the
competitiveness of developing countries relative to one another. Unreliable and costly
supplies of utilities may also prevent enterprises from operating at or near full capacity
utilization. Overall, a weak infrastructural environment will lower the rate of transi-
tion of agro-industries from informal to formal operators and steer the structure of the
sector toward a higher level of concentration.
growth. The key premise of this argument is that overall growth in a country can be
generated not only by increasing the amount of labor and capital within the
economy, but also by expanding exports. Accordingly, exports can serve as an
“engine of growth.” An offshoot of this idea is the assumption that developing
countries have comparative advantages in agricultural production, thus only
needing to forward their agricultural produce to international markets (Akande
2012). However, empirical analyses to confirm this proposition have shown mixed
results. While positive for some countries (Krueger 1978; Lussier 1993), they were
negative for others with more than half the empirical investigations published in the
1990s finding no long-run relationship between exports and economic growth,
suggesting that correlations between these variables arise as a result of short-term
fluctuations.
A critical factor that affects the chances of developing countries benefiting from
export trade in agriculture is increasing consumer concerns about food safety.
Specifically, food exports from the developing world are exposed to demanding
food safety standards from organizations such as Codex Alimentarius and by
unilateral requests from individual importers. Also, attitudes and standards in vogue
in the developed world spill over to local markets (Pinstrup-Andersen 2000). A new
form of protectionism often arises in which high quality and safety standards
imposed by importing countries cannot be accommodated rapidly by local pro-
duction technologies or guaranteed by local analytical capabilities. The latter may
lead to increased levels of rejection at entry ports. Moreover, even if the problem
regarding the safety of an imported food has been overcome, the credibility of the
exporting country to produce safe food may be at stake, thus affecting the volume of
its food exports. For this reason, developing countries that consider implementing
or strengthening their food-borne disease controls and investigation and surveil-
lance systems are unlikely to gain in the long run from food and agricultural export
trade.
In summary, the review indicates that depending on the prevailing factors the
correlation between agricultural productivity, agro-processing and raw material
exports can be positive or negative and is also subject to random influence from
market forces. Hence, the focus of this paper is establishing this correlation and
how the equilibrium can be shifted in a way so as to achieve sustainable growth and
inclusive development in SSA.
The simulation approach examines the future evolution of TFP in SSA agricultural
production under the assumption that uncertainties are associated with the evolution
of certain TFP determinants (Davidson and MacKinnon 2004). First, we estimated
the TFP data from the aggregate agricultural production function using the hybrid
Olley and Pakes (1996) and Levinsohn and Petrin (2003) procedure. Second, the
fixed coefficients in the TFP simulation model were estimated from a Tobit
14 Improving Agricultural Productivity Growth in Sub-Saharan Africa 317
EðhN ¼ hÞ
As a first step, agricultural TFP was estimated from the hybrid Olley and
Pakes-Levinsohn and Petrin production function:
yit ¼ boi þ bk kit þ bl lit þ bld ldit þ xðkit ; iit Þ þ uqit : ð14:2Þ
where lower case letters represent the log transform of the respective variable, y is
gross domestic product measured in million purchasing power parity in dollars
(PPP$); k is the gross capital investment measured in million US dollars; l is
agricultural labor measured in million people employed in agriculture; ld is agri-
cultural land measured in square kilometers; i is gross agricultural investment
measured in million US dollars; u is the error term Nð0; r2 Þ.1
The fixed parameters in the TFP simulation model were estimated from the Tobit
regression:
where a0i are fixed effects parameters on countries; aðj:j [ 0Þ are parameters on the
associated variables; agvadd is value addition to agricultural products through
agro-processing measured in current market prices (USD); agrmtexpt is the value of
agricultural raw materials exported measured in current US dollars; agr&d is the
public expenditure on agricultural research and development measured in million
constant 2011 US dollars; agfdi is the value of foreign direct investment in agri-
culture measured in current US dollars; agoda is the value of official development
1
Annexure A gives a derivation of this model.
318 O.R. Akande et al.
assistance to agriculture measured in constant 2012 US dollars; eit is the error term
*N(0, r2 ).
Finally, TFP was simulated from the stochastic model:
tfpit ¼ aoi þ a1 agvaddit þ g1;it þ a2 ðagrmtexptit þ g2;it Þ
þ a3 agr&dit þ a4 agfdiit þ a5 agodait þ nit ð14:4Þ
Data for the study is the longitudinal time series or panel data on 13 countries in
sub-Saharan Africa. The data covered the period 1981–2005. Data was collected
from the databases of the Food and Agriculture Organization (FAO) of the United
Nations, Agricultural Science and Technology Indicators (ASTI) (www.asti.cgiar.
org), and the World Bank (www.worldbank.org). Data on agricultural raw materials
exported was derived by multiplying the proportion of agricultural raw materials in
the total merchandize export by the total merchandize export. The value of
agro-industrial value addition was proxied by the industrial value added. This was
obtained by multiplying industrial value added as a proportion of GDP by the
GDP. Values of official development assistance in agriculture (agoda) and foreign
direct investment in agriculture (agfdi) were obtained by weighting the aggregate of
these variables by the proportion of agriculture value added in GDP.
14.4.1 Results
Annexure B summarizes the data, while Table 14.1 and Table 14.2 give estimates
from production function and the TFP model. The goodness of fit statistics of the
hybrid Olley and Pakes-Levinsohn and Petrin production function indicates a good
fit of the data to the model. The returns to scale statistics show that agricultural
14 Improving Agricultural Productivity Growth in Sub-Saharan Africa 319
Table 14.1 Parameter estimates of hybrid Olley and Pakes-Levpet and Petrin regression model of
agricultural production in sub-Saharan Africa
Variablea Coefficient Std. error Sig. level
Labor 0.72 0.36 0.05
Land −0.16 0.46 0.74
Gross capital 1 0.42 0.02
Investment 0.001 0.10 0.99
Wald 0.43(0.43) SS
Source Author’s computation
a
All variables are in logarithm form
Table 14.2 Parameter estimates of the Tobit regression model of TFP in SSA’s agriculture
Variable Mixed effects model Random effects model
Coefficient (std. error) Coefficient (std. error)
agr&d −0.15(0.05)** −0.133(0.032)***
Agoda 0.04(0.02)** 0.027(0.021)
Agfdi −0.004(0.001)* −0.004(.002)**
Agvadd 0.09(0.02)** 0.034(0.024)
Agrmtexpt −0.04(0.01)*** −0.032(0.013)**
Burkina Faso −1.56(0.05)**
Madagascar −2.35(0.06)*
Ghana −0.37(0.07)*
Mali −1.42(0.06)*
Togo 0.06(0.05)**
Kenya −1.47(0.09)*
Nigeria −1.20(0.14)
Malawi −0.80(0.07)*
sigma_u 2.68e−19(1.00) 0.79(0.206)***
Sigma_e 0.12(0.01)*** 0.12(0.01)***
Rho 4.81e−36(3.69e−19) 0.98(0.01)
Fit stat.:
Log likelihood 90.23 63.96
AIC −150.46 −113.92
BIC −106.01 −93.18
Wald Chi-square 7163.81*** 29.96***
Likelihood ratio (LR) 52.54***
Source Author’s computation
***(**)(*)—significant at 1, 5, 10%
production in SSA exhibits constant returns to scale. The coefficients on labor and
gross capital were significantly different from zero, whereas those on land and
investment were not significant. Specifically, the elasticity coefficient on labor
indicates that a percentage increase in the variable increased aggregate agricultural
320 O.R. Akande et al.
14.4.2 Discussion
Evidence from the regression analysis points to the fact that increases in
agro-processing activities and its corollary decrease in the export of raw agricultural
materials increase agricultural production in SSA. However, the low elasticity
coefficient on value addition (less than unity) implies that agricultural productivity
in the region responds little to changes in value addition activities, which further
14 Improving Agricultural Productivity Growth in Sub-Saharan Africa 321
Table 14.3 Scenario analysis of the effect of increases in agro-industrial activities and decreases
in export of agricultural raw materials on TFP in sub-Saharan Africa
Scenario (% increase in Percentage of Percentage of progressive
agro-processing plus corresponding progressive growth in growth in TFP over the
% decrease in agric raw materials TFP over the baseline baseline (marginal)
export) (total)
Baseline 0 0
1 1.33 1.33
2.5 8 3.2
5 13.33 2.67
7.5 20 2.67
10 21.33 2.13
Source Author’s calculation
Fig. 14.1 Effect of improving agro-industrial activities and decreasing agricultural raw material
exports on progressive growth of TFP in agriculture in SSA
suggests that the growth of agro-industry in SSA faces some challenges. AfDB
(2008), the World Bank, and Information Development/Agribusiness (2013)
identified the challenges including lack of infrastructure, storage, finance, compe-
tencies, adequate technologies, and a good policy environment which confront
agro-industrial development in many parts of Africa. Specifically, these studies say
that lack of storage capacity in conjunction with poor rural electrification and water
access, insufficient road networks, and difficult access to communication tools
322 O.R. Akande et al.
The limitation of our study is associated with the fact that the findings may be
affected by the quality of the data used. Specifically, nonavailability of data on
many variables and missing data reduced the number of countries used for the
analysis. A more precise estimate may be obtained by a study that uses datasets
with improved quality.
This paper investigated the question of how agro-processing and agricultural raw
material exports can be effectively used to improve productivity of agriculture in
324 O.R. Akande et al.
SSA. Our findings lead to the conclusion that while intensifying efforts in exporting
raw agricultural materials lead to decreased productivity growth in agriculture,
increasing agro-processing activities marginally lead to improved agricultural
productivity growth, suggesting that agro-industrial activities are locked in a low
level of equilibrium.
To overcome the challenges associated with agro-industrialization and improv-
ing the value of agricultural exports thereby improving agricultural productivity
growth, there is a need for a policy, regulatory, and institutional framework across
countries in the region that enables agro-industrial development to become stronger;
creating opportunities for increased private sector engagement including through
the formation of public–private partnerships for developing synergies; providing
access to credit for participants along the agricultural value chain; providing rural
infrastructure that reduces postharvest losses and transport costs and shortens transit
time while increasing overall rural mobility; supporting innovations and technology
for developing competitive value chains; providing access to value-responsive
markets; providing access to timely information to improve bargaining powers;
establishing organizations to reduce transaction costs; and including women, poor,
and/or marginal groups into value chains. This strategy will have optimal results if
it concomitantly and yearly increases agro-industrial activities and decreases agri-
cultural raw material exports by 2.5% from their existing values.
where Y is the aggregate output, K is the vector of capital input, L is the labor input,
A is the Hicksian neutral efficiency level.
While Y; K and L are all observed by an econometrician, A is not observed by a
researcher. Taking the natural logarithm results of Eq. (14.5) yields:
where the lower case letters refer to the natural logarithm of respective variables and
lnðAÞ ¼ b0i þ eit : Where b0i measures productivity that varies over countries, and
eit s, the time specific deviation from that mean. When eit is decomposed into a
predictable and unpredictable component, Eq. (14.6) becomes:
14 Improving Agricultural Productivity Growth in Sub-Saharan Africa 325
where xt ¼ b0i þ vit represents sector specific productivity and uit is a iid error
term, representing unexpected deviation from the mean due to measurement or
other unexpected circumstances. The task is to estimate Eq. (14.7) and solve for xt .
TFP can then be calculated by exponentiating ðxt Þ and then expressing it as a
function of its relevant determinants such as:
exit decision will depend on the firm’s perception about the distribution of the future
market structure given the information currently available. To achieve consistency a
number of assumptions have been further made. First, the productivity of the firm is
assumed to be the only state variable, evolving through the first-order Markov pro-
cess. Second, a monotonicity assumption is imposed on the investment variable to
ensure stability of the investment demand function. Therefore, investment increases
in productivity are conditional on the values of all the state variables. Consequently,
only nonnegative values of investments can be used in the analysis. Moreover, if
industry-wide prices are used to deflate the input and output measured in value terms
to proxy their respective quantities, it is implicitly assumed that all firms in the
industry face common prices (Ackerberg et al. 2007).
Overall, the investment decision will depend on capital and productivity as:
Estimation of Eq. (14.11) proceeds in two stages (Olley and Pakes 1996). In the
first stage, output (value added) is regressed on log of labor and capital and a
polynomial function of investment and capital (i and k) to obtain a consistent
estimate of the labor elasticity parameter and ut ðkit ; Iit Þ, the combined effect of
capital and efficiency or productivity level. By this action, the estimated labor
coefficient and other included free variables are expected to be lower since this
corrects for downward bias in capital (Hall and Mairesse 2007; Van Beveren 2012).
The second stage of the estimation process, which recovers the coefficient on
capital variable, exploits the information on firm dynamics. Specifically,
14 Improving Agricultural Productivity Growth in Sub-Saharan Africa 327
The second stage of the estimation algorithm is then derived by using the law of
motion.
In contrast to Olley and Pakes’ (1996) decision to use investment as proxy for
productivity, Levinsohn and Petrin (2003) relied on intermediate inputs as proxy.
Second, their estimation does not correct for selection bias.
In our study, a hybrid Olley and Pakes (1996) and Levisohn and Pakes (2003)
estimator was implemented. Specifically, the model is similar to the Olley and
Pakes (1996) estimator in terms of employing investment as a proxy for produc-
tivity. It resembles Levinsohn and Petrin (2003) as it does not correct for selection
bias. The latter is consistent with the aggregate nature of the data used.
References
UNDP (2011). Towards human resilience: sustaining MDG progress in an age of economic
uncertainty. United Nations Development Programme Bureau for Development Policy, New
York
Van Beveren I (2012) Total factor productivity estimation: a practical review. J Econ Surv
26(1):98–128
Yasar M, Raciborski R, Poi BP (2008) Production function estimation in Stata using the Olley and
Pakes method. Stata J 8:221–231
Chapter 15
Determinants of Service Sector Firms’
Growth in Rwanda
Abstract The service sector is an avenue for economic transformation as not all
countries have a competitive edge in manufacturing. Findings from a micro-level
research on the service sector confirm that ICT integration, firm’s age, the education
of the owner, the boss’ attitude, family business, networks, new processes, major
improvements, market share, on the job training and know-how significantly, and
positively increase the probability of a firm’s growth. Even though the growth rate
of services is currently impressive in the Rwandan economy, no investigations have
been done on the determinants of the growth of the firms in the service sector. This
paper studies the development of services over the years in Rwanda’s economy in
detail and empirically estimates its determinants by using an econometric
methodology. The empirical results are based on micro-data collected by the
Rwanda Enterprise Survey (2011) and the 2014 Establishment Census. The survey
has data on 241 firms and establishments. Linear and limited dependent variable
techniques are employed to investigate the factors behind the development of
service firms. Models are specified and estimated to assess the factors contributing
to sales growth, innovations, and turnovers of service firms. The results show that
the key factors driving the development of service firms in Rwanda include access
to credit, application of ICT, availability of skilled labor, employee development
and acquisition of fixed assets. The results suggest that the government should
uphold the use of ICT in all service firms, promote access to finance to new service
firms and promote on-work training in service firms to speed up Rwanda’s shift
from a low income to a middle-income state.
E. Uwitonze
Ministry of Gender and Family Promotion (MIGEPROF), MIGEPROF,
Single Project Implementation Unit, Kigali, Rwanda
e-mail: uwitonzeric@gmail.com
A. Heshmati (&)
Jönköping International Business School (JIBS),
Jönköping University, Jönköping, Sweden
e-mail: almas.heshmati@gmail.com
15.1 Introduction
15.1.1 Background
As per the 2014 Rwanda Services Policy Review, the service sector was the largest
and most dynamic sector in the Rwandan economy. The Rwandan service sector is
subdivided into two broad categories of trade and transport services. Trade and
transport services include maintenance and repair of motor vehicles, wholesale and
retail trade, transport services and other services such as hotels and restaurants;
information and communication; financial services; real estate activities; profes-
sional, scientific and technical activities; administrative and support services; public
administration and defense; compulsory social security; education services; human
health; social work services; and cultural, domestic and other services.
The service sector spearheaded the strong economic growth journey as it
accounted for a bigger share of GDP by 2015—47% GDP as compared to 33% by
the primary sector (agriculture, forestry and fisheries) while the growth of services
was impressive at around 9% by 2014 against 7% for industry and 4% for agri-
culture. Trade and transport services contributed to services’ share in GDP at
159 billion RWF1 in 1999 which increased to 784 billion RWF in 2014 of which
wholesale and retail trade had 615 billion RWF in 2014 against 133 billion RWF in
1999. Other services including hotels and restaurants, information and communi-
cation and financial services increasingly contributed to GDP from 430 billion
RWF in 1999 to 1505 billion RWF in 2014. The service sector’s contribution grew
to 2290 billion RWF in 2014 as compared to 563 billion RWF in 1999. Authorized
loans by the central bank to the service sector increased from 1.5 billion RWF in
2010 to 12 billion RWF in 2014. All these statistics are at fixed 2011 prices and
suggest increased attention and public support for the service sector’s development.
The Doing Business in Sub-Saharan Africa Report (2013–2014) ranked Rwanda
second after Mauritius and its service sector received a big share of foreign private
investments. As a matter of fact, 41.4% of foreign private investments were allo-
cated to ICT and tourism (12.8%), while others like mining received 13.8%,
manufacturing (10.8%) and other sectors received a significant (21.7%) share of
private investments. Meanwhile, as documented in the Rwandan Vision 2020
document, the service sector is believed to be the engine for Rwanda’s economy
with a growth rate of 13.5% and a contribution of 42% to GDP.
1
USD 1 = 746 RWF on 9 March 2016.
15 Determinants of Service Sector Firms’ Growth in Rwanda 333
The distribution of businesses by economic activity shows that the service sector
achieved positive growth in both rural and urban areas. The main sub-sectors in the
service sector that showed more than 30% growth include accommodation and food
services; human health and social work activities; and art, entertainment and
recreation activities. According to Singh and Kaur (2014) rapid urbanization is
a key factor which contributes to the growth of services and leads us to analyze
this growth of the service sector in urban and rural areas in 2011–2014.
Accommodation and food service activities showed greater growth; they had
26,190 registered establishments in 2011 and 36,545 registered establishments in
2014 in rural areas showing a 40% increase whereas in urban areas 7095
334 E. Uwitonze and A. Heshmati
Fig. 15.1 Economic activities of private and business-oriented mixed establishments according to
urban/rural areas (2011 and 2014) Source NISR’s Establishment Census (2014)
terms of the number of people employed included wholesale and retail trade, repair
of motorcycles and motor vehicles (with 120,482 employees equivalent to 24.4% of
the total employment), followed by education employing 83,569 (16.9% of the total
employment) and accommodation and food service activities having 82,213
employed people (16.7% of the total employment). These sub-sectors supported the
growth of the service sector since they provided more jobs as compared to other
economic sectors.
Men were predominant in almost all the service sub-sectors except human health
and social work activities where they represented 47.7% of the total employed
while female workers reached 52.3%. A general picture of the share of employment
within the service sector shows that gender inequalities persist. Only 36.8% of the
total employment in the service sector was with females as compared to male
workers who had the lion’s share of service sector employment at 63.2%.
Considering women’s share in the total population of Rwanda—53% as compared
to 47% for men—there is hope that the service sector will continue to grow if there
is full participation of women in its employment. Figure 15.2 illustrates the way
employment is divided across economic activities.
Fig. 15.2 Distribution of number of workers and gender structure by economic activity (2014).
Source NISR’s Establishment Census (2014)
Fig. 15.3 GDP by sector activity at constant 2011 prices (in billion RWF). Source National
Institute of Statistics of Rwanda (2014)
15 Determinants of Service Sector Firms’ Growth in Rwanda 337
The service sector’s economic development is the only way of promoting economic
structural adjustment and accelerating the transformation of economic growth
(Zhou 2015). A declining share of agricultural employment is a key feature in
economic development (Alverez-Cuadrado and Poschke 2011); structural trans-
formation usually coincides with a growing role of industry and services in the
economy (UNECA 2015). The growing size of the service sector and its impact on
the other parts of the economy make it all the more important to promote efficiency
in the provision of services thereby boosting economy-wide labor productivity as
witnessed in OECD member countries. The slowdown in the service sector brought
down labor productivity in the entire economy from more than 4% in 1976–1989 to
less than 2% in 1999–2004 (Jones and Yoon 2008).
Acharya and Patel (2015) confirm that the service sector is the fastest growing
sector in India, contributing significantly to GDP, economic growth, trade and
foreign direct investment (FDI) inflows as the total share of this sector to India’s
GDP is around 65%.
Singh and Kaur (2014) state that the main reasons for the growth in services in
India are rapid urbanization, expansion of the public sector and increased demand
for intermediate and final consumer services. Domestic investments and openness
also positively affect the share of the service sector in GDP, and the main service
sectors attracting FDI in India are telecommunications, construction and hotels and
restaurants. Lee and Malin (2013) says that the service sector has become the main
contributor to GDP not only in developed economies such as the US, Japan and
UK, but also in developing economies such as China, Indonesia, Pakistan and India.
Concluding their study on the determinants of innovation capacity with empirical
evidence from service firms, Madeira et al. (2014), affirms that the greater the
financial investments in the acquisition of machinery, equipment and software; in
internal research and development; in acquisition of external knowledge; and in
marketing activities and other procedures, the greater the propensity of firms to
innovate in terms of services.
According to Park and Shin (2012) general wisdom is that when a country
industrializes, the shares of industry and service sectors in both GDP and
employment increase whereas the share of agriculture falls and when a country
de-industrializes and moves into the post-industrial phase, the share of services
increases while the shares of both industry and agriculture fall. They found that
when computing the contribution of agriculture, industry and services to GDP
growth, in general the service sector made the biggest contribution. Further, the
lower the per capita GDP, the greater the scope of labor productivity growth in
the service sector, which implies that there is still a lot of room for growth in the
productivity of services. Thus, Buera and Kaboski (2009) argue that as productivity
grows, individuals consume new services. Eventually, labor productivity increases
enough which makes the absolute cost advantage of market-production smaller and
15 Determinants of Service Sector Firms’ Growth in Rwanda 339
Sahu (2015) analyzed micro-data on service sector companies to test high growth in
total factor productivity (TFP) assessing if better factor allocation led to TFP
growth. He found that a reduction in the misallocation of resources in the service
sector resulted in an accelerated pace of TFP growth. Therefore, the communication
and community service industries registered the fastest growth in terms of moving
toward efficient TFP levels. Acharya (2016) affirms what accounts for exceptional
TFP growth performance in some ICT industries using industries where produc-
tivity gains in the production of ICT are given as an answer in the US and in the
Organization of Economic Cooperation for Development (OECD) countries. Van
der Marel and Shepherd (2013) confirm that ICT capital and legal institutions are
particularly important determinants of a country’s ability to successfully export
services. Further, the tradability indices are strongly correlated with important
factors such as country productivity and size, factor endowments, trade costs and
regulatory measures.
Geishecker and Görg (2013) claim that measuring both service and material
off-shoring is not straightforward and is greatly limited in available data when it
comes to coherent and comparable information on such activities. Thus, trade
economists usually revert to measuring trade in intermediaries as proxy. In addition,
they assessed the impact of off-shoring activities on individual wages in an industry
which are conceptualized as average hourly gross labor earnings including bonus,
premium and other extra payments. The explanatory variables are demographic and
human capital variables including age, age squared; dummies for the presence of
children and being married; job tenure; tenure squared; a high education indicator;
dummies for occupation; and dummies for firm size and regional dummies. Their
results show that workers in industries with increasing levels of off-shoring services
were likely to experience reduction in their wages. They conclude what would have
been considered as a perfect case of spillovers from ICT using conventional
methods—the impact of research and development and other intangible capital.
Madeira et al. (2014) investigated the main determinants of innovation in the
service sector in the area of innovation activities. They found the use of the logit
model to be appropriate for measuring direct and indirect effects of a selected set of
340 E. Uwitonze and A. Heshmati
Capital, labor and knowledge-based capital are key inputs in the production of
goods and services. Salehi-Isfahani (2006) claims that urban households are a
source of growth in human capital in the Middle East and North of Africa (MENA)
countries. But households in that region have to face the state playing a large role in
the economy, which distorts the incentive to invest in education and the labor
market and in social norms regarding gender. As a result, households invest in an
inefficient portfolio of human capital with dire consequences for long-run growth.
Literature argues about the relevance of knowledge-based capital in a firm.
Yli-Renko et al. (2001) found that knowledge acquisition was positively associated
with knowledge exploitation for competitive advantages through new product
development, technological distinctiveness and sale cost efficiency. Corporate
entrepreneurship was positively associated with knowledge-based capital (Simsek
and Heavy 2011) and business services can have an effect comparable to the
traditional production factor only when it applies to the service sector (Drejer
2002).
A review of contemporary literature suggests that regulatory, policy and insti-
tutional environments, competition in the product market, spillovers and external-
ities and internalization and globalization are constituents of a business
environment affecting a firm’s performance.
15 Determinants of Service Sector Firms’ Growth in Rwanda 341
Bouazza et al. (2015) confirm that the key factors of a business environment
affecting Algerian firms are unfair competition from the informal sector; cumber-
some and costly bureaucratic procedures; burdensome laws, policies and regula-
tions; an inefficient tax system; lack of access to external financing; and low human
resources capacity. The main internal factors responsible for unstable and limited
growth include entrepreneurial characteristics, low managerial capacity, lack of
market skills and low technological skills. Gale et al. (2015) confirm the existence
of a negative relationship between the rate of firm formation and the top income tax
rate by finding that a cut in top income tax automatically generates or necessitates
growth.
The economic growth of a country in terms of GDP growth is determined by the
real value-added growth of underlying firms. According to Pop et al. (2014), in an
economic crisis it becomes clear that the smaller firms are often capable of
responding faster, they are more targeted and flexible to fluctuations in the global
economy and to withstanding the recessionary phase.
Khan (2011) tested the important determinants of a firm’s growth. He highlights
that a firm’s age, the education of the owner, the boss’ attitude, family business,
networks, new processes, major improvements, market share, on the job training
and know-how significantly and positively increased the probability of a firm’s
growth. The age of the owner, foreign trade regulations, taxes, other regulations,
political instability, inflation and lack of skilled labor adversely reduced the
probability of a firm’s growth in terms of employment opportunities. Olivera and
Fortunato (2008) and Lenaerts and Merlevede (2015) claim that a firm’s growth is
mainly explained by the firm’s age and size.
Existing literature states that expenditure on ICT has a positive impact on
exports of producer services (Guerrieri and Meliciani 2004) and sees ICT as the
bedrock of improving business processes, customer relations and efficient delivery
of goods and services to satisfy customer needs (Atom 2013). According to
Bethapudi (2013), ICT integration provides a powerful tool that brings advantages
to promoting and strengthening the tourism industry. Mihalic et al. (2015) mention
that ICT is also becoming an important factor in business and competitiveness
because of, as discussed by Borghoff (2011), its influence on the three
sub-processes of globalization: internationalization, global network building and
global evolutionary dynamics.
As for ICT applicability in the service sector, its role is crucial in facilitating
trade (Gupta 2012). According to Liu and Nath (2013) the trade-enhancing effect of
ICT is on its use. Internet subscriptions and Internet hosts have a significant positive
effect on both exports and imports. ICT in transport services plays a decisive role in
reducing energy consumption and CO2 emissions in the road transport sector
(Gupta 2012).
According to Agwu and Carter (2014) the use of mobile banking and automatic
teller machines (ATMs) has made financial services easily accessible and has
reduced costs for both customers and financial service providers in Nigeria.
Information technology has enabled banks to understand and serve customers better
342 E. Uwitonze and A. Heshmati
than their competitors; they have developed and improved new products for cus-
tomers and further improved processes and relationships with customers and
business partners (Muro et al. 2013).
Arnold et al. (2016) demonstrate the presence of a link between India’s policy
reforms in the service and productivity of manufacturing firms. They find that
banking, telecommunications, insurance and transport reforms have all had sig-
nificant effects on productivity in manufacturing firms; these effects tend to be
stronger on foreign owned firms.
El-Said and Kattara (2013) researched the application of information technology
versus human interaction services in an Egyptian hotel. They found that customers
preferred to contact an employee rather than depending on technology-based
self-services in a majority of service encounters. In Uganda, more than 80% of the
households were employed in tourism services. Tourism employment can provide
initial capital for supplementary activities.
Heshmati and Kim (2011) came to the conclusion that the competitiveness in
Korea’s service industry can be driven by an incentive system for skilled workers
and investing more in research and development in order to increase labor pro-
ductivity. In addition, the Korean government should implement an open market
policy to liberalize labor movement and induce low paid labor to move to the
production process to a large extent.
Departing from the macroeconomic point of view, the growing size of the service
sector and its impact on the other parts of the economy makes it all the more
important to promote efficiency in the provision of services thereby boosting
economy-wide labor productivity (Jones and Yoon 2008). The main reasons for the
growth in services are rapid urbanization, domestic investments, openness,
expansion of the public sector and increased demand for intermediate and final
consumer services (Singh and Kaur 2014). The lower the per capita GDP, the
greater the scope for labor productivity growth in the service sector, which implies
that there is still a lot of room for growth in the productivity of services (Park and
Shin 2012).
Our microeconomic literature review supports that the development of service
firms is mainly backed with knowledge acquisition (Yli-Renko et al. 2001),
knowledge-based capital (Simsek and Heavy 2011), on the job training and
know-how and skilled labor and ICT applicability (Gupta 2012). Firms benefit
15 Determinants of Service Sector Firms’ Growth in Rwanda 343
immensely from spending on their human capital because this investment adds
value to their companies (Jafaridehkord et al. 2015). The effect of knowledge
networking on firm growth is significantly larger for service firms than for manu-
facturing firms since it positively affects net asset and value-added growth of ser-
vice firms (Schoonjans et al. 2013).
To conclude, throughout research it is claimed that the important determinants of
a firm’s growth include a firm’s age, the education of the owner, the boss’ attitude,
family business, networks, new processes, major improvements, market share, on
the job training and know-how which significantly and positively increase the
probability of the firm’s growth (Khan 2011).
15.3 Methodology
2
ISIC classified services into sections from G to U as per individual categories in such a way to
(U) include wholesale and retail trade, repair of motor vehicles and motorcycles, transport and
storage, accommodation and food service activities, information and communication, financial and
insurance activities, real estate activities, professional, scientific and technical activities, admin-
istrative and support service activities, public administration and defense, compulsory social
security, education, human health and social work activities, arts, entertainment and recreation,
other services’ activities, activities of households as employers, undifferentiated goods and ser-
vices producing activities for households for own use and activities of extra-territorial organiza-
tions and bodies (UN 2008).
344 E. Uwitonze and A. Heshmati
and a higher degree of joint decision making among the owners and managers of
small firms (Reuber and Fischer 2002). Sustaining economic growth and improving
living standards requires shifting labor into both the manufacturing and service
sectors (Eichengreen and Gupta 2011).
A firm’s growth is conceived as an increase in the product or service as the main
business, increase in sales, increase in the number of new employed persons and the
size of the establishment in the service sector. Smith and Verner (2006) found that
the proportion of women in top management jobs had a positive effect on a firm’s
performance and that the effect depended on the qualifications of female top
managers in Denmark. Dawkins et al. (2007) argue that both large firms and those
which are highly specialized, enjoy higher profit margins, whereas the more capital
intensive the firm the lower its profitability.
!
employment cost; working capital; ICT;
Total sales growth ¼ f
firm innovation criteria; acquisition of fixed asset
ð15:1Þ
The second hypothesis is that the service sector’s development is reflected in its
innovations that are expressed in the introduction of new products or services. In
the Rwanda Enterprise Survey (2011), firms were asked whether they had intro-
duced new products or services in the last three year. The variable of the intro-
duction of new products or services which is conceived as innovation is taken as the
dependent variable. Independent variables include internal research and develop-
ment (R&D) activities, external or internal acquisition of research and development
(ext. R&D) as time given to employees in a service firm to develop or try out a new
approach or a new idea about products or services, business process, firm man-
agement, marketing, training, access to finance as illustrated by the acquisition of
fixed assets and a firm’s characteristics in term of size. The null hypothesis suggests
that these factors do not influence service innovation, while the alternative
hypothesis suggests that they have a positive effect on service innovation of new
products and services. The model to investigate the factors affecting service sales is
structured as:
!
R&D; ext:acquisition of R&D; acquisition of training;
Service innovation ¼ f
acquisition of fixed assets; other firms’ criteria
ð15:2Þ
The third hypothesis is that the turnover of a service firm is affected by a number
of factors such as the capital used, openness conceived as buying and selling outside
the country, the manager’s gender, paying value-added tax, paying income tax and
the service sub-sector. The turnover of a service firm is defined as the amount of
money that is received in sales. In the Establishment Census (2014), the information
collected on this variable is classified in categories where the first category includes
all firms with turnovers less than 300,000 RWF, the second category includes all
firms with turnovers ranging from 300,000 RWF to 12 million RWF, the third
category has all firms with turnovers ranging from 12 million to 50 million RWF
and the last category includes all firms with turnovers more than 50 million RWF.
This is a category-dependent variable. Categorization of the turnover leads to
information about losses within the category; it also sheds light on category dif-
ferences in performance and the variations in their determinants.
The first dummy variable on openness contains information on whether a firm
sells or buys goods or services abroad. The second dummy variable ‘gender’
defines whether the manager of a firm is female or male. The third dummy variable
on value-added tax (VAT) contains information on whether or not the firm pays
VAT. The fourth dummy variable has information on whether or not the firm pays
income tax. There is also a factor variable on the service sub-sector where 7 stands
346 E. Uwitonze and A. Heshmati
for wholesale and retail trade and repair of motor vehicles and motorcycles, 8 stands
for transportation and storage, 9 stands for accommodation and food service
activities, 10 stands for information and communication, 11 stands for financial and
insurance activities and 12 stands for real estate activities. The other factor variable
‘capital’ contains information classified in categories in such a way that the first
category considers firms using less than 500,000 RWF as capital, the second using
500,000 to 15 million RWF, the third using 15 million to 75 million RWF and the
last category using capital more than 75 million RWF. Thus, this is a categorical
variable. Factors affecting change in turnover are constructed with the variables
mentioned earlier and are expressed as:
As discussed earlier, sales are used as an indicator to measure a firm’s growth and
this growth as its turnover. In our study, sales and turnover are both used with
different model specifications because the datasets used are different. Otherwise,
they should have the same model specifications since they can be used
interchangeably.
The model on the sales of service sector firms is constructed with the variables
used in the collection of data during the 2011 Rwanda Enterprise Survey by the
National Institute of Statistics of Rwanda in partnership with the World Bank.
Because this database contained missing values, we constructed a model on turn-
over with the variable used to collect information in the Establishment Census
(2014) by the National Institute of Statistics of Rwanda. This was done to track the
main factors affecting sales or turnover.
For the innovation model, we used the same database as the sales model because
the 2011 Enterprise Survey attached more interest to the innovation factor in the
performance of firms. Only the predictors of the innovation model can appear in the
sales model in order to prove the contribution of innovation in the growth of sales
of service firms.
Data about the performance of Rwanda’s service sector used in this study was
provided by the National Institute of Statistics of Rwanda. The data came from two
important data collection channels—the 2010–2012 Enterprise Survey in Rwanda
and the 2014 Establishment Census.
The Enterprise Survey focuses on the many factors which shape the business
environment and is useful for both policymakers and researchers. The Enterprise
15 Determinants of Service Sector Firms’ Growth in Rwanda 347
Survey is conducted by the World Bank and its partners across all geographic
regions and covers small, medium and large companies. The sample is consistently
defined in all countries and includes the entire manufacturing sector, the service
sector and the transport and construction sector. The 2011 Rwanda Enterprise
Survey covered 241 firms including 159 service firms and 82 manufacturing firms.
The cleaned raw database contains 148 firm observations each with 247 variables
describing various aspects of the firms and their activities (WB 2014).
The Rwanda Establishment Census (2014) consists of a complete count of all
establishments practicing specific economic activities in Rwanda except
not-for-sale government services. It covered themes such as economic activity,
legal status, registration of establishment, taxation, capital employed, regular
operation accounts, socioeconomic characteristics of an establishment’s staff,
payment status and sex of employees. The dataset contains 154,236 cases with 91
variables (NISR 2014).
The dependent variable is service firm growth which is measured by several
attributes such as turnover/sales, employment, assets, market shares and profits. The
Rwanda Enterprise Survey (2011) provides data on total sales for three years and
the 2010 fiscal year and data on the introduction of new products or services which
are a measure of innovation output in the previous three years. Factors affecting
total sales, growth of employment and service innovation determine the develop-
ment of the service sector. Literature highlights key measures of a firm’s growth as
sales, employment and innovation. Zhou and Wit (2009) and Isaga (2015) used
sales and employment to measure the growth of a firm since they reflect both
short-term and long-term changes in a firm.
In the model on service innovation, the dependent variable is a binary variable
on the introduction of new products or services in three years from 2010. According
to Neely and Hii (1998), innovation has a direct impact on the competitiveness of a
firm. The values created by innovations are often manifested in new ways of doing
things or new products and processes that contribute to wealth. In their studies,
Arvanistis and Stucki (2012) and Madeira et al. (2014) used a firm’s innovations for
measuring growth because it is argued that innovation start-ups are important dri-
vers of economic growth.
The model on turnover uses a categorical dependent variable where the turnover
of a firm is classified into four categories as described earlier. An ordinary scale
with many categories (5 or more), interval and ratio are usually analyzed using the
traditional approaches of statistical tests (Newsom 2013).
Independent variables in new service development are classified into four cat-
egories—firm characteristics, innovation characteristics, managerial characteristics
and business environment. In this study, a firm’s characteristics consider the firm’s
size, gender composition and legal status. Considering firm size, Madeira et al.
(2014) found a positive and increasing effect of firm size on firm innovation.
Medium-sized firms showed greater propensity to innovate than small sized firms.
Innovation characteristics include market conditions, new management prac-
tices, new market methods, spending on research and development activities, a
service firm’s employees’ development, a firm’s access to finance expressed in the
348 E. Uwitonze and A. Heshmati
acquisition of fixed assets and degree of competition. Acs and Audretch (1988) and
Prajogo and Sohal (2006) claim that there is a positive relationship between
innovation and research and development activities of firms.
Managerial characteristics are pointed out with the top managers’ levels of
education and the years of their working experience in the service sector. Education
is measured by level of education attained classified as: no education, primary
school, secondary school, vocational training, some university training and graduate
degree. Queiro (2016) found that firms which switch to more educated managers’
experience sharp increases in growth relative to comparable firms managed by less
experienced managers. More educated managers increase the use of incentive pay
and are likely to report new products and services and incorporate new technolo-
gies. The correlation matrix of the dependent and independent variables is given in
Appendix 1.
Madeira et al. (2014) have argued that a firm’s capacity to innovate is a complex
phenomenon influenced by a wide range of factors. Thus, the logistic regression
(logit model) helps to study the statistical relationship of the dependent variable in
relation to more than one determinant variable. Stock and Watson (2011) discuss a
regression with a binary dependent variable and conclude that when dependent
variable Y is binary, the population regression function is the probability that Y = 1,
conditional on the regressors. The resulting predicted values are predicted proba-
bilities and the estimated effect of a change in regressor X is the estimated change in
the probability that Y = 1 arising from the change in X. The standard estimation in
the maximum likelihood method and its estimates proceeds in the same way as it
does in linear multiple regressions.
In our study, dependent variables for service innovation are conceived as the
introduction of new products or services; they are binary variables where value of
zero translates into the fact that a firm did not introduce a new product or service
and 1 for firms that introduced new products or services. The same applies to
independent variables.
According to Verbeek (2004), who discusses models with limited dependent
variables, when the dependent variable is zero for a substantial part of the popu-
lation but positive for the rest of the population with many different outcomes, the
logistic regression model is particularly suited for these types of variables. Since a
violation of distribution leads to inconsistent maximum likelihood estimators,
testing for misspecifications is to be conducted and necessary measures undertaken.
To estimate the total sales growth in service firms, we used the multivariate
regression analysis since growth is expected to be analyzed in the three years’ total
15 Determinants of Service Sector Firms’ Growth in Rwanda 349
annual sales of a service firm. We need to track the factors that contributed to the
change in total annual sales in service firms. In this case, using the linear regression
model is helpful.
Empirical models for an analysis of the service sector’s development and its
determinants in Rwanda are expressed on the basis of total annual sales, service
sector innovativeness and service sector turnovers to track the factors influencing
the dependent variables. Starting with the factors affecting sales in service firms
(Model 1), we can construct the multivariate regression model as:
Sales i ¼ b0 þ b1 x1 þ b2 x2 þ b3 x3 þ b4 x4 þ b5 x5 þ b6 x6
ð15:4Þ
þ b7 x7 þ b8 x8 þ b9 x9 þ b10 x10 þ b11 x11 þ b12 x12 þ ei :
In this model, the dependent variable ‘Sales’ stands for the level of total sales
given the values of X’s that are independent or determinant variables. X1 stands for
the total annual cost of labor including wages, salaries, bonus and social security
payments as the performance expression in service firms, X2 stands for the size of
the most recent loan or line of credit approved as a source of finance, X3 stands for a
dummy variable on the use of Internet expressed by e-mails to communicate with
clients or suppliers as an ICT application, X4 stands for a dummy variable of
employees’ development activities through new ideas or approaches about products
or services, X5 stands for a dummy variable on the spending on formal research and
development activities to create new products or to find more efficient methods of
production, X6 stands for a dummy variable on innovation expressed as the intro-
duction of products or services, X7 stands for a dummy variable on engaging in
internal or external training of personnel, X8 stands for a dummy variable on the
acquisition of fixed assets such as machinery, vehicles, equipment, land or build-
ings, X9 stands for a dummy variable on the new or significantly improved methods
of offering services, X10 stands for a dummy variable on the new or significantly
logistical or business support processes, X11 stands for a dummy variable on
introduced new or significant improved marketing methods, X12 stands for a
dummy variable on the new or significantly improved organizational structure or
management practices.
The coefficients are represented by the symbol b with subscripts from 0 to 12
according to the dependent variables. On the one hand is the null hypothesis H0:
bi ¼ 0, that is, b1 ; b2 ; . . .; bn ¼ 0. In this case, no independent variable has any
effect on the total annual sales of service firms, and on the other hand, is the
alternative hypothesis, H1: bi 6¼ 0 meaning that in the independent variables the
results change in total annual sales of service firms. A positive coefficient is
interpreted as having a positive effect and a negative effect on sales. Thus, the main
focus is on the properties of the effects namely the signs of the effects and their
350 E. Uwitonze and A. Heshmati
consistency with our expectations, the size of the effects and their statistical sig-
nificance. The model can also be specified in the form of changes in sales between
two years or labor productivity that is sales per employee.
The innovation model was also used to assess the determinants of service sector
innovativeness which can influence firms’ growth. The model for service innova-
tion (Model 2) is specified as:
Pr:ðY ¼ 1jzÞ ¼ u0 þ u1 z1 þ u2 z2 þ u3 z3 þ u4 z4 þ u5 z5 þ u6 z6 þ u7 z7
ð15:5Þ
þ u8 z8 þ u9 z9 þ u10 z10 þ lt :
The probability that the service firms introduced new products or services is
portrayed with Y as the binary dependent variable. The symbol z with subscripts
ranging from 0 to 10 stands for different independent variables or determinants of
innovativeness that are thought to have an effect on the extent to which a firm
innovates.
As conceived in Eq. (15.5), z1 stands for new or significantly improved methods
of offering services, z2 stands for a dummy variable on the new or significantly
logistical or business support processes, z3 stands for a dummy variable on intro-
duced new or significant improved marketing methods, z5 stands for a dummy
variable on spending on formal research and development activities to create new
products or to find more efficient methods of production, z6 stands for a dummy
variable on employees’ development activities through new ideas or approaches
about products or services, z7 stands for a dummy variable on engaging in internal
or external training of personnel, z8 stands for a dummy variable on the acquisition
of fixed assets such as machinery, vehicles, equipment, land or buildings, z9 stands
for a dummy variable on having a line or a loan from a financial institution, z10
stands for a factor variable on the firm size defined as small (5–19 employees),
medium (20–99 employees) and large (100 employees and above) and lt stands for
the random error term.
For this model, the null hypothesis, H0: ui ¼ 0, implies that all the independent
variables do not affect or generate the introduction of new products or services and
the alternative hypothesis, H1: ui 6¼ 0, suggests that the independent variables have
an effect on the introduction of new products or services. Although maximum
likelihood estimators have the property of being consistent, the likelihood function
has to be correctly specified for this to hold. The most convenient framework for
such a test is the Lagrange multiplier framework (Verbeek 2004).
Turnover as a measure of growth is used to assess the factor that influences it in
the service sub-sectors. The model on the service firm turnover (Model 3) is con-
structed as:
Turnover ¼ h0 þ h1 X1 þ h2 X2 þ h3 X3 þ h4 X4 þ h5 X5 þ h6 X6 þ i ð15:6Þ
The level of the turnover of service firms given the predictor Xi in this model is
represented by G and the coefficients are symbolized by h with subscripts 1–6. The
independent variable X1 stands for the gender of the manager, X2 stands for
15 Determinants of Service Sector Firms’ Growth in Rwanda 351
openness in the service firm as selling and buying goods or services abroad, X3
stands for tax on added value, X4 stands for tax on income, X5 stands for a cate-
gorical variable on the main service sub-sector, X6 stands for a categorical variable
on the capital used by the service firm and i represents the error term. The null
hypothesis, H0 : h ¼ 0 implies that the independent variables have no effect on the
level of turnovers in service firms. The alternative hypothesis, H1 : h 6¼ 0 implies
that independent variables affect the level of turnover in service firms. The sign of
the coefficient is checked to be consistent with expectations.
The results of the multivariate linear regression of the service sales model (Model 1)
are presented in Table 15.1. At a 5% confidence interval, the variable on
employment coefficient, loan size, employees’ development and Internet use are
statistically significant with a positive effect on the growth in sales except
employees’ development. Therefore, we reject the null hypothesis. Other coeffi-
cients are statistically insignificant, thus we fail to reject the null hypothesis.
Innovation, training, acquisition of fixed assets, new methods, new practices, new
marketing and new logistics do not have any effect on total annual sales. The R2 is
0.84, meaning that the independent variables explain variations in sales of service
firms at 84%.
The results of the logistic regression of the service innovation model (Model 2) in
output are given in Table 15.2. The results for the innovation model show that the
independent variables on new or improved methods of offering services, engaging
in internal or external training and acquisition of fixed assets are statistically sig-
nificant at 5%, that is, they effect the service firms’ innovation. Thus, we reject the
null hypothesis. The other variables in the model are statistically insignificant as
they have no effect on the innovativeness of the service sector.
Testing the fit of the model, we find that AIC is lower than BIC which implies
that our model is well fit (see Table 15.3). The logistic model of innovation is
correctly classified at 76.58%. The log likelihood ratio test is recommended with
inference at −80.4422 with Chi2 (1) = 1.63 and Prob > Chi2 = 0.2015 at 5%,
implying that the model is fully fitted (Appendix 2). According to Scott (1997), the
LR test assesses constraints by comparing the likelihood of the unconstrained
model to the log likelihood of the constrained model. If the constraint significantly
352 E. Uwitonze and A. Heshmati
Table 15.1 Linear regression of service sales model (Model 1) and its determinants
Source ss df MS No of Obs = 48
Model 152.5216 12 12.7101 F(12, 35) = 15.80
Residual 28.1634 35 0.8046 Prob > F = 0.000
Total 180.6851 47 3.8443 R-squared = 0.8441
Adj. R-squared = 0.7907
Root MSE = 0.8970
Log total sales Coef. Std. Err. t p > |t| [95% conf. interval]
Log employ cost 0.7220 0.1079 6.689 0.0000 0.5029 0.9412
Log loan size 0.2361 0.0852 2.771 0.0089 0.0631 0.4090
Internet use 1.2684 0.5292 2.397 0.0220 0.1940 0.3428
Employe dvt −1.0810 0.4163 −2.596 0.137 −1.9262 −0.2358
Research devpt −0.9456 0.3223 −2.934 0.0059 −1.5999 −0.2914
innovation 0.1124 0.4208 0.267 0.7910 −0.7419 −0.9668
Trainings −0.1875 0.4509 0.416 0.6801 −1.1028 0.7278
Fixed asset −0.2912 0.3970 −0.733 0.4681 −1.1028 0.5147
New methods −0.6576 0.4593 −1.432 0.1611 −1.5901 0.2749
New practices 0.1796 0.4846 0.371 0.7132 −0.8043 1.1634
New marketing −0.6149 0.3739 −1.645 0.1697 −0.3740 0.1442
New logistics 0.7042 0.5023 1.402 0.1697 −0.3155 1.7239
_Cons 3.4132 1.5283 2.233 0.0320 0.3105 6.5159
Table 15.2 Logistic regression model of innovation performance (Model 2) and its determinants
Logistic regression Number of obs = 158
LR chi2 (12) = 46.28
Prob > chi2 = 0.0000
Log Likelihood = −81.257932 Pseudo R2 = 0.2217
Innovation Coef. Std. Err. Z P > |z| [95% Conf. Interval]
New methods 1.0971 0.4907 2.236 0.0254 0.1354 2.0587
New logistics 0.2143 0.5451 0.393 0.6943 −0.8542 1.2827
New practices −0.1162 0.5654 −0.205 0.8372 −1.2243 0.9920
New marketing −0.2969 0.4911 −0.605 0.5454 −1.2595 0.6656
Research dvpt 0.2238 0.4919 0.455 0.6491 −0.7402 1.1878
Employee dvpt 0.8771 0.4861 1.804 0.0712 −0.757 1.8399
Training 0.9657 0.4720 2.046 0.0408 0.0406 1.8909
Fixed asset −1.1771 0.4449 −2.646 0.0082 2.0491 −0.3051
Loan 0.6215 0.4092 1.519 0.1288 −0.1805 1.4234
(continued)
15 Determinants of Service Sector Firms’ Growth in Rwanda 353
Table 15.3 Summary of post-estimation of Akaike’s and Baysian information criteria (AIC, BIC)
Model Obs 11(null) 11(model) df AIC BIC
– 158 −80.44222 14 188.8844 231.7608
reduces the likelihood, then the null hypothesis is rejected. The results of an
alternative skewed logistic regression of innovation are presented in Appendix 3.
In order to estimate the service turnover model, we used ordered logistic regression
because turnover is a dependent variable defined as a categorical variable. If the
primary interest is understanding how the explanatory variable affects the con-
ceptual dimension represented by an ordinal variable, an ordinal variable is
appropriate. The results of an ordinal logistic model are the same as those for a
traditional logistic model with the exception that there is a cut point instead of
a constant (Powers and Xie 1999).
The results presented in Table 15.4 indicate that the coefficients of gender,
openness, value-added tax, income tax, capital used and service sub-sectors 8, 9 and
11 are statistically significant. Meaning that, they influence the level of turnover of
a service firm. The others are statistically insignificant which implies that they have
no effect on the change in the level of turnover.
This section gives an interpretation and analysis of the results for the three models
specified and estimated earlier. From this, we can gain advanced knowledge about
the constituents of the service sector and the determinants contributing to the
development of this sector. Service sector development is measured by considering
key measures of a firm’s performance and growth such as innovation, sales and
354 E. Uwitonze and A. Heshmati
Table 15.4 Ordered logistic regression of service turnover model (Model 3) and its determinants
Logistic regression Number of obs = 35575
LR chi2 (12) = 17,932.95
Prob > chi2 = 0.0000
Log Likelihood = −21.21409.823 Pseudo R2 = 0.2952
Turnover Coef. Std. Err. z p > |z| [95% conf. Interval]
Gender manager −0.0624 0.0280 −2.224 0.0262 −0.1174 −0.0074
Openness 0.7192 0.0891 8.075 0.0000 0.5447 0.8938
Value-added tax 1.8273 0.0816 22.380 0.0000 1.6672 1.9873
Income tax 0.2105 0.0479 4.394 0.0000 0.1166 0.3043
Ssubsectors
8 0.7318 0.2213 3.306 0.0009 0.2980 1.1656
9 −0.3654 0.0277 −13.193 0.0000 −0.4197 −0.3111
10 −0.0246 0.2351 −0.105 0.9166 −0.4854 0.4361
11 1.9284 0.1207 15.983 0.0000 1.6920 2.1649
12 −0.4586 1.1115 −0.413 0.6399 −2.6371 1.7200
Capital
2 2.7719 0.0334 82.892 0.0000 2.7063 2.8374
3 5.3948 0.1121 48.128 0.0000 5.1751 5.6145
4 6.4496 0.1464 44.058 0.0000 6.1626 6.7365
turnover, and these are taken to be dependent variables for forming and estimating
the models. The growth in sales of service firms contributes to the growth of the
service sector’s share in Rwanda’s GDP. Innovations bring in new products or
services which in turn push the growth of the sector. The factors influencing growth
in sales, service innovativeness and turnover are used to find the drivers of service
sector development. These determinants are taken into consideration in shaping and
sustaining the service-led economy path as it is a national strategy for economic
growth.
Estimation results of the linear regression of the sales model indicate that
employment costs, size of the approved loan and use of Internet positively affected
the change in sales of service firms for the period 2008–2010. Growth in
employment is a good indicator of a firm’s performance whereby the cost of
employment for three years is positively reflected in total sales. A 1% change in
costs attributed to employment resulted in a 0.72% change in sales in service firms,
other things holding constant.
15 Determinants of Service Sector Firms’ Growth in Rwanda 355
UNECA (2015) reported that financial services are the oil of transactions and
provide access to credit for investments for most other businesses. This is proven by
the fact that in our model on sales, the size of the most recent loan or line of credit
approved was positively correlated to the change in total sales of service firms.
Other things holding constant, a 1% change in the size of the loan resulted in a
0.236% change in the total annual sales of a service firm.
Liu and Nath (2013) argue that the trade-enhancing effect of ICT infrastructure
or ICT capability depends on its use. Internet subscriptions and Internet hosts have
significant positive effects on both exports and imports. In our model on sales, the
use of e-mails to communicate with clients or suppliers expressed as Internet use
had a positive relationship to total sales as has also been found in previous studies.
Holding other things constant, a 1% change in the use of Internet brought a 1.268%
change in the sales of a service firm.
Both employees’ development and research and development activities were
negatively correlated with a change in the sales generated in service firms. Holding
other things constant, a 1% decrease in employees’ development resulted in a
0.108% decrease in total service sales. A 1% decrease in spending on research and
development activities induced a 0.94% decrease in total sales, other things holding
constant.
The change in total sales of service firms in Rwanda is attributed to financial
services through access to credit, ICT applications in service provision principally
via e-mail operationalization, employment growth expressed by the costs incurred
by a service firm on employment, employees’ development as a trial of a new
approach or new idea about products or services, business process, firm manage-
ment or marketing. Last but not least is the expenditure incurred on research and
development activities. These variables are explained in the model at 84% as
measured by R2 and all are statistically significant as their t-statistic is greater than
1.96 with p-values less than 0.05.
The logistic regression of the service innovation model (Model 2) finds the factors
contributing to innovations in Rwanda’s service firms. In the summary of results for
Table 15.2, the number of observations shows that 158 firms were included in the
estimation. The significance test of the likelihood ratio indicating whether the
predictors in the model together accounted for significant variations in the depen-
dent variable is 46.28 where the probability Chi-square test is 0.000. This implies
that the independent variables influenced the dependent variable. Variables such as
new methods, training and acquisition of fixed assets were statistically significant at
the 95% confidence interval since their p-values are less than 0.05 and their z values
in absolute terms are greater than 1.96. The approximate amount of variance is
accounted for by independent variables in this model as expressed by Pseudo R2
which is 0.22. The log likelihood is −81.2579.
356 E. Uwitonze and A. Heshmati
The ultimate goal of this study was to carry out an analysis of trends in the
development of service firms in Rwanda and identifying contributing factors
driving its performance using survey data covering various aspects of the service
sector. Literature was reviewed to assess the similarities and dissimilarities in
findings all over the world, a descriptive analysis of existing data and an empirical
analysis of micro-data on service firms were used to understand the functioning of
service firms in Rwanda and in other parts of the world. The results are interesting
and are useful for academics and both the public and private sectors.
The results of factors influencing innovation in service firms are very useful for the
government because innovation is a key to economic growth and development. In
public sector management, innovation is a priority for all nations because the
current wealthy nations have got a wide range of innovations in various disciplines.
In our study, innovation as a stand-alone variable did not influence any change in
total sales; though some of the variables characterizing innovation were statistically
significant, namely new methods and training. Therefore, the government could use
these findings to scale-up innovation activities in service firms and shape capacity
building strategies and policy with these empirical facts. Innovation is a prime
15 Determinants of Service Sector Firms’ Growth in Rwanda 359
The results of the linear regression of the sales model are very useful in assessing
the role of economic integration. One of the objectives of economic integration is to
operate in a large market where nationals buy and sell their products and services.
Having openness as a significant variable to change turnovers indicates that eco-
nomic agents in service firms should take advantage of this information to increase
the returns to their businesses. The private sector can use this information to exploit
unused channels and do a study of regional markets to expand their businesses since
it has been a while since the government signed the agreement to be a member of
East African Countries (EAC) and other regional economic integration cooperation
efforts.
Focus on ICT is found to be another source of better performance in service
firms. Daily use of Internet as a communication channel must be looked at as a
strategy to be widely adopted by competitive managers of service firms. This fits
well with Rwanda’s national commitment of becoming an ICT regional hub and an
ICT connected country.
For academic research purposes, this information is crucial since it opens up the
ground for further empirical studies to assess how the government is benefiting
from regional economic integration in terms of economic growth and development.
Further, it will be interesting to conduct an empirical study on ICT applicability and
economic performance in Rwanda.
All firms aim to increase their turnovers as they are profit-based entities. The results
from the model on service firms’ turnovers gives information on interacting with
the foreign market by either buying or selling products or services. The more the
capital used, the more the turnover increases which could inform investors attracted
by service related economic activities such as transport and storage and accom-
modation and food services. These service sub-sectors are found to be more
profitable in the overall service sector. The spillover effects of taxes are marked in
360 E. Uwitonze and A. Heshmati
the turnovers of service companies. This could be used to back the importance of
paying taxes by service sector taxpayers. Looking at the value-added tax, which is
paid by consumers, helps us conclude that the service firms’ development is
demand elastic because the more the consumers pay VAT, the more the turnovers
are generated. Income tax is normally paid depending on the income earned by a
firm through the year. The correlation of income tax and growth in turnover implies
good performance of service firms. Generally, taxes support the economy at large
and it is important to know how taxes affect the service firms’ development in
particular.
Access to finance is one of the most needed inputs for the good performance of a
service firm; this is provided by financial institutions like banks. Our investigation
of the determinants of service sector development qualifies it to be more appropriate
for service firms’ performance as indicated by acquisition of fixed assets, loan size
and capital used. The government should take note of this in steering monetary
policy and encourage financial institutions to facilitate service firms’ investors in
accessing funds.
15.5.1 Conclusions
Our study on service firms’ development and their performance in Rwandan eco-
nomic growth provides useful details about service firms over years and empirically
estimates the determinants of service firms’ development by using the econometric
methodology. The measures of firm growth used include innovation, sales and
turnover. The estimation was enabled by using micro-data collected by the National
Institute of Statistics of Rwanda namely the 2011 Rwanda Enterprise Survey and
the 2014 Establishment Census.
The literature review on the service sector supports that services contribute more
to economic growth. Zhou (2015) and William (1997) claim that services accelerate
the transformation of economic growth, raise employment and boost economy-wide
labor productivity. The key factors that contribute to the growth of the service
sector include rapid urbanization, expansion of the public sector, increased demand
for intermediate and final consumer services, domestic investments and openness,
education skills, cultural adaptability, financial attractiveness, business environ-
ment, expansion of quality health services, application of information and tech-
nology, increase in consumption expenditure, incentive systems and investing more
in research and development. In Rwanda, the services are dominated by wholesale
and retail trade, motorcycle and motor vehicle repairs, accommodation and food
services and human health and social work sub-sectors.
After estimating models on sales, innovation and turnovers in service firms, the
results show that service firms’ development in Rwanda is driven by access to
15 Determinants of Service Sector Firms’ Growth in Rwanda 361
15.5.2 Recommendations
As the Government of Rwanda has opted for driving its economic growth through
service development and aims to become a middle-income country, this study
362 E. Uwitonze and A. Heshmati
makes recommendations that can help it in speeding up the shift form a low income
to a middle-income economy.
Our study shows that both employees’ development and research and devel-
opment activities are negatively correlated with a change in the sales generated in
service firms. From literature’s point of view, research and development has a great
effect on the development of service firms (Madeira et al. 2014). Reducing expenses
on research and development reduces the sales of service firms. From the results of
our study, it is worthy to recommend that policymakers encourage research and
development in service firms to increase and sustain their levels of performance in
the Rwandan economy.
The use of ICT in service firms as expressed by use of improved technology,
equipment and software for production and the use of Internet were relevant factors
in the sales performance of service firms. A review of microeconomic literature on
service firms reaffirmed ICT as a bedrock for improving business processes, cus-
tomer relations and efficient delivery of goods and services to satisfy the needs of
customers (Atom 2013). The study recommends that the government should
advance and sustain the use of ICT in all service firms in Rwanda. In addition, the
application of ICT in service firms can generate multiple effects on the performance
of service firms which is correlated with the national aspiration of becoming an ICT
regional hub to accelerate its target of economic growth.
Our study shows that the size of the loan approved and capital used by a service
firm influence its sales. New service firms need to be supported to generate addi-
tional value on the performance of service firms in the economy. For this reason, the
study suggests that the government promote access to finance to new service firms
through, for instance, setting up an affordable collateral value and extending the
time for paying back the loan approved by giving a sufficient grace period.
In our study, we found both employees’ development and acquisition of internal
or external training to have a great impact on the performance of service firms in
Rwanda. Prior to this study, it was also found that training develops skills, com-
petencies and abilities and ultimately improves employee performance and orga-
nizational productivity (Amir and Amen 2013). Hence, the government should
promote on-work training in service firms to speed up Rwanda’s shift from a low
income to a middle-income state.
Due to the fact that the acquisition of fixed assets such as machinery, vehicles,
equipment, land and buildings has multiple effects on innovation, the government
should facilitate the import of necessary fixed assets to be used by service firms.
This could be tax exemptions and incentives depending on the value of the
imported fixed assets. Further, since the acquisition of fixed assets is a proxy
indicator for accessing finance for firms, the government should regulate finance in
a way that facilitates firms to have easy access to finance from financial institutions
like working out the interest rate charged from a firm when it wants to purchase
fixed assets.
The key recommendations from the analysis of service firms’ development and
economic growth in Rwanda can be summarized as:
15 Determinants of Service Sector Firms’ Growth in Rwanda 363
• Advancing and sustaining the use of ICT in all service firms, specifically the use
of the Internet;
• Promoting new service firms’ access to finance in terms of loans to acquire fixed
assets;
• Promoting on-work training in service firms to increase the level of firm
performance;
• Putting in place a services innovation policy complementing existing employ-
ment with emphasis on employees’ development and enhanced training
strategies;
• Expanding ICT applications for service firms to become mobile based for tar-
geting the countryside population; and
• Putting in place a foreign trade policy with emphasis on service exports in forms
that benefit from existing economic integration.
+ 87 25 112
− 12 34 46
Total 99 59 158
Classified + if predicted pr (D) > = 0.5
True D defined as innovation! = 0
Sensitivity Pr (+ | D) 87.88%
Specificity Pr (− | *D) 57.63%
Positive predictive value Pr (D | +) 77.68%
Negative predicative value Pr (*D | −) 73.91%
False + rate for true *D Pr (+ | *D) 42.37%
False − rate for true D Pr (− | D) 12.12%
False + rate for classified + Pr (*D | +) 22.32%
False − rate for classification − Pr (D | −) 26.09%
Correctly classified 76.58%
(continued)
Innovation Coef. Std. Err. z p > |z| [95% conf. Interval]
New 0.1210 0.3527 0.343 0.7316 −0.5703 0.8123
logistics
New −0.0990 0.3551 −0.279 0.7804 −0.7950 0.5970
practices
New −0.1649 0.3152 −0.523 0.6009 −0.7827 0.4529
marketing
Research 0.2385 0.2765 0.862 0.3885 −0.3035 0.7804
dvpt
Employee 0.6093 0.3223 1.891 0.0587 −0.0224 1.2410
dvpt
Training 0.5703 0.2875 1.984 0.0473 0.0068 1.1339
Fixed asset −0.6896 0.2680 −2.573 0.0101 −1.2149 −0.1643
Loan 0.4360 0.2527 1.725 0.0845 −0.0593 0.9313
Firm size
1 −0.3431 0.6728 −0.510 0.6101 −1.6617 0.9756
2 0.0838 0.6807 0.123 0.9021 −1.2504 1.4180
3 0.5803 0.7563 0.767 0.4429 −0.9020 2.0626
_Cons −15.5998 1523.5291 −0.010 −0.9918 −3.00e 2970.4623
+03
/lnalpha 14.5702 1523.5288 0.010 0.9924 −2.97e 3000.6318
+03
alpha 2.13e+06 3.24e+09 0.0000
References
Acharya R (2016) ICT use and total factor productivity growth: intangible capital or productive
externalities? Oxf Econ J 68(1):16–39
Acharya R, Patel R (2015) Contribution of telecom sector to growth of indian service sector: an
empirical study. Indian J Sci Technol 8(4):101–105
Acs ZJ, Audretsch DB (1988) Innovation in large and small firms: an empirical analysis. Am Econ
Rev 78(4):678–690
Agwu EM, Carter AL (2014) Mobile phone banking in Nigeria: benefits, problems and prospects.
Int J Bus Commer 3(6):50–70
Alvarez-Cuadrado F, Poschke M (2011) Structural change out of agriculture: labor push versus
Labor pull. Am J Macroecon 3(3):127–158
Amir E, Amen I (2013) The effect of training on employee performance. Eur J Bus Manag 5
(4):137–147
Arnold JM, Jovorcik B, Lipsomb M, Mattoo A (2016) Services reform and manufacturing
performance: evidence from India. Econ J 129(590):1–39
Arvanitis S, Stucki T (2012) What determines the innovation capability of firm founders? Ind Corp
Change 24(4):1049–1084
Atom B (2013) The impact of information communication technology (ICT) on business. Asian J
Bus Manage Sci 3(2):13–28
366 E. Uwitonze and A. Heshmati
Bethapudi A (2013) The role of ICT in tourism Industry. J Appl Econ Bus 1(4):67–79
Borghoff T (2011) The role of ICT in the globalization of firms. J Mod Account Audit 7(10):1128–
1149
Bouazza AB, Ardjouman D, Abada O (2015) Establishing the factors affecting the growth of small
and medium-sized enterprises in Algeria. Am Int J Soc Sci 4(2):101–115
Buera FJ, Kaboski JK (2009) The rise of the service economy. Am Econ Rev 102(6):2540–2569
Chude DI, Nkuru P (2015) Impact of company income taxation on the profitability of companies in
Nigeria: a study of Nigerian breweries. Eur J Account Audit Financ Res 3(8):1–11
Dawkins P, Feeny S, Harris MN (2007) Benchmarking firm performance. Benchmark Int J 14
(6):693–710
Drejer I (2002) Business services as a production factor. Econ Syst Res 14(4):389–405
Du J, Temouri Y (2015) High-growth firms and productivity: evidence from the United Kingdom.
Small Bus Econ 44(1):123–143
Eichengreen B, Gupta P (2011) The service sector as India’s road to economic growth. NBER
working paper no. 1675
El-Said OA, Kattara HS (2013) Customers’ preferences for new technology-based self-services
versus human interaction services in hotels. Tour Hosp Res 13(2):67–82
Fuchs VR (1980) Economic growth and the rise of service employment. NBER working paper no.
486
Gale WG, Krupkin A, Rueben K (2015) The relationship between taxes and growth at the state
level: new evidence. Nat Tax J 68(4):919–942
Geishecker I, Görg H (2013) Services offshoring and wages: evidence from micro data. Oxf Econ J
65(1):124–146
Guerrieri P, Meliciani V (2004) International competitiveness in producer services. Available at
SSRN: https://ssrn.com/abstract=521445
Gupta R (2012) The role of ICTs in facilitating India’s external trade. J Decis Making 12(1):11–15
Hagen B, Zucchella A (2014) Born global or born to run? The long-term growth of born global
firms. Manage Int Rev 54(4):497–525
Halpern L, Koren M, Szeidl A (2015) Imported inputs and productivity. Am Econ Rev 105
(2):3660–3703
Heshmati A, Kim H (2011) The R&D and productivity relationship of Korean listed firms. J Prod
Anal 36(2):125–142
Isaga N (2015) Owner-managers’ demographic characteristics and the growth of Tanzanian small
and medium enterprises. Int J Bus Manage 10(5):168–181
Jafaridehkordi H, Rahim RA, Jafaridehkordi P (2015) Intellectual capital and investment
opportunity set in advanced technology companies. Int J Innov Appl Stud 10(3):1022–1027
Jajri I (2008) Determinants of total factor productivity growth in Malaysia. J Econ Coop 28(3):41–
58
Johnsen GJ, McMahon RGP (2005) Owner-manager gender, financial performance and business
growth amongst SMEs from Australia’s business longitudinal survey. Int Small Bus J 23
(2):115–142
Jones RS, Yoon T (2008) Enhancing the globalisation of Korea. Economic department working
papers no. 614. OECD
Khan KS (2011) Determinants of firm growth: an empirical examination of SMEs in Gujranwala,
Gujarat and Sialkot districts. Interdisci J Contemp Res Bus 3(1):1389–1409
King RG, Levine R (1993) Finance and growth: Schumpeter might be right. Q J Econ 108(3):717–
737
Latha CM, Shanmugam V (2014) Growth of service sector in India. IOSR J Humanit Soc Sci 19
(1):8–12
Lee S, Malin BA (2013) Education’s role in China’s structural transformation. J Dev Econ
101:148–166
Lenaerts K, Merlevede B (2015) Firm size and spillover effects from foreign direct investment: the
case of Romania. Small Bus Econ 45(3):595–616
15 Determinants of Service Sector Firms’ Growth in Rwanda 367
Liu L, Nath HK (2013) Information and communications technology and trade in emerging market
economies. Emerg Markets Financ Trade 49(6):67–87
Madeira MJ, Jorge S, Sousa G, Moreira J, Mainardes EW (2014) Determinants of innovation
capacity: empirical evidence from service firms. Innov Manage Policy Pract 16(3):404–416
Mihalic T, Praniceric DG, Arneric J (2015) The changing role of ICT competitiveness: the case of
the Slovenian hotel sector. Econ Res Ekon Istraživanja 28(1):367–383
Muro MB, Magutu PO, Getembe KN (2013) The strategic benefits and challenges in the use of
customer relationship management systems among commercial banks in Kenya. Eur Sci J 9
(13):327–349
Nassazi A (2013). Effect of training on employee performance with evidence from Uganda.
Business Economics and Tourism, 1–57
Neely A, Hii J (1998) Innovation and business performance. University of Cambridge, London
Newsom J (2013) Levels of measurement and choosing the correct statistical test. USP 634 data
analysis, Spring 2013, New York
NISR (2014) Establishment census. National Institute of Statistics of Rwanda, Kigali
Oliveira B, Fortunato A (2008) The dynamics of the growth of firms: evidence from the services
sector. Empirica 35(3):293–312
Park D, Shin K (2012) The service sector in Asia: is it an engine of growth? ADB economics
working paper series no. 322
Pop ZW, Stümpel HJ, Bordean ON (2014) From strategic decisions to corporate governance in the
SME sector in Germany. Stud Univ Babeș-Bolyai Oecon 59(3):57–67
Powers AD, Xie Y (1999) Statistical methods for categorical data analysis. Academic Press Inc,
Austin
Prajogo DI, Sohal AS (2006) The integration of TQM and technology/R&D management in
determining quality and innovation performance. Omega 34(3):296–312
Queiro F (2016) The effect of managers education on firm growth. Q J Econ 118(4):1169–1208
RES (2011) Rwanda Enterprise Survey. Rwanda National Institute of Statistics. http://www.
statistics.gov.rw/
RES (2016) Rwanda Enterprise Survey. Rwanda National Institute of Statistics. http://www.
statistics.gov.rw/
Reuber AR, Fischer E (2002) Foreign sales and small firm growth: the moderating role of the
management team. Entrep Theory Pract 27(1):29–45
Sahu S (2015) Source of service sector TFP growth in India: evidence from micro-data. South
Asian J Macroeconom Pub Financ 4(1):62–90
Salehi-Isfahani D (2006) Microeconomics of growth in MENA: the role of households. Contrib
Econ Anal 278:159–194
Schoonjans B, Van Cauwenberge P, Vander Bauwhede H (2013) Does Formal Business
Networking contribute to SME Growth?—An empirical examination. Working Papers of
Faculty of Economics and Business Administration, Ghent University, Belgium from Ghent
University, Faculty of Economics and Business Administration, 2011/708
Schoonjans B, Van Cauwenberge P, Vander Bauwhede H (2013) Formal Business Networking
and SME Growth. Small Business Economics 41(1):169–181
Scott L (1997) Advanced quantitative techniques in the social science. International Educational
and Professional Publisher, New Delhi
Simsek Z, Heavey C (2011) The mediating role of knowledge-based capital for corporate
entrepreneurship effects on performance: a study of small- to medium-sized firms. Strateg
Entrepreneurship J 5(1):81–100
Singh M, Kaur K (2014) Indian’s service sector and its determinants: an empirical investigation.
J Econ Dev Stud 2(2):385–406
Smith N, Smith V, Verner M (2006) Do women in top management affect firm performance? A
panel study of 2,500 Danish firms. Int J Prod Perform Manage 55(7):569–593
Stock JH, Watson MW (2011) Introduction to econometrics, 3rd edn. Pearson Education Inc,
Boston
368 E. Uwitonze and A. Heshmati
Stoilova D, Patonov N (2013) An empirical evidence for the impact of Taxation on economy
growth in the European Union. In: Proceedings TMS international conference 2012: Financial
management, accounting & taxation, vol 2, pp 1031–1039
Tahir M, Azid T (2015) The relationship between international trade openness and economic
growth in the developing economies: Some new dimensions. J Chin Econ Foreign Trade Stud 8
(2):123–139
UN (2008) International standards industrial classification of all economic activities. Revision 4
UNECA (2015) Economic report on Africa 2015: industrializing through trade
Van der Marel E, Shepherd B (2013) International tradability indices for services: policy research
working paper no. 6712
Verbeek M (2004) A guide to modern economitrics, 2nd edn. Wiley, Rotterdam
Watson J (2003) SME performance: does gender matter. A paper for the small enterprise
association of Australia and New Zealand 16th Annual conference. Ballarat
WB (2014) Enterprise surveys: Rwanda country profile 2011. Enterprise surveys country profile.
World Bank Group, Washington, DC
Williams CC (1997) Understanding the role of consumer services in local economic development:
some evidence from the Fens. Environ Plann 28(3):555–571
Wu W, Wu C, Zhou C, Wu J (2012) Political connections, tax benefits and firm performance:
evidence from China. Journal of accounting and public policy. J Account Public Policy 31
(3):277–300
Yeboah O, Naanwaab C, Saleem S, Akuffo A (2012) Effects of trade openness on economic
growth: the case of African countries. Agribusiness, Applied Economics and Agriscience
Education—NC A&T, Birmingham
Yli-Renko H, Autio E, Sapienza HJ (2001) Social capital, knowledge acquisition, and knowledge
exploitation in young technology-based firms. Strateg Manag J 22(6–7):587–613
Zhou Z (2015) The development of service economy: a general trend of the changing economy.
Development Research Center of Shanghai, Shanghai
Zhou H, de Wit G (2009) Determinants and dimensions of firm growth. SCALES EIM research
reports (H200903). Groningen
Chapter 16
Labor-Use Efficiency in Kenyan
Manufacturing and Service Industries
Masoomeh Rashidghalam
Abstract This study uses the labor-use requirement model to estimate labor-use
efficiency of Kenyan manufacturing and service sectors. It also studies the deter-
minants of labor-use efficiency. The data are obtained from the World Bank’s
Enterprise Survey (ES). The Cobb–Douglas functional form of labor-use frontier
estimates shows that wages, sales, capital, fuel, and electricity affected the amount
of labor used in Kenya. The determinants of labor-use efficiency were the man-
ager’s experience, female share, labor training, education, and obstacles. The results
show that the estimated firm labor-use efficiency ranged from 0.14 to 0.87 with a
mean labor-use efficiency value of 0.66. According to the results, most of the firms
operated within the labor-use efficiency range of 0.70–0.80 suggesting that there is
space for improvements in labor use of 20–30% as compared to the firms with best
labor-use practices.
16.1 Introduction
The World Bank’s most recent Kenya Economic Update (KEU) (March 2016)
projected a 5.9% growth in 2016, rising to 6% in 2017. The report attributes this
positive outlook to low oil prices, good agricultural performance, a supportive
monetary policy, and ongoing infrastructure investments. According to the latest
Kenya National Bureau of Statistics’ (KNBS) quarterly report, Kenya’s economy
expanded by 6.2% in the second quarter of 2016 as compared to 5.9% in the same
period in 2015. This growth was mainly supported by agriculture, forestry, and
fishing; transportation and storage; real estate; and wholesale and retail trade.
M. Rashidghalam (&)
Department of Agricultural Economics, University of Tabriz, Tabriz, Iran
e-mail: maso.azar@gmail.com
Manufacturing, construction, and the financial and insurance sectors slowed down
during this quarter while accommodation and food services, mining and quarrying,
electricity and water supply, and information and communication sectors recorded
improvements.1
Although manufacturing companies in Kenya are small, they are the most
sophisticated in East Africa. Industries in Kenya have been growing since the late
1990s and into the new century. These companies are also relatively diverse. The
transformation of agricultural raw materials, particularly of coffee and tea, remains
the principal industrial activity. Meat and fruit canning, wheat flour and cornmeal
milling, and sugar refining are also important. Production of electronics, vehicle
assembly, publishing, and soda ash processing are all significant. Assembly of
computer components began in 1987. Kenya also manufactures chemicals, textiles,
ceramics, shoes, beer and soft drinks, cigarettes, soap, machinery, metal products,
batteries, plastics, cement, aluminum, steel, glass, rubber, wood, cork, furniture, and
leather goods. It also produces a small number of trucks and automobiles. The most
common manufacturing industries in Kenya include small-scale consumer goods
(plastic, furniture, batteries, textiles, clothing, soaps, cigarettes, and flour), agri-
cultural products, horticulture, oil refining, aluminum industries, steel industries,
lead industries, cement industries, and commercial ship repairs.2
Kenya is also a leading sub-Saharan African (SSA) producer and exporter of
services. According to the World Bank, Kenya has a comparative advantage in
services production. It has the largest service economy in the East African
Countries (EAC). It produced $19 billion of services in 2012; this amount repre-
sents almost half of the nation’s GDP and accounted for an estimated 43% of the
EAC’s total services output (Serletis 2014). As East Africa’s distribution hub,
telecommunication axis, and financial center, Kenya has a broad array of
well-developed service industries with an abundance of service suppliers. These
factors make Kenya a promising source of increased exports of services. In addi-
tion, the Government of Kenya is aiming to spur economic growth by promoting
exports of services, including professional services, which are critical for Kenya’s
economic development and also serve as key inputs for East Africa’s growth. In
most of the years, this sector accounts for the largest share of jobs in Kenya.
In 2006, Kenya’s labor force was estimated to include about 12 million workers
of which almost 75% worked in agriculture. About 6 million were employed
outside small-scale agriculture and pastoralism. Approximately 15% of the labor
force was officially classified as unemployed in 2004. As Kenya became increas-
ingly urbanized, the labor force shifted from the countryside to cities (The World
Bank 2015). The service sector absorbed a majority of the inflow of labor to urban
areas. Labor force participation rates for both women and men were constant
between 1997 and 2010. In 1997, 65% of the women were employed in some type
of labor market activity, while the corresponding number for men was 76%
1
http://www.worldbank.org/en/country/kenya/overview.
2
https://softkenya.com/industry/.
16 Labor-Use Efficiency in Kenyan Manufacturing … 371
(the World Bank 2015). Around 60% of the women and 70% of the men were in the
labor force in 2005. Their shares increased in 2010, when 61% of the women and
72% of the men were a part of the labor force.
In this regard, studying labor-use efficiency in these two main economic sectors
of Kenya is important. Therefore, our study investigates labor-use efficiency and its
determinants in manufacturing and service sectors in Kenya. Labor efficiency is a
measure of how efficiently a given workforce accomplishes a task when compared
to the standard in that industry or setting. As labor efficiency goes up, costs go
down. It may also be possible to increase production because more labor hours are
available for producing goods and services. This will be even more important in
periods of increased demand when a company needs more laborers to make more
goods or offer more services. More efficiency can also translate into wider oppor-
tunities for research and development as a company has workers available to put to
these tasks instead of having to focus on meeting the needs of the production line.
One way of looking at labor efficiency is by comparing the number of hours
actually required to produce a given product or service with those usually spent. If
the workforce is producing products and services at below the usual rate, it is
operating with high efficiency, cutting time off production. This can translate into
significant savings as the company will spend less money on wages and overheads
because it is turning out finished services and products at a more efficient rate.3
In particular, we address the following questions in our paper: What are the
levels of labor-use efficiency in manufacturing and service sectors in Kenya and
which factors determine the efficiency of labor in Kenya? The results of our study
will provide researchers and employers with information about how labor and farm
characteristics affect labor-use efficiency.
The rest of the paper is organized as follows. Section 2 includes a brief review of
relevant literature. Section 3 outlines the relevant labor-use requirement model and
determinants of efficiency. Data sources along with identification of inputs and
outputs are reported in Sect. 4. Section 5 discusses the findings from the empirical
analysis, and Sect. 6 gives a conclusion.
3
http://www.wisegeek.com/what-is-labor-efficiency.htm.
372 M. Rashidghalam
efficiency and regional disparities and confirmed that the persistently high rate of
unemployment was the result of not only excess labor supply but was also related to
a shortfall between supply and demand (sector, location, and qualification).
Anyiro et al. (2013) examined labor-use efficiency by smallholder yam farmers
in Nigeria. The Cobb–Douglas functional form of labor-use frontier estimates
showed that the quantity of harvested yam, size of the cleared farmland, and the
quantity of fertilizers applied significantly affected the amount of labor used in yam.
The socioeconomic determinants of labor-use efficiency were age, education, farm
size, gender, labor wage, and household size; these were statistically significant.
According to their results, labor-use efficiency ranged from 0.20 to 0.97 with a
mean labor-use efficiency value of 0.76. Policies aimed at increasing yam farmers’
scale of operations through improved access to production inputs such as fertilizers,
agrochemicals, and capital are required for increasing labor-use efficiency in the
area.
Das et al. (2009) used a data envelopment analysis to measure labor-use effi-
ciency of individual branches of a large public sector bank in India. They intro-
duced the concept of area or spatial efficiency for each region relative to the nation
as a whole. Their findings suggest that the policies, procedures, and incentives
handed down from the corporate level cannot fully neutralize the detrimental
influence of the local work culture across different regions. Most of the potential
reduction in labor cost appeared to come from possible downsizing in the clerical
and subordinate staff.
16.3 Methods
The labor-use requirement frontier model determines the minimum amount of labor
required to produce a given level of output. This model is expressed as (Akanni and
Dada 2012; Anyiro et al. 2013; Martinez and Burns 1999; Masso and Heshmati
2004):
where
Labori labor-use requirement frontier model
Wi real wage
Outputi sale
Zu vector characterizing the production process
b unknown parameters associated with determinants of optimal labor use
16 Labor-Use Efficiency in Kenyan Manufacturing … 373
where
Ln Labor natural log of annual employment
Ln Wage natural log of wage per employee in KES
Ln Capital natural log of annual investment per employee
Ln Sale natural log of total sales (in Kenyan shilling, KES)
Ln Electricity natural log of annual cost of electric energy per labor
Ln Fuel natural log of annual cost of all fuels per labor (fuel intensity) in
KES
To study the determinants of labor-use efficiency (LE), the following model was
formulated:
LE ¼ d0 þ d1 Z1 þ d2 Z2 þ þ d8 Z8 ð16:3Þ
where
LE labor-use efficiency
Z1 experience of manager (in years)
Z2 female share of employees
Z3 training programs for employees (yes = 1, No = 0)
Z4 average number of years of education of a typical female production worker
(years)
Z5 percentage of full-time permanent workers who have completed secondary
school (%)
Z6 age of firm (years)
Z7 does the firm face minor and moderate obstacles (Yes = 1, No = 0)
Z8 does the firm face major and severe obstacles (Yes = 1, No = 0).
16.4 Data
The data used in this study are from the World Bank’s Enterprise Survey (ES). As
part of these surveys, the World Bank collects data from key manufacturing and
service sectors in every region of the world. The surveys use standardized survey
instruments and a uniform sampling methodology to minimize measurement errors
and to yield data that is comparable across the world’s economies and as such is
suitable for comparative economic studies. The initial dataset consisted of 670
firm-level observations in Kenya’s manufacturing and service firms in 2013. Data
374 M. Rashidghalam
Tables 16.1 provide summary statistics of the data for the input and output vari-
ables and labor, firm, and market characteristics used in this study. Sales averaged
at 1170 million Kenyan shilling (KES) with dispersion 6.54 times the mean. The
average employment in a sample firm was 98 persons. It varied in intervals 1 and
8000, with a dispersion of 4.32 around the mean value. The ratio of the two
variables, the amount of sales per employee, which measures labor productivity
varied from 3000 to 1720 billion KES with mean and standard deviations of 14.4
and 90.2 million KES. The value of investment per employee indicates consider-
able variations in the dataset. Mean wage per employee was 1.05 million KES with
16 Labor-Use Efficiency in Kenyan Manufacturing … 375
Table 16.1 Summary statistics of key variables and labor-use efficiency determinants in the
Kenyan (2013) enterprise data (N = 670)
Variable Variable Mean Std. dev. Minimum Maximum
definition
A. Key variables
Employment Annual 98 424.80 1 8000
employment
Sale Total sales (in 1,170,000,000 7,650,000,000 90,000 120,000,000,000
Kenyan
shilling, KES)
LABOR Sale per 14,400,000 90,200,000 3000 1,720,000,000
employee
(labor
productivity)
in KES
CAPINT Annual 7,806,039 35,000,000 0.16 484,000,000
investment per
employee
(capital
intensity) in
KES
EENINT Annual cost of 238,817 1,610,259 0.00 36,000,000
electric energy
per labor
(energy
intensity) in
KES
FENINT Annual cost of 4,821,125 10,800,000 0.00 200,000,000
all fuels per
labor (fuel
intensity) in
KES
WAGE Wage per 1,049,785 7,095,087 1000 170,000,000
employee in
KES
SALE Total sales (in 1,170,000,000 7,650,000,000 90,000 120,000,000,000
Kenyan
shilling, KES)
B. Labor-use efficiency determinants
expe Manager’s 18.79 10.77 2 57.00
experience in
years
femsh Female share 0.81 1.27 0 9.50
of employees
train Training 0.45 0.50 0 1.00
programs for
employees
(equals 1 if
(continued)
376 M. Rashidghalam
a large standard deviation of 7.1 million. It varied in the interval 1000 and
170 million KES. Energy and fuel intensity variables also showed large variations
among firms.
The sample average capital intensity per employee was 7.8 million KES with
standard deviation of 35 million KES. The highest in the sample—a
capital-intensive technology firm—used 484 million KES in capital per employee.
Variability in energy (electricity and fuel) use per employee also varied greatly. An
average manager’s experience was about 19 years in our dataset, which varied
between 2 and 57 years. The average age of firms was 25 years with a standard
deviation of 18 years. It varied in the interval 2 and 108. On average, the male labor
share was 0.81, and, on average, firms’ CEOs had about 12 years of education.
Around 80% of the permanent workers had completed secondary schooling.
16 Labor-Use Efficiency in Kenyan Manufacturing … 377
Table 16.3 Estimates of ordinary least squares parameter of frontier model and efficiency
determinant (N = 670)
Variable Coefficient Std. err. Variable Coefficient Std. err.
A. Frontier model B. Efficiency model
Intercept −0.503 0.367 Intercept −0.679a 0.025
WAGE −0.261a 0.024 expe 0.001 0.000
CAPINT −0.083a 0.014 femsh −0.025a 0.004
SALE 0.440a 0.017 train 0.022b 0.010
ELEINT 0.031b 0.013 feduc 0.001 0.002
FEUINT −0.042a 0.007 psec −0.001 0.000
age 0.001a 0.000
Tobst1 −0.019c 0.011
Tobst2 −0.023c 0.013
sigma_u 0.448a 0.086
sigma_v 0.754a 0.047
F-value
Note Significant at less than 1% (a), 1–5% (b) and 5–10% (c) levels of significance
Table 16.4 Distribution of labor-use efficiency in Kenyan manufacturing and service sectors
Labor-use efficiency range Frequency Percentage
0.10–0.20 4 0.59
0.20–0.30 9 1.34
0.30–0.40 29 4.32
0.40–0.50 35 5.22
0.50–0.60 88 13.13
0.60–0.70 201 30.00
0.70–0.80 269 40.14
0.80–0.90 35 5.22
Total 670 100.00
Maximum labor-use efficiency 0.87
Minimum labor-use efficiency 0.14
Mean labor-use efficiency 0.65
This paper analyzed labor-use efficiency at the firm level using data from 670 firms
in the manufacturing and service sectors in Kenya. The data were sourced from the
World Bank’s Enterprise Survey (ES). It was concerned with two important issues.
First, modeling labor-use requirements, and second, considering labor-use effi-
ciency and its determinants. In estimating the labor-use requirement model, we
studied the effects of wages, sales, capital, electricity, and fuel use in labor demand.
380 M. Rashidghalam
References
Abid AB, Drine I (2011) Efficiency frontier and matching process on the labor market: evidence
from Tunisia. Econ Model 28:1131–1139
Akanni AK, Dada AO (2012) Analysis of labor-use patterns among small-holder Cocoa farmers in
South Western Nigeria. J Agric Sci Technol 2:107–113
Anyiro CO, Emerole CO, Osondu CK, Udah SC, Ugorji SE (2013) Labor-use efficiency by
smallholder yam farmers in Abia State Nigeria: a labor-use requirement frontier approach. Int J
Food Agricul Econ 1(1):151–163
Das A, Subhash CR, Nag A (2009) Labor-use efficiency in Indian banking: a branch-level
analysis. Int J Manag Sci 37:411–425
Heshmati A, Su B (2014) Development and sources of labor productivity in Chinese provinces.
China Econ Policy Rev 2(2):1–30
LaFave D, Thomas D (2016) Height and cognition at work: labor market productivity in a low
income setting. Available at doi:10.1016/j.ehb.2016.10.008
Martinez MG, Burns J (1999) Sources of technological development in the Spanish food and drink
industry, a ‘supplier dominated’ industry. Agribusiness 15(4):4431–4448
Masso J, Heshmati A (2004) Optimality and overuse of labor in Estonian manufacturing
enterprises. Econ Transit 12(4):683–720
Nagler P, Naudé W. (2014) Labor productivity in rural African enterprises: empirical evidence
from the LSMS-ISA. IZA Discussion Paper No. 2014: 8524
Ogutu SO, Okello JJ, Otieno DJ (2014) Impact of information and communication
technology-based market information services on smallholder farm input use and productivity:
the case of Kenya. World Dev 64:311–321
Serletis G (2014) Kenya’s services output and exports are among the highest in Sub-Saharan
Africa. USITC Exec Brief Trade 202:205–3315
The World Bank (IBRD-IDA) (2015) Labor force participation rate, female (per cent of female
population ages 15+) (modeled ILO estimate). Available at http://data.worldbank.org/indicator/
SL.TLF.CACT.FE.ZS?
Author Index
A Allendorf, K., 12
Abada, O., 341 Allsopp, M.S., 12, 14, 17
Abadian, S., 12, 15, 16 Al Riyami, A., 15
Abdoul, G.M., 130 Alsop, R., 14, 18
Abid, A.B., 371 Alvarez-Cuadrado, 338
Abor, J., 166 Amen, I., 356, 362
Acemoglu, D., 67 Amidu, M., 159
Acharya, R., 338, 339 Amir, E., 356, 362
Ackerberg, D., 326 Andersen, E., 185
Ackerberg, D.A., 50 Anderson, D.R., 220, 221, 224
Acs, Z.J., 348 Anderson, K., 312
Adam, C., 80 Andrews, W.H., 325
Adams, N.A., 315 Anwar, M., 12
Adesola, W.A., 159 Anyanwu, J.C., 131, 140, 142
Afifi, M., 15 Anyiro, C.O., 372
Afriyie, K., 12 Appleton, S., 46
Agarwal, J.P., 127, 128 Ardjouman, D., 341
Aghion, P., 181, 182, 312 Arellano, M., 63, 72, 73, 75, 77, 325
Agwu, E.M., 341 Armstrong, J.S., 215
Ahmad, F., 55 Arneric, J., 341
Ahmed, I., 159, 160, 167 Arnold, J.M., 342
Ahmed, N., 159, 160, 167 Arvanitis, S., 347
Ahmed, Q.M., 299 Asafu-Adjaye, J., 292, 295, 297
Ahmed, S., 104 Asea, K., 255, 256, 258
Ahmed, Z., 159, 160, 167 Asekeny, L., 46
Ahn, M., 259 Asiedu, E., 129, 130, 135, 140, 142
Ajakaiye, O., 46, 49, 50, 52 Assefa, M., 124, 131, 294
Akande, O.R., 316 Astatike, G., 124, 131
Akanni, A.K., 372 Atom, B., 341, 362
Akehurst, G., 210, 214 Attah-Obeng, P., 114
Akin, J.S., 45 Audretsch, D.B., 348
Akther, S., 12, 15 Autio, E., 340, 342
Akyol, A., 210, 214 Ayele, G., 294
Albaum, G., 217, 230 Azid, T., 343
Alchian, A., 184
Alemayehu, G., 294 B
Alesina, A., 179 Bachewe, F.N., 295
Alfani, S., 54 Baker, B., 315
Ali, D.A., 12, 13 Baker, M., 158
Dean, D.L., 208, 209, 214, 216, 217, 229, 230 Essendi, H., 12, 15
De Bethune, X., 54 Evangelista, F., 217, 230
De Gregorio, J., 109, 260 Ezeh, A.C., 12, 15
De Haan, J., 230
Deininger, K., 12, 13 F
Dennis, G.J., 92, 93, 98 Faridi, M.Z., 12
Deolalikar, A., 50 Feenstra, R., 255
Deolalikar, A.B., 49 Feeny, S., 344
Dess, G.G., 214 Feger, T., 292, 295, 297
Dhanaraj, C., 210 Felbermayr, G.J., 217
Diamantopoulos, A., 214 Ferdous, M., 114, 115
Dimitratos, P., 214, 231 Fevolden, A., 186, 213
Diop, F., 54 Fevolden, A.M., 209, 213
Dolado, J., 143 Fidell, L.S., 226, 228
Doornik, J.A., 93, 98 Fischer, E., 344
Doorslaer, E.V., 44 Fischer, S., 107, 109, 110, 114
Dopfer, K., 185 Florens, J.P., 51
Dornbusch, R., 107 Foly, C., 209, 230
Dorrance, G.S., 109, 114 Fortunato, A., 341
Dorrance, S., 109, 110, 114 Foster, J., 186–189, 191, 193
Dosi, G., 186 Foster, M., 68, 95
Dougherty, M.L., 239, 240, 242 Fotso, J.C., 12, 15
Drejer, I., 340 Foucade, A.L., 298
Drine, I., 255, 267, 371 Franco-Rodriguez, S., 80, 82, 95
Drukker, D., 104, 111 Fredman, L., 55
Druzic, I., 254, 256 Freeman, J., 211, 212, 214, 230
Du, J., 340 Freeman, S., 214, 227
Dunning, J.H., 128, 132, 135, 144 Frey, B.S., 136
Duray, R., 214 Fuchs, V.R., 339
Dushko, J., 67 Fugazza, M., 229
Fuglie, K.O., 312
E
Easterby-Smith, M., 213, 215 G
Easterly, W., 106, 110, 111 Görg, H., 339
Eberhardt, M., 262, 264, 265 Gale, W.G., 341
Ebru, Ç., 166 Ganle, J.K., 12
Ederer, S., 192 Gatti, D., 184
Edwards, S., 315 Gebreselassie, T., 43
Ehdaie, J., 300 Geda, A., 125, 129, 131, 136, 142
Eichengreen, B., 344 Geishecker, I., 339
Eifert, B., 96 Gelb, A., 96
Elbadawi, I., 136, 140 Gemmell, N., 299
El-Halawany, H.S., 12, 16, 35 Genimakis, G., 159, 166, 167
El-Hennawi, M., 159, 160 Getembe, K.N., 342
Eliasson, G., 186 Ghatak, M., 292
Elo, I., 54 Ghuman, S.J., 12, 16
Elovaino, R., 44 Gill, A., 159
El-Said, O.A., 342 Gipson, J.D., 12, 14, 16
Engle, R., 263, 266 Glaeser, E., 67
Engle, R.F., 91 Godfrey, L.G., 92
Enu, P., 114 Gokal, V., 105–107
Erbaykal, E., 111 Goldfarb, R.S., 178
384 Author Index
External Balassa Hypothesis, 5, 253–257, 261, Income distribution, 4, 177–180, 188, 192,
269, 273, 274 193, 195, 199, 200
Inflation rate, 3, 104–106, 110–113, 118, 120,
F 134, 136, 140
Factor endowment, 2, 339 Innovation capacity, 338, 340
FDI inflows, 4, 123–126, 129, 131, 136, 137, Inpatient health care, 47
140–142, 144 Institutional factors, 4, 90, 123
Fertility preferences, 15 Institutional indicators, 66, 67, 134, 138, 142
Firm competitiveness, 6, 342, 347 Institutional quality, 3, 63, 64, 66–69, 71, 72,
Firm-specific effects, 129, 155, 165, 167 74, 75, 77, 123, 124, 144
First marriage, 3, 11, 13, 16, 19, 20, 22, 23, 26, Insurance companies, 4, 155–157, 160–163,
28, 32–36 166–169
Fiscal disequilibrium, 83, 90, 94 International aid agencies, 7
Fiscal effects of aid, 4, 81–83 International trade, 64, 129, 131, 210
Fiscal equilibrium, 79, 83, 87, 89, 90, 92–96
Fiscal performance, 4, 79 L
Fiscal response model, 80, 87, 88 Labor force participation, 12, 13, 36, 339, 370
Fixed capital formation, 73–77, 126 Labor productivity, 185, 194, 199, 200, 312,
Foreign aid, 4, 79–81, 85, 96, 131 338, 342, 350, 360, 374, 375
Foreign direct investment, 3, 80, 123, 208, 239, Labor-use efficiency, 6, 369, 371–373, 375,
240, 317, 318, 320, 338, 343 377, 379, 380
Foreign exchange earning, 80 Land use fee buoyancy, 302
Fostering productivity growth, 5 Liberalization policies, 125, 294
Liquidity, 4, 155, 156, 158, 160, 161, 163–165,
G 168
Global economic crisis, 74, 341
Global financial crisis, 1, 125, 157 M
Government consumption, 77, 81 Manufacturing-led development, 6
Government credibility, 70, 137, 140 Market activities, 370
Government effectiveness, 3, 63, 64, 66, 68, Maternal health, 15, 43
70, 72–77, 123, 131, 137, 140–143, 145 Measuring empowerment, 14
Government fiscal planning, 4, 79, 85, 97 Media exposure, 3, 11, 13, 16, 19, 20, 22–25,
Government fiscal statistics, 4, 79, 97 27, 29–31, 33, 34, 36
Granger causality, 111, 113, 117, 132 Micro-environmental factors, 211, 217, 221,
Gross capital formation, 68 222
Gross enrollment rate, 68, 74, 76 Microfinance, 24, 37
Middle income countries, 5
H Millennium development goals, 42
Health insurance schemes, 3, 43, 44, 55 Mineral export, 2, 5, 237–243, 245–251
Health sector reform, 41 Mineral resources, 238–240, 242–245, 250
Health service utilization, 43, 54, 55 Mining employment, 243
Heterogeneity, 2, 5, 43–45, 47, 51, 52, 253, Multi-dimensional approach, 207, 210
255, 264 Multi-dimensional index, 70
Heteroskedasticity, 63, 73, 77, 138, 139, 143, Multiple approach perspective, 2
155, 162, 163
Household decision making, 12, 13, 16, 17, N
19–22, 26, 27, 29, 33–37 National care health system, 44
Human capital, 42, 65, 67, 70–73, 76, 77, 129, Natural resources, 4, 64, 123, 129, 131, 135,
131, 179, 182, 257, 339, 340, 343 240, 244, 313
Nominal exchange rate, 136, 141, 142, 261,
I 268
Impact of institutions, 2, 3, 63, 67, 72, 76 Non-market activities, 42
Import substitution, 254 Non-stationarity, 263
Inclusive growth, 7 Non-tradable, 256, 259–261, 273, 274
Subject Index 393
O Service-based economies, 6
Off-farm employment, 314 Service innovation, 345, 347, 348, 350, 351,
Out-of-pocket health care, 3, 41 355, 356, 359
Outpatient health care, 3, 41, 43, 47, 54, 55 Service sector, 6, 130, 242, 331–340, 343, 344,
Overseas Development Assistance (ODA), 80 349, 354, 357, 359, 361, 369, 371, 373,
377, 379
P Service sector development, 344, 353, 357,
Pecking order theory, 155, 158–161, 166–169 360, 361
Personal income tax, 291, 293, 297, 301, Service turnover model, 353
304–306 Small and medium sized enterprises, 208
Physical abuse, 3, 11, 20 Social conflict, 179
Political and institutional risk, 137 Social development, 2, 16, 35
Political and institutional variables, 136 Social interaction, 54
Political stability, 4, 123, 125, 130, 134, 137, Socio-demographic variables, 46, 47
140–142, 144 Socioeconomic status, 42, 55
Population aging, 1 Sources of empowerment, 11–13, 16, 22, 33
Poverty alleviation, 2 Structural adjustment policies, 125
Poverty reduction strategy, 42, 43 Sustainability of growth, 2
Productivity and efficiency, 2, 6, 12, 13 Sustainable development, 104, 120, 254, 316
Profitability, 4, 105, 127, 135, 136, 155, 156,
158, 159, 161, 164–166, 168, 214, 215, T
220, 221, 228, 229, 231, 243, 344, 357 Tangibility, 155, 156, 158, 159, 161, 163, 164,
Pro-poor economic growth, 64 166
Protection of property rights, 3, 64–66, 68, 70, Tax buoyancy, 293, 298–300, 302, 305
72–77 Tax effects, 106
Provision of aid, 3 Tax elasticity, 292, 298, 300, 303, 305
Public spending, 4, 79, 83, 97 Tax reform, 294, 301
Purchasing Power Parity (PPP), 104, 256, 317 Tax responsiveness, 6, 291, 298
Tax structure, 292, 293, 295, 298, 300, 357
Q Total factor productivity growth, 313
Quality of government policies, 70
Quality of life, 16 U
Unemployment, 4, 103, 104, 107, 112–115,
R 118, 120, 372
Raw material export, 311, 316–318, 320, 321, Utilization of health care, 3
323, 324
Real economic growth, 4, 103, 105, 106, 111, V
112, 120 Vector auto-regressive, 79, 82, 86, 116, 304
Regulatory quality, 123, 137, 140–142, 144, Vector error correction, 114
145 Voice of accountability, 137, 141
Rule of law, 66, 67, 123, 131, 137, 140–143,
145 W
Women’s autonomy, 16, 35, 37
S Women’s empowerment, 2, 3, 11–13, 15–17,
Self-esteem, 12, 13, 16, 17, 19, 21, 22, 24, 26, 19, 20, 22–25, 33, 35–37
33, 34 Women empowerment theories, 15
Semi-structured interviews, 209, 211–213, 215, Worldwide governance indicators, 137
227, 229–231
Journal of Statistical and Econometric Methods, vol.5, no.4, 2016, 63-91
ISSN: 1792-6602 (print), 1792-6939 (online)
Scienpress Ltd, 2016
Abstract
Economic analysis suggests that there is a long run relationship between variables
under consideration as stipulated by theory. This means that the long run
relationship properties are intact. In other words, the means and variances are
constant and not depending on time. However, most empirical researches have
shown that the constancy of the means and variances are not satisfied in analyzing
time series variables. In the event of resolving this problem most cointegration
techniques are wrongly applied, estimated, and interpreted. One of these techniques
is the Autoregressive Distributed Lag (ARDL) cointegration technique or bound
cointegration technique. Hence, this study reviews the issues surrounding the way
cointegration techniques are applied, estimated and interpreted within the context
of ARDL cointegration framework. The study shows that the adoption of the
1
Department of Economics, University of Port Harcourt, Port Harcourt, Nigeria.
E-mail: nkoro23@yahoo.co.uk
2
Department of Economics, University of Port Harcourt, Port Harcourt, Nigeria.
ARDL cointegration technique does not require pretests for unit roots unlike other
techniques. Consequently, ARDL cointegration technique is preferable when
dealing with variables that are integrated of different order, I(0), I(1) or
combination of the both and, robust when there is a single long run relationship
between the underlying variables in a small sample size. The long run relationship
of the underlying variables is detected through the F-statistic (Wald test). In this
approach, long run relationship of the series is said to be established when the F-
statistic exceeds the critical value band. The major advantage of this approach lies
in its identification of the cointegrating vectors where there are multiple
cointegrating vectors. However, this technique will crash in the presence of
integrated stochastic trend of I(2). To forestall effort in futility, it may be advisable
to test for unit roots, though not as a necessary condition. Based on forecast and
policy stance, there is need to explore the necessary conditions that give rise to
ARDL cointegration technique in order to avoid its wrongful application,
estimation, and interpretation. If the conditions are not followed, it may lead to
model misspecification and inconsistent and unrealistic estimates with its
implication on forecast and policy. However, this paper cannot claim to have
treated the underlying issues in their greatest details, but have endeavoured to
provide sufficient insight into the issues surrounding ARDL cointegration
technique to young practitioners to enable them to properly apply, estimate, and
interpret; in addition to following discussions of the issues in some more advanced
texts.
1 Introduction
In applied econometrics, the Granger (1981) and, Engle and Granger (1987),
Autoregressive Distributed Lag(ARDL) cointegration technique or bound test of
cointegration(Pesaran and Shin 1999 and Pesaran et al. 2001) and, Johansen and
Juselius(1990) cointegration techniques have become the solution to determining
the long run relationship between series that are non-stationary, as well as
reparameterizing them to the Error Correction Model (ECM). The reparameterized
result gives the short-run dynamics and long run relationship of the underlying
variables. However, given the versatility of cointegration technique in estimating
relationship between non-stationary variables and reconciling the short run
dynamics with long run equilibrium, most researchers still adopt the conventional
way of estimation even when it is glaring to test for cointegration among the
variables under consideration. That is most of the researchers are not conversant
with the conditions that necessitate the application of cointegration test and the
interpretation of the results therein, hence, presenting misleading inferences.
With this background, the objective of this paper is to examine the conditions that
necessitate the application of the Autoregressive Distributed Lag (ARDL)
cointegration or bound test of cointegration technique and its interpretation.
Accordingly, this paper is divided into five sections. Section one, which is the
introduction. Section two, examines the concept of stationarity, section three
focuses on various unit roots tests, section four deals on ARDL cointegration
approach, section five focuses on summary and conclusions.
Although ARDL cointegration technique does not require pre-testing for unit
roots, to avoid ARDL model crash in the presence of integrated stochastic trend of
I(2), we are of the view the unit root test should be carried out to know the number
of unit roots in the series under consideration. This is presented in the next section.
cannot be seen as a means to an end, but restricted. However, this paper focuses on
series with unit root, I(1) (no constant mean and variance) that have no tendency of
returning to the long-run path.
There are various methods of testing unit roots. They include; Durbin-Watson
(DW) test, Dickey-Fuller test(1979)(DF), Augmented Dickey-Fuller(1981)(ADF)
test, Philip-Perron(1988) (PP) test, among others. It is of the view that before
pursuing formal tests to plot the time series under consideration, to determine the
likely features of the series and; run the classical regression. If the series is trending
upwards it shows that the mean of the series has been changing with time. In the
case of the classical regression, if Durbin– Watson statistics is very low and a high
R2 (Granger–Newbold, 1974), this perhaps reveals that the series is not stationary.
This is the initial step for a more formal test of stationarity. The most popular
strategy for testing the stationarity property of a single time series involves using
the Dickey Fuller or Augmented Dickey Fuller test respectively. The choice of the
right tests depends on the set up of the problem which is of interest to the
practitioner. It is difficult to follow the latest advances or to understand the
problems between employing various tests. This should not be understood as a
motive for not performing other types of unit root tests. Comparing different
results from different test methods is a good way of testing the sensitivity of your
conclusions. Once you understand how these tests work, and their limitations, you
will understand when to use any test. The advantage is that it enables us to
understand the meaning and purpose of any test. However, when a test result is
inconclusive, the usual way is to continue the analysis with a warning note, or
simply assume one of the alternatives. Thus, the unit roots test is basically required
to ascertain the number of times a variable/series has to be differenced to achieve
stationarity. From this comes the definition of integration: A variable Y, is said to
be integrated of order d, I(d)] if it attained stationarity after differencing d
times(Engle and Granger, 1987).
Emeka Nkoro and Aham Kelvin Uko 71
This test is a simple but unreliable test for unit root. To understand how this
test works, recollect that the DW-value is calculated as DW =2(1−ˆρ)( Harvey,
1981), where ρ = ˆρ is the estimated first order autocorrelation. Thus, if Yt is a
random walk, ρ will equal unity and the DW value is zero. Under the null that Yt is
a random walk, the DW statistic calculated from the first order autocorrelation of
the series Yt = Yt−1 +Vt, will approach one. The DW value approaches 0 under the
null of a random walk. A DW value significantly different from zero rejects the
hypothesis that Yt is a random walk and I(1), in favor of the alternative that Yt is
not I(1), and perhaps I(0). The test is limited by the assumption that Yt is a random
walk variable. This test is not good for integrated variables in general. The critical
value at the 5% level for the maintained hypothesis of I(1) versus I(0) is 0.17. A
higher value rejects I(1) )( Bo Sjö, 2008).
In practice, we test the hypothesis that р=0. If р=0, “α” in equation 3.2 will be
equal to 1, meaning that we have a unit root. Therefore, the series under
72 Autoregressive Distributed Lag (ARDL) cointegration technique
consideration is non-stationary. In the case where р ≥ 0, that is, the time series is
stationary with zero mean and in the case of 3.4, the series, Yt is stationary around
a deterministic trend. If р ≥ 1, it means that the underlying variable will be
explosive.
However, conducting the DF test as in (3.3) or (3.4), it is assumed that Ut is
uncorrelated. But in the case the error terms (Ut) are correlated, the Augmented
Dickey-Fuller (ADF) is resorted to, since it adjusts the DF test to take care of
possible autocorrelation in the error terms (Ut), by adding the lagged difference
term of the dependent variable, ∆Yt.
3.3 The Augmented Dickey-Fuller (ADF) (1981) tests for Unit Root
k
Restrictive ADF Model: ΔYt = р1Yt-1 + ∑ α ∆Y
i =1
i t −i + ut (3.7)
k
Restrictive ADF Model: ΔYt = р1Yt-1+ α2T + ∑ α i ∆Yt −i + ut (3.8)
i =1
k
General ADF Model: ΔYt = α0 + р1Yt-1 + ∑ α ∆Y
i =1
i t −i + ut (3.9)
k
General ADF Model: ΔYt = α0 + р1Yt-1+ α2T + ∑ α i ∆Yt −i + ut (3.10)
i =1
ut is a pure white noise error term and ∆Yt-1 =(Yt-1 –Yt-2), ∆Yt-1 =(Yt-1 –Yt-2), etc.
The number of lagged difference terms to be included is often determined
empirically, the reason being to include enough terms so that the error term in (3.5)
and (3.6) are serially uncorrelated. k is the lagged values of ∆Y, to control for
higher-order correlation assuming that the series follow an AP(p). In ADF р=0 is
Emeka Nkoro and Aham Kelvin Uko 73
still tested and follow the same asymptotic distribution as DF statistic. H0: р1
=0(р1 ∼ I(1)), against Ha : р1 < 0(р1∼ I(0)).
In practice, an DF or ADF value with less than its critical value shows that the
underlying series is non-stationary. Contrarily, when an DF or ADF value that is
greater than its critical value shows that the underlying series is stationary.
However, the null hypothsis cannot be rejected about non-stationarity based on
ADF test, since its power is not strong as such. This decision can be verified using
other related tests, such as Kwiatkowski-Phillips-Schmidt-Shin (1992)(KPSS) or
Philips-Perron (PP) test. PP test has the same null hypothesis as ADF, and its
asymptotic distribution is the same as the ADF test statistic. But in the case of
KPSS test, the null hypothesis is different; it assumes stationarity of the variable of
interest. The results from ADF test differ from KPSS as KPSS does not provide a
p-value, showing different critical values instead. In this case, the test statistic value
is compared with the critical value on desired significance level. If the test statistic
is higher than the critical value, we reject the null hypothesis and when test statistic
is lower than the critical value, we cannot reject the null hypothesis. However,
when there is a conflicting of the tests, it all depends on the researchers aim and
objective. In general, the null hypothesis for ADF reads that the series is non-
stationary while KPSS reads that the series is stationary. For the treatment of
serial correlation, PP reads that there is serial correlation (non-parametric) while
ADF reads that there is serial correlation (parametric).
The test can also be performed on variables in first differences as a test for I(2).
Under the null, ˆр1 will be negatively biased in a limited sample, thus, unless yt is
explosive. A significant positive value implies an explosive process, which can be a
very difficult alternative hypothesis to handle. Conversely, When testing for I(2) or
differencing twice, a trend term is not a possible alternative. The two interesting
models here are the ones with and without a constant term. Furthermore, lag length
in the augmentation can also be assumed to be shorter.
74 Autoregressive Distributed Lag (ARDL) cointegration technique
4 Cointegration Test
Modeling time series in order to keep their long-run information intact can be
done through cointegration. Granger (1981) and, Engle and Granger(1987) were
the first to formalize the idea of cointegration, providing tests and estimation
procedure to evaluate the existence of long-run relationship between set of
variables within a dynamic specification framework. Cointegration test examines
how time series, which though may be individually non-stationary and drift
extensively away from equilibrium can be paired such that the workings of
equilibrium forces will ensure they do not drift too far apart. That is, cointegration
involves a certain stationary linear combination of variables which are individually
non-stationary but integrated to an order, I(d). Cointegration is an econometric
concept that mimics the existence of a long-run equilibrium among underlying
economic time series that converges over time. Thus, cointegration establishes a
stronger statistical and economic basis for empirical error correction model, which
brings together short and long-run information in modeling variables. Testing for
cointegration is a necessary step to establish if a model empirically exhibits
meaningful long run relationships. If it failed to establish the cointegration among
underlying variables, it becomes imperative to continue to work with variables in
differences instead. However, the long run information will be missing. There are
several tests of cointegration, other than Engle and Granger(1987) procedure,
76 Autoregressive Distributed Lag (ARDL) cointegration technique
becomes the alternative. The next sections expose the requirement for using this
approach and its application.
The ARDL(p,q1,q2......qk) model specification is given as follows;
k
Ф(L,p)yt = ∑ βi ( L, qi ) xit + δwt +ut (4.1)
i =1
where
Ф(L,p) = 1- Ф1L - Ф2L2-….-ФpLp
β(L,q) = 1- β1L - β2L2-….-βqLq, for i=1,2,3…….k, ut ~ iid(0;δ2).
L is a lag operator such that L0yt =Xt, L1yt=yt-1, and wt is a s x1 vector of
deterministic variables such as the intercept term, time trends, seasonal dummies,
or exogenous variables with the fixed lags. P=0,1,2…,m, q=0,1,2….,m, i=1,2….,k:
namely a total of (m+1)k+1 different ARDL models. The maximum lag order, m, is
chosen by the user. Sample period, t = m+1, m+2….,n.
OR
The ADRL(p,q) model specification:
Ф(L)yt = φ + θ(L)xt + ut, (4.2)
with
Ф(L) = 1− Ф1L−...− ФpLp,
θ(L) = β0- β1L-...- βqLq.
To determine whether the above requirements are met or not see section 4.3.
This sub-section explores how one determines whether the above requirements
are met.
Step 1: Determination of the Existence of the Long Run Relationship of the
Variables
At the first stage the existence of the long-run relation between the variables
under investigation is tested by computing the Bound F-statistic (bound test for
cointegration) in order to establish a long run relationship among the variables.
This bound F-statistic is carried out on each of the variables as they stand as
endogenous variable while others are assumed as exogenous variables.
In practice, testing the relationship between the forcing variable(s) in the
ARDL model leads to hypothesis testing of the long-run relationship among the
80 Autoregressive Distributed Lag (ARDL) cointegration technique
underlying variables. In doing this, current values of the underlying variable(s) are
excluded from ARDL model approach to Cointegration.
This approach is illustrated by using an ARDL (p,q) regression with an I(d)
regressor,
yt = Ф1yt-1 + … + Фpyt-p + θ0xt + θ1xt-1 …+ q1xt-p +u1t (4.4)
or
xt = Ф2xt-1 + … + Фpxt-p + θ0yt + θ1yt-1 …+ q1yt-p + u2t (4.5)
For convenience the deterministic regressors such as constant and linear time
trend are not included. Where Ф, θ0 and θ1 are unknown parameters, and xt( or yt)
is an I(d) process generated by;
xt= xt-1+Ԑt;
or
yt= yt-1+Ԑt;
ut and Ԑt are uncorrelated for all lags such that xt (or yt) is strictly exogenous with
respect to ut.. Ԑt is a general linear stationary process.
(Cointegration/stability Condition) /Ф/ <1, so that the model is dynamically stable.
This assumption is similar to the stationarity condition for an AR(1) process and
implies that there exists a stable long-run relationship between yt(xt) and xt (yt). If
Ф =1, then there would be no long-run relationship. In practice, this can also be
denoted as follows:
The ARDL (p,q1,q2......qk) model approach to Cointegration testing;
k k
∆𝑋𝑡 = 𝛿0𝑖 + ∑ α i ∆X t −1 + ∑ α 2 ∆Yt −i + δ1X𝑡−1 + δ2Y𝑡−1 + v1𝑡
i =1 i =1
(4.6)
k k
∆Y𝑡 = 𝛿0𝑖 + ∑ α i ∆Yt −1 + ∑ α ∆X 2 t −i + δ1Y𝑡−1 + δ2X𝑡−1 + v1𝑡 (4.7)
i =1 i =1
k is the ARDL model maximum lag order and chosen by the user. The F-statistic
is carried out on the joint null hypothesis that the coefficients of the lagged
Emeka Nkoro and Aham Kelvin Uko 81
variables (δ1X𝑡−1 δ1Y𝑡−1 or δ1Y𝑡−1 δ1X𝑡−1) are zero. (δ1 – δ2) correspond to the
long-run relationship, while (α1 – α2) represent the short-run dynamics of the
model.
The hypothesis that the coefficients of the lag level variables are zero is to be
tested.
The null of non-existence of the long-run relationship is defined by;
Ho: δ1 = δ2= 0 (null, i.e. the long run relationship does not exist)
H1: δ1 ≠ δ2 ≠ 0 (Alternative, i.e. the long run relationship exists)
This is tested in each of the models as specified by the number of variables.
This can also be denoted as follows:
FX(X1│Y1,. . . . . Yk) (4.8)
Fy(Y1│X1,. . . . . Xk) (4.9)
The hypothesis is tested by means of the F- statistic (Wald test) in equation 4.8 and
4.9, respectively. The distribution of this F-statistics is non-standard, irrespective of
whether the variables in the system are I(0) or I(1). The critical values of the F-
statistics for different number of variables (K), and whether the ARDL model
contains an intercept and/or trend are available in Pesaran and Pesaran (1996a), and
Pesaran et al. (2001). They give two sets of critical values. One set assuming that
all the variables are I(0)(i.e. lower critical bound which assumes all the variables
are I(0), meaning that there is no cointegration among the underlying variables) and
another assuming that all the variables in the ARDL model are I(1)( i.e. upper
critical bound which assumes all the variables are I(1), meaning that there is
cointegration among the underlying variables). For each application, there is a band
covering all the possible classifications of the variables into I(0) and I(1). However,
according to Narayan (2005), the existing critical values in Pesaran et al. (2001)
cannot be applied for small sample sizes as they are based on large sample sizes.
Hence, Narayan (2005) provides a set of critical values for small sample sizes,
ranging from 30 to 80 observations. The critical values are 2.496 - 3.346, 2.962 –
3.910, and 4.068 – 5.250 at 90%, 95%, and 99%, respectively.
82 Autoregressive Distributed Lag (ARDL) cointegration technique
If the relevant computed F-statistic for the joint significance of the level
variables in each of the equations(4.6 and 4.9), δ1, and δ2 falls outside this band, a
conclusive decision can be made, without the need to know whether the underlying
variables are I(0) or I(1), or fractionally integrated. That is, when the computed F-
statistic is greater than the upper bound critical value, then the H0 is rejected (the
variables are cointegrated). If the F-statistic is below the lower bound critical value,
then the H0 cannot be rejected (there is no cointegration among the variables). If
long run (or multiple long-run relationships) relationships exist in both equations
(4.8 and 4.9) the ARDL approach cannot be applied, hence, Johansen and Juselius
(1990) approach becomes the alternative.
If the computed statistic falls within(between the lower and upper bound) the
critical value band, the result of the inference is inconclusive and depends on
whether the underlying variables are I(0) or I(1). It is at this stage in the analysis
that the investigator may have to carry out unit root tests on the variables (Pesaran
and Pesaran, 1996a). Also, if the variables are I(2), the computed F-statistics of the
bounds test are rendered invalid because they are based on the assumption that the
variables are 1(0) or 1(1) or mutually cointegrated (Chigusiwa et al., 2011).
However, to forestall an effort in futility, it may be advisable to first perform unit
roots, though not as a necessary condition in order to ensure that none of the
variables is I(2) or beyond, before carrying out the bound F-test.
Step 2: Choosing the Appropriate Lag Length for the ARDL Model/
Estimation of the Long Run Estimates of the Selected ARDL Model
If a long run relationship exists between the underlying variables, while the
hypothesis of no long run relations between the variables in the other equations
cannot be rejected, then ARDL approach to cointegration can be applied. The issue
of finding the appropriate lag length for each of the underlying variables in the
ARDL model is very important because we want to have Gaussian error terms (i.e.
standard normal error terms that do not suffer from non-normality, autocorrelation,
heteroskedasticity etc.). In order to select the appropriate model of the long run
Emeka Nkoro and Aham Kelvin Uko 83
𝑋s (𝑋1𝑡, 𝑋2𝑡 , 𝑋3𝑡, ……….. 𝑋n𝑡) are the explanatory or the long run forcing variables, k
is the number of optimum lag order.
The best performed model provides the estimates of the associated Error Correction
Model (ECM).
s −1
yt-1 = yt −∑ ∆yi − j s =1,2, . . p
j =1
and similarly,
wt = Δwt +wt-1
xt = Δxt +xt-1
s −1
x1t-s = yit-1 −∑ ∆xit − j , s =1,2, . .qi
j =1
Recall that ϕ(1,ˆp) = 1- ˆϕ1 - ˆϕ2 - . . . ˆϕp measures the quantitative importance
of the error correction term. The remaining coefficients ˆϕj and βij, relate to the
short-run dynamics of the model’s convergence to equilibrium. ECt is the residuals
that are obtained from the estimated cointegration model of equations 4.6 and 4.7.
86 Autoregressive Distributed Lag (ARDL) cointegration technique
The ARDL models and its associated ECM can be estimated by the OLS
method.
• If the F-statistics (Wald test) establishes that there is a single long run
relationship and the sample data size is small (n≤ 30) or finite, the ARDL
error correction representation becomes relatively more efficient.
• The ARDL model is reparameterized into ECM when there is one
cointegrating vector among the underlying variables. The reparameterized
result gives the short-run dynamics and long run relationship of the
underlying variables.
• When there are multiple long-run relationships, ARDL approach cannot be
applied. Hence, an alternative approach like Johansen and Juselius (1990)
becomes more appropriate.
References
88 Autoregressive Distributed Lag (ARDL) cointegration technique
[36] A.K. Uko and E. Nkoro, Inflation Forecast with ARIMA, Vector
Autoregressive and Error Correction Models in Nigeria, EJEFAS, Issue 50,
July, (2012).
Munich Personal RePEc Archive
10 January 2018
Online at https://mpra.ub.uni-muenchen.de/83973/
MPRA Paper No. 83973, posted 19 Jan 2018 02:37 UTC
ARDL model as a remedy for spurious regression: problems,
performance and prospectus
(1) Ghulam Ghouse
Ghouserazaa786@gmail.com
PhD scholar (Department of Econometrics and Statistics)
Pakistan Institute of Development Economics, Islamabad, Pakistan.
econometrics and have developed many tools employed in applied macroeconomics. The
time series. While reviewing a well-established study of Granger and Newbold (1974) we realized
that the experiments constituted in this paper lacked Lag Dynamics thus leading to spurious
regression. As a result of this paper, in conventional Econometrics, the Unit root and Cointegration
analysis have become the only ways to circumvent the spurious regression. These procedures are
also equally capricious because of some specification decisions like, choice of the deterministic
part, structural breaks, autoregressive lag length choice and innovation process distribution. This
study explores an alternative treatment for spurious regression. We concluded that it is the missing
variable (lag values) that are the major cause of spurious regression therefore an alternative way
to look at the problem of spurious regression takes us back to the missing variable which further
leads to ARDL Model. The study mainly focus on Monte Carlo simulations. The results are
providing justification, that ARDL model can be used as an alternative tool to avoid the spurious
regression problem.
1. Introduction
The most important feature that led to development of new time series econometrics was spurious
regression. Spurious regression is a phenomena known to econometricians since the times of Yule
(1926). Spurious regression was attributed to missing variable until Granger and Newbold (1974)
who showed that spurious regression could be found with nonstationary time series even with no
missing variable. Nelson and Plosser (1982) argued that most of the time series are better
characterized as nonstationary. Spurious regression have performed a vital role in the construction
of contemporary time series econometrics and have developed many tools employed in applied
macroeconomics. However, the widespread literature considers the non-stationarity as the only
reason for spurious regression. To evade the problem of spurious regression caused by the non-
Supposing that the spurious regression occurs due to non-stationarity and unit root and
cointegration testing are used as the remedy, even then it is very hard to find reliable inference.
There is no test of unit root with good size and power in small sample. The unit root and
cointegration procedures involves many prior specification decisions e.g. lag length, trend and
structural stability etc. If we do a data based decision making, it will involve a large battery of
tests. Each test is having specific statistical error (type I, II error). The cumulative probability of
error in all tests leave the results of unit root test unreliable. Because, of these reasons, the literature
relevant variable is a major cause of spurious regression. Even it can be shown that the spurious
regression in Granger and Newbold (1974) experiment was also due to missing variable see,
(section, 5.1).
So, an alternative way to look at the problem of spurious regression takes us back to missing
variable which further leads as to ARDL. Suppose, we have two independent autoregressive
nonstationary series
the construction of both variables. Granger and Newbold (1974) shown that the spurious regression
But we know that true data generating process (DGP) of Y and X contain lag of values, including
Which is an ARDL model. It is observed in our study (section 4) that this kind of model
significantly reduce the probability of spurious regression in case of nonstationary series. This
indicates that spurious regression occur due to missing variable and can be avoided by including
The objective of this study is to explore an alternative solution that is expected to perform for
nonstationary series. This study will investigates that, is it possible to use ARDL model to evade
the spurious regression bypassing the very complicated and ambiguous unit root testing,
cointegration analysis and other treatments. We will generate the autoregressive (nonstationary,
stationary and negative moving average) series and investigate that how the probability of spurious
increase dramatically in nonstationary case by ignoring the lag dynamics through Monte Carlo
simulations.
2. Literature review
An immense amount of studies are available on spurious regression topic in time series
econometric literature. In this section we briefly discuss the proposed theoretical and empirical
methods for the treatment of spurious regression in literature. The literature review is arranged as
follows
There is long historical debate on nonsense correlation (spurious regression) issue in econometrics
literature, at least seeing back to the well-known study of Yule (1926). In his study, he presented
the presence of a strong correlation of 0.95 between mortality rate and proportion of marriages of the
Church of England to all marriages during 1866 to 1911. Yule (1926) thought that the spurious
Simon (1954) also supported the idea that the missing variable is a source of spurious correlation.
Simon described that if we are uncertain that the perceived correlation is spurious, we have to
significant. In their experiment they generated independent autoregressive series like, 𝑋𝑡 and 𝑌𝑡 .
and 𝑌𝑡 on 𝑋𝑡 .
To find the relationship between the economic variables is the core objective of economic studies.
The spurious regression offers deceptive statistical evidence of strong relationship even though the
variables are independent. Hendry (1980) demonstrated a spurious correlation between cumulative
rainfall and price level in UK. He inspected that all these time series were stationary at difference
except unemployment rate. Plosser and Schwert (1978) claimed that, the regression without taking
difference of nonstationary series most probably come up with invalid or nonsense results. The
reasoning behind this claim is that if we run regression without taking difference of difference
stationary series, the estimator properties and the distribution of test statistics are no more reliable.
Phillips (1986) examined the asymptotic properties of spurious least square regression model and
endorsed Granger and Newbold (1974) simulation results that the misspecification of level of
Mostly, the nominal economic variables are correlated, even there is no relationship between them,
and the mutual presence of price level in data series develops correlation between them. It was
also shown that many time series are nonstationary that’s why the probability of spurious
regression is very high. We are presenting here some examples of spurious regression form time
Chaouachi (2013) inspected that Dar et al. (2012) in their study provided spurious strong positive
relationship among usage of nass chewing, hookah smoking and many other habits with
oesophageal squamous cell carcinoma (ESCC) risk. Dar et al. (2012) conducted a case control
study in valley of Kashmir, India. They considered 702 historical cases of oesophageal squamous
cell carcinoma (ESCC) and 1663 hospital based controls, exclusively matched to the cases for sex,
age and residence district from Sep, 2008 to Jan, 2012. They used monthly data from Sep, 2008 to
Jan, 2012. They concluded that nass chewing and hookah smoking are strongly positively
associated with (ESCC) risk, which is based on severe misinterpretation. According to Chaouachi
(2013) all the relevant studies showed that there is feeble or insignificant association among nass
chewing, hookah smoking with (ESCC) risk. Chaouachi (2013) stated that Dar et al. (2012) came
up with spurious results because they did not incorporate the very significant element which is
Roger and Jupp (2006) described an example of spurious positive relationship between human
baby’s birth and stork nesting in the sequence of spring, because these two variables are correlated
to a third variable. According to the Roger and Jupp (2006) the sequence of Dutch statistics is
showing a positive relationship between stork nesting in the sequence of spring and human baby’s
birth at that time, it is due to that the both variables are associated to the state of weather. It means
that both variables are independent, but they have relation with the state of weather. This shows
that both variables are spuriously correlated because of third missing variable. According to the
Hofer et al. (2004) this spurious correlation is due to lack of statistical information.
Nelson and Plosser (1982) examined that most of the macroeconomics series of U.S.A economy
are having unit root. Their study is generally acknowledged as a significant contribution with
consequences for the theory and policy. They employed Dickey Fuller test for unit root detection
on fourteen historical macroeconomics series for U.S.A economy, including GNP, wage,
employment, prices, stock prices and interest rate and they found that twelve out of fourteen series
were having unit root. In fact Nelson and Plosser (1982) study is a noteworthy contribution in time
series econometric literature which enhanced the interest of researchers in unit root tests. That’s
Engle and Granger (1987) introduced the co-integration technique as a solution of spurious
regression due to non-stationary time series. According to Granger the non-stationary time series
are cointegrated, if their linear combination is a stationary process. Now the problem is that how
to estimate the long run equilibrium relationship parameters for this Engle and Granger presented
an Error Correction Mechanism. The residuals of equilibrium regression can be used for error
correction model. The first drawback of EG (Engle and Granger) cointegration test is that it only
deals with one cointegrated vector. Second, it depends upon two step estimator, first step is to
produce series of residuals and second, to check the stationarity of residuals series. Third, the
major limitation is the distributions of the estimators are non-standard. Phillips and Ouliaris (1990)
proposed residual based tests under the null hypothesis of no cointegration in time series. In which
the asymptotic distributions of residual based tests depend upon number of variables and
deterministic trend terms. Engle and Yoo (1991) proposed three step procedure to evade the
limitations of EG model, which is an extension of EG model. Engle and Yoo (EY) procedure
confirms that the distributions of the estimators yield the normal distribution. It is also only useful
When we have more than one variable then there is the possibility of more than one cointegrated
vector. EG and EY cointegration do not provide any solution in this situation. So, to overcome this
problem Johansen and Juselius (1992) introduced the multivariate cointegration test. The Johansen
and Juselius (JJ) test allows to find out more than one cointegrated vectors so, it is generally more
applicable than EG and EY cointegration tests. We knew that EG and EY single equation
procedures ignore short run dynamics, when the relationships are estimated. But, the JJ procedure
also considers the short run dynamics. Pesaran et al. (1996) and Pesaran (1997) proposed a single
and EY. The first advantage is the ARDL cointegration approach provides explicit tests for the
presence of a single cointegrating vector, instead of assuming uniqueness. Pesaran and Shin
(1995) revealed that asymptotically valid inference on short run and long run parameters could be
made by employing ordinary least square estimations of ARDL model. So, the ARDL model order
is properly augmented to grant for contemporary correlation among the stochastic elements of the
The cointegration testing is involves many specification decisions which cut the reliability of
results. The existing cointegration testing procedures do not provide any reasonable criteria
regarding these specification decisions: choice of the deterministic part; the structural breaks;
autoregressive lag length choice and innovation process distribution. For further detail see, (section, 2.3.2).
It is a common misconception that the spurious regression only prevails due to unit root.
Nevertheless, the missing relevant variable is a major cause of spurious regression. Yule (1926)
first time anticipated that the nonsense correlations could prevail due to missing variable.
Simon (1954) argued that the missing variable is a cause of spurious correlation. Simon has
described this problem in following tactic that if we are uncertain that the observed correlation is
spurious, we should introduce another (extra) variable which may observed the true correlation.
Frey (2002) argued that the spurious regression could be probably due to missing variable.
Even it can be shown that the spurious regression in Granger and Newbold (1974) experiment was
also due to missing variable. In their experiment they generated independent autoregressive series
like, 𝑋𝑡 and 𝑌𝑡 . Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values.
missing in equation (11) and similarly one determinant of 𝑋𝑡 i.e. 𝑋𝑡−1 is missing in equation (12).
Taking these missing variables into account the equation shall become
Therefore, equation (13) shall not have spurious regression if our supposition if missing variable
The most familiar procedures to evade the spurious regression are unit root and cointegrating
testing. These methods are equally capricious because of some specification decisions like, choice
of the deterministic part; the structural breaks; autoregressive lag length choice and innovation process
distribution see, (section, 2.3.1.1). The cointegration analysis which is employed as a tool to avoid
spurious regression, also experience with specification decisions problems see, (section, 2.3.2). It
involves unit root testing which is also unreliable. The tests of unit root are so unreliable that is
Numerous financial and economic series exhibit nonstationary or trending behavior like, Stock
prices, exchange rate and Gross Domestic Product (GDP) and many others. It is unlikely to get
accurate results from trendy series. The most common procedures to avoid the spurious regression
are unit root and cointegrating testing. These procedures are equally unreliable due to specification
decisions. The cointegration analysis which is used as a tool to avoid spurious regression, suffer
numerous problems. It involves unit root testing and then testing for cointegration. The tests of
unit root are so unreliable that is why it is very hard to conclude something reasonable. The US
GNP is the series used by the large number of researchers as a guinea pig for the tests of unit root.
However, nothing reasonable could be said about the unit root in series. Rehman and Zaman (2008)
“Trend Stationary: Perron (1989), Zivot and Andrews (1992), Diebold and Senhadji (1996),
Difference stationary: Nelson and Plosser (1982), Murray and Nelson (2002), Kilian and Ohanian
(2002),
So, the important task in econometrics is to determine the most suitable arrangement of trend in
time series. There are two common procedures to eradicate the trend of data are regression with
time trend and differencing. The unit root testing procedure offers an idea which procedure can
be adopted to render the time series stationary. Besides, the precision and specification of unit root
procedures are still a paradox, though, since mid-eighties the literature on unit root testing has been
raised stormily.
Rehman and Zaman (2008) investigated that the two main causes for inadequate performance of
unit root tests are observational equivalence and model misspecification. They mainly targeted
four specification decisions: choice of the deterministic part; the structural breaks; autoregressive lag
length choice and innovation process distribution, and examine their role in an inference from unit root
tests. They explored that these specification decisions seriously impact the performance of unit
root tests. Also investigated that the existing unit root tests do not provide any set criteria regarding
these specification decisions, that is why they came up with unreliable results.
DeJong et al. (1992) found that Choi and Philips (1991) and Philips and Perron (1988) unit root
procedures suffer from size distortion and low power issues in the presence of moving average
(MA). While, Augmented Dicky Fuller (ADF) behaved well. Schwert (2002) Investigated that the
Dicky Fuller (1979, 1981) is responsive to pure autoregressive process assumption means the data
generating process of series is pure autoregressive (AR). When the moving average competent
involves in fundamental process, then the Dicky Fuller reported distribution and test statistic
distribution can be quite different. Many other unit root tests are being proposed, at some extent
Like unit root tests the cointegration testing is also involves many specification decisions which
cut the reliability of results. The existing cointegration testing procedures do not provide any
reasonable criteria regarding these specification decisions, and that leads to their results are
unreliable.
For example, Lag length specification is a significant practical question about the application of
any econometric analysis. Like, in case of unit root test, if the lag length is too short then the serial
correlation remains in errors and the results will be biased. If the lag length is too large this will
reduce the power of the test. In the same way the cointegration tests are also very sensitive to lag
length selection. Agunloye et al. (2014) explored that the Engle Granger (EG) cointegration test is
extremely sensitive to lag length. Carrasco et al. (2009) examined that the lag length
misspecification may significantly affect the cointegration results. In case of the under
specification, it could undermine the cointegration results and in over specification, it may
diminish the power of test. Similarly, trend specification is also a very significant issue in
econometric literature.
Ahking (2002) explored that when the deterministic linear time trend included in Johansen’s
cointegration test it provides disproving results and after exclusion of deterministic linear time
trend got robust results. He also suggested that great attention must be taken in trend specification
in cointegration analysis. There are lot of studies are available in literature on this issue but most
of them are with different results. Leybourne and Newbold (2003) used three cointegration test for
independent integrated series and each series has a structural break. They found cointegration
among them until structural break are not properly treated. Choi et al. (2004) examined that the
economic models for cointegration are often provided erroneous results. The main reason is the
errors are unit root non-stationary owing one of the variable has non-stationary measurement error.
They stated that “If the money demand function is stable in the long-run, we have a cointegrating
regression when money is measured with a stationary measurement error but have a spurious
In ARDL model the dependent variable is expressed by the lag and current values of independent
variable and its own lag value. Davidson et al. (1978) proposed ARDL methodology (DHSY
hereafter) to model the UK consumption function. ARDL model normally starts from reasonably
general and large dynamic model and progressively reducing its mass and altering variable by
imposing linear and non-linear restrictions (Charemza and Deadman, 1997). Autoregressive
distributed lag (ARDL) model is one of the most general dynamic unrestricted model in
The ARDL (1, 1) is the simplest form of ARDL model. Consider an ARDL (1, 1) model
Hendry and Richard (1983), Hendry, Pagan and Sargan (1984) and Charemza and Deadman (1997)
argued that by imposing restrictions we can find out at least ten most appropriate economically
interpretable models from ARDL (1, 1) model. We are giving hare some important cases of
restriction
1. 𝛽2 = 𝛽3 = 0 Static regression,
general specification taking into account the lag structure. Therefore it could give better results.
4. The Methodology
This study mainly focuses on Monte Carlo Simulations. The data would be generated with pre
decided specifications and the probability of spurious regression would be tested using classical
The data generating process equation (18) can generate data in quite large types of scenarios.
Suppose, 𝜃12 = 𝜃21 = 0 and 𝜌 = 0, the data generating process will generate two independent
series and would be indication of spurious regression if the regression of 𝑥𝑡 on 𝑦𝑡 turns out to be
series. If A is zero it means series would be IID (identically independently distributed). The value
Gross domestic product of thirty seven countries Albania, Antigua and Barbuda, Argentina,
Austria, Bahamas, Bahrain, Barbados, Belgium, Botswana, Brazil, Brunei Darussalam, Cabo
Verde, Canada, Comoros, Congo, Costa Rica, Denmark, Dominica, El Salvador, Fiji, Finland,
France, Gabon, Gambia, Germany, Grenada, Guinea-Bissau, Guyana, Honduras, Hong Kong, Iraq
Iceland, Ireland, Israel, Italy, Kiribati and Luxembourg from 1980 to 2014. We employed the
ADF unit root test and come to know all the series are stationary at first difference. All the series
are statistically independent of each other. We regress Antigua and Barbuda, Argentina, Austria,
Bahamas, Bahrain, Barbados, Belgium, Botswana, Brazil, Brunei Darussalam, Cabo Verde,
Canada, Comoros, Congo, Costa Rica, Denmark, Dominica, El Salvador, Fiji, Finland, France,
Gabon, Gambia, Germany, Grenada, Guinea-Bissau, Guyana, Honduras, Hong Kong, Iraq Iceland,
Ireland, Israel , Italy, Kiribati and Luxembourg on Albania and found that all regression come up
with significant results. Even though all the series are independent of each other. As we can see in
table 1 which is consists on linear regression results, all the GDP series are having statistically
significant relations. Table consists on the coefficient values and the P values are in parenthesis.
The P-values indicating that all the relation are highly significant even at 1% level of significance.
The table 3 shows the residual analysis of linear regression model. It shows that all the results of
autocorrelation are significant at 1% level of significance. While the LM test for heteroskedasticity
results are also significant, expect 15 cases. It means out of 36 regression only 15 regression
residuals facing heteroskedasticity. Nonetheless, the table 4 is presenting the residual analysis of
ARDL model. As we can see that the autocorrelation test are insignificant at 5% except Argentina
and Brunei Darussalam, they are insignificant at 1%. The Hetroscedasticity test statistics are
These results infer that ARDL model significantly reduced the probability of spurious regression
from 100% to approximately 5%. It also rejects the common misconception about the spurious
regression that it is only prevails due to unit root. Nevertheless, the missing relevant variable is a
major cause of spurious regression. As we introduced the lag values the probability of spurious
regression, then the irrelevant variable acts as a proxy of potential variable. It captures the effect
of potential variables and then the results would be significant. If we start with ARDL model it
will overtake the problem of missing variable. Even it can be shown that the results in Granger
and Newbold (1974) experiments were significant only due to missing lag values. See, (section,
5.1).
order 1 by using our data generating process given above, imposing restrictions 𝜃12 = 𝜃21 = 0
and 𝜌 = 0. Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values and the coefficients of lag
values 𝜃1 = 𝜃2 = 1 .
We are using sample size of 50 observations. We regress 𝑋𝑡 on 𝑌𝑡 by using simple linear regression
Monte Carlo simulation are used for simulations of results. We simulated the t-stat value of X
variable 1000 time and the results are explained through figure 1 given below. The vertical lines
are indicating the asymptotic critical value at 5% nominal level of significance which is 1.96. It is
noticeable that wider area of distribution lies in rejection region. The regression is estimated at 5%
nominal level of significance but after 1000 time simulations of t-statistics for coefficient, we got
the probability of spurious regression is increased from 5% to 67%. It means that we got 670 times
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
These spurious results are due to missing variable because we did not include the lag values of
variables as an independent variable. Now, if we include the lag value as an independent variable
then the model become ARDL (1, 1). We can see that the ARDL (1, 1) model reduce the
probability of spurious regression and eliminate the chances of spurious regression. The equation
is following
Figure 2 shows the distribution of t-statistics for coefficient 𝑋𝑡 after ARDL (1, 1) model. The
vertical lines are indicating the asymptotic critical value at 5% nominal level of significance which
is 1.96. It is noticeable that smaller area of distribution lies in rejection region. The regression is
estimated at 5% nominal level of significance, after 1000 time simulation of t-statistics for
coefficient, the probability of spurious regression recorded to be approximately 5%. This directs
that ARDL can be used as a treatment of spurious regression with nonstationary series. Same
experiments were done in Granger and Newbold (1974) experiments and they did not consider the
0.40
The distribution of t-statistics of coefficient after ARDL model
0.35
0.30
0.25
0.20
0.15
0.10
0.05
-4 -3 -2 -1 0 1 2 3 4
order 2 by using our data generating process given above, imposing restrictions 𝜃12 = 𝜃21 = 0
and 𝜌 = 0. Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values and the coefficients of lag
There is no third variable involved in the construction of both variables. We regressed 𝑋𝑡 on 𝑌𝑡 and
0.0225
The distribution of t-statistics for coefficient
0.0200
0.0175
0.0150
0.0125
0.0100
0.0075
0.0050
0.0025
The vertical lines are indicating the asymptotic critical value at 5% nominal level of significance
which is 1.96. It is noticeable that wider area of distribution lies in rejection region. The regression
is estimated at 5% nominal level of significance but after 1000 time simulation of t-statistics for
coefficient, we got the probability of spurious regression is 92%. It means that the probability of
variables as an independent variable. Now at first, we include the one lag value of X and Y as an
independent variables then the model become ARDL (1, 1). The equation is following
Figure 4: The distribution of t-statistics for coefficient of X (t) after ARDL (1, 1)
0.125
0.100
0.075
0.050
0.025
-12.5 -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0
Figure 4 shows the distribution of t-statistics for coefficient of linear regression. The vertical lines
are indicating the asymptotic critical value at 5% nominal level of significance which is 1.96. It is
noticeable that wider area of distribution lies in rejection region. The regression is estimated at 5%
nominal level of significance but after 1000 time simulation of t-statistics for coefficient, we got
actual level of significance which is 50%. It means ARDL (1, 1) reduced the probability of spurious
of variables as an independent variable. Now, we also include the second lag values of X and Y as
an independent variable then the model become ARDL (2, 2). The equation is following
Figure 5: The distribution of t-statistics for coefficient of X(t) after ARDL (2, 2)
0.35
0.30
0.25
0.20
0.15
0.10
0.05
-5 -4 -3 -2 -1 0 1 2 3
It is noticeable that wider area of distribution lies in rejection region. The regression is estimated
at 5% nominal level of significance but after 1000 time simulation of t-statistics for coefficient,
we got the probability of spurious regression is 7%. This indicates a distortion of only 2%. It means
that we got 70 times significant results out 1000 instead of 50 times out of 1000. This directs that
ARDL can be used as a treatment of spurious regression in case of higher integrated order time
series.
6. Conclusion
The Unit root and Cointegration analysis are the only ways to circumvent the spurious regression
equally unreliable because of some specification decisions like, autoregressive lag length choice,
choice of the deterministic part, structural breaks and innovation process distribution. After having
reviewed an excessive amount of available literature and inferences, we have been able to conclude
that it is the missing variable (lag values) that are the major cause of spurious regression in all the
cases therefore an alternative way to look at the problem of spurious regression takes us back to
the missing variable which further leads to ARDL Model. The results are also providing
Reference
Cointegration Test: A Modified Koyck Mean Lag Approach Based on Partial Correlation.
Charemza, W. W., & Deadman, D. F. (1997). New directions in econometric practice. Books.
Choi, C. Y., Hu, L., & Ogaki, M. (2004). A spurious regression approach to estimating structural
of optimal lag length in cointegrated VAR models with weak form of common cyclical features.
Chaouachi, K. (2013). False positive result in study on hookah smoking and cancer in Kashmir:
measuring risk of poor hygiene is not the same as measuring risk of inhaling water filtered tobacco
smoke all over the world. British journal of cancer, 108(6), 1389.
Davidson, J. E., Hendry, D. F., Srba, F., & Yeo, S. (1978). Econometric modelling of the aggregate
time-series relationship between consumers' expenditure and income in the United Kingdom. The
DeJong, D. N., Nankervis, J. C., Savin, N. E., & Whiteman, C. H. (1992). The power problems of
unit root test in time series with autoregressive errors. Journal of Econometrics, 53(1-3), 323-343.
Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series
with a unit root. Journal of the American statistical association, 74(366a), 427-431.
Dickey, D. A., & Fuller, W. A. (1981). Likelihood ratio statistics for autoregressive time series
Engle, R. F., & Granger, C. W. (1987). Co-integration and error correction: representation,
Engle, R. and Yoo Sam (1991). Forecasting and Testing in Co-integrated Systems, In Engle and
Granger (eds.), Long Run Economic Relationships. Readings in Cointegration, Oxford University
Frey, B. S. (2002). Inspiring economics: Human motivation in political economy. Edward Elgar
Publishing.
Granger, C. W., & Newbold, P. (1974). Spurious regressions in econometrics. Journal of
Hashimzade, N., & Thornton, M. A. (Eds.). (2013). Handbook of research methods and
Hassler, U. (2003). Nonsense regressions due to neglected time-varying means. Statistical Papers,
44(2), 169-182.
Hendry, D. F., & Richard, J. F. (1983). The econometric analysis of economic time series.
Hendry, D. F., Pagan, A. R., & Sargan, J. D. (1984). Dynamic specification. Handbook of
econometrics, 2, 1023-1100.
Höfer, Thomas; Hildegard Przyrembel; Silvia Verleger (2004). New evidence for the Theory of
PPP and the UIP for UK. Journal of econometrics, 53(1-3), 211-244.
Leybourne, S. J., & Newbold, P. (2003). Spurious rejections by cointegration tests induced by
Nelson, C. R., & Plosser, C. R. (1982). Trends and random walks in macroeconmic time series:
Plosser, C. I., & Schwert, G. W. (1978). Money, income, and sunspots: measuring economic
relationships and the effects of differencing. Journal of Monetary Economics, 4(4), 637-660.
Perron, P. (1990). Testing for a unit root in a time series with a changing mean. Journal of Business
Phillips, P. C., & Perron, P. (1988). Testing for a unit root in time series regression. Biometrika,
335-346.
Pesaran, M. H., Shin, Y., & Smith, R. J. (1996). Testing for the' Existence of a Long-run
Pesaran, M. H. (1997). The role of economic theory in modelling the long run. The Economic
Pesaran, M. H., & Smith, R. (1995). Estimating long-run relationships from dynamic
Rehman, A. U., & Malik, M. I. (2014). The modified R a robust measure of association for time
Sapsford, Roger; Jupp, Victor, eds. (2006). Data Collection and Analysis. Sage. ISBN 0-7619-
4362-5.
Sun, Y. (2004). A convergent t-statistic in spurious regressions. Econometric Theory, 20(05), 943-
962.
Su, J. J. (2008). A note on spurious regressions between stationary series. Applied Economics
Schwert, G. W. (2002). Tests for unit roots: A Monte Carlo investigation. Journal of Business &
study in sampling and the nature of time-series. Journal of the royal statistical society, 89(1), 1-
63.
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 1/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
1
Another commonly used abbreviation is ADL.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 2/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 3/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
5
1960 1965 1970 1975 1980
log consumption
log income
log investment
Data: National accounts, West Germany, seasonally adjusted, quarterly, billion DM, Lütkepohl (1993, Table E.1).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 4/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
ARDL model
ARDL(p, q, . . . , q) model:
p q
β 0i xt−i + ut ,
X X
yt = c0 + c1 t + φi yt−i +
i=1 i=0
ARDL(4,1,0) regression
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .4568483 .1064085 4.29 0.000 .2450887 .6686079
L2. | .3250994 .1127767 2.88 0.005 .1006666 .5495322
L3. | .1048324 .1092992 0.96 0.340 -.11268 .3223449
L4. | -.1632413 .0853844 -1.91 0.059 -.3331616 .0066791
|
ln_inc |
--. | .4629184 .078421 5.90 0.000 .3068557 .6189812
L1. | -.202756 .0965775 -2.10 0.039 -.3949513 -.0105607
|
ln_inv | .0080284 .0118391 0.68 0.500 -.0155322 .0315889
_cons | .0373585 .0143755 2.60 0.011 .0087504 .0659667
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 6/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
lagcombs[12,4]
ln_consump ln_inc ln_inv aic
r1 1 0 0 -585.22447
r2 1 1 0 -585.39189
r3 1 2 0 -583.88179
r4 2 0 0 -590.66282
r5 2 1 0 -592.6904
r6 2 2 0 -591.62792
r7 3 0 0 -588.69069
r8 3 1 0 -590.83183
r9 3 2 0 -589.67101
r10 4 0 0 -590.03466
r11 4 1 0 -592.73282
r12 4 2 0 -592.15636
. estat ic
-----------------------------------------------------------------------------
Model | Obs ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
. | 88 -64.51057 304.3747 8 -592.7495 -572.9308
-----------------------------------------------------------------------------
Note: N=Obs used in calculating BIC; see [R] BIC note.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 7/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 1
. timer list 1
1: 0.01 / 1 = 0.0150
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 8/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 2
. timer list 2
2: 0.75 / 1 = 0.7520
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 9/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
ARDL(2,0,4) regression
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .30383 .0942165 3.22 0.002 .1161411 .491519
L2. | .3195318 .0776321 4.12 0.000 .1648808 .4741828
|
ln_inc | .3767587 .0389267 9.68 0.000 .2992128 .4543046
|
ln_inv |
--. | .0581759 .0170736 3.41 0.001 .0241635 .0921884
L1. | -.0185484 .0214624 -0.86 0.390 -.0613036 .0242068
L2. | .01012 .021505 0.47 0.639 -.0327202 .0529602
L3. | -.0146641 .0213098 -0.69 0.493 -.0571154 .0277872
L4. | -.0488136 .0174121 -2.80 0.006 -.0835003 -.0141269
|
_cons | .0416317 .0107782 3.86 0.000 .0201603 .063103
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 10/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
EC representation
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 12/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
ln_inv | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inv |
D1. | .080464 .0187106 4.30 0.000 .0432214 .1177066
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 13/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc |
L1. | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
|
ln_inv |
L1. | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inc |
D1. | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
D1. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 14/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3788728 .0420886 -9.00 0.000 -.4626481 -.2950975
-------------+----------------------------------------------------------------
LR |
ln_inc | .9669152 .0039557 244.44 0.000 .9590416 .9747889
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.346926 .0806726 -4.30 0.000 -.5075007 -.1863512
L2D. | -.1074193 .0790118 -1.36 0.178 -.2646883 .0498497
|
ln_inv |
D1. | .0758713 .0176989 4.29 0.000 .0406425 .1111002
LD. | .0422224 .0191523 2.20 0.030 .0041008 .080344
L2D. | .0678568 .0185208 3.66 0.000 .030992 .1047216
L3D. | .0485441 .0179609 2.70 0.008 .0127938 .0842944
|
_cons | .0504873 .0114518 4.41 0.000 .027693 .0732816
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 15/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
EC representation: Interpretation
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 16/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 17/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
3 Pq
The test is not directly performed on the long-run coefficients θ = βj /α.
j=0
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 18/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 19/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
4
The number of short-run coefficients only affects the finite-sample but not the asymptotic critical values
(Cheung and Lai, 1995; Kripfganz and Schneider, 2018). The elements of ω in the ec1 parameterization for
variables that have 0 lags in the ARDL model do not count towards this number.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 20/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Test decisions:
Do not reject H0F or H0t , respectively, if the test statistic is
closer to zero than the lower bound of the critical values.
Reject the H0F or H0t , respectively, if the test statistic is more
extreme than the upper bound of the critical values.
The first two steps of the bounds test are implemented in the
ardl postestimation command estat ectest.
By default, finite-sample critical values for the 1%, 5%, and
10% significance levels are provided. Asymptotic critical values
are displayed with option asymptotic. Alternative significance
levels can be specified with option siglevels(numlist ).
The test statistics in step 3 have the usual asymptotic
standard normal (or χ2 ) distributions irrespective of the
integration order of the independent variables.5
5
The OLS estimator for the long-run coefficients θ of I(1) independent variables is “super-consistent” with
√
convergence rate T instead of T (Pesaran and Shin, 1998; Hassler and Wolters, 2006).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 21/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat ectest
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.032 4.831 | 4.958 5.843 | 7.070 8.119 | 0.000 0.000
t | -2.550 -2.899 | -2.861 -3.225 | -3.470 -3.854 | 0.000 0.000
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 22/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.341178 .0431316 -7.91 0.000 -.4270464 -.2553096
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.14358 .0782318 14.62 0.000 .9878321 1.299327
qtr | -.0036516 .0016171 -2.26 0.027 -.006871 -.0004322
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.4362663 .0851 -5.13 0.000 -.6056874 -.2668452
L2D. | -.1899566 .0825977 -2.30 0.024 -.354396 -.0255172
|
ln_inv |
D1. | .0842961 .0173889 4.85 0.000 .0496775 .1189146
LD. | .0517241 .0188448 2.74 0.008 .0142069 .0892412
L2D. | .0726232 .017972 4.04 0.000 .0368437 .1084027
L3D. | .0482872 .0173383 2.79 0.007 .0137693 .0828051
|
_cons | -.3188651 .1422961 -2.24 0.028 -.602155 -.0355753
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 23/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat ectest
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.066 4.582 | 4.784 5.351 | 6.396 7.057 | 0.000 0.000
t | -3.107 -3.384 | -3.412 -3.704 | -4.014 -4.327 | 0.000 0.000
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 24/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 25/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Postestimation commands
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 26/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Postestimation commands
6
estat dwatson is not valid for ARDL / EC models because the lagged dependent variable is not strictly
exogenous by construction.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 27/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 28/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat hettest
chi2(1) = 0.26
Prob > chi2 = 0.6067
chi2(54) = 52.03
Prob > chi2 = 0.5508
---------------------------------------------------
Source | chi2 df p
---------------------+-----------------------------
Heteroskedasticity | 52.03 54 0.5508
Skewness | 12.24 9 0.2000
Kurtosis | 0.02 1 0.8967
---------------------+-----------------------------
Total | 64.29 64 0.4664
---------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 29/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. sktest resid
. qnorm resid
. pnorm resid
.02 1.00
.01 0.75
0 0.50
−.01 0.25
−.02 0.00
−.02 −.01 0 .01 .02 0.00 0.25 0.50 0.75 1.00
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 30/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
−2
−4
1961 1966 1971 1976 1981
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 31/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
−1
−2
1961 1966 1971 1976 1981
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 32/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Number of obs = 88
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 33/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Number of obs = 88
Note: This is a test for a structural break in the speed-of-adjustment and long-run coefficients.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 34/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
Further topics
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 35/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
ARDL(4) regression
------------------------------------------------------------------------------
D.dln_inv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
-------------+----------------------------------------------------------------
LR |
_cons | .015006 .0060544 2.48 0.015 .0029618 .0270501
-------------+----------------------------------------------------------------
SR |
dln_inv |
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
------------------------------------------------------------------------------
Note: The aim is to test whether dln inv, the first difference of ln inv, is nonstationary.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 36/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
. estat ectest
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 3.823 3.812 | 4.677 4.659 | 6.644 6.601 | 0.026 0.025
t | -2.565 -2.569 | -2.869 -2.874 | -3.463 -3.472 | 0.017 0.017
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
Note: The null hypothesis is that dln inv follows a unit root process (without drift).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 37/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
D.dln_inv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
|
_cons | .0113337 .0060208 1.88 0.063 -.0006437 .023311
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 38/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
1981q1: ...........
1981q2: ...........
1981q3: ...........
1981q4: ...........
1982q1: ...........
1982q2: ..........
1982q3: ..........
1982q4: ...........
7.75
7.7
7.65
7.6
7.55
1979 1980 1981 1982
Note: The forecast period (1981q1 – 1982q4) is excluded from the estimation period (1961q1 – 1980q4).
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 40/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
| Newey-West
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .2225557 .0931767 2.39 0.019 .0370552 .4080562
L2. | .2463097 .1003579 2.45 0.016 .0465125 .4461068
L3. | .1899566 .1013927 1.87 0.065 -.0119008 .3918141
|
ln_inc | .3901642 .0400174 9.75 0.000 .3104956 .4698327
|
ln_inv |
D1. | .0842961 .0258047 3.27 0.002 .0329229 .1356693
LD. | .0517241 .0158053 3.27 0.002 .0202582 .08319
L2D. | .0726232 .0156803 4.63 0.000 .0414061 .1038404
L3D. | .0482872 .017342 2.78 0.007 .013762 .0828124
|
qtr | -.0012458 .000383 -3.25 0.002 -.0020083 -.0004833
_cons | -.3188651 .1104624 -2.89 0.005 -.5387789 -.0989513
------------------------------------------------------------------------------
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 41/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_nl_1 | 1.14358 .0691576 16.54 0.000 1.008033 1.279126
------------------------------------------------------------------------------
Note: This is the same long-run coefficient as earlier but with Newey-West standard errors.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 42/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
help ardl
help ardl postestimation
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 43/44
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
References
Cheung, Y.-W., and K. S. Lai (1995). Lag order and critical values of the augmented Dickey-Fuller test.
Journal of Business & Economic Statistics 13(3): 277–280.
Engle, R. F., and C. W. J. Granger (1987). Co-integration and error correction: representation, estimation,
and testing. Econometrica 55(2): 251–276.
Hassler, U., and J. Wolters (2006). Autoregressive distributed lag models and cointegration. Allgemeines
Statistisches Archiv 90(1): 59–74.
Kripfganz, S., and D. C. Schneider (2018). Response surface regressions for critical value bounds and
approximate p-values in equilibrium correction models. Manuscript, University of Exeter and Max Planck
Institute for Demographic Research, www.kripfganz.de.
Lütkepohl, H. (1993). Introduction to Multiple Time Series Analysis (2nd edition), Berlin, New York:
Springer.
Narayan, P. K (2005). The saving and investment nexus for China: evidence from cointegration tests.
Applied Economics 37(17): 1979–1990.
Pesaran, M. H., and Y. Shin (1998). An autoregressive distributed-lag modelling approach to cointegration
analysis. In Econometrics and Economic Theory in the 20th Century. The Ragnar Frisch Centennial
Symposium, ed. S. Strøm, chap. 11, 371–413. Cambridge: Cambridge University Press.
Pesaran, M. H., Y. Shin, and R. Smith (2001). Bounds testing approaches to the analysis of level
relationships. Journal of Applied Econometrics 16(3): 289–326.
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 44/44
Le présent document est la propriété de l’Institut Tunisien de la Compétitivité et des
Études Quantitatives (ITCEQ). Toute reproduction ou représentation, intégrale ou
partielle, par quelque procédé que ce soit, de la présente publication, faite sans
l’autorisation écrite de l’ITCEQ, est considérée comme illicite et constitue une
contrefaçon.
Les résultats, interprétations et conclusions émis dans cette publication sont ceux de(s)
auteur(s) et ne devraient pas être attribués à l’ITCEQ, à sa Direction ou aux autorités
de tutelle.
Résumé ........................................................................................................................... 1
Introduction ...................................................................................................................... 2
Conclusion ..................................................................................................................... 22
Annexes......................................................................................................................... 24
1
Introduction
La Tunisie, comme la majorité des pays en développement, est loin d’avoir l’atout
des ressources naturelles pour affronter les défis de développement en faisant un
bon dosage entre l’aspect économique et l’aspect social qui gagne en pertinence ces
dernières années. D’ailleurs, elle a toujours été parmi les pays qui comptent en
grande partie sur ses ressources internes, particulièrement les ressources fiscales
pour financer ses dépenses publiques. En effet, les recettes fiscales ont financé le
budget de l’Etat à raison de 60% en moyenne entre les années 1986-2017.
De plus, répondre aux exigences croissantes des citoyens en infrastructures et
services publics de qualité, rend l’exercice et l’enjeu d’amélioration du rendement
des recettes fiscales assez compliqué. Lors de la conception d’une politique fiscale,
les pouvoirs publics doivent conserver les ressources fiscales permettant de soutenir
les équilibres budgétaires en respectant les facultés contributives de l’économie et
des contribuables et instaurer un système fiscal équitable. Un tel défi exige une
connaissance du potentiel fiscal de l’économie pour pouvoir déterminer la marge (le
domaine de définition) permettant aux autorités fiscales de rationaliser le système
d’imposition et d’éviter les cas extrêmes de sous imposition ou de sur imposition ;
« trop d’impôt tue l’impôt » Ibn Khaldoun (El mokadima).
2
Dans ce contexte s’inscrit ce travail dont l’objectif est de chercher des éléments de
réponse à une telle question en se basant sur une estimation de la capacité
maximale de collecte d’impôt à l’échelle macro qui exige la connaissance des
déterminants structurels et non structurels de l’effort fiscal. En d’autres termes
aborder la notion du potentiel fiscal : définir et étudier ses déterminants pour le cas
de la Tunisie.
3
1. Bilan de la situation des finances publiques en Tunisie (2000-
2017)
Les difficultés économiques que connait la Tunisie ces dernières années ont mis en
devant le débat sur la situation critique des finances publiques, particulièrement les
problèmes touchant les équilibres budgétaires et la soutenabilité de la dette publique.
une première qui s’étale sur la période 2000-2010: Plus ou moins performante, au
cours de laquelle le gouvernement a réussi à maintenir, un déficit ne dépassant
pas la barre de 3,4% du PIB et une moyenne de 2,61% sur toute la période. Faut-
il signaler que ces résultats ont été reluisants pour les deux années 2008 et 2010
(1% du PIB), malgré les pressions croissantes sur les finances publiques
émanant des retombées de la crise financière et de l’arrivée au point culminant
du démantèlement tarifaire des produits industriels dans le cadre de l’accord
d’association avec l’UE.
Une seconde tendance concerne la période post-révolutionnaire (2011-2017) est
marquée par des dérapages au niveau des équilibres des finances publiques. En
effet, le déficit budgétaire (hors privatisations et dons) durant ces sept années a
été de l’ordre de 5,38% du PIB et a atteint une apogée drastique de 6,9% en
2013. Les améliorations enregistrées successivement au cours des deux années
suivantes n’ont pas pu perdurer et le déficit a fini par atteindre le niveau de 6,1%
du PIB en 2016 et 2017. Les convulsions sociales, l’incertitude et l’instabilité
politique au niveau national ont été, de concert, les principales causes de la
situation difficile des finances publiques subtiles.
Par ailleurs, l’évolution du solde budgétaire primaire entre 2000-2017 a montré des
alternances entre des périodes dégageant des excédents et d’autres affichant des
déficits. Plus particulièrement et à partir de l’année 2011, la Tunisie a affiché un
déficit primaire qui ne cesse de s’aggraver d’une année à l’autre (Le déficit primaire
est passé de 0,61% du PIB en 2011 à 2,44% en 2015 et 2,74% en 2016 et 2017),
des niveaux sans précédent durant les vingt dernières années.
4
Graphique 1 : Evolution des soldes budgétaires en Tunisie (en % du PIB)
Une telle situation n’a pas cessé de s’aggraver davantage vue la récession
économique de la période de transition démocratique depuis l’année 2011. En effet,
l’instabilité politique et l’incertitude ont affaibli l’investissement et la création de
richesse et ont conduit à une dynamique de la dette publique dans ces deux
composantes interne et externe pour financer le déficit budgétaire. En conséquence,
le taux d’endettement public s’est exacerbé pour atteindre 69,7% du PIB en 2017
alors qu’il était à 44,6% en 2011 et d’une moyenne de 51,34% pour la période 2000-
2017 (graphique 2). Ceci laisse présager des difficultés inhérentes à la soutenabilité
de la politique budgétaire en Tunisie et la capacité d’honorer ses engagements
envers ses créanciers.
Graphique 2 : Evolution du taux d'endettement public et de la pression fiscale de la
Tunisie (% du PIB)
80% 30%
70%
25%
60%
50%
20%
40%
30% 15%
1986
1989
1992
1995
1998
2001
2004
2007
2010
2013
2016
1986
1989
1992
1995
1998
2001
2004
2007
2010
2013
2016
5
Dans ce sens, la Tunisie s’est trouvée dans un dilemme de plus en plus complexe,
d’une part elle devrait se procurer des ressources financières internes pour financer
les dépenses publiques tout en étant à l’abri d’un endettement plantureux, qui risque
de mettre en péril la souveraineté de l’Etat et porter préjudice à sa solvabilité
financière. D’autre part, elle doit freiner l’augmentation tendancielle de la pression
fiscale qui a atteint un pic de 23,1% du PIB en 2014.
6
2. Définition des concepts et revue de la littérature
Le concept de l’effort fiscal est défini comme étant le degré d’exploitation du potentiel
fiscal d’un pays1. Pessino et Fenochietto (2010) présentent le potentiel fiscal ou la
capacité contributive comme étant le maximum de recettes fiscales qu’un pays
donné peut collecter compte tenu des facteurs structurels d’ordre économique,
social, institutionnel et démographique.
Lotz et Morss (1967) ont examiné les déterminants du niveau de taxation d’un pays
en modélisant la pression fiscale de 72 pays développés et en développement par
deux variables indépendantes : le revenu national brut par habitant et le taux
d’ouverture (import et export rapportés PNB). Ils concluaient que ces 2 variables
impactent positivement et significativement la pression fiscale.
1
Ministère de l’Economie et des Finances du Sénégal, Bulletin du CEPOD, Quatrième trimestre 2012
7
période 1955-1966. Les résultats ont montré que deux facteurs ont été retenus
comme déterminants de la pression fiscale à savoir la part de l’agriculture et le taux
d’ouverture.
Tanzi (1992) s’est intéressé à étudier les déterminants de la pression fiscale dans 83
pays en développement pour la période 1978-1988. Il a prouvé que : la part des
importations dans le PIB, le PIB par habitant, la part de l’agriculture dans le PIB et la
part de la dette extérieure dans le PIB influent sur le ratio des recettes fiscales
rapportées au PIB.
Stotsky et Wolde Mariam (1997) ont enrichi les efforts déjà dévoués en essayant de
présenter les déterminants de la pression fiscale pour 43 pays de l’Afrique Sub-
saharienne pour la période 1990-1995 à travers des données de panel et de
construire un indice de mesure de l’effort fiscal. Ils ont conclu que la part de
l’agriculture dans le PIB et la part des mines dans le PIB déterminent d’une façon
négative et significative la pression fiscale alors que la part des exports et le PIB par
habitant ont des effets positifs et significatifs.
Eltony (2002) s’est intéressé au sujet à travers l’analyse des déterminants de l’effort
fiscal dans seize pays arabes pour la période 1994-2000. En effet, Eltony a fini par
conclure que pour les six pays arabes du conseil de coopération du Golfe (CCG), la
part des mines affecte significativement et négativement la pression fiscale, alors
que l’influence du revenu par habitant est positive. Quant aux autres pays non
producteurs de pétrole, les résultats étaient statistiquement significatifs présentant un
effet négatif de la part de l’agriculture alors que l’effet était plutôt positif pour la part
des mines, la part des importations et celle des exportations et pour le revenu par
habitant.
De surcroit, Gupta (2007) à travers une régression sur des données de panel pour
une période de 25 ans portant sur 105 pays en développement a fini par conclure
que des facteurs d’ordre structurel (le revenu par habitant, la part de l’agriculture
dans le PIB, l’ouverture mesurée par la part des imports dans le PIB ainsi que les
aides étrangères) déterminent d’une façon significative la performance des recettes
publiques (hors subventions) de ces pays.
8
politique affecte positivement la collecte de revenus importants pour les pays à
faibles revenus.
En se basant sur les travaux d’Aigner et al. (1977) et d’Afirman (2003), Pessino et
Fenochietto (2010) ont utilisé un modèle à frontière stochastique afin de déterminer
l’effort fiscal de 96 pays sur la période 1991-2006 et ont fini aux conclusions
suivantes : un impact positif et significatif sur la pression fiscale est lié aux variables
revenu par habitant, taux d’ouverture et dépenses publiques en éducation en
pourcentage du PIB. Par contre, l’inflation mesurée par l’indice des prix à la
consommation, le degré de concentration des revenus qui est mesuré par l’indice de
GINI, la part de l’agriculture dans le PIB et la corruption ont tendance à diminuer
d’une façon significative la pression fiscale. De plus, ils ont conclu que les pays
n’ayant pas encore atteint leur capacité contributive et visant à augmenter la
pression fiscale risquent de créer un milieu favorable à la corruption. Un phénomène
qui est en train de réduire d’une façon importante l’efficience des recettes fiscales et
nécessiterait par conséquent, plus d’effort pour la combattre.
Karagöz (2013) s’est intéressé au sujet en utilisant des séries temporelles de 1970 à
2010 pour montrer la manière avec laquelle la structure sectorielle de l’économie de
la Turquie impacte la pression fiscale. La conclusion était que la pression fiscale
turque est impactée significativement d’une façon positive par la part de l’industrie
dans le PIB, la dette extérieure totale rapportée au PIB, le taux de monétarisation de
l’économie (M2 rapportée au PIB) et le taux d’urbanisation. Quant à la part de
l’agriculture dans le PIB, l’impact est significatif mais négatif. L’impact de l’ouverture
est non significatif. L’auteur finit par recommander aux pouvoirs publics turcs
d’augmenter la pression fiscale à moyen terme d’une façon progressive avec des
réformes au niveau des dépenses publiques afin de créer un espace budgétaire
durable pour prioriser les dépenses émergentes.
Un travail mené par Amin et al.(2014) vient enrichir la batterie des déterminants de la
pression fiscale par les facteurs qui affectent la collecte des taxes (directes,
indirectes et totales) en utilisant la méthode de cointégration de Pesaran et Shin sur
des séries temporelles de 1980 à 2010 pour le cas du Pakistan. Les chercheurs sont
parvenus aux résultats suivants: la pression fiscale totale est en relation inverse et
significative avec les variables de corruption, indice d’instabilité politique et le revenu
réel par habitant. La relation devient positive et significative avec la variable taux
9
d’ouverture et non significative avec la variable inflation mesurée par l’indice des prix
à la consommation.
Taux d’urbanisme
Des variables d’ordre institutionnel
Indice de corruption
10
3. Présentation de la méthodologie et validation empirique
L’étape qui suit consiste à estimer le comportement de la pression fiscale par rapport
à ses déterminants. Le modèle prend la forme linéaire suivante :
PFt f ( X it )
OUV : L’ouverture de l’économie tunisienne mesurée par le rapport entre les importations plus
exportations et PIB
l pf t 0 1l agri t 2l (M 2)t 3l (OUV )t 4l ( PIBrh)t 5l (urban)t t (1)
0 : Constante
On a introduit l’opérateur log sur les variables du modèle afin de minimiser leurs
variances.
11
3.1 Présentation des données et choix de la technique d’estimation
Les séries temporelles relatives aux différentes variables sont collectées des bases
de données de la Banque Mondiale(WDI) qui permet de recueillir des séries plus
longues que d’autres sources. La série contient 34 observations annuelles sur la
période 1983-2016.
L’étude de la stationnarité des variables revêt une importance cruciale pour avoir une
estimation fiable du potentiel fiscal et éviter le risque des régressions fallacieuses.
Une série est stationnaire si son espérance et sa variance sont constantes et finies
et si la covariance ne dépend pas du temps.2
L’examen des graphiques relatifs aux différentes séries montre que les différentes
séries temporelles ne sont pas stationnaires en niveau. Une analyse plus profonde
par le recours à des tests de racine unitaire est nécessaire pour juger de cette
propriété stochastique. On optera en premier lieu pour le test ADF (Augmented
Dickey-Fuller) et le test de Phillips-Perron. En second lieu, on aura recours au test de
racine unitaire de Zivot et Andrews afin de prendre en considération la présence de
rupture structurelle dans les séries temporelles des différentes variables.
Variables Conclusion
2
Econométrie, Régis Bourbonnais, Dunod 2015, 9ème édition
12
La présence de variables à ordre d’intégration mixte I(0) et I(1), pousse vers le choix
de la technique de cointégration selon un modèle à correction d’erreur non contraint
de Pesaran (2001) ou modèle ARDL (Auto Regressive Distributed Lag). En effet, les
caractéristiques statistiques des variables en termes d’ordre d’intégration, vérifient la
condition nécessaire à l’application de cette approche qui exige qu’aucune des
variables n’est intégrée d’ordre 2 (I(2)), ainsi que la variable dépendante l(pf), soit
intégrée d’ordre 1 (I(1)).
Cette technique repose sur l’estimation d’un modèle optimal à correction d’erreurs
non contraint sous sa représentation ARDL dont la forme sera établie sur la base des
critères du choix du retard (AIC ou SIC ou HQ). Ainsi, le modèle relatif à l’équation 1
peut-être formulé comme suit :
p le retard optimal choisi et DUM2011, une variable muette pour introduire l’effet du
choc de la révolution, qui prendra comme valeur 0 avant 2011 et 1 à partir de 2011.
Par la suite et afin de mettre en œuvre l’existence d’une relation de long terme,
Pesaran et al. 1999 recommandent le recours au « ARDL Bound test » ou test de
cointégration par les bornes.
13
Selon l’approche ARDL, l’existence d’une relation de cointégration entre les variables revient à
calculer la statistique de Fisher (F-stat) du «test de Wald» ayant comme hypothèse nulle l’absence de
relation de cointégration à travers la nullité des coefficients des variables explicatives retardées du
modèle ARDL choisi ( i . Une fois calculée, la statistique de Fisher est comparée aux valeurs
critiques générées par la table de Narayan (2005)3. Une valeur inférieure relative à la borne
inférieure qui suppose que toutes les variables sont purement I(0) et une valeur supérieure qui
suppose que toutes les variables sont purement I(1). La règle de décision pour tester la relation de
cointégration est la suivante : Si F-stat est plus élevée que la limite supérieure alors l’hypothèse nulle
est rejetée en faveur de celle de présence d’une relation de cointégration, et si F-stat est en dessous de
la limite inférieure alors l’hypothèse nulle est acceptée. Sinon, pour une valeur de F-stat comprise
entre les deux bornes inférieure et supérieure, le résultat n’est pas concluant.
Les sont les coefficients de long termes et p le nombre de retard déterminé par le test bound
Ce qui signifie que la forme du modèle soit : la variable dépendante LPF, les
variables explicatives : l(PF) retardée, la variable l(agri), lM2, l(ouv) ; l(PIBrh) et
l(PIBrh) retardée l(urban) et la variable DUM2011.
3
Narayan (2005) a fournit une table des valeurs critiques pour une taille d’échantillon entre 30 et 80 observations.
4
Pour les séries annuelles, Paseran and Shin (1999) recommandent de choisir deux retards au maximum
14
Tableau 2 : Modèle ARDL (1, 0, 0, 0, 1, 0)
Erreur
Variables Coefficients t-Stat Probabilité
standard
L(PF(-1)) 0,392 0,152 2,570 0,017 (***)
L(AGRI) 0,038 0,059 0,651 0,521
L(M2) -0,241 0,099 -2,429 0,023 (**)
L(OUV) 0,233 0,065 3,560 0,002 (***)
L(PIBrh) -0,619 0,286 -2,160 0,041 (**)
L(PIBRH(-1)) 0,792 0,316 2,507 0,019 (**)
L(URBAN) -0,611 0,220 -2,772 0,011 (**)
DUM2011 0,065 0,027 2,389 0,025 (**)
C -2,750 0,818 -3,362 0,003 (***)
R2=0,86
Adjusted R2=0,81
SCR=0,017
En effet, ces résultats montrent que la statistique de Fisher calculée est plus élevée
que les valeurs critiques des bornes supérieures aux seuils de significativité 5% et
10%. En guise de conclusion, il y a une relation de cointégration entre les variables
du modèle qui nous conduira à estimer une relation de long-terme entre les
variables.
15
Le tableau 3 montre qu’à long terme les régresseurs (L(M2), L(OUV), L(PIBrh),
L(URBAN)) expliquent la pression fiscale et leurs coefficients sont statistiquement
significatifs.
D’après les résultats des estimations, l’impact de l’agriculture sur la pression fiscale
est positif mais non significatif. Ce résultat est inattendu au regard de ceux de
plusieurs travaux précédents qui ont trouvé un signe négatif et significatif (Lee et al.
(2008) et Botlhole (2010)). Ce résultat peut être justifié par la faible part de
l’agriculture dans le PIB qui n’a pas dépassé les 12% depuis la fin des années 90,
avec une moyenne de 12,63% pour la période 1983-2016. Toutefois, faut-il signaler
que ce résultat corrobore ceux issus des travaux de Agbeyegbe et al.(2004),
Mahdavi (2008) et Chaudhry et Munir (2010).
L’effet de l’évolution de la variable M2 sur PIB sur la pression fiscale est négatif et
significatif. Ce résultat ne concorde pas avec celui des travaux précédents
(Lutfunnahar (2007) et Karagöz (2013)). En effet, Ngakosso (2015) dans son
ouvrage renseigne que plus une économie est monétarisée, plus les transactions
économiques se développent, et plus se créent des revenus imposables. Dans notre
cas, le degré de monétarisation de l’économie tunisienne influe inversement la
capacité de l’Etat à mobiliser plus de ressources.
Ce résultat intrigant pour la Tunisie mérite d’être analysé dans la mesure où le degré
de monétarisation moyen de l’économie tunisienne est de l’ordre de 53,4% pour la
période 1983-2016 et le taux de bancarisation est de l’ordre de 66,1% en 2015
(presque 2 comptes bancaires pour 3 habitants)5. Cette relation négative peut être
expliquée par une préférence de plus en plus accrue pour la liquidité entre la période
2000-2015 comme en témoigne l’augmentation de l’indicateur « Billets et Monnaies
en Circulation (BMC) rapporté au PIB ». Cet indicateur atteint un pic de 10,4% de
PIB en 2012, confirmant un engouement des agents économiques aux transactions
en espèces.
5
Rapport Sur la Supervision Bancaire 2015, Banque Centrale de la Tunisie, Décembre 2016, page 41.
16
Pour ce qui est de l’ouverture de l’économie mesurée par le ratio importations et
exportations rapportées au PIB, son impact est positif et significatif à 5%. En effet,
une augmentation d’un point de pourcentage de la part des importations et
exportations dans le PIB (toute chose égale par ailleurs) entraîne une augmentation
de la pression fiscale de 0,382 point de pourcentage. Le résultat obtenu est attendu
et confirme celui de l’AFD (2007) qui conclut que les revenus issus du commerce
international constituent une assiette plus facilement taxable que les revenus ou les
consommations intérieures.
Quant au PIB réel par habitant utilisé comme proxy pour le développement de
l’économie, il s’en sort qu’il est un déterminant de la pression fiscale en Tunisie. Son
signe est positif et significatif à 5%. Le résultat est attendu et vient confirmer ceux de
Lotz et Morss (1967) et Pessino et Fenochietto (2010). En effet, le développement
des activités économiques crée de la richesse et augmente les capacités à mobiliser
et payer les impôts.
Pour ce qui est du taux d’urbanisation, utilisé dans le modèle comme proxy à la
demande en services publics, les résultats du modèle montrent un impact négatif et
significatif à 1% sur la pression fiscale en Tunisie. Une augmentation du taux
d’urbanisation d’un point de pourcentage provoque une diminution de la pression
fiscale de 1,003 point de pourcentage. Le résultat est inattendu parce que
théoriquement l’urbanisation augmente la demande des biens publics et crée de la
base imposable facilement taxable du fait de la concentration des activités formelles
dans les milieux urbains (Bird 2007). Toutefois, l’effet positif de l’urbanisation est
tributaire à la capacité du pouvoir public à concevoir une base d’aménagement
propice pour le développement des activités formelles et à fournir un service public
de qualité. En d’autres termes cet effet positif nécessite des préalables pour attendre
le civisme fiscal permettant de motiver les citoyens à bien accomplir leur devoir fiscal
d’une manière volontaire et loyale. Par conséquent, une mauvaise gouvernance est
de nature à nuire au capital confiance entre contribuables et gouvernement et
semble conduire à plusieurs formes de résistance à l’impôt, sous prétexte d’une
absence de contrepartie en services publics de qualité. La Tunisie se situe dans la
deuxième situation, l’augmentation du taux d’urbanisation enregistré durant les
dernières années s’est accompagnée par un rythme faible d’aménagement de
territoire et une détérioration de la qualité de service public.
17
Relation de court terme
En fait, l’existence d’une relation d’équilibre entre la pression fiscale et les différentes
variables explicatives du modèle permet de mettre en évidence une relation de long
terme entre elles au moins dans un sens. L’étude de la causalité servira de tremplin
pour affiner l’analyse et déterminer le sens de causalité dans les relations existantes
entre la pression fiscale et les régresseurs du modèle. Le choix du test de causalité
selon l’approche de Toda Yamamoto est motivé par la présence de variables mixtes
I(0) et I(1) dans le modèle.
Ainsi, les principaux résultats sont les suivants (voir tableau 4 en annexe) :
18
d’urbanisation entraine une amélioration de l’offre des facteurs ce qui peut générer
plus de richesse et donc de développement.
Afin de valider le modèle, une série de tests économétriques doit-être réalisée sur le
résidu. Le tableau 2 (en annexes) montre une absence de corrélation des résidus du
modèle qui est confirmée par le test de « Breusch-Godfrey Lagrange multiplier » et
qui signifie l’absence d’hétéroscédasticité. Le test Jarque-Béra confirme que la
distribution est normale. De plus, le test de Ramsey Reset a confirmé la spécification
linéaire de notre modèle.
19
Les résultats (tableau 5) montrent que la Tunisie est très proche et même au-delà de
son potentiel fiscal. En effet, l’évolution de l’indice de l’effort fiscal est presque égale
à l’unité au cours des périodes normales et dépasse l’unité dans les périodes de
mauvaise conjoncture.
Les valeurs qu’affiche cet indice pendant la période d’étude étaient toujours très
proches de l’unité, ce qui prouve que la Tunisie exploite pleinement le potentiel fiscal
disponible et que le risque de dépassement est très élevé. Faut-il signaler que le fait
que la Tunisie est sur les frontières de son potentiel fiscal dont la structure des
contribuables est prédominée par les salariés et quelques sociétés pétrolières peut
mettre en valeur l’équité fiscale.
20
pression fiscale
potentiel fiscal indice d’effort
Année effective
(en % du PIB) fiscal
(en % du PIB)
2015 0,216 0,219 0,987
2016 0,207 0,210 0,985
Source : Ministère des Finances, compilation ITCEQ
21
Conclusion
Le présent travail constitue un essai d’analyse du potentiel fiscal de la Tunisie et de
ses déterminants. L’approche ARDL a été mise en œuvre pour la période 1983-2016
sur les différentes séries temporelles des variables suivantes : part de la valeur
ajoutée de l’agriculture dans le PIB, variante M2 de la masse monétaire rapportée au
PIB, rapport entre les importations plus exportations et PIB, PIB réel par habitant et
taux d’urbanisation. Les résultats des estimations montrent l’existence d’une relation
d’équilibre entre les différentes variables et confirment que l’ouverture et le
développement de l’économie déterminent positivement la pression fiscale en
Tunisie alors que le signe attribué aux variables taux d’urbanisation et degré de
monétarisation de l’économie, est négatif et ne coïncide pas avec les résultats des
travaux précédents. La part de la valeur ajoutée de l’agriculture dans le PIB à
l’encontre des résultats attendus, n’impactent pas d’une façon significative la
pression fiscale.
L’indice d’effort fiscal qui représente le rapport entre la pression fiscale effective et le
potentiel fiscal calculé prouve que la Tunisie confronte des difficultés pour mobiliser
plus de recettes fiscales avec cette même base de contribuables. De ce fait elle est
appelé à orienter les actions de réforme vers deux aspects majeurs : élargir la base
des contribuables pour garantir plus d’équité fiscale et adopter une stratégie de
sensibilisation et de motivation qui vise plus de civisme fiscal.
22
Elargir la base des contribuables par le réformes qui visent à éliminer le régime
forfaitaire et mettre en place des avantages et des procédures permettant de faciliter
et motiver le passage de l’informel au formel. Enfin, il serait judicieux de réglementer
de plus les paiements en espèces et veiller à l’application des règles du droit qui
régissent la matière.
23
ANNEXES
24
Tableaux
Tableau 1 : Tests de racine unitaire : Augmented Dickey-Fuller (ADF) et Phillips-
Perron
25
Tableau 2 : Tests de racine unitaire en présence de rupture structurelle de Zivot et
Andrews
26
Tableau 3 : test de co-intégration par les bornes
nombre de variables =5
5% 10%
Variables indépendantes
Variables
L(pf) L(agri) L(M2) L(OUV) L(PIBRH) L(URBAN)
dépendantes
χ² (2) χ² (2) χ² (2) χ² (2) χ² (2)
L(OUV) 26,775 (***) 0,598 42,001 (***) - 19,985 (***) 39,928 (***)
L(PIBRH) 13,912 (***) 1,321 8,151 (**) 6,310 (**) - 9,892 (***)
27
Tableau 5 : test de validation du modèle
28
-3.72
-3.68
-3.64
-3.60
-3.56
-3.52
ARDL(1, 0, 0, 0,
1, 0)
-0.4
0.0
0.4
0.8
1.2
1.6
-8
-6
-4
-2
0
2
4
6
8
ARDL(1, 0, 0, 0,
1, 1)
ARDL(1, 0, 0, 0,
0, 1)
ARDL(1, 1, 0, 0,
2012
2012
1, 0)
ARDL(2, 0, 0, 0,
1, 0)
ARDL(1, 0, 0, 1,
2013
1, 0)
2013
ARDL(1, 0, 0, 0,
2, 0)
ARDL(1, 0, 0, 0,
29
0, 0)
CUSUM
ARDL(1, 0, 1, 0,
CUSUM of Squares
1, 1)
Graphiques
2014
ARDL(1, 0, 1, 0,
2014
0, 1)
ARDL(1, 1, 0, 0,
1, 1)
ARDL(1, 0, 0, 0,
1, 2)
ARDL(2, 0, 0, 0,
2015
2015
1, 1)
5% Significance
ARDL(2, 1, 0, 0,
5% Significance
1, 0)
ARDL(1, 0, 0, 1,
1, 1)
ARDL(1, 0, 0, 0,
2016
2, 1)
2016
ARDL(1, 1, 0, 0,
0, 1)
ARDL(1, 1, 0, 0,
0, 0)
ARDL(1, 2, 0, 0,
1, 0)
Références Bibliographiques
Ahmad,H.K., Ahmed, S., Mushtaq ,M., Nadeem, M(2016) Socio Economic Determinants of
Tax Revenue in Pakistan: An Empirical Analysis. Journal of Applied Environmental and
Biological Vol 6(2S), 32-42Science
Chaudhry, I. S., & Munir, F. (2010). Determinants of Low Tax Revenue in Pakistan.
Pakistan Journal of Social Sciences (PJSS) Vol, 30, 439-452.
Eltony,M.N (2002) The Determinants of Tax Effort in Arab Countries, Arab Planning
Institute
Lotz, J.R. and E.R. Morss, 1967.Measuring ‘Tax Effort’ in Developing Countries.
International
Lutfunnahar, B. (2007). A Panel Study on Tax Effort and Tax Buoyancy with Special
Reference to Bangladesh. Working Paper 715: Policy Analysis Unit (PAU) Research
Department Bangladesh Bank
Le, Tuan Minh; Moreno-Dodson, Blanca; Rojchaichaninthorn, Jeep. 2008. Expanding taxable
capacity and reaching revenue potential : cross-country analysis. Policy Research Working
Paper ; no. WPS 4559. Washington, DC : World Bank
Stotsky, J.G. and Wolde Mariam, A. (1997).Tax Effort in Sub-Saharan Africa. Working Paper
107: International Monetary Fund, Washington, DC
30
An Autoregressive Distributed Lag Modelling
Approach to Cointegration Analysis¤
M. Hashem Pesaran
Trinity College, Cambridge, England
Yongcheol Shin
Department of Applied Economics, University of Cambridge, England
First Version: February, 1995, Revised: January, 1997
Abstract
This paper examines the use of autoregressive distributed lag (ARDL) mod-
els for the analysis of long-run relations when the underlying variables are I(1).
It shows that after appropriate augmentation of the p order of the ARDL model,
the OLS estimators of the short-run parameters are T -consistent with the as-
ymptotically singular covariance matrix, and the ARDL-based estimators of the
long-run coe¢cients are super-consistent, and valid inferences on the long-run pa-
rameters can be made using standard normal asymptotic theory. The paper also
examines the relationship between the ARDL procedure and the fully modi…ed
OLS approach of Phillips and Hansen to estimation of cointegrating relations, and
compares the small sample performance of these two approaches via Monte Carlo
experiments. These results provide strong evidence in favour of a rehabilitation
of the traditional ARDL approach to time series econometric modelling. The
ARDL approach has the additional advantage of yielding consistent estimates of
the long-run coe¢cients that are asymptotically normal irrespective of whether
the underlying regressors are I(1) or I(0).
[1]
implication that the OLS estimators of the long-run coe¢cients,
Pp de…ned by the
ratios ± = ®1 =Á(1) and µ = ¯=Á(1), where Á(1) = 1 ¡ i=1 Ái , converge to their
true values faster than the estimators of the short run parameters ®1 and ¯. The
3
ARDL-based estimators of ± and µ are T 2 -consistent and T -consistent, respec-
tively. These results are not surprising and are familiar from the cointegration
literature. But more importantly, we will show that despite the singularity of
the covariance structure of the OLS estimators of the short-run parameters, valid
inferences on ± and µ, as well as on individual short run parameters, can be made
using standard normal asymptotic theory. Therefore, the traditional ARDL ap-
proach justi…ed in the case of trend-stationary regressors, is in fact equally valid
even if the regressors are …rst-di¤erence stationary.
In the case where ut and "t are correlated the ARDL speci…cation needs to be
augmented with an adequate number of lagged changes in the regressors before
estimation and inference are carried out. The degree of augmentation required
depends on whether q > s + 1 or not. Denoting the contemporaneous correlation
between ut and "t by the k £ 1 vector d, the augmented version of (1.1) can be
written as
p
X X
m¡1
0
yt = ®0 + ®1 t + Ái yt¡i + ¯ xt + ¼ 0i ¢xt¡i + ´ t ; (1.3)
i=1 i=0
[2]
where ¢xt = et , and »t = (vt ; e0t )0 follows a general linear stationary process, the
3
OLS estimators of ± and µ are T 2 - and T -consistent, but in general the asymp-
totic distribution of the OLS estimator of µ involves the unit-root distribution
as well as the second-order bias in the presence of the contemporaneous correla-
tions that may exist between vt and et . Therefore, the …nite sample performance
of the OLS estimator is poor and in addition, due to the nuisance parameter
dependencies, inference on µ using the usual t-tests in the OLS regression of
(1.4) is invalid. To overcome these problems Phillips and Hansen (1990) have
suggested the fully-modi…ed OLS estimation procedure that asymptotically takes
account of these correlations in a semi-parametric manner, in the sense that the
fully-modi…ed estimators have the Gaussian mixture normal distribution asymp-
totically, and inferences on the long run parameters using the t-test based on the
limiting distribution of the fully-modi…ed estimator is valid.
The ARDL-based approach to estimation and inference, and the fully-modi…ed
OLS procedure are both asymptotically valid when the regressors are I(1), and a
choice between them has to be made on the basis of their small sample properties
and computational convenience. To examine the small sample performance of the
two estimators we have carried out a number of Monte Carlo experiments. Since
in practice the “true” orders of the ARDL(p; m) model are rarely known a priori,
in the Monte Carlo experiments we also consider a two-step strategy whereby p
and m are …rst selected (estimated) using either the Akaike Information Criterion
(AIC), or the Schwarz Bayesian Criterion (SC), and then the long-run coe¢cients
and their standard errors are estimated using the ARDL model selected in the
…rst step. We refer to these estimators as ARDL-AIC and ARDL-SC. The main
…ndings from these experiments are as follows:
(i) The ARDL-AIC and the ARDL-SC estimators have very similar small-sample
performances, with the ARDL-SC performing slightly better in the majority
of the experiments. This may re‡ect the fact that the Schwartz criterion is
a consistent model selection criterion while Akaike is not.
(ii) The ARDL test statistics that are computed using the ¢-method (or equiv-
alently by means of the so-called Bewley’s regression), generally perform
much better in small samples than the test statistics computed using the
asymptotic formula that explicitly takes account of the fact that the regres-
sors are I(1).
(iii) The ARDL-SC procedure when combined with the ¢-method of comput-
ing the standard errors of the long-run parameters generally dominates the
Phillips-Hansen estimator in small samples. This is in particular true of the
size-power performance of the tests on the long-run parameter.
[3]
(iv) The Monte Carlo results point strongly in favor of the two-step estimation
procedure, and this strategy seems to work even when the model under con-
sideration has endogenous regressors, irrespective of whether the regressors
are I(1) or I(0).1
The plan of the paper is as follows: Section 2 examines the asymptotic prop-
erties of the OLS estimators in the context of a simple autoregressive model with
a linear deterministic trend and the k-dimensional strictly exogenous I(1) regres-
sors. Section 3 considers a more general ARDL model, allowing for residual serial
correlations and possible endogeneity of the I(1) regressors, and develops the re-
sultant asymptotic theory. In Section 4 the ARDL-based approach is compared to
the cointegration-based approach of Phillips and Hansen (1990). Section 5 reports
and discusses the results of Monte Carlo experiments. Some concluding remarks
are presented in Section 6. Mathematical proofs are provided in an Appendix.
where yt is a scalar, Á(L) = 1 ¡ ÁL, with L being the one period lag operator, xt
is a k £ 1 vector of regressors assumed to be integrated of order 1:2
xt = xt¡1 + et ; (2.2)
Á(L)yt = ®0 + (®1 + ¯0 ¹x )t + ¯ 0 x
~t + ut ;
where x
~t follows an I(1) process without a drift.
[4]
(A2) The k-dimensional vector, et , in (2.2) has a general linear multivariate
stationary process,
(A3) ut and et are uncorrelated for all leads and lags such that xt is strictly
exogenous with respect to ut ,
(A4) The I(1) regressors, xt , are not cointegrated among themselves, and
(A5) jÁj < 1, so that the model is dynamically stable, and a long-run relationship
between yt and xt exists.3
From (2.1) and (2.4) it is clear that yt and xt are individually I(1), but must be
cointegrated for (2.1) to be meaningful.4 Similarly, we obtain
yt¡1 = ¹1 + ±t + µ0 xt + ·t ; (2.5)
[5]
model, (2.1). For expositional convenience, we transform (2.1) to the partitioned
regression model in the matrix form as,
yT = ZT b + yT ¡1 Á + uT ; (2.6)
where yT = (y1 ; :::; yT )0 , yT ¡1 = (y0 ; :::; yT ¡1 )0 , ¿ T = (1; :::; 1)0 , tT = (1; :::; T )0 ,
XT = (x1 ; :::; xT )0 , ZT = (¿ T ; tT ; XT ), uT = (u1 ; :::; uT )0 , and b = (®0 ; ®1 ; ¯ 0 )0 .
Since our main interest is in the long-run coe¢cients on trended regressors, t and
xt , we also partition
µ ¶ µ ¶
®0 ®1
ZT = (¿ T ; ST ); ST = (tT ; XT ); b = ; c= ;
c ¯
Theorem 2.1. Under the assumptions (A1) - A(5), the OLS p estimators of Á and
c = (®1 ; ¯ 0 )0 in (2.6), denoted by Á^ T and ^cT , respectively, are T -consistent, and
have the following asymptotic distributions:
½ ¾
p ¾ 2u
^ a
T (ÁT ¡ Á) » N 0; 2 ; (2.7)
¾·
½ ¾
p a ¾ 2u 0
T (^
cT ¡ c) » N 0; 2 ¸¸ ; (2.8)
¾·
where ¸ = (±; µ0 )0 is a (k + 1) £ 1 vector of the long run parameters on trended
regressors, t and xt , and rank(¸¸0p ) = 1. In addition, the OLS estimator of
®0 in (2.6), denoted by ® ^ 0T , is also T -consistent, but has the mixture normal
distribution. De…ning h = (b0 ; Á)0 and PZT = (ZT ; yT ¡1 ), and denoting the OLS
estimator of h by h ^T , the covariance matrix of h ^ T can be consistently estimated
by
V^ (h
^T ) = ¾
^ 2uT (P0ZT PZT )¡1 ;
where ¾ ^T )0 (yT ¡ PZ h
^ 2uT = T ¡1 (yT ¡ PZT h ^T ), and V^ (h
^T ) is asymptotically sin-
T
gular with rank equal to 2.
Theorem 2.1 shows that despite the presence of stochastic and deterministic trends
p
in the ARDL model, the OLS estimators of the short-run parameters are T -
consistent.5 The second and more important …nding is that the OLS estimators
5
Similar results can also be obtained in the case of regressors with higher order trend terms
such as t2 ; t3 ; :::; or I(2), I(3), ..., variables.
[6]
of the coe¢cients on the trended regressors, ®1 and ¯, in (2.1) are asymptoti-
cally perfectly collinear with the OLS estimator of the coe¢cient on the lagged
dependent variable, Á; namely,
p n o
T (^ ^ T ¡ Á) = op (1):
cT ¡ c) + ¸(Á (2.9)
One interesting implication of this result is that the t-statistics for testing the sig-
ni…cance of individual impact coe¢cients on the I(1) regressors are asymptotically
equivalent, namely t¯^ i ¡ t¯^ j = op (1) for i 6= j, and t¯^ i ¡ t®^ 1 = op (1).6 Furthermore,
^ = op (1). Relation (2.9) in conjunction with
t¯^ i ¡ t(1¡Á)
^ T ¡ Á)
^ T ¡ ¸ = (^
¸
cT ¡ c) + ¸(Á
; (2.10)
^T )
(1 ¡ Á
also yields an important result familiar from the cointegration literature, which
we set out in the following theorem:
and therefore, ½ ¾
2
1
¡1 ^ T ¡ ¸) » N 0;
a ¾ u
QS2~ DST (¸ Ik+1 ; (2.11)
T (1 ¡ Á)2
where ¸ ^0T )0 , Q ~ = DS S0T HT ST DS ; ST = (tT ; XT ), HT = IT ¡
^ T = (^± T ; µ
ST T T
0 ¡1 0 ¡ 32 ¡1
¿ T (¿ T ¿ T ) ¿ T ; and DST = Diag(T ; T Ik ):
inferences concerning ± and µ are possible. Notice also that the covariance matrix
of the estimator of ¸ simply depends on the inverse of the (scaled) demeaned
data matrix and the spectral density at zero frequency of (1 ¡ ÁL)¡1 ut , namely
¾ 2u =(1 ¡ Á)2 . Once again, this …nding is in line with the results already familiar
from the cointegration literature. (See Section 4 for further discussions.)
6
For large enough T we have t¯^ i ¼ (1 ¡ Á) (¾ · =¾ u ) : This explains the relatively low t-ratios
often obtained for short-run coe¢cients in ARDL regressions with I(1) variables, especially when
Á is close to unity.
[7]
Hypothesis testing on the general linear restrictions involving the k + 1 di-
mensional long-run parameter vector, ¸, can be carried out in the usual manner.
Consider the g linear restrictions on ¸,
R¸ = r;
For completeness the asymptotic results for these models are summarized in The-
orems 2.3 and 2.4.
Theorem 2.3. Under the assumptions (A1) and (A5), the p OLS estimators of
®0 ; ®1 and Á in (2.14), denoted by ®
^ 0T , ® ^ T , are all T -consistent, and
^ 1T , and p
Á p
asymptotically normally distributed. In addition, T (^ ®1T ¡ ®1 ) and T (Á^ T ¡ Á)
[8]
are perfectly collinear asymptotically and the covariance matrix of (^ ®0T , ® ^T )
^ 1T , Á
is asymptotically singular with rank equal to 2. Furthermore, the estimator of the
long run parameter ±, computed by ® ^ 1T =(1 ¡ Á^ T ), has the following asymptotic
distribution: ½ ¾
2
12¾
T 2 (^± T ¡ ±) » N 0;
3 a u
: (2.16)
(1 ¡ Á)2
Theorem 2.4. Under assumptions (A1) - (A5), p the OLS estimators of ®0 ; ¯ and
Á in (2.15), denoted by ®^ 0T , ¯ ^ T are T -consistent, and have the asymp-
^ T , and Á
p p
totic (mixture) normal distributions. In addition, T (^ ®1T ¡ ®1 ) and T (Á ^ T ¡ Á)
are perfectly collinear asymptotically and so the covariance matrix of (^ ^T ,
®0T , ¯
^
ÁT ) is asymptotically singular with rank equal to 2. Furthermore, the estimator
of the long run parameter µ, given by µ ^T = ¯^ T =(1 ¡ Á^ T ); has the mixture normal
distribution asymptotically, and
½ ¾
1
^ a ¾ 2u
QX~ T (µ T ¡ µ) » N 0;
2
Ik ; (2.17)
T (1 ¡ Á)2
^ 2uT
¾ 1
V^ (^µ T ) = P : (2.19)
(1 ¡ Á ^ T )2 T
(x t ¡ x
¹ ) 2
t=1
7
In the case where xt is I(0) p
we have the same asymptotic result given by (2.18); that is,
since T xT HT xT = Op (1) and T (^
¡1 0
µT ¡ µ) = Op (1), hence
" T # 12 ½ ¾
p X a ¾ 2u
T (^µ T ¡ µ) = (^
1
(T ¡1 x0T HT xT ) 2 (xt ¡ x
¹)2
µ T ¡ µ) » N 0; :
t=1
(1 ¡ Á)2
[9]
The computation of the variance of ^µT by the ¢-method involves approximating
^µT = g(ª
^T) = ¯^ T
;
1¡Á ^T
¾
^ 2 h i 1 · P 2
P ¸·
1
¸
(y ¡ y
¹ ) ¡ (y ¡ y
¹ )(x ¡ x
¹ )
V^¢ (^µT ) = uT
1; ^µT P t¡1 Pt¡1 t
^µT ;
(1 ¡ Á^ T )2 DT ¡ (yt¡1 ¡ y¹)(xt ¡ x¹) (xt ¡ x¹)2
(2.20)
where the bar over the variable denotes the sample mean, and
" T #" T # " T #2
X X X
DT = (xt ¡ x¹)2 (yt¡1 ¡ y¹)2 ¡ (yt¡1 ¡ y¹)(xt ¡ x¹) :
t=1 t=1 t=1
Using (2.5), recalling that ± = 0 and de…ning y~t¡1 = yt¡1 ¡ y¹; x~t = xt ¡ x¹ and
· ¹ , we also have
~ t = ·t ¡ ·
y~t¡1 = µ~
xt + ·
~t ; (2.21)
where · ~ t follows a general linear stationary process. Substituting this result in
(2.20), we obtain
PT P P
^ ^ ^ 2uT
¾ ~ 2t + (^µT ¡ µ)2 Tt=1 x~2t ¡ 2(^µT ¡ µ) Tt=1 x~t ·
t=1 · ~t
V¢ (µT ) = P P P : (2.22)
(1 ¡ Á ^ T )2 T T T
~ 2t ) ¡ ( t=1 x~t ·
( t=1 x~2t )( t=1 · ~ t )2
Since ·
~ t is I(0) and x~t is I(1), using the results familiar in the literature (see, for
example, Phillips and Durlauf (1986)), we have
X
T X
T X
T
¡1
T ~ 2t
· = Op (1); T ¡2
x~2t = Op (1); T ¡1
x~t ·
~ t = Op (1):
t=1 t=1 t=1
[10]
Also from the result of Theorem 2.4 we know that T (^µT ¡ µ) = Op (1). Hence,
taking probability limits of the right hand side of (2.22) as T ! 1, we have
¾ 2u 1
V^¢ (^µT ) = PT
+ op (1):
(1 ¡ Á) T ¡2 t=1 (xt ¡ x¹)2
2
Therefore, the standard error for the estimator of the long run parameter, µ,
obtained using the ¢-method is asymptotically the same as that given by (2.19),
which was derived assuming that xt is I(1). One important advantage of the
variance estimator obtained by the ¢-method over the asymptotic formula (2.19)
lies in the fact that it is asymptotically valid irrespective of whether xt is I(1) or
I(0), while the latter estimator is valid only if xt is I(1). P
The two variance estimators clearly di¤er in …nite samples. Notice that ( Tt=1 x~t ·
~ t )2
is asymptotically negligible compared to other terms in (2.22), but it may not be
negligible in …nite samples, especially when x~t and · ~ t are correlated. For a com-
parison of the small sample properties of the two variance estimators see the
Monte Carlo results reported in Section 5.
[11]
Pq
Using the decomposition ¯(L) = ¯(1) + (1 ¡ L)¯ ¤ (L), where ¯(1) = j=0 ¯j ;
P Pq
¯ ¤ (L) = q¡1 ¤ j ¤
j=0 ¯ j L and ¯ j = ¡ i=j+1 ¯ i ; (3.1) can be rewritten as
q¡1
X
0
Á(L)yt = ®0 + ®1 t + ¯ xt + ¯ ¤0
j ¢xt¡j + ut ; (3.2)
j=0
[12]
P1 ¡1
where ·0t = j=q µ¤0
j et¡j + [Á(L)] ut . Similarly,
q¡1
X
0 0
yt¡i = ¹i + ±t + µ xt + gij ¢xt¡j + ·it ; i = 1; :::; p; (3.7)
j=0
where QK is the p£p positive de…nite covariance matrix of (·1t ; ·2t ; :::; ·pt )0 de…ned
by (3.8), and p a © 0
ª
cT ¡ c) » N 0; ¾ 2u ¿ 0p Q¡1
T (^ K ¿ p ¸¸ ; (3.11)
where ¸ = (±; µ0 )0 , ¿ p is the p-dimensional unit vector, and rank(¸¸0 ) = 1. The
p
OLS estimators of ®0 and ¯ ¤ , denoted by ® ^ ¤T ; are also T -consistent, and
^ 0T and ¯
have the mixture normal distributions, asymptotically. The covariance matrix for
all the short-run parameters, h = (f 0 ; Á)0 , is asymptotically singular with rank
equal to kq + 2, and can be consistently estimated in the usual way by
V^ (h
^T ) = ¾^ 2uT (P0G PG )¡1 ;
T T
^T )0 (yT ¡ PG h
^ 2uT = T ¡1 (yT ¡ PGT h
where PGT = (GT ; YT ); and ¾ ^T ).
T
[13]
p p
From Theorem 3.1 we also …nd that T (^
® ¡ ® ) and ^ T ¡ ¯) are asymp-
T (¯
p 1T 1
totically perfectly collinear with T (Á^ T ¡ Á); that is,
p n o
T (^ ^
cT ¡ c) + ¸[ÁT (1) ¡ Á(1)] = op (1): (3.12)
^ T (1) = 1 ¡ Pp Á
where Á ^
i=1 iT . It is also straightforward to show that
^ T (1) ¡ Á(1)]
^ T ¡ ¸ = (^
¸
cT ¡ c) + ¸[Á
: (3.13)
^ T (1)
Á
Using Theorem 3.1, and results (3.12) and (3.13), we have:
Theorem 3.2. Under the assumptions (A1)0 and (A2) - (A5), the OLS estimators
of the long-run parameters, ¸ ^0T )0 = ^
^ T = (^± T; µ cT =Á^ T (1) in (3.9), converge to
their true values at faster rates than the estimators of the associated short-run
parameters, and follow the mixture normal distribution asymptotically. Therefore,
½ ¾
1
¡1 ^ a ¾ 2u
QS~ DST (¸T ¡ ¸) » N 0;
2
Ik+1 ; (3.14)
T [Á(1)]2
where QS~T and DST are as de…ned in Theorem 2.2.
Comparing Theorems 2.2 and 3.2, we …nd that the presence of the I(0) stationary
regressors in (3.9) (i.e., additional lagged changes in yt and the lagged changes
in xt which are introduced to deal with the residual serial correlation problem)
does not a¤ect the asymptotic properties of the OLS estimator of the long run
coe¢cients, ± and µ. Therefore, inferences concerning the long-run parameters
can be based on the same standard tests as given by (2.12) and (2.13). In this
more general case, however, the expression for the asymptotic variance of ¸ ^ T is
2 2
still given by (2.11), but with ¾ u =(1¡Á) replaced by the more general expression,
¾ 2u =[Á(1)]2 .
We now relax assumption (A3) and allow for the possibility of endogenous
regressors, but con…ne our attention to the case where ¢xt can be represented by
a …nite order vector AR(s) process,9
[14]
to be serially uncorrelated, but possibly contemporaneously correlated with ut ;
namely, we assume that ³ t = (ut ; "0t )0 follows the multivariate iid process with
mean zero and the covariance matrix,
· 2 ¸
¾ u §u"
§³³ = : (3.16)
§"u §""
We will, however, continue to assume that Cov(ut¡j ; "t¡i ) = 0 for i 6= j. No-
tice that despite this assumption the model is still general enough to allow not
only for the contemporaneous but also for cross-autocorrelations between ut and
¢xt . With assumption (A3) relaxed, the OLS estimators in (3.1) are no longer
consistent. To correct for the endogeneity of xt , we model the contemporaneous
correlation between ut and "t by the linear regression of ut on "t
u t = d 0 "t + ´ t ; (3.17)
where using (3.16) we have d = §¡1 0
"" §u" , and "t is strictly exogenous with respect
to ´ t .10 Substituting (3.15) in (3.17) we obtain:
ut = d0 P(L)¢xt + ´t ; (3.18)
where ¢xt¡i ’s, i = 0; :::; s; are also strictly exogenous with respect to ´ t . The
parametric correction for the endogenous regressors is then equivalent to extending
the ARDL(p; q) model (3.2) to the more general ARDL(p; m) speci…cation,
X
m¡1
0
Á(L)yt = ®0 + ®1 t + ¯ xt + ¼0j ¢xt¡j + ´ t ; (3.19)
j=0
[15]
Theorem 3.3. Under the assumptions (A3)0 , (A4) and (A5), the OLS estimators p
of the short-run parameters in (3.19), ®0 , ®1 , ¯, Á1 ; :::; Áp , ¼ 0 ; :::; ¼ m¡1 , are T -
consistent, and asymptotically have the (mixture) normal distributions. Further-
p p h i
more, T (^ ^
cT ¡ c) is asymptotically perfectly collinear with T ÁT (1) ¡ Á(1) ,
P
where c = (®1 ; ¯ 0 )0 and Á(1) = 1 ¡ pi=1 Ái , such that the covariance matrix for
the estimators of the short-run parameters is asymptotically singular with rank
equal to km + 2. The asymptotic distribution of the OLS estimators of the long-
run parameters, ¸ ^0T )0 = ^
^ T = (^± T; µ cT =Á^ T (1) in (3.19), are mixture normal and
therefore, ½ ¾
2
1 ¾
¡1
QS2~ DST (¸^ T ¡ ¸) » N 0;
a ´
Ik+1 ; (3.20)
T [Á(1)]2
where ¾ 2´ is the variance of ´ t in (3.19), and QS~T and DST are as de…ned in
Theorem 2.2.
There are no fundamental di¤erences between the results of Theorems 2.2, 3.2
and 3.3, as far as the estimators of the log-run parameters are concerned. A com-
parison of (2.11), (3.14) and (3.20) shows that the asymptotic distributions of the
estimators of the long-run parameters, ¸ ^ T , under various assumptions discussed
above di¤er only by a scalar coe¢cient.
In sum, in the context of the ARDL model inference on the long run para-
meters, ± and µ, is quite simple and requires a priori knowledge or estimation of
the orders of the extended ARDL(p; m) model. Appropriate modi…cation of the
orders of the ARDL model is su¢cient to simultaneously correct for the resid-
ual serial correlation and the problem of endogenous regressors. Variances of the
OLS estimators of the long-run coe¢cients can then be consistently estimated
using either (3.20), or by means of the ¢-method applied directly to the long-
run estimators. Alternatively, one could compute the estimates of the long-run
coe¢cients and their associated standard errors using Bewley’s (1979) regression
procedure. Bewley’s method involves rewriting (3.19) as
p¡1
1 X 0 1 X ¤
m¡1
®0 ´
Á(L)yt = 0
+ ±t + µ xt + ¼ j ¢xt¡j ¡ Áj ¢yt¡j + t ; (3.21)
Á(1) Á(1) j=0 Á(1) j=0 Á(1)
and then estimating it by the instrumental variable method using (1, t, xt , ¢xt ,
¢xt¡1 ; :::; ¢xt¡m+1 , yt¡1 , yt¡2 ; :::; yt¡p ) as instruments. It is easy to show that the
IV estimators of ± and µ obtained using (3.21) are numerically identical to the
OLS estimators of ± and µ based on the ARDL model (3.19), and that the standard
errors of the IV estimators from the Bewley’s regression are numerically identical
to the standard errors of the OLS estimators of ± and µ obtained using the ¢-
method. (See, for example, Bardsen (1989).) The main attraction of the Bewley’s
[16]
regression procedure lies in its possible computational convenience as compared to
the direct OLS estimation of (3.19) and computation of the associated standard
errors by the ¢-method.11
Finally, we note in passing that the results developed in this section also apply
to the case where the underlying regressors, xt , given by (3.15), are I(0). (See
footnote 7 and the Monte Carlo simulation results in Section 5.)
yt = ¹ + ±t + µ0 xt + vt ; (4.1)
¢xt = et : (4.2)
Although the OLS estimator of µ is shown to be T -consistent, (see Stock (1987)),
it has also been found that the …nite sample behavior of the OLS estimator is
generally very poor (see, for example, Banerjee et. al. (1986)). Especially, in the
presence of non-zero correlation between vt and et , OLS estimators of µ in (4.1)
are often heavily biased in …nite samples, and inferences based on them are invalid
because of the dependence of the limiting distribution of the OLS estimators on
nuisance parameters. For details see Phillips and Loretan (1991).
Broadly speaking, there are two basic approaches to cointegration analysis: Jo-
hansen’s (1991) maximum likelihood approach, and Phillips-Hansen’s (1990, PH)
fully modi…ed OLS procedure.12 The ARDL approach to cointegration analysis
advanced in this paper is directly comparable to the PH procedure, and we shall,
therefore concentrate on this method. PH assume that vt and et in (4.1) and (4.2)
follow the general correlated linear stationary processes:13
[17]
where ³ t = (ut ; "0t )0 are serially uncorrelated random variables with zero means
and a constant variance matrix given by (3.16). Assuming A1 (L) and A2 (L) are
invertible, (4.1) can be approximated as an ARDL speci…cation by truncating
the order of the in…nite order lag polynomials [A1 (L)]¡1 and [A2 (L)]¡1 such that
Á(L) ¼ [A1 (L)]¡1 and P(L) ¼ [A2 (L)]¡1 , where the orders of the lag polyno-
mials Á(L) and P(L) are denoted by p and s, respectively. Then we obtain the
approximate …nite-dimensional ARDL(p; m) speci…cation,
Á(L)yt = fÁ(1)¹ + ±Á0 (1)g + ±Á(1)t + Á(L)µ0 xt + §u" §¡1 "" P(L)¢xt + ´ t ; (4.4)
P
where Á0 (1) = ¡ pi=1 iÁi , m = max(p; s + 1), and by construction xt (and ¢xt ’s)
are uncorrelated with ´ t .14 Notice that (4.4) is of the same form as (3.19), with
the following relations among their parameters: ®0 = Á(1)¹ + ±Á0 (1), ®1 = ±Á(1),
¯ = Á(1)µ, ¼0 (L) = Á¤ (L)µ0 + §u" §¡1 ¤
"" P(L), where Á (L) is de…ned by Á(L) =
¤
Á(1) + (1 ¡ L)Á (L). Therefore, the ARDL speci…cation (4.4) and the static
cointegrating formulation, (4.1) and (4.2), represent alternative ways of modelling
the serial correlation in vt ’s and the endogeneity of xt .
Here we examine the PH estimation procedure in the context of the ARDL
approximation for the yt process given by (4.4). Assuming that » t = (vt ; e0t )0 in
(4.1) and (4.2) satisfy the multivariate invariance principle, the long-run variance
matrix of »t is given by15
( " T #)
XT X̀ X XT
-» = Plim T ¡1 » t » 0t + T ¡1 »t » 0t¡j + » t¡j »0t ; (4.5)
T !1
t=1 j=1 t=j+1 t=j+1
where the lag truncation parameter ` increases with T , such that `=T ! 0, as
T ! 1. We also de…ne
( T )
X X̀ X
T
¢» = Plim T ¡1 » t »0t + » t » 0t¡j ; (4.6)
T !1
t=1 j=1 t=j+1
[18]
problem. To deal with the cross-correlations between vt and current and lagged
values of et , PH consider the modi…ed error process, denoted by vt+ , which is
obtained from the regression of vt on et ,
vt+ = vt ¡ -ve -¡1
ee et ; (4.7)
and vt+ is not correlated with et by construction. Then, the long-run variance
matrix of » + + 0 0 + +
t = (vt ; et ) , denoted by -» , is block diagonal; that is, -» =
diag(! v¢e ; -ee ), where
! v¢e = ! vv ¡ -ve -¡1
ee -ev ; (4.8)
is the conditional long-run variance of vt given et . Combining (4.7) with (4.1) we
have the modi…ed “static” cointegrating relation,
yt+ = ¹ + ±t + µ0 xt + vt+ ; (4.9)
where yt+ = yt ¡ -ve -¡1ee ¢xt . There is still a bias term remaining in (4.9) because
of the correlation between xt and current and lagged values of vt+ , which is given
by ¢+ ¡1
ev = ¢ev ¡ ¢ee -ee -ev . Removing this bias leads to the Phillips-Hansen
fully-modi…ed OLS estimators,
2 + 3 8 2 39
¹
^T < 0 =
6 ^+ 7 ^+
0
4 ± T 5 = (ZT ZT )
¡1
Z0 y ^+ ¡ 4 0 5 T ¢ ev ; (4.10)
+ : T T ;
^
µT ¿ k
^T+
where ZT = (¿ T ; tT ; XT ), ¿ k is the k-dimensional column unit vector, and y
^ ev are consistent estimators of yt and ¢ev , respectively.
and ¢ + + +
[19]
Theorem 4.1. In the context of the ARDL speci…cation (3.19) or (4.4), the long-
run variance of the Phillips-Hansen modi…ed error process, vt+ in (4.9) (denoted
by ! v¢e ) is equal to ¾ 2´ =[Á(1)]2 , which is the spectral density at zero frequency of
[Á(L)]¡1 ´ t in (3.19).
[20]
Before discussing the simulation results, notice that when ! 12 = 0, the correct
speci…cation is the ARDL(1,0) model, and when ! 12 6= 0; it is the ARDL(1,2)
model. (See Section 3). But since in general the true order of the ARDL model is
not known a priori, we estimated 30 di¤erent ARDL models, namely ARDL(p; m),
p = 1; 2; :::; 5, m = 0; 1; 2; :::; 5, and used the Akaike Information Criterion (AIC),
and the Schwarz Criterion (SC) to select the orders of the ARDL model before
estimating the long-run coe¢cients and carrying out inferences. The estimates
obtained by these two-step procedures will be referred to as ARDL-AIC, and
ARDL-SC, respectively.
The simulation results are summarized in Tables 1a-1f and 2a-2f for Experi-
ments 1 and 2, respectively. Summary statistics included in these tables are:
The nominal size of the tests is set at 5 percent, and the number of replications
at R = 2; 500.16
Tables 1a-1f summarize the results for the correctly speci…ed ARDL model
(namely the ARDL(1,0) when ! 12 = 0, and the ARDL(1,2) for ! 12 6= 0), the
estimates based on ARDL-AIC and the ARDL-SC procedures, and the Phillips-
Hansen fully modi…ed estimators based on the Bartlett’s window for window sizes
0, 5, 10, 20 and 40, which are reported under PH(0), PH(5), etc.
In the case where ! 12 = 0, the bias of the ARDL estimators is much smaller
than that of the PH estimators. The extent of the bias crucially depends on the
value of Á, and not surprisingly increases as Á is increased from 0.2 in Table 1a
16
In a very small number of replications Á(1) was estimated to be in excess of 0.99. These
cases are not included in the summary results.
[21]
to 0.8 in Table 1d. Also the RMSE’s of the ARDL and the PH estimators are
very similar when Á = 0:2, but diverge considerably for Á = 0:8. As can be seen
from Table 1d, for T = 50, the RMSE of the ARDL estimators is about one-third
of the RMSE of the PH estimators. The empirical sizes of the ARDL procedure
are much more satisfactory than the ones obtained using the PH fully modi…ed
estimators. When ! 12 = 0, the sizes of the tests based on the ARDL estimators
are generally reasonable and much nearer to their nominal size of 5 percent, than
the sizes of tests based on the PH estimators.
Empirical sizes of the tests based on the ARDL estimators computed using the
¢-method tend to be much closer to their nominal values, than those computed
using the asymptotic formula. This is particularly so when T is small. Therefore,
in what follows, we shall focus on the ARDL test statistics that are computed
using the ¢-method.
Another general feature of the simulation results is the slight superiority of the
ARDL-SC method over the ARDL-AIC procedure; which is in accordance with
the fact that the SC is a consistent model selection criterion, while the AIC is
not. (See, for example, Lütkepohl (1991, Chapter 4)).
Finally, there is a clear tendency for the tests based on the PH method to
over-reject in small samples, and the extent of this over-rejection increases with
Á, and declines only slowly with the sample size, T . For example, for Á = 0:8
and T = 100, the empirical sizes of the t-tests based on the PH method exceed
41 percent for all the …ve window sizes, and even for T = 250 do not fall below
20 percent. (See the column headed “SIZE” in Table 1d). By contrast the size of
the test based on the ¢-method in Table 1d is reasonable even for T = 50. For
the correct ARDL(1,0) speci…cation, the size of the test based on the ¢-method
is 7.2 percent and increases to 12.8 and 8.6 percents for the ARDL-AIC and the
ARDL-SC procedures, respectively.
Similar results are obtained in the case where ! 12 = 0:5, and hence xt and ut
are contemporaneously correlated. The ARDL estimators are now substantially
less biased than the PH estimators. (See the column headed “BIAS” in Table
1e). Once again the performance of the PH estimators improves with the sample
size, but very slowly. For T = 250, the bias of the PH estimators for the most
favorable window size is still around -0.14, but the biases of the ARDL estimators
lie between -0.0017 and 0.0024. The size performance of the two test procedures
also closely mirrors these di¤erences in the degree of biases of the estimators.
The empirical size of the tests based on the PH method ranges between 60 to 85
percent for T = 50, and falls to around 21 percent for T = 250 and a window size
of 20. The size of the tests based on the ARDL procedure, when the ¢-method
is used to compute the variances, is at most 13 percent for T = 50, and lies in the
range 5.2 to 7.7 percent when T is increased to 250. (See Table 1e).
[22]
Due to the large size distortions of the PH procedure, the results presented in
Tables 1a-1f do not allow proper comparisons of the power properties of the two
test procedures. But for T = 250 where the size distortion of the PH test is not
too excessive, the ARDL procedure consistently outperforms the PH method. For
example, in the case of Á = 0:8, ! 12 = 0:5; µ = 5, and T = 250, the power of the
ARDL procedure in rejecting the false null hypothesis, µ = 0:95µ0 , is consistently
above 98 percent while the power of the PH method is at most 62 percent even
though its associated size is 85 percent! There seems also to be a tendency for the
power function of the ARDL procedure in the case where ! 12 6= 0 and T small
to be asymmetric around µ = µ0 ; showing a higher power for the alternatives
exceeding µ0 as compared to the alternatives falling below µ 0 .
The results for Experiments 2 with an I(0) regressor are summarized in Tables
2a-2f. These results are very similar to those obtained for Experiments 1. The
overall performances of the ARDL-based methods with variances estimated using
the ¢-method are satisfactory for most cases, though slightly worse than those
obtained for Experiments 1. (In particular, the biases are slightly larger and the
tests are less powerful.) But, the performance of the PH estimators are still very
poor, especially when T is small.
Overall, the simulation results show that the ARDL-based estimation proce-
dure based on the ¢-method developed in the paper can be reliably used in small
samples to estimate and test hypotheses on the long-run coe¢cients in both cases
where the underlying regressors are I(1) or I(0). This is an important …nding since
the ARDL approach can avoid the pretesting problem implicitly involved in the
cointegration analysis of the long-run relationships. (Also see Cavanaugh et. al.
(1995) and Pesaran (1997).)
Before concluding this section, we note that the comparison of the small sam-
ple performance of the ARDL-based and the PH estimators is not comprehensive
in the sense that the data generating process we have used is biased in favor of the
ARDL procedure (see Inder (1993)). In this regard, it is more appropriate to con-
sider the relative performances of the ARDL and the PH estimators using more
general DGP’s, such as (4.1) and (4.2), that can allow for moving average error
processes. In the working paper version of this paper we also considered Monte
Carlo experiments using (4.1) and (4.2) as data generating processes. In one set of
experiments (called DGP 2) we used …rst-order bivariate vector moving-average
processes to generate the errors, vt and et , and in another set of experiments
(called DGP 3) we generated vt and et according to …rst-order vector autoregres-
sive processes. Neither of these DGP’s allows transformations of the model so
that xt could become strictly exogenous with respect to the disturbances of the
augmented ARDL model. We found that the simulation results based on these
DGP’s are less clear-cut, but the ARDL-based estimator using the ¢-method
[23]
still outperforms the PH estimator in most experiments, especially for small T .
Broadly speaking, the relative small sample performance of the two estimators
seems to depend on the signal-to-noise ratio, V ar(et )=V ar(vt ), with the ARDL
approach dominating the PH method when this ratio is low, and vice versa. This
is clearly an area for further research.17
6. Concluding Remarks
The theoretical analysis and the Monte Carlo results presented in this paper pro-
vide strong evidence in favor of a rehabilitation of the traditional ARDL approach
to time series econometric modelling. The focus of this paper, however, has been
exclusively on single equation estimation techniques and the important issue of
system estimation is not addressed here. Such an analysis inevitably involves
the problem of identi…cation of short-run and long-run relations and demands
a structural approach to the analysis of econometric models. The problem of
long-run structural modelling in the context of an unrestricted VAR model has
been addressed elsewhere. (See, for example, Johansen (1991), Phillips (1991)
and Pesaran and Shin (1995)). An alternative procedure, which takes us back to
the Cowles Commission approach, would be to extend the ARDL methodology
advanced in this paper to systems of equations subject to short-run and/or long-
run identifying restrictions. (See, for example, Boswijk (1995) and Hsiao (1995).)
We hope to pursue this line of research in the future; thus establishing a closer
link between the recent cointegration analysis and the traditional simultaneous
equations econometric methodology.
17
We are grateful to Peter Boswijk and an anonymous referee for drawing our attention to
this point.
[24]
Appendix: Mathematical Proofs
p a
For notational convenience we use “!”, “)” and “»” to signify the convergence
in probability, the weak convergence in probability measure, and the asymptotic
equality in distribution. All sums are over t = 1; 2; :::; T .
In the case where the regressors are stationary the usual method of deriving
the asymptotic distribution of the OLS estimators of the short-run parameters in,
for example, (2.1), would be to apply the Slutsky’s theorem to (P0ZT PZT )¡1 and
P0ZT uT , separately, where PZT = (¿ T ; tT ; XT ; yT ¡1 ); after appropriately scaling
it by the sample size. (The appropriate scaling of P0ZT PZT in this case is given
1 3
by DPT PZT P0ZT DPT where DPT = Diag(T ¡ 2 ; T ¡ 2 ; T ¡1 Ik ; T ¡1 ):) This procedure
cannot, however, be applied to dynamic time series models with trended regres-
sors (irrespective of whether the trends are stochastic or deterministic), because
P0ZT PZT does not converge to a non-singular matrix even if the individual elements
of P0ZT PZT are appropriately scaled by the sample size.
In what follows the asymptotic theory will be developed using the partitioned
regression techniques and then writing individual elements of the OLS estimators
of the short-run parameters as ratios of random variables, thus avoiding the need
to apply the Slutsky’s theorem to (P0ZT PZT )¡1 directly.
Since Theorems 2.1 - 2.4 are special cases of Theorems 3.1 and 3.2, and can
be proved in a similar manner, we omit their proofs to save space.
Proof of Theorem 3.1.
Before deriving the asymptotic distributions of the OLS estimators of the short
run parameters in (3.9) we derive some preliminary results. De…ne
1
qKT uT = T ¡ 2 K0T uT ; QKT = T ¡1 K0T KT ;
· ¸ · ¸
DZT Z0T uT qZT uT
qGT uT = DGT G0T uT = 1 = ;
T ¡ 2 WT0 uT qWT uT
· ¸ · ¸
DZT Z0T KT qZT KT
qGT KT = DGT G0T KT = 1 = ;
T ¡ 2 WT0 KT qWT KT
· 1 ¸ · ¸
DZT Z0T ZT DZT T ¡ 2 DZT Z0T WT QZT QZT WT
QGT = DGT G0T GT DGT = = ;
1
T ¡ 2 WT0 ZT DZT ¡1
T WT WT 0 Q0Z W QWT
T T
where KT = (·1T ; ·2T ; :::; ·pT ) with ·iT = (·i1 ; ·i2 ; :::; ·iT )0 for i = 1; :::; p; DGT =
1 3 1 1 3
Diag(T ¡ 2 ; T ¡ 2 ; T ¡1 Ik ; T ¡ 2 Ikq ) and DZT = Diag(T ¡ 2 ; T ¡ 2 ; T ¡1 Ik ): Using the
results in Phillips and Durlauf (1986), it is easily seen that as T ! 1,
p p
qKT uT ! qKu ; QKT ! QK ; (A.1)
[A.1]
· ¸ · ¸
qZu qZK
qGT uT ) qGu = ; qGT KT ) qGK = ; (A.2)
qW u qW K
· ¸
QZ 0
QGT ) QG = ; (A.3)
0 QW
where qKu , qW u , qW K , QK and QW are (…nite) probability limits of qKT uT , qWT uT ,
qWT KT , QKT and QWT , respectively, and qZu , qZK and QZ are functionals of
Brownian motions given by
2 3 2 3
B (1) B (1)
R1 u R1 K
qZu = 4 rdBu (r) 5 ; qZK = 4 rdBK (r) 5 ;
R1 00 R1 00
0
Be (r)dBu (r) 0
Be (r)dBK (r)
2 R1 3
1
1 B (r)dr
6 1
2
1
R 10 e 7
QZ = 4 rBe (r)dr 5 :
R1 0 2 R1 3 R1 0
0
Be (r)dr 0 rB0e (r)dr 0 B0e (r)Be (r)dr
Bu (r) is the scalar Brownian motion process with variance equal to r times ¾ 2u
(since ut is not serially correlated), Be (r) is a k-dimensional Brownian motion on
r 2 [0; 1] with variance equal to r times the long-run variance of et ; and BK (r)
is the p-dimensional Brownian motion on [0,1] with variance equal to r times the
long run variance of (·1T ; ·2T ; :::; ·pT ). The long-run variance of a stochastic
process is given by 2¼ multiplied by the spectral density of the process at zero
frequency. Notice that QZ (or QG ) is of the full column rank by assumption (A4),
and the elements in QZ involving Be (r) are random even asymptotically.
Since ·1T ; ·2T ; :::; ·pT ; and 1; t; xt ; ¢xt ; ¢xt¡1 ; :::; ¢xt¡q+1 are all distrib-
uted independently of ut such that BK (r) and Be (r) are independent of Bu (r), it
follows that ¡ ¢ ¡ ¢
a a
qKu » N 0; ¾ 2u Q· ; qGu » M N 0; ¾ 2u QG ; (A.4)
where M N denotes the mixture normal distribution. For details concerning the
theory of the mixture normal distribution see, for example, Phillips (1991). How-
ever, this (mixture) normality result does not hold in the case of qGK , because xt
and ¢xt¡i ’s (i = 0; :::; q ¡ 1) are correlated with ·it , i = 1; :::; p.
The OLS estimators of f and Á in (3.9), denoted by ^ fT and Á ^ T , satisfy the
relations,
^ T ¡ Á = (YT0 MG YT )¡1 (YT0 MG uT ) ;
Á (A.5)
h T ³T ´i
^
fT ¡ f = (G0T GT )
¡1
G0T uT ¡ G0T YT Á ^T ¡ Á ; (A.6)
where MGT = IT ¡ GT (G0T GT )¡1 G0T with IT being the T £ T identity matrix.
Using (3.7), YT can be expressed as
YT = GT ¡ + KT ; (A.7)
[A.2]
where 2 3
¹1 ¹2 ¢ ¢ ¢ ¹p
6 ± ± ¢¢¢ ± 7
¡=6
4 µ µ
7;
¢¢¢ µ 5
g1 g2 ¢ ¢ ¢ gp
0 0 0
and gi = (gi0 ; gi1 ; :::; gi;q¡1 )0 is a kq £ 1 vector of parameters. Using (A.7) we have
¡1
YT0 MGT YT = K0T KT ¡ K0T GT (G0T GT ) G0T KT ;
¡1
YT0 MGT uT = K0T uT ¡ K0T GT (G0T GT ) G0T uT ;
where we used G0T MGT = 0: Using (A.1) - (A.3), it can be shown that as T ! 1,
p
T ¡1 (YT0 MGT YT ) = QKT + op (1) ! QK ; (A.8)
1 p
T ¡ 2 (YT0 MGT uT ) = qKT uT + op (1) ! qKu : (A.9)
p
Multiplying (A.5) by T , and using (A.8), (A.9) and (A.4), we obtain (3.10).
Next, substituting YT from (A.7) in (A.6), we obtain
³ ´ ³ ´
^ ¡1
fT ¡ f = (G0T GT ) G0T uT ¡ ¡ Á ^ T ¡ Á ¡ (G0T GT )¡1 G0T KT Á ^ T ¡ Á : (A.10)
De…ne ³ ´ ³ ´
dT = ^
fT ¡ f + ¡ Á^T ¡ Á : (A.11)
Multiplying (A.11) by D¡1GT , using (A.1) - (A.3) and (A.10), and applying the
continuous mapping theorem (see, for example, Phillips and Durlauf (1986)), it
follows that
D¡1 ¡1 ¡1
GT dT = QGT qGT uT + op (1) ) QG qGu : (A.12)
Since qGu is shown to be mixture normal in (A.4), hence
a ¡ ¢ 1
a ¡ ¢
Q¡1G qGu » M N 0; ¾ 2 ¡1
u QG ; Q 2
GT D¡1
GT d T » N 0; ¾ 2
u I k+kq+2 :
1
Next, pre-multiplying (A.12) by the diagonal matrix, Diag(1; T ¡1 ; T ¡ 2 Ik ; Ikq ),
we have
2 3
1 0 0 0
p 6 0 T ¡1 0 0 7
T dT = 6 4 0 0
7 Q¡1 qG u + op (1) (A.13)
T 2 Ik 0 5 GT T T
1
¡
0 0 0 Ikq
2 3 8 2 11 39
1 0 0 0 >
> Q Z 0 0 0 >
>
6 0 0 0 0 7 ¡1 < 6 7=
6 7 a 6 0 0 0 0 7
) 4 Q q » M N 0; 4 5> ;
0 0 0 0 5 G Gu >
> 0 0 0 0 >
: ;
0 0 0 Ikq 0 0 0 Q¡1 W
[A.3]
¡1
where Q11Z is the (1,1) element of QZ . The above result can be rewritten sepa-
¤
rately for ® ^ T as
cT and ¯
^ 0T ; ^
p ¡ ¢p ³ ´
T (^
®0T ¡ ®0 ) + ¹1 ; ¹2 ; :::; ¹p T Á ^ T ¡ Á = dZu;1 + op (1); (A.14)
p p ³ ´
cT ¡ c) + ¸¿ 0p T Á
T (^ ^ T ¡ Á = op (1); (A.15)
p ³ ¤ ´ p ³ ´
^ T ¡ ¯ ¤ + (g1 ; g2 ; :::; gp ) T Á
T ¯ ^ T ¡ Á = Q¡1 qW u + op (1); (A.16)
W
where ¿ p is a p £ 1 vector of unity and dZu;1 is the …rst element of Q¡1 Z qZu . Using
(3.10) in (A.15) we obtain (3.11). It is also clear p from above results that the
¤
OLS estimators of ®0 and ¯ (standardized by T ) have the (mixture) normal
distributions asymptotically.
Finally, using (3.10), (3.11), and (A.13)-(A.16), it is easily seen that a consis-
tent estimator of the variance of h ^T is given by V^ (h
^T ) = ¾
^ 2uT (P0GT PGT )¡1 with
the rank of V^ (h
^T ) being equal to kq + 2.
Proof of Theorem 3.2.
Partition dT = (aT ; s0T ; wT0 )0 conformably to GT = (¿ T ; ST ; WT ); then sT is
given by
p p ³ ´
sT = T (^ cT ¡ c) + ¸¿ 0p T Á^T ¡ Á : (A.17)
Let
qS~T uT = DST S0T HT uT ; QS~T = DST S0T HT ST DST ;
3
where DST = Diag(T ¡ 2 ; T ¡1 Ik ). Then, it is also easily seen that as T ! 1,
" R #
1 1
(r ¡ )dB u (r)
qS~T uT ) qSu ~ = R0 1 0 2 (A.19)
~ e (r)dBu (r) ;
B
0
" R1 #
1 1 ~
(r ¡ )B (r)dr
QS~T ) QS~ = R1 R0 1 0 2 e (A.20)
~ e (r)dr ;
12
1 ~0 ~ e (r)B
0
(r ¡ 2
)Be (r)dr 0
B
[A.4]
R
where B ~ e (r) = Be(r) ¡ 1 Be(r)dr is a k-dimensional demeaned Brownian motion
0
on [0; 1]. Since B~ e (r) is also distributed independently of Bu (r), we obtain as in
(A.4), ¡ ¢
a 2
qSu
~ » M N 0; ¾ u QS~ : (A.21)
1
Multiplying (A.18) by the diagonal matrix, Diag(D¡1
ST ; T ), using (A.19)-(A.21)
2
we obtain ³ ´
a
D¡1
ST sT ) Q¡1
S~
q ~
Su » M N 0; ¾ 2 ¡1
Q
u S~ ;
and therefore,
1
a ¡ ¢
QS2~ D¡1 2
ST sT » N 0; ¾ u Ik+1 : (A.22)
T
^T ¡ ¸ = sT
¸ : (A.23)
^ T (1)
Á
1
p
Multiplying (A.23) by QS2~ D¡1 ^
T
ST , using (A.22) and noting that ÁT (1) ! Á(1); we
obtain (3.14).
Proof of Theorem 3.3 can be established in a similar manner and is omitted to
save space.
Proof of Theorem 4.1.
Consider the dynamic ARDL(p; m) model (3.19) (or (4.4)), and its static coun-
terpart (4.1). Applying the decomposition Á(L) = Á(1) + (1 ¡ L)Á¤ (L) to (3.19)
we have
®0 0 ¼0 (L) ´t Á¤ (L)
yt = + ±t + µ xt + ¢xt + ¡ ¢yt : (A.24)
Á(1) Á(1) Á(1) Á(1)
Substituting for ¢yt = ± + µ0 ¢xt + ¢vt from (4.1) in (A.24), we have
¼ 0 (L) ´ Á¤ (L) 0
yt = ¹ + ±t + µ0 xt + ¢xt + t ¡ (µ ¢xt + ¢vt ) : (A.25)
Á(1) Á(1) Á(1)
[A.5]
h ¤ ¤
i
0 (L)µ 0
De…ning kt = (´t ; vt ; ¢x0t )0 = (´t ; vt ; e0t )0 , and ª(L) = 1
Á(1)
; ¡Á (L)(1¡L)
Á(1)
; ¼ (L)¡Á
Á(1)
,
then the spectral density of vt = ª(L)kt is given by
where 2 3
¾ 2´ ¾ ´v 0
V ar(kt ) = 4 ¾ 0´v ¾ 2v §ve 5 :
0 §0ve §ee
Hence, the spectral density of vt at zero frequency is given by
+ +0
¾ 2´
2¼fv+ v+ (0) = ª (0)V ar(k+
t )ª (0) = :
[Á(1)]2
¾ 2´
2¼fv+ v+ (0) = B-» B0 = ! vv ¡ -ve -¡1
ee -ev = :
[Á(1)]2
[A.6]
References
[1] Banerjee, A., J. Dolado, D. Hendry and G. Smith (1986), “Exploring Equi-
librium Relationships in Economics through Statistical Models: Some Monte
Carlo Evidence,” Oxford Bulletin of Economics and Statistics, 48: 253-277.
[5] Cavanaugh, C.L., G. Elliott and J.H. Stock (1995), “Inference in Models with
Nearly Integrated Regressors,” Econometric Theory, 11: 1131-1147.
[6] Engle, R.F. and C.W.J. Granger (1987), “Cointegration and Error Correction
Representation: Estimation and Testing,” Econometrica, 55: 251-276.
[7] Hendry, D., A. Pagan and J. Sargan (1984), “Dynamic Speci…cations‘” Chap-
ter 18 in Handbook of Econometrics, Vol II (ed., Z. Griliches and M. Intrili-
gator), North Holland
[13] Pesaran, M.H. (1997), “The Role of Economic Theory in Modelling the Long-
Run,” The Economic Journal, 107: 178-191.
[R.1]
[14] Pesaran, M.H. and B. Pesaran (1997), Micro…t 4.0: Interactive Econometric
Analysis, Oxford University Press (forthcoming).
[15] Pesaran, M.H. and Y. Shin (1995), “Long-Run Structural Modelling,” un-
published manuscript, University of Cambridge.
[16] Pesaran, M.H., Y. Shin and R.J. Smith (1996), “Testing for the Existence of
a Long-Run Relationship,” DAE Working Papers Amalgamated Series, No.
9622, University of Cambridge.
[18] Phillips, P.C.B. and S.N. Durlauf (1986), “Multiple Time Series Regression
with Integrated Processes,” Review of Economic Studies, 53: 473-496.
[20] Phillips, P.C.B. and M. Loretan (1991), “Estimating Long Run Economic
Equilibria,” Review of Economic Studies, 58: 407-436.
[21] Phillips, P.C.B. and V. Solo (1992), “Asymptotic for Linear Processes,” An-
nals of Statistics: 971-1001.
[24] Stock, J.H. and M.W. Watson (1993), “A Simple Estimator of Cointegrating
Vectors in Higher Order Integrated Systems,” Econometrica, 61: 783-820.
[25] Wickens, M.R. and T.S. Breusch (1988), “Dynamic Speci…cation, the Long
Run Estimation of the Transformed Regression models,” The Economic Jour-
nal, 98: 189-205.
[R.2]
Modèles à retards distribués et modèles ARDL
Christophe Hurlin
Abstract
Cette note propose une brève présentation des modèles à retards distribués en général
et des modèles de type Autoregressive Distributed-lagged model (ou ARDL) en particulier.
L’objectif est de comprendre la spéci…cité et les avantages des modèles ARDL en les remet-
tant en perspective par rapport aux modèles dynamiques à retards distribués. Dans une
première section, nous présentons les modèles à retards distribués non contraints. La sec-
onde section est consacrée aux modèles restreints (linéaire, géométrique, etc.) et notamment
aux modèles polynomiaux d’Almon. La troisième section présente les modèles avec variable
dépendante retardée : modèles de Koyck, AR-X, et ARDL. La dernière section décrit les
procédures d’estimation de ces di¤érents modèles sous les logiciels R et SAS.
Université d’Orléans (LEO, FRE CNRS 2014). Cette note a été rédigée dans le cadre de la préparation des
étudiants du master ESA de l’Université d’Orléans au challenge DRIM game (Deloitte - RCI Bank) 2018.
1
1 Introduction
Les modèles à retards distribués (ou à retards échelonnés) sont des modèles dynamiques de
séries temporelles. Ils ont pour particularité que la dynamique de la variable dépendante y soit
expliquée par des valeurs contemporaines et retardées d’une ou plusieurs variables explicatives
x. Le principal avantage de ces modèles est qu’ils autorisent une dynamique plus riche (com-
parativement à un modèle linéaire simple sans retard sur les variables explicatives) des e¤ets
marginaux des variables x sur la variable dépendante. On peut ainsi distinguer la notion d’e¤ets
marginaux dynamiques de court terme, qui représentent l’impact instantané de la variable con-
temporaine xt (ou retardée xt s ) sur yt , de l’e¤et cumulatif de long terme de x sur la variable
dépendante y.
De façon générale on oppose les modèles à retards distribués …nis et in…nis, suivant que
l’on considère un nombre …ni ou in…ni de valeurs retardées pour la variable explicative. Bien
évidemment, seuls les modèles à retards …nis (…nite distributed lag models) peuvent être estimés
en pratique. Toutefois, même lorsque l’on considère un nombre …ni et relativement peu impor-
tant de retards, l’estimation de ce type de modèle par MCO ou MCG peut poser problème.
En e¤et, il est fréquent que les valeurs retardées xt ; xt 1 ; : : : ; xt q soient fortement corrélées,
induisant un problème de multi-colinéarité dans le modèle de régression. Les estimations des
coe¢ cients par MCO sont alors peu …ables et peuvent notamment prendre des valeurs aber-
rantes. De plus, l’estimation de ces modèles requiert des échantillons de taille importante étant
donné le potentiellement grand nombre de paramètres à estimer suivant le nombre de retards q
considérés pour la variable exogène.
A…n de palier à ces problèmes, deux types de solutions ont été considérés dans la littérature.
La première solution a consisté à imposer des restrictions sur les coe¢ cients associées aux
valeurs retardées xt ; xt 1 ; : : : ; xt q de la variable explicative (Almon, 1965; Smith and Giles,
1976; Madinier et Mouillart, 1983). On obtient alors des modèles à retards distribués contraints
(restricted distributed lag models). Ces restrictions peuvent être de formes très di¤érentes, mais
elles ont toutes pour objectif (i) de limiter le nombre de paramètres à estimer, (ii) de limiter
les potentiels problèmes de quasi-colinéarité, et (iii) de conduire à des pro…ls temporels d’e¤ets
marginaux pouvant être justi…és sur le économique. Concernant ce dernier point, le principal a
priori que l’on peut avoir vis-à-vis des e¤ets marginaux est que l’e¤et instantané de la variable
xt s sur le niveau de yt diminue avec le temps, mais pas nécessairement de façon uniforme.
Plusieurs modèles restreints ont été proposés a…n de satisfaire ces trois objectifs. On peut
citer le modèle avec décroissance linéaire des paramètres retard et le modèle avec distribution
géométrique des retards (geometric distributed lag model ). Mais le modèle le plus utilisé est
sans aucun doute le modèle à retards polynomiaux (polynomial distributed lag model ) ou modèle
d’Almon (1965). L’idée consiste à postuler que le paramètre associé à la variable retardée xt s
est une fonction (inconnue) du décalage s, et que cette fonction peut être approximée par un
polynôme d’ordre p; avec généralement p << q. Il su¢ t alors d’estimer les paramètres de ce
polynôme pour retrouver les coe¢ cients associés aux variables retardées xt s . On peut ainsi
réduire la dimension du problème et limiter les risques de quasi-colinéarité.
La seconde solution consiste à introduire des valeurs retardées de la variable dépendante. On
aboutit ainsi à une représentation de type AR(p) sur yt , augmentée des valeurs contemporaines
et passées d’une variable exogènes xt . L’exemple le plus simple est le modèle de Koyck (1954).
Ce modèle linéaire très simple explique le niveau de yt par une constante, la valeur retardée
yt 1 et le niveau contemporain d’une variable explicative xt . Notons que dans le modèle de
2
Koyck, aucun retard n’est introduit sur la variable explicative xt , ce qui exclut tout problème
de colinéarité. Quel est l’avantage de ce modèle ? En inversant le polynôme autorégressif associé
à yt 1 , on peut montrer que cette représentation est équivalente à un modèle à retards distribués
de dimension in…nie, avec une décroissance géométrique des poids. Ainsi, le modèle de Koyck
est équivalent à une représentation dans laquelle la variable yt est expliquée par les variables
xt ; xt 1 ; xt 2 ; xt 3 ; : : : ; x 1 , et pour autant l’estimation de ce modèle (qui suppose simplement
de régresser yt sur yt 1 et xt ) ne pose pas de problème lié à la corrélation entre les valeurs
retardées.
Dans la terminologie de Box et Jenkins (1976), le modèle de Koyck s’apparente à un modèle
de type AR(1)-X, où la lettre X indique la présence de la variable exogène xt dans l’équation
d’espérance conditionnelle de yt . Bien évidemment, ce modèle peut être étendu à une représen-
tation de type AR(p)-X, incluant non plus une seule valeur retardée yt 1 , mais p valeurs yt 1 ;
yt 2 ; : : : ; yt p . Toutefois, le modèle de Koyck et son extension présentent un important défaut
lorsque l’on considère plus d’une variable exogène. Dans ce cas, la décroissance des coe¢ cients
retards (e¤ets marginaux de court terme) avec le décalage temporel est identique pour toutes
les variables explicatives. Par exemple, les impacts dynamiques sur yt de deux variables ex-
plicatives x1;t s et x2;t s sont supposés évoluer de la même façon avec le décalage s. Une telle
hypothèse est problématique car elle ne correspond généralement à aucune théorie, ni à au-
cune observation empirique. Le modèle ARDL (autoregressive distributed lag model ) permet de
répondre à cette critique. Formellement, ce modèle permet d’introduire à la fois des retards sur
la variable dépendante et sur la variable exogène. Ce faisant l’e¤et marginal de la variable xt
sur yt est déterminé par le ratio de deux polynômes retard (d’où l’appellation alternative de
rational lag model ), le premier étant spéci…que à la variable xt ; le second à celui de la variable
dépendante. Dès lors, deux variables exogènes, associées à deux polynômes retards, n’ont pas
nécessairement le même impact dynamique sur la variable endogène.
Tous ces modèles peuvent être estimés assez facilement grâce à di¤érentes procédures, que ce
soit sous les logiciels SAS, Eviews, Matlab, et R. Dans cette note nous donnerons les principales
fonctions pour SAS et R.
Le plan de cette note est structuré de la façon suivante. Dans une première section, nous
présenterons les modèles à retards distribués non contraints. Dans une seconde section, nous
présenterons les modèles restreints (linéaire, géométrique, etc.) et notamment les modèles poly-
nomiaux d’Almon. La troisième section sera consacrée aux modèles avec variable dépendante
retardée : modèles de Koyck, AR-X, et ARDL. La dernière section présentera les procédures
d’estimation de ces di¤érents modèles sous R et SAS.
3
De…nition 1 Un modèle à retards échelonnés linéaire s’écrit sous la forme
q
X
yt = + (L) xt + "t = + s xt s + "t (1)
s=0
où f"t ; t 2 Zg est un bruit blanc faible, L désigne l’opérateur retard, (L) un polynôme retard
P
d’ordre q avec (L) = qs=0 s Ls et q 6= 0.
Notons que cette valeur est …nie à la condition que les paramètres s véri…ent
1
X
j sj <1 (4)
s=0
Supposons à présent que la valeur de la variable x change à la période t. On peut alors distinguer
son e¤et immédiat sur yt (multiplicateur d’impact ou multiplicateur de court-terme) de son e¤et
cumulé sur la valeur d’équilibre de y. Le multiplicateur d’impact mesure l’e¤et immédiat d’une
variation marginale de xt sur yt . Formellement, ce multiplicateur est dé…ni par :
@yt @yt+s
Multiplicateur dynamique de court terme = = = s (5)
@xt s @xt
2
Pour une discussion détaillée des modèles à retard échelonnés, de leur spéci…cation et de leur estimation, voir
l’ouvrage de synthèse Dhrymes (1971).
4
Le multiplicateur de long-terme est quant à lui dé…ni par
X 1
@y
Multiplicateur de long-terme = = s (6)
@x
s=0
Par exemple, considérons un modèle à retards échelonnés et …nis d’ordre 2 tel que
Supposons que la variable x augmente de façon transitoire d’une unité à la date t, puis revient
à son niveau initial à la date t + 1. Dans ce cas, yt augmente à la date t de 6 unités, puisque les
valeurs xt 1 , xt 2 et "t sont inchangées et que @yt =@xt = 6. A la date t + 1, la valeur de yt+1
diminuera de 2 unités puisque @yt =@xt 1 = 2. Ainsi la quantité @yt =@xt s mesure l’impact
dynamique d’un changement marginal de xt sur les valeurs successives de yt ; yt+1 ; yt+2 , etc.
Supposons à présent que la variable xt augmente de façon permanente d’une unité à partir
de la date t.
0 si s < t
xs = (8)
1 si s t
A la date t, yt augmente de 6 unités tout comme dans le cas précédent. Mais à la date t + 1,
yt+1 augmente de @yt =@xt + @yt =@xt 1 = 6 2 = 4 unités. La limite de cet e¤et cumulatif est
déterminée par la somme des coe¢ cients retards, c’est à dire
@yt @yt @yt
+ + =6 2+3=7 (9)
@xt @xt 1 @xt 2
L’e¤et marginal de long terme de la variable x sur la valeur d’équilibre de y est donc égal à 7
unités.
Comme dans le cas d’un modèle linéaire simple, les paramètres s peuvent être estimés par la
méthode des moindres carrés ordinaires (MCO) ou la méthode des moindres carrés généralisés
(MCG), en supposant que la variable x est strictement exogène. L’interprétation des coe¢ -
cients s renvoie à l’analyse des e¤ets marginaux présentés précédemment. L’avantage de cette
spéci…cation réside dans le fait qu’aucune restriction n’est imposée a priori sur les paramètres
s , et donc sur les e¤ets dynamiques de x sur y.
Toutefois, l’estimation des paramètres d’un modèle à retards distribués pose deux principaux
problèmes. Le premier problème est celui de la multicolinéarité. Même dans le cas d’une variable
explicative x stationnaire, il est fréquent d’observer de fortes autocorrélations entre les valeurs
xt et xt s ; notamment aux premiers ordres. Or de fortes corrélations entre les variables xt , xt 1 ,
xt 2 ,: : : ; xt q se traduit dans le modèle de régression de l’équation (10) par un problème de
quasi-multicolinéarité3 . Le niveau élevé de corrélation entre les régresseurs peut conduire à des
3
La multi-colinéarité au sens strict impliquant que la matrice des régresseurs X = (xt : xt 1 : : : : : xt q ) n’est
pas de plein rang q + 1; i.e. que certaines colonnes peuvent s’écrire comme une combinaison linéaire exacte des
autres colonnes de la matrice. Par conséquent, la matrice X 0 X n’est pas inversible. Dans le cas d’une quasi
multi-colinéarité, la matrice X 0 X est inversible mais son déterminant est très proche de 0.
5
estimations des coe¢ cients peu …ables4 avec des variances et des écart types très importants.
L’estimation du modèle à retards échelonnés pose un second problème lorsque l’ordre des
retards q est relativement grand comparé à la taille d’échantillon disponible pour estimer les
paramètres du modèle. En e¤et, si la taille d’échantillon est égale à T , compte tenu des retards
on ne dispose au …nal que de T q observations pour estimer les q + 2 paramètres du modèle
(y compris la constante), soit T 2q 2 degrés de liberté. Chaque fois que l’on augmente
le retard q d’une unité, on perd ainsi deux degrés de liberté : un parce qu’il faut estimer un
paramètre de plus et un autre parce que la taille d’échantillon e¤ectivement disponible se réduit
d’une observation. L’estimation peut donc s’avérer peu précise si la taille T est relativement
faible comparée au nombre maximum de retard q. Il n’y pas de règle absolue concernant le
nombre de degrés de liberté requis pour garantir à la fois la convergence des estimateurs et la
pertinence du résultat théorique de normalité asymptotique utilisé pour l’inférence. Toutefois,
on peut convenir qu’en dessous de 50 degrés de liberté, il convient d’être prudent quant à
l”interprétation des résultats d’estimation. Bien évidemment, ce problème n’est pas spéci…que
au modèle à retards échelonnés et concerne l’ensemble des modèles dynamiques (AR, MA,
ARIMA, etc.).
En résumé, le modèle à retards distribués à ordre …ni (…nite distributed lag model ) est
approprié pour estimer les relations dynamiques entre x et y lorsque (i) les paramètres s
diminuent assez rapidement avec l’ordre s jusqu’à zéro, (ii) la variable explicative xt est peu
auto-corrélée, et (iii) la taille de l’échantillon T est su¢ samment importante par rapport à
l’ordre des retards q.
3. Le modèle avec décalage polynomial distribué (polynomial distributed lag model ), connu
aussi sous le nom de modèle d’Almon (Almon distributed lag model ).
4
Une des manifestations possibles de ce problème de quasi-multicolinéarité est que l’on peut parfois obtenir
des coe¢ cients estimés b s qui prennent alternativement des valeurs positives et négatives très élevées en valeur
absolue, sans aucune explication économique valable. Ce type de comportement peut traduire la présence d’un
problème de quasi-multicolinéarité, mais ce n’est pas une règle absolue. Cela peut simplement traduire le fait
que les racines du polynôme retard B (L) sont des racines complexes.
6
3.1 Modèle avec décroissance linéaire des paramètres.
L’idée est que les paramètres 1 ; 2 ; 3 ; : : : ; s sont des fractions linéairement décroissantes du
multiplicateur de court terme 0 . Dans ce cas, on pose
q+1 s
s = 0 s = 1; : : : ; q (11)
q+1
Par exemple, si l’on pose q = 4 les paramètres s sont respectivement dé…nis par 1 = 4 0 =5;
2 = 3 0 =5; 3 = 2 0 =5; et 4 = 0 =5. Le modèle à retards échelonnés d’ordre …ni q s’écrit
alors sous la forme
Xq
q+1 s
yt = + 0 xt s + "t (12)
q+1
s=0
Dans cette spéci…cation, seuls les paramètres et 0 doivent être estimés. La procédure
d’estimation est alors extrêmement simple. Pour un ordre q donné, on construit la variable
explicative transformée zt dé…nie par
q
X q+1 s q q 1 1
zt = xt s = xt + xt 1 + xt 2 + ::: + xt q (13)
q+1 q+1 q+1 q+1
s=0
Puis, on régresse yt sur une constante et la variable zt par la méthode des MCO ou des MCG.
yt = + 0 zt + "t (14)
Dans ce modèle, l’e¤et cumulatif de long terme est alors égal à
q
X q+1 s q
0 = 0 1+ (15)
q+1 2
s=0
Par exemple, pour q = 4 on obtient un e¤et cumulatif de long terme égal à 0 + 1 + 2 + 3 + 4 =
3 0 . Notons que le modèle à décroissance linéaire peut être conçu comme un cas particulier
du modèle à distribution polynomiale de retards ou modèle d’Almon (1965) obtenu pour le cas
particulier s = 0 (cf. infra)
En…n, il est possible de considérer di¤érentes variantes de ce modèle. On peut par exemple
supposer que les poids s augmentent linéairement jusqu’à un pic à l’ordre m, puis décroissent
linéairement jusqu’à 0. Pour cela, il su¢ t de poser
jm sj
s = 0 1 s = 1; : : : ; 2m (16)
m+1
Par exemple pour m = 3, on obtient 0 = 3 =4, 1 = 2 3 =4; 2 = 3 3 =4; 3; 4 = 3 3 =4,
5 = 2 3 =4 et 6 = 3 =4.
7
De…nition 2 Le modèle à retards échelonnés d’ordre …ni q, avec distribution géométrique des
retards (geometric lag model), s’écrit sous la forme
q
X
s
yt = + 0 (1 ) xt s + "t (19)
s=0
Cette représentation peut être justi…ée comme une forme réduite d’un modèle d’anticipation
dans lequel la valeur de yt dépend de l’anticipation de la valeur future xt+1 obtenue avec
l’information disponible à la date t. Sous l’hypothèse d’anticipation adaptative, la forme réduite
de ce modèle correspond à l’équation (19). Voir Greene (2007) pour plus de détails.
s = g (s) s = 1; : : : ; q (20)
Dès lors, il est toujours possible d’approximer cette fonction par un pôlynome d’ordre p
2 p
s = g (s) ' 0 + 1s + 2s + ::: + ps (21)
De…nition 3 La modèle polynomial d’Almon postule une restriction sur les paramètres retard
s de la forme
Xq
yt = + s xt s + "t (22)
s=0
2 p
s = 0 + 1s + 2s + ::: + ps s = 0; 1; : : : ; q (23)
où les paramètres j ; j = 0; : : : ; p sont des constantes réelles véri…ant p 6= 0. Le modèle à
décalage retard polynomial distribué devient alors
q
X q
X q
X
yt = + 0 xt s + 1 sxt s + + p sp xt s + "t (24)
s=0 s=0 s=0
8
Une spéci…cation usuelle des lags d’Almon est la fonction quadratique, obtenue pour p = 2
et s = 0 + 1 s + 2 s2 . Comme le montre la …gure ci-dessous, la fonction quadratique permet
d’obtenir des pro…ls de coe¢ cients retards s su¢ samment variés pour capter un grand nombre
de con…gurations sur les e¤ets marginaux.
20
coefficient
15
10
5
0
0 1 2 3 4 5
s
=0.2 =1,2 =-0,2
0 1 2
3
s
2
coefficient
0
0 1 2 3 4 5
s
Estimation. La méthode d’estimation d’un modèle d’Almon est très simple. Pour un ordre
de retard q et un degré s du polynôme d’Almon donnés, on construit les variables explicatives
transformées suivantes :
q
X q
X q
X q
X
z0;t = xt s z1;t = sxt s z2;t = s2 xt s ::: zp;t = sp xt s (25)
s=0 s=0 s=0 s=0
Les paramètres ; 0 ; 1 ; : : : ; p peuvent alors être estimés par MCO ou MCG. A partir des
paramètres estimés b0 ; b1 ; : : : ; bp on peut alors reconstruire les estimateurs b 0 ; b 1 ; : : : ; b q des
coe¢ cients retard en utilisant la fonction polynomiale
b = b0 + b1 s + b2 s2 + : : : + bp sp s = 0; 1; : : : ; q (27)
s
9
On construit les 3 variables z0;t ; z1;t et z2;t telles que
z0;t = xt + xt 1 + xt 2 + xt 3 + xt 4 (30)
La valeur ajustée de yt peut alors s’écrire soit en fonction des variables transformées zs;t , soit
en fonction des variables explicatives retardées xt s de la façon suivante
La distribution des paramètres retards estimés peut parfois sembler contre-intuitive. On peut
par exemple obtenir des coe¢ cients retard qui s’écartent de zéro à l’extrémité ou qui prennent
des valeurs négatives au milieu. Une distribution de retards estimée non plausible peut être
la preuve d’une mauvaise spéci…cation du modèle et ne doit pas être ignorée. Si l’on souhaite
toutefois conserver la spéci…cation du modèle, il est possible de contraindre les coe¢ cients
s à avoir certaines propriétés en posant des contraintes sur les paramètres de la fonction
polynomiale. Par exemple, considérons le cas d’une fonction quadratique (p = 2) et supposons
que l’on souhaite que les poids s convergent régulièrement vers zéro et qu’ils s’annulent pour
un décalage q + 1, comme c’était le cas pour les décalages linéaires précédemment mentionnés.
On souhaite donc imposer la contrainte
Imposer cette contrainte sur les paramètres de la fonction polynomiale lors de l’estimation
permet ainsi d’obtenir des coe¢ cients retards estimés b s qui décroissent progressivement vers
0 lorsque les retards s approchent l’ordre maximum q. Pour une discussion plus approfondie
sur le choix de l’ordre du polynôme s, ses implications sur les pro…ls des coe¢ cients retard s ,
et sur les di¤érentes restrictions que l’on peut imposer sur ces paramètres, voir Smith et Giles
(1976).5
La méthode polynomiale d’Almon est donc très simple d’utilisation. Toutefois, elle présente
un inconvénient puisqu’elle nécessite non seulement la spéci…cation a priori du nombre de retards
q, mais aussi la spéci…cation du degré p du polynôme. Le choix de ce dernier paramètre est
particulièrement délicat et une mauvaise spéci…cation peut introduire un biais important lors
de l’estimation de certains coe¢ cients.
5
Pour une application des lags d’Almon dans un autre contexte que celui des modèles à retards échelonnés,
voir par exemple Banulescu, Candelon, Hurlin et Laurent (2016).
10
4 Modèles avec variable dépendante retardée
L’idée des modèles avec variable dépendante retardée est similaire à celle des modèles AR et
ARIMA : il s’agit d’utiliser une ou plusieurs valeurs retardées de y comme déterminant de la
valeur actuelle de yt . Le modèle le plus simple est le modèle de Koyck qui est fondé uniquement
sur la valeur retardée yt 1 et la valeur courante de l’explicative xt . Par inversion du polynôme
autorégressif, il est alors possible de montrer que ce modèle admet une représentation équivalente
sous forme de modèle à retards échelonnés in…nis à décroissance géométrique.
y t = + yt 1 + 0 xt + vt (36)
(1 L) yt = + 0 xt + vt (37)
0 1
yt = + xt + vt (38)
1 (1 L) (1 L)
1 2 3 P1 s
On rappelle que si j j < 1, on a (1 L) =1+ + + + ::: = s=0 . Dès lors, cette
équation peut se réécrire sous la forme
1
X 1
X
s s
yt = + 0 xt s + vt s (39)
1
s=0 s=0
Proposition 1 Le modèle de Koyck peut se réécrire sous la forme d’un modèle à retards in…nis
contraints, à décroissance géométrique, sous la forme
1
X
s
yt = + B (L) xt + "t = + 0 xt s + "t (40)
s=0
P1 s 1 P1 s
avec = = (1 ), "t = s=0 vt s, B (L) = 0 (1 L) = s=0 s, et s = 0 .
6
Le X mis à la …n de l’acronyme AR, MA, ARMA ou ARIMA signi…e que l’on ajoute à l’équation du modèle
une ou plusieurs variables explicatives supposées exogènes. Dans un modèle ARIMA-X il n’y a pas d’équation
auxiliaire pour décrire la dynamique de ces variables X exogènes, contrairement aux modèles VAR qui postulent
une dynamique jointe (endogène).
11
Un modèle de Koyck correspond donc à un modèle à retards in…nis, avec une distribution
géométrique des retards qui est dé…nie implicitement par inversion du polynôme retard autoré-
gressif 1 L. Pour rappel, un modèle avec distribution (in…nie) géométrique des retards s’écrit
sous la forme
1
X
yt = + e (1 0) s
xt s + "t (41)
s=0
Une des principales limites de la spéci…cation du modèle de Koyck réside dans son manque
de souplesse et de ‡exibilité lorsque l’on considère plus d’une variable explicative. Considérons
un modèle de Koyck avec deux variables explicatives x1t et x2t tel que
Les e¤ets marginaux dynamiques des variables x1t et x2t sur yt sont alors égaux à
@yt s @yt s
= 0 = 0 (45)
@x1t @x2t
On observe immédiatement que le modèle de Koyck impose que la vitesse de décroissance (avec
les décalages temporels) des e¤ets marginaux des variables x1t et x2t soit exactement identique.
Une telle hypothèse de symétrie du pro…l temporel des réponses dynamiques de la variable y
aux di¤érentes variables explicatives peut être problématique. C’est la principale justi…cation
aux modèles ARDL (cf. infra) : le fait d’introduire un polynôme retard spéci…que à chaque
variable explicative permet de di¤érencier la dynamique temporelles des e¤ets marginaux des
variables x1t et x2t .
12
remet en cause l’exogénéité faible du régresseur yt 1 . Mais on se heurte ici à un problème
de circularité : a…n de tester l’absence d’autocorrélation dans le terme d’erreur vt (et donc
l’exogénéité faible de yt 1 et, in …ne, la convergence de l’estimateur des MCO), on a besoin
des résidus vbt qui ont été construits à partir des estimateurs des MCO, potentiellement non
convergents.
Pour cette raison, l’estimation du modèle de Koyck et de ses extensions (ARDL, AR-X) se
fait parfois par la méthode des variables instrumentales pour tenir compte de l’endogénéité de
la variable yt 1 . C’est typiquement le cas sous R, avec la fonction koyckDlm du package dLagM
(Demirhan, 2018).
Extension du modèle de Koyck. Une extension naturelle du modèle de Koyck est le modèle
AR(p)-X qui les valeurs retardées de la variable dépendante pour des retards allant de 1 à p.
Ce modèle s’écrit simplement comme suit
yt = + 1 yt 1 + ::: + p yt p + 0 xt + vt (46)
Pour p = 1,on retrouve bien évidemment le modèle de Koyck. Ce modèle AR(p)-X peut s’écrire
de façon plus concise en utilisant un polynôme retard.
(L) yt = + 0 xt + vt (47)
Pp s
où vt est un bruit blanc faible et où le polynôme (L) véri…e (L) = 1 s=1 s L , avec
p 2 R . On suppose que les racines du polynôme (L) sont toutes situées en dehors du cercle
unité.
La condition sur les racines du polynôme (L) est une généralisation de la condition j j < 1
du modèle de Koyck. Par exemple, considérons un modèle AR(2)-X tel que :
5 1
y t = yt 1 yt 2 + xt + vt (48)
8 16
Le polynôme autorégressif (L) s’écrit (L) = 1 5=8 L + 1=16 L2 . Les racines de ce
polynôme, telles que ( 1 ) = ( 2 ) = 0; sont égales à 1 = 2 et 2 = 8. Leur module (leur
valeur absolue pour des valeurs réelles) est supérieur à l’unité. Les deux racines sont donc à
l’extérieur du cercle unité, ce qui garantit la stabilité du modèle.
Lorsque cette condition de stabilité n’est pas satisfaite, une variation marginale de x peut
conduire à une variation explosive de y. Dit autrement, la réponse dynamique de y à un choc
x est explosive. Une solution consiste alors à di¤érencier la variable y et à postuler un nouveau
modèle AR(p 1)-X sur la variation y = (1 L) y et non plus sur le niveau de y.
13
Spéci…cation et estimation des modèles ARDL. Un modèle ARDL(p; q) s’écrit sous la
forme
Xp q
X
yt = + s yt s + s xt s + vt (49)
s=1 s=0
yt = + 1 yt 1 + 2 yt 2 + 0 xt + 1 xt 1 + vt (50)
Le modèle ARDL(p; q) peut s’écrire de façon plus concise en utilisant deux polynômes retard
: un pour les retards sur la variable dépendante y (polynôme autorégressif) et l’autre pour les
retards sur la variable explicative x.
Tout comme nous l’avions fait pour le modèle de Koyck, nous pouvons réécrire ce modèle
sous la forme d’un modèle à retards échelonnés contraint par inversion du polynôme (L).
(L)
yt = + xt + vt = + B (L) xt + vt (52)
(1) (L)
Cette formulation explique pourquoi le modèle ARDL est parfois appelé modèle à retard ra-
tionnel7 ou rational lag model ( Jorgenson, 1966). La détermination des termes du polynôme
B (L) suppose d’inverser le polynôme (L). Il existe pour cela di¤érentes méthodes (cf. annexe
A).
Dans ce modèle, l’e¤et cumulatif de long terme de x sur y est égal à
1
X (1)
B (1) = s = (53)
(1)
s=1
Tout comme pour le modèles de Koyck et AR-X, la condition selon laquelle les racines du
polynôme (L) sont toutes situées en dehors du cercle unité garantit que l’e¤et dynamique
(e¤et cumulé de long terme) de x sur y est non explosif8 . Il convient de bien véri…er cette
condition sur les modèles estimés sous peine d’obtenir des e¤ets dynamiques non conformes à la
réalité économique. Dans le cas, où cette condition n’est pas véri…ée, il convient de di¤érencier
la variable yt et d’appliquer un nouveau modèle ARDL sur yt = (1 L) yt .
Tout comme pour les modèles ARMA, il existe plusieurs façons (non exclusives) de déter-
miner les retards maximum p et q des modèles ARDL :
7
Pour rappel, un nombre rationnel est un nombre qui peut s’exprimer comme le quotient de deux entiers
relatifs. Par analogie ici le polynôme retard B (L) s’écrit comme le ratio de deux polynômes
(L)
B (L) =
(L)
8
Pour plus de détails, voir Greene (2007), chapitre 19, section 19.4.3, consacrée à l’étude de la stabilité d’une
équation dynamique.
14
En testant la signi…cativité des paramètres p et q . Si l’hypothèse nulle de nullité de p
(respectivement q ) n’est pas rejetée, il convient de réduire l’ordre p (respectivement q).
En utilisant des critères d’information de type AIC et BIC. La meilleure spéci…cation des
retards maximum (p; q) est celle qui permet de minimiser les critères d’information, i.e.
de minimiser la MSE du modèles pour un nombre de paramètre à estimer le plus faible
possible.
yt = t + 1 yt 1 + ::: + p yt p + vt (56)
b T +1jT = + b T +1jT
0x + 1 xT + ::: + q xT q+1 (58)
| {z }
Prev
1. Les paramètres ; i ; j ne sont pas connus et doivent être estimées, ce qui engendre une
erreur d’estimation.
2. La valeur future de la variable explicative xT +1 n’est pas connue. Elle doit être prévue, ce
qui induit une erreur de prévision xT +1 x b T +1jT qui se répercute sur l’erreur de prévision
sur yT +1 .
3. Par dé…nition, la composante d’erreur de type bruit blanc vT +1 , ne peut pas être prévue
puisque E ( vT +1 j T ) = 0.
En général, la seconde source d’incertitude est négligée car on ne connait pas la forme ou les
propriétés de l’erreur de prévision sur xT +1 . La variance asymptotique de la prévision ybT +1jT ,
et donc de l’erreur de prévision sur yT +1 , dépend de façon classique de la matrice de variance
15
covariance des paramètres estimés et de la variance du terme d’erreur vt . Pour plus de détails,
voir Greene (2007).
Le même raisonnement peut être mené pour n’importe quel horizon h 1. Par exemple,
pour un horizon h = 2, la prévision dynamique de yT +2 conditionnelle à l’information T
disponible à la date T devient
b T +2jT = + b T +2jT
0x + b T +1jT
1x + 2 xT + ::: + q xT q+2 (60)
| {z } | {z }
Prev Prev
Dans ce cas, la prévision ybT +2jT nécessite de connaitre les prévisions de la variable x aux
b T +1jT et x
horizons h = 1 et h = 2, notées x b T +2jT . Les procédures de prévisions de R ou de SAS
pour les modèles ARDL nécessitent donc de donner les prévisions x b T +1jT ; x
b T +2jT ; : : : x
b T +hjT
pour toutes les variables explicatives (exogènes) du modèle. L’utilisateur doit donc construire
des modèles auxiliaires pour mener à bien ces prévisions ou faire des scenarios sur ces valeurs
futures.
5 Applications
Nous allons discuter ici brièvement les possibilités d’application de ces modèles sous le logiciel
R et sous le logiciel SAS.
Modèles à retard distribués d’ordre …ni (…nite distributed lag models) : fonction dlm
Modèles avec décalage polynomial distribués (polynomial (Almon) distributed lag models)
: fonction polyDlm
Modèles avec distribution géométrique des retards (geometric distributed lag models) avec
ou sans transformation de Koyck : fonction koyckDlm. Rappelons qu’un modèle avec
transformation de Koyck est équivalent à un modèle à retards distribués d’ordre in…ni
(in…nite distributed lag models).
16
5.2 Mise en oeuvre sous le logiciel SAS
Sous SAS, les modèles à retards échelonnés et leurs extensions peuvent être estimés à partir
de la procédure PROC PDLREG. Pour plus de détails, voir SAS (2014). Cette procédure est
essentiellement consacrée à l’estimation de modèles à retards polynomiaux d’Almon. Elle peut
être étendue pour introduire la variable dépendante retardée yt 1 grâce à l’instruction LAGDEP.
Attention dans ce cas, on obtient un modèle de Koyck avec un schéma de contraintes sur le
polynôme retard (L) déterminé par les polynômes d’Almon. C’est donc une procédure qui est
beaucoup plus spécialisée que le package dLagM de R. Mais elle permet facilement par exemple
de poser des restrictions lors de l’estimation des paramètres du polynôme d’Almon.
De façon automatique, les coe¢ cients estimés bi et b s sont a¢ chés comme le montre les
…gures ci-dessous9 . Dans cet exemple, la variable dépendante m est régressée sur 3 variables
explicatives (y; r et p) et une valeur retardée mt 1 . Pour la variable yt on considère q = 3 lags,
c’est-à-dire que l’on va introduire les régresseurs yt ; yt 1 ; yt 2 et yt 3 . Les 4 paramètres associés
0 ; 1 ; 2 ; 3 sont déterminés par un polynôme de degré 3, du type
2 3
s = 0 + 1s + 2s + 3s s = 0; : : : ; 3
Les paramètres estimés b0 ; b1 ; b2 et b3 sont reportés sur la …gure 2. Les paramètres estimés
b ; b ; b ; b sont reportés sur la …gure 3.
0 1 2 3
9
Les instructions de ce modèle sont les suivantes :
proc pdlreg data=a;
model m = lagm y(5,3) r(2, , ,…rst) p(3,2) / lagdep=lagm;
run;
17
Figure 3: Paramètres b s estimés
References
[1] Almon, S. (1965). The Distributed Lag Between Capital Appropriations and Expenditures.
Econometrica, 33 (1), pp. 178-196.
[2] Banulescu D.,Candelon B., Hurlin C. et Laurent S. (2016), Do We Need Ultra-High Fre-
quency Data to Forecast Variances?, Annales d’Economie et Statistiques, 123-124, pp.
135-174.
[3] Box, G.E et G.M. Jenkins (1976). Time Series Analysis, Forecasting and Control, Wiley.
[5] Dhrymes, P. J., (1971). Distributed Lags: Problems of Estimation and Formulation. Holden-
Day, San Francisco.
[6] Greene W. (2007), Econometric Analysis, sixth edition, Pearson - Prentice Hill.
[7] Koyck, L. M. (1954). Distributed Lags and Investment Analysis. Amsterdam: North-
Holland.
[8] Madinier H. et M. Mouillart (1983), Les méthodes d’estimation des modèles à retards
échelonnés en économie, Revue de statistique appliquée, 31 (4), pp. 53-73.
[10] Smith, R.G. et D.E.A. Giles (1976). The Almon estimator: Methodology and users’guide.
Discussion Paper E76/3, Reserve Bank of New Zealand.
18
A Annexe : Inversion d’un polynôme d’ordre p
En analyse des séries temporelles, il est souvent utile d’inverser des processus. Par exemple,
partant d’un processus AR stationnaire, on peut par inversion du polynôme autorégressif, déter-
miner la forme M A (1) associée à la décomposition de Wold. On obtient ainsi des représen-
tation équivalentes d’un même processus. Pour cela, il est nécessaire d’inverser des polynômes
dé…nis en l’opérateur retard. Nous avons déjà vu comment réaliser cette opération pour des
polynômes de degré un. Nous allons à présent généraliser cette méthode au cas de polynôme
de degré supérieur ou égal à un.
Le problème est donc le suivant. Soit (z) un polynôme inversible d’ordre p à coe¢ cients
réels avec 0 = 1. Il s’agit de déterminer e (z) ; le polynôme inverse de (z) . Par dé…nition,
8z 2 C
Xp 1
X
e
(z) (z) = (z) (z) = 1 j e zj = 1
jz j
j=0 j=0
Plusieurs solutions existent pour déterminer e (z) : Parmi celles-ci, nous n’en retiendrions
que deux.
P
i
e
i k k = 0 8i 2 [1; p]
k=0 (61)
Pp
e
i k k = 0 8i > p
k=0
avec 1 = 0:6 et 2 = 0:3: Le polynôme est inversible puisque les racines sont de module
strictement supérieur à 1 : 1 = 1:23 et 2 = 3:23: Soit e (z) le polynôme inverse de (z) ;
que l’on suppose de degré in…ni. On part de la relation d’identi…cation :
(z) e (z) = 1
En développant on obtient :
1+ 2 e + e z + e z 2 + e z 3 + ::: + e z p + ::: = 1
1z + 2z 0 1 2 3 p
() e + e z + e z 2 + e z 3 + ::: + e z p + :::
0 1 2 3 p
e e z + e2 z 2 + e e z 3 + e e z 4 + ::: + e e z p+1 + :::
0 1 1 2 1 3 1 p 1
e e z 2 + e e z 3 + e2 z 4 + e e z 5 + ::: + e e z p+2 + ::: = 1
0 2 1 2 2 3 2 p 2
Par identi…cation des termes de même degré à droite et à gauche du signe égal, on obtient
alors le système suivant :
19
8
> e =1
>
> 0
>
> e +
>
> 1 1 =0
< e e
2+ 1 1+ 2 =0
> e e e
>
> 3+ 2 1+ 1 2 =0
>
> :::
>
>
: e e e
n + n 1 1 + n 2 2 = 0 8n > 2
La résolution de ce système fournit alors une suite de récurrence qui dé…nit les coe¢ cients
de la représentation M A (1) du processus xt :
1
X
xt = e (z) "t = e "t j (62)
j
j=0
e =1 (63)
0
e = 0:6 (64)
1
e = 0:6en + 0:3en 8n 2 (65)
n 1 2
ep 1
j
aj = 8j p (67)
Q
p
ej ek
k=1
k6=j
Or on montre que :
! 0 1
p
X p
X 1
X 1
X Xp
aj ek z k = k
= aj j
@ aj ej A z k (68)
j=1 1 ej z j=1 k=0 k=0 j=1
20
Considèrons à nouveau l’exemple du polynôme AR (2) dé…ni par (z) = 1 + 1 z + 2 z 2 ;
avec 1 = 0:6 et 2 = 0:3: Les deux racines réelles sont 1 = 1:23 et 2 = 3:23: On cherche
tout d’abord à déterminer les paramètres ai tels que 8z 2 C
1 a1 a2
= +
1 e1 z 1 e2 z 1 e1 z 1 e2 z
avec
e1 = 1 1
=
1 1:23
e2 = 1 1
=
2 3:23
En développant, on obtient l’égalité suivante, 8z 6= i; i = 1; 2 :
a1 1 e2 z + a2 1 e1 z = 1
() (a1 + a2 ) a1 e2 + a2 e1 z = 1
Par identi…cation des termes de même degré, on obtient le système :
a1 + a2 = 1
a1 e2 + a2 e1 = 0
e1 e2
a1 = a2 =
e1 e2 e2 e1
On peut démontrer que les paramètres ej ainsi dé…nis satisfont l’équation de récurrence
dé…nie en (65).
21
This article was downloaded by: [Michigan State University]
On: 23 September 2013, At: 06:48
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number:
1072954 Registered office: Mortimer House, 37-41 Mortimer Street,
London W1T 3JH, UK
To cite this article: Subrata Ghatak & Jalal U. Siddiki (2001) The use of the
ARDL approach in estimating virtual exchange rates in India, Journal of Applied
Statistics, 28:5, 573-583, DOI: 10.1080/02664760120047906
Taylor & Francis makes every effort to ensure the accuracy of all
the information (the “Content”) contained in the publications on our
platform. However, Taylor & Francis, our agents, and our licensors
make no representations or warranties whatsoever as to the accuracy,
completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of
the authors, and are not the views of or endorsed by Taylor & Francis.
The accuracy of the Content should not be relied upon and should be
independently verified with primary sources of information. Taylor and
Francis shall not be liable for any losses, actions, claims, proceedings,
demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in
relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study
purposes. Any substantial or systematic reproduction, redistribution,
reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access
and use can be found at http://www.tandfonline.com/page/terms-and-
conditions
Downloaded by [Michigan State University] at 06:48 23 September 2013
Journal of Applied Statistics, Vol. 28, No. 5, 2001, 573- 583
abstract This paper applies the autoregressive distributed lag approach to cointegration
analysis in estimating the `virtual exchange rate’ (VER) in India. The VER would have
prevailed if the unconstrained import demand were equal to the constraint imposed due to
foreign exchange rationing and the VER is used to approximate the `price’ of rationed
foreign exchange reserves. We highlight the shortcomings of the existing literature in
approximating equilibrium exchange rates in a less developed country such as India and
propose the VER approach for equilibrium rates, which uses information from an estimated
structural model. In this relationship, black market real exchange rate (EU ) is a dependent
variable and real oý cial exchange rates (EO ), the ratio of the foreign (r*) to the domestic
(r) interest rate (I), and oý cial forex reserves (Q) are explanatory variables. In our
estimation, the VERs are higher than EO by about 10% in the short-run and 16% in the
long-run.
1 Introduction
The existence of `dual’ rates in the foreign exchange (forex) markets- one oý cial
and the other `unoý cial’ or black market (BM)- is a common phenomenon in less
developed countries (LDCs) (Dornbusch, 1983; Phylaktis, 1992; Siddiki, 2000).
Dual exchange rates emerge as a result of controls on access to the oý cial market.
The chronic and persistent balance of payments problems, the trade controls, and
® nancial repression lead to the emergence of BM in exchange rates.
These BM rates could render the use of the oý cial exchange rate impotent
to control the trade balance and forex reserves. The negative consequences of
protectionist trade and ® nancial policies on the Indian economy are enormous
(Bhagwati & Desai, 1970; Bhagwati, 1979; Siddiki & Daly, 1999). The major costs
Correspondence: S. Ghatak, School of Economics, Kingston University, Kingston KT1 2EE, UK.
ISSN 0266-4763 print; 1360-0532 online/01/050573-11 2001 Taylor & Francis Ltd
DOI: 10.1080/02664760120047906
574 S. Ghatak & J. U. Siddiki
dominated by BMs could be linked to high in¯ ation. In the asset market also, an
exchange rate linkage exists when domestic agents can hold real assets (land) or
claims on real assets (stocks). Indeed, to diþ erentiate between the domestic and
foreign asset markets, capital controls and dual exchange rates have often been
used in many East European and LDCs, particularly in Latin America (Charemza
& Ghatak, 1990).
The reduction of such costs emanating from the presence of a BM in many LDCs
provides a strong motivation for measuring a `virtual’ exchange rate (VER)- a rate
that would have prevailed if the unconstrained import demand were equal to the
constraint imposed due to forex rationing. Such VERs can be regarded as `just
bites’ (i.e. prices) of rationed forex in the sense that the rationed levels coincide
with the quantities that would have been chosen by the unrationed agents facing
the same prices and income in the Tobin & Houthakker (1950: see also Neary &
Roberts (1980)) or Rothbarth (1940) sense. In this sense, a VER approximates the
equilibrium or `just’ price of rationed forex of a developing economy.
It is often argued by the International Monetary Fund/World Bank that getting
the real exchange rate `right’ should be one of the important goals for policy makers
in LDCs, particularly where oý cial exchange rates are administratively determined.
The modelling and estimation of such `right’ rates in LDCs is the prime objective
of this paper since the concept of the `equilibrium’ exchange rate has long been
regarded as a chimera and its estimation is hazardous (see Section 3). In addition,
none of the available methods considers the relationship between oý cial and BM
rates and the impact of ® nancial policies on BM rates. Our paper seeks to ® ll this
gap by exploring the determinants of BM rates, i.e. the causes of distortions in the
forex market, in India and by deriving the VER from the information available in
both the oý cial and the unoý cial exchange rate markets. Our estimation of the
VER is based on the important structural factors in the economy that aþ ect the
`equilibrium’ exchange rate.
The autoregressive distributed lag (ARDL) approach to cointegrated analysis
(Pesaran & Shin, 1998) and time series data from 1965- 96 are used in estimating
our empirical model and in estimation VERs. The major advantage of the ARDL
method is that it avoids problems of serial correlation and of endogeneity, by an
appropriate augmentation, that may be experienced by other cointegration
methods. In addition, this method avoids pretesting of the order of integration,
which is associated with other cointegration analyses. Thus, the aims of the paper
are as follows:
Estimating virtual exchange rates in India 575
problems since oý cial reserves are a component of the money supply. Additionally,
the inclusion of interest rates as a determinant of BM rates captures the impact of
money supply on BM rates and causes the co-eý cient of money supply to be
statistically insigni® cant (Siddiki, 2000).
(P*) to the domestic price level (P) multiplied by the nominal unoý cial exchange
rate, rupees per dollar. EO is the oý cial real exchange rate de® ned as the ratio of
P* to P multiplied by the nominal oý cial exchange rate. Q is oý cial forex reserves
in US million dollars. I is ratio of foreign to domestic interest rates.1 The error
term ut is normally and identically distributed. All data are in natural logarithms
except the ratio of foreign to domestic interest rates. Our sample comprises annual
data from 1967 to 1996. The data source for EU is Pick’s/World Currency Yearbook
(various years). For the remaining variables, data are gathered from the International
Financial Statistics Yearbook (IMF: various years).
According to our model, EU evolves positively with EO , i.e. b > 0. An increase in
EO reduces (raises) the ¯ ow supply of forex in the BMs (oý cial markets). This
decrease in supply requires an increase in EU to keep the premiums unchanged and
push the ¯ ow supply up to retain equilibrium in BMs (AgeÂnor, 1990). This
prediction is also consistent with the ® nding of Baghestani & Noer (1993) and
Siddiki (2000) in the case of India.
We expect a negative sign of the coeý cient of Q, i.e. oý cial forex reserves. A
decline in Q increases the excess demand of forex in the oý cial markets. This
excess demand is met in the BM with a market determined rate. Note that a low
level of oý cial forex reserves in India is associated with more restrictive trade
policies ( Joshi & Little, 1994; Siddiki, 2000). Thus, a low value of Q signals
expected future depreciations in EO , which in turn causes a depreciation in EU .
This argument is consistent with the ® ndings of various devaluation episodes in
LDCs, which con® rm that a low level of oý cial forex reserves is associated with a
high level of BM rates (Kamin, 1993).
The sign of the coeý cient of I depends on whether the BM in India is a monetary
or a portfolio phenomenon. As is explained above, the MABM predicts a negative
sign, i.e. a fall in I causes a depreciation of BM rates while the PABM postulates a
positive sign, i.e. a fall in I causes an appreciation of BM rates (see Section 3
above).
As described in the Appendix, we follow a two-step procedure of the ARDL
method to estimate equation (1): see Pesaran and Pesaran (1997). In the ® rst step,
we carried out `stability tests’ for examining the existence of the long-run relation-
ship among EU , EO , Q and I. The F-test for examining this relationship from the
EC model with EU as a dependent variable is denoted by FEU (EU ½ EO , Q, I) (see
equation (4) and the discussion on it). The calculated FEU (. ½ . . . ) 5 5.03 is higher
than the upper bound critical value 4.378 at a 5% signi® cance level2 (the number
578 S. Ghatak & J. U. Siddiki
of lags chosen in all EC models is two). Therefore we reject the null of no long-
run relationship with EU as a dependent variable. Similarly, F-tests in EC models
with EO , Q, I as dependent variables are indicated by FEO (EO ½ EU , Q, I),
FQ(Q ½ EU , EO , I) and FI(I ½ EU , EO , Q, I), respectively. Calculated FEO (. ½ . . . ) 5 2.4058,
FQ(. ½ . . . ) 5 2.8841, FI(. ½ . . . ) 5 2.7578. These F statistics are lower than the lower
bound of the critical value 3.219 at a 5% signi® cance level. Our results show
that only FEU (. ½ . . . ) is signi® cant and the remaining F-statistics are insigni® cant.
Therefore, there exists a unique and stable long-run relationship with EU as
dependent variables and EO , Q and I as independent variables.
Having found a unique relationship, in the next step the following ARDL
(2, 0, 2, 1), with lag lengths determined by the Akaike Information Criterion (AIC),
is estimated (t values in parentheses):
EU 5 0.013 + 0.60743 E**
U, (t 2 1) 2 0.23862 EU, (t 2 2) + 0.73443 E**
O,t 2 0.009 Qt
Downloaded by [Michigan State University] at 06:48 23 September 2013
(0.58) ( 2 4.69)
Estimating virtual exchange rates in India 579
The coeý cient of I is positive and statistically signi® cant. This result implies
that an increase in foreign interest rates (returns to foreign money), relative to
domestic interest rates (returns to domestic money), boosts the demand for foreign
money. This increase in demand causes an increase in BM rates. This ® nding
supports the prediction that the higher the interest rates diþ erential, the greater
the expectations that the domestic currency will be depreciated in the future.
Therefore, the demand and price of foreign currencies will be higher in the BMs
(Dornbusch et al., 1983).
Our estimated long-run coeý cient of Q is negative and statistically signi® cant at
a 6% level. This result is in accordance with the fact that one of the main reasons
for the existence of BMs in India is the excess demand in the oý cial markets
caused by the scarcity of oý cial forex reserves. Thus, a low level of oý cial reserves
is associated with a high level of excess demand that increases BM rates.
In terms of short-run dynamics only D EO is statistically signi® cant (equation
(4)). However, the inclusion of the other variables is justi® ed according to the AIC
criterion. The statistically signi® cant coeý cient of D EO implies that, in the short-
run, BM rates respond positively to the oý cial rates.
2 1
Þ EUt 5 a0 + VERS + + c i Qt 2 i ++ d i It 2 i + ut (5)
i5 0 i5 0
Þ VERS 5 p S 3 EO,t ; p S 5
( +
i5 1
2
aà i ++
j5 0
0
b j
)
The VERS can be calculated from equation (2):
VERs 5 p st 3 EO, t 5 5 (0.60743 2 0.23862 2 0.73443) EOt 5 1.10324 3 EOt (6)
580 S. Ghatak & J. U. Siddiki
The long-run VER (VERL ) can be obtained from the estimated long-run equa-
tion (3):
EU 5 a + VERL EO + c QOL + d I (7)
where VERL 5 p L 3 EO with p L 5 b (see equation (3)), i.e. VERL is equal to the
long-run coeý cient of EO multiplied by EO , which is 1.1633 3 EO in our case.
We can also obtain the short and long-run VER by taking the weighted average
of EO and EU :
VERi 5 p i (s EOt + l EUt ) (8)
where i 5 S and L, s and l are weights, such that s + l 5 1, given to EO and EU .
Therefore, VERs would be about 10% higher in the short-run and 16% higher
in the long-run than the oý cial rates. More interestingly, the VERs are lower than
the BM rates, indicating that risk premiums are associated with the BM rates. The
diþ erence between the VER and BM rates is thought to be positively in¯ uenced
Downloaded by [Michigan State University] at 06:48 23 September 2013
by the risks associated with BM markets. The risks include the probability of
detection plus legal and moral problems (Sheik, 1976).
6 Conclusions
In this paper, we applied the ARDL approach to cointegration analysis developed
by Pesaran & Shin (1998) for estimating the VERs in India using annual data from
1967 to 1996. We ® nd a multivariate cointegrated relationship where real BM
exchange rates (EU ) is a dependent variable and real oý cial exchange rates (EO ),
oý cial forex reserves (Q) and the ratio of foreign to domestic interest rates (I) are
explanatory variables. Results reveal that an increase in EO causes a depreciation
in EU . This result supports the view that an oý cial depreciation generally reduces
the BM premiums and the ¯ ow supply of forex to the BM by reducing under-
invoicing of exports and over-invoicing of imports. This fall in (¯ ow) supply of
forex requires a depreciation in the BM rate to maintain the equilibrium.
We also conclude that an increase in I causes a depreciation in EU . Note that a
rise in I, i.e. an increase in foreign interest rates (returns on foreign money) relative
to domestic interest rates (returns on domestic money), boosts the demand for
foreign money. This rise in demand causes an increase in EU . Finally, we found
that an increase in Q causes an appreciation in EU . A reduction in Q raises the
excess demand in the oý cial markets, which in turn causes a depreciation in EU .
Contrary to the other available methods of modelling equilibrium exchange
rates, a structural relationship is considered in estimating the VERs. Our results
show that the VER would be higher than the oý cial exchange rates by about 10%
in the short-run, and 16% in the long-run. As the VER is lower than the BM rates,
distortions in exchange rates are not severe and the government can gradually
adjust the exchange rates without facing serious diý culties. The reason for BM
rates being higher than the VERs may be due to the risks associated with the BMs.
Acknowledgements
We are grateful to an anonymous referee for constructive comments on an earlier
version of this paper. This paper has also bene® ted from the comments of Professor
Kate Phylaktis and the participants of the ESRC conference in Birmingham
University (UK) and the IIDS conference at Central Michigan University (USA).
We are thankful to Stephen Wheatly Price and Chris Stewart for helpful comments.
The usual disclaimer applies.
Estimating virtual exchange rates in India 581
Notes
1. The foreign interest rate is proxied by the London-based Euro Dollar Rate and the domestic interest
rate is proxied by the Bank Rate, the discount rate given by the central bank to commercial banks.
2. The lower and upper bounds are appropriate for I (0) and I (1) variables, respectively. If the critical
values fall outside both bounds, as is our case, no knowledge is required regarding whether variables
are I(0) or I(1). However, if estimated critical values fall within the band, knowledge on the order
of integration of the variables is needed.
3. AR2-F and AR2-k (2) are the F and chi-square statistics, respectively, for joint autocorrelation of the
residuals up to order two. RESET-F and NOR-k 2(2) are the F and chi-square statistics, respectively,
for functional mis-speci® cation. NOR-k 2(2) is the chi-square statistic for testing normality. H-F and
H-k 2 are F and the chi-square statistics, respectively, for testing heteroscedasticity. EU -F, EO -F, Q-F,
I-F are F tests for the joint signi® cance of the particular variables (contemporaneous and lagged) in
the model.
Downloaded by [Michigan State University] at 06:48 23 September 2013
REFERENCES
Age nor, P. R. (1990) Stabilization policies in developing countries with a parallel market for foreign
exchange: a formal framework, IMF Staþ Papers, 37(3), pp. 560- 592.
Baghestani, H. & Noer, J. (1993) Cointegration analysis of the black market and oý cial exchange
rates in India, Journal of Macroeconomics, 15(4), pp. 709- 721.
Biswas, B. & Nandi, S. (1986) The black market exchange rate in a developing economy: the case of
India, The Indian Economic Journal, 33(3), pp. 23- 34.
Blejer, M. L. (1978) Exchange restrictions and the monetary approach to the exchange rate. In: J. A.
Frankel and H. G. Johnson (Eds) The Economics of Exchange Rates: Selected Studies (Reading, MA).
Bhagwati, J. (1979) The New International Economic Order (Boston, MIT Press).
Bhagwati, J. & Desai, M. (1970) India: Planning for Industrialisation (London, Oxford University Press).
Charemza, W. W. (1990) Parallel markets, excess demand and virtual prices: an empirical approach,
European Economic Review, 34, pp. 331- 339.
Charemza, W. W. & Ghatak, S. (1990) Demand for money in dual-currency quantity constrained
economy: Hungary and Poland, 1956- 85, The Economic Journal, 100, pp. 1159- 1172.
Dornbusch, R., Dantas, D. V., Pechman, C., Rocha, R. R. & Simoes, D. (1983), The black market
for dollars in Brazil, Quarterly Journal of Economics, 98, pp. 25- 40.
Engle, R. F. & Granger, C. W. J. (1987) Cointegration and error correction: representation, estimation
and testing, Econometrica, 52, pp. 251- 276.
Ghatak, A. & Ghatak, S. (1996) Budgetary de® cits and Recardian equivalence: the case of India,
Journal of Public Economic, 60, pp. 267- 282.
Gupta, S. (1980) An application of the monetary approach to black market exchange rates, Welwirtsch-
aftliches Archiv, 116, pp. 235- 252.
International Monetary Fund (1997) Exchange Arrangements and Exchange Restrictions: Annual
Report 1997 (Washington, DC).
Joshi, V. & Little, I. M. D. (1994) India: Macroeconomics and Political Economy, 1964- 1991 (Wash-
ington, DC, The World Bank).
Kamin, S. B. (1993) Devaluation, exchange controls, and black markets for foreign exchange for
developing countries, Journal of Development Economies, 40, pp. 151- 169.
Khan, M. S. and Ostoy, J. D. (1992) Response of Equilibrium Real Exchange Rate to Real Disturbances
in Developing Countries, World Development, 20, pp. 1325- 34.
Kiguel, M. A., Lizondo, J. S. & O’Connell, S. A. (1997) Parallel Exchange Rates in Developing
Countries (London, Macmillan Press).
Neary, P. & Roberts (1980) The theory of household behaviour under rationing, European Economic
Review, 13, pp. 25- 42.
Pesaran, H. M. & Pesaran, B. (1997) Micro® t 4.0 (Oxford University Press).
Pesaran, H. M. & Shin, Y. (1998) An autoregressive distributed lag modelling approach to cointegration
analysis, chapter 11 in S. Størm (Ed) The Econometrics and Economic Theory in the 20th Century
(Cambridge, Cambridge University Press).
Pesaran, H. M., Shin, Y. & Smith, R. J. (1996) Testing the existence of a long-run relationship. DAE
Working Paper Series, 9622, Cambridge University, Department of Applied Economics.
Phylaktis, K. (1992) The black market for dollars in Chile, Journal of Developing Countries, 37,
pp. 155- 172.
582 S. Ghatak & J. U. Siddiki
Rothbarth, E. (1940) The measurement of changes in real income under conditions of rationing,
Review of Economic Studies, 8, pp. 100- 107.
Sheik, M. A. (1976) Black market for foreign exchange, capital ¯ ows and smuggling, Jour nal of
Development Economics, 3, pp. 9- 26.
Siddiki, J. U. (1999) Economic liberalisation and growth in Bangladesh: 1974- 95, PhD Thesis,
Kingston University, UK.
Siddiki, J. U. (2000) Black market exchange rates in India: an empirical analysis, Empirical Economics,
25(2), pp. 297- 313.
Siddiki, J. U. & Daly, V. (1999) Trade and ® nancial liberalisation and economic growth in India.
Discussion Paper No. 99/7, Kingston University, UK.
Tobin, J. & Houthakker, H. S. (1950) The eþ ects of rationing on demand elasticities, Review of
Economic Studies, 18, pp. 140- 153.
World Bank (1994) Trends in Developing Countries (Washington, DC).
Appendix
Downloaded by [Michigan State University] at 06:48 23 September 2013
ECM: D yt 5 + a i D yt 2 i ++ b i¢ D xt 2 i +c zt 2 1 (M.2)
1 0
criterion and OLS is then applied. Recovery of the coeý cients of the long-run
model is a re-parameterization exercise and therefore purely computational.
Pesaran et al. (1996) oþ er a procedure for identifying the dependent variable in
a system containing a single cointegrating relationship. This procedure involves
computation of standard and hypothesis tests, albeit with non-standard critical
values, applied to an unrestricted version of ECM (UECM):
UECM: D yt 5 + a i D yt 2 i ++ b i¢ D xt 2 i +} yt 2 1 +d ¢ xt 2 1 (M.4)
1 0
Time Series
in Economics
and Finance
Time Series in Economics and Finance
Tomas Cipra
This Springer imprint is published by the registered company Springer Nature Switzerland AG.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
v
vi Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Chapter 1
Introduction
Most data in economics and finance are observed in time (sometimes even online in
real time) so that they have the character of time series. This monograph presents
methods currently used for analysis of data in this context. Such methods are
available not only in many monographs, textbooks, or papers but also in various
journals or working papers, case studies, or guides to the corresponding software
systems. This text tries to bring together as many methods as possible to cover the
most recommended instruments for analysis and prediction of dynamic data in
economics and finance.
The objective of this book is the practical applicability. Therefore, it centers on
the description of methods used in practice (both simple and complex ones from the
computational point of view). Their derivation is often concise (if any, particularly in
more complicated cases), but one always refers to easily available sources. In any
case, a lot of numerical examples illustrate the theory by means of real data which are
usually chosen to be characteristic for the presented methodology.
Selected parts of the text are suitable for university programs (undergraduate,
graduate, or doctoral) concerning econometrics or calculation finance as study,
training, or reference materials. Moreover, due to the complete survey of actual
methods and approaches the book can serve as a reference text in research work. On
the other hand, it can also be recommended for people dealing with analysis of data
in economics and finance (banks, exchanges, energetic planning, currency and
commodity markets, insurance, statistical offices, demography, and others).
The presented material requires mostly the application of suitable software.
Fortunately, the corresponding programs are easily available since they can be
found in libraries of common statistical or financial software systems (R Statistical
Software, MATLAB, EViews, and others can be recommended). There are several
reasons supporting ready-made software: (1) calculations (e.g., in Excel) are usually
troublesome (particularly for users with superficial knowledge of programming);
(2) software manuals are usually helpful in various individual situations, and,
moreover, the parameters of programs are preset as default values suitable for the
immediate (routine) application; and (3) when browsing through the offer of
software systems, one discovers other methods or modifications which can be useful
for the solved problem. On the other hand, the potential user should not be only a
software consumer sharing all drawbacks of the given software product. Moreover,
the qualified users should be capable of interpreting the computer outputs in a proper
way since they understand principles of the chosen methods.
The monograph consists of several parts divided into particular chapters:
Part I (Subject of time series, Chap. 2) deals with the subject of time series which
are looked upon as trajectories of random processes.
Part II (Decomposition of economic time series, Chaps. 3–5) is devoted to the
classical approach decomposing economic time series to trend, periodic (seasonal
and cyclical), and residual components. Some of more advanced methods are also
addressed, e.g., tests of periodicity or randomness.
Part III (Autocorrelation methods for univariate time series, Chaps. 6 and 7)
summarizes so-called Box–Jenkins methodology based on (linear) ARMA models
and their modifications (ARIMA, seasonal ARMA, long memory processes) for
univariate time series. Some more actual topics are also mentioned in this context
(e.g., information criteria or tests of unit root). Finally, dynamic regression models
are presented in Part III including distributed lag models.
Part IV (Financial time series, Chaps. 8–11) confines itself to financial time series
which require special (namely nonlinear) models and instruments due to the typical
volatility of financial data. Models nonlinear in mean and in variance are distin-
guished including tests of linearity and duration modeling. Further, Part IV addresses
the modeling of financial assets by means of diffusion processes including Black–
Scholes formula and modeling of the term structure of interest rates. Chapter 11
presents a very actual topic of risk measures (value at risk and others). Extreme value
theory is also mentioned in this context, namely block maxima and threshold
excesses.
Part V (State space models of time series, Chaps. 12–14) concludes the mono-
graph considering the multivariate time series. At first, the popular vector
autoregression (VAR) model is presented including tests of causality, impulse
response, variance decomposition, cointegration, and EC models. The multivariate
volatility modeling is also described including multivariate EWMA and GARCH
models with a practical application for conditional value at risk. Finally, the (mul-
tivariate) state space models as the background of Kalman filtering are discussed
including the state space model approach to exponential smoothing.
Some parts of this monograph serve as lecture notes for courses of time series
analysis and econometrics at the Faculty of Mathematics and Physics of Charles
University in Prague (it is also the reason why some real data used in practical
examples are taken from the Czech economics and finance).
Acknowledgment The author thanks for various forms of help to Dr. Radek Hendrych. The
research work contained in the monograph was supported by the grant 19-28231X provided by
the Grant Agency of the Czech Republic.
Part I
Subject of Time Series
Chapter 2
Random Processes
Data typical for economic and financial practice are time data, i.e., values of an
economic variable (or variables in multivariate case) observed in a time interval with
a given frequency of records (each trading day, in moments of transactions, monthly,
etc.). The frequency of records is understood either as the lengths of intervals
between particular observations (e.g., calendar months) or the regularity of obser-
vations (e.g., each trading day). As to the regularity, financial data are often
irregularly observed (irregularly spaced data), e.g., the stock prices in stock
exchanges are quoted usually in moments of transactions from the opening to closing
time of trading day, the frequency of transactions being usually lower in the morning
after opening, during the lunch time, and later in the afternoon before closing
(a possible approach in such a situation assigns the closing or prevailing price to
this day). The important property of time data is the fact that they are ordered
chronologically in time.
The term time series denotes any sequence of data y1, . . ., yn ordered chronolog-
ically in time. It could justify a simplifying view looking on a time series as a set of
numbers ordered in time (historically it was the case, e.g., for astronomic observa-
tions). However, a very important aspect of time series is not only their dynamics but
also their randomness. In order to be adequate, the analysis of time series should
apply such models that are based on stochastic principles (i.e., on probability theory)
and are capable of generating time sequences similar from the stochastic point of
view to the trajectory that we just observe. Such models are denoted as random
processes and can be looked on as specific algorithms based on random number
generators. The knowledge of the algorithm that generated the observed time series
as an output among many realizations may be highly useful for examining our
specific data.
Random process (or stochastic process) {Yt, t 2 T} is a set (or family) of random
variables in the same probability space (Ω, ℑ, P) indexed by means of values t from
T (T ⊂ R), where t is interpreted as time. According to the form of the index set T,
which is a subset of real line R, one distinguishes:
• Random process in continuous time: T is an interval in real line, e.g., T ¼ h0, 1),
i.e., {Yt, t 0}.
• Random process in discrete time: T is formed by discrete real values, e.g., T ¼ N0,
i.e., {Y0, Y1, Y2, . . .}.
According to the states of random variables Yt (i.e., according to the state space S)
one also distinguishes:
• Random process with discrete states: e.g., counting process Yt 2 N0 for all t 2 T
that registers the number of specified events in time.
• Random process with continuous states: e.g., real process Yt 2 R or nonnegative
process Yt 2 h0, 1) for all t 2 T.
• Multivariate random process: Yt is an m-variate random vector Yt for all t 2 T.
In any case, one can observe only trajectories (realizations) of random processes.
Such a trajectory arises by a choice of an elementary event ω 2 Ω and is a
deterministic function of time {Yt (ω), t 2 T} observable due to this specific choice.
One denotes trajectories simply as {yt, t 2 T} in discrete time and {y(t), t 2 T} in
continuous time.
Remark 2.1 Unfortunately in the literature (and also in this text), it is common that
the term time series is interpreted sometimes as the trajectory and sometimes as the
random process. The real meaning follows from the context.
⋄
In any case, the data in the form of time series have a lot of specific features. It can
help to analyze such data files, but on the other hand, it can cause complications that
must be overcome by suitable procedures and adjustments. The next section will
present examples of specific problems that are typical for time series analysis.
Economic and financial data observed in time typically feature problems implied just
by their time character:
Time series in discrete time that prevail in economics and finance usually arise by the
following ways:
• They are discrete by nature (e.g., daily interbank LIBOR rates).
• One discretizes time series in continuous time (e.g., closing quotations on stock
exchanges assigned to the ends of particular trading days in the context of
continuous trading).
• One accumulates (aggregates) values over given time intervals (e.g., accumu-
lated sums of insurance benefits paid out in particular quarters); often one pro-
duces averages instead of aggregates.
In some cases, one is not allowed to select time points of observations oneself.
However, if such a possibility exists, one should pay careful attention to it. It often
means that one must find a trade-off among contradictory requirements: for instance,
on one side due to numerical complexity not to use too high density of records of a
continuous process (see, e.g., ultra-high-frequency data UHFD in finance) and on
the other side not to apply so scarce data that one is not capable of identifying some
characteristic features of the given process (e.g., if we are interested in seasonal
fluctuations we must dispose of several observations during each year at least). As
the distance between neighboring observations is concerned, it is common to
observe data regularly in equidistant time points. On the contrary, in finance there
are not unusual irregularly spaced data due to irregularities in market trading (see
Sect. 2.1).
The nature is responsible for a minor part of problems caused by the calendar (e.g.,
the number of days of one solar year is not integer, various geographic zones require
8 2 Random Processes
time shifts). However, the major part of calendar problems is due to human conven-
tions due to which we have, e.g.,
• Different lengths of calendar months
• Four or five weekends monthly
• Different numbers of working or trading days monthly
• Moving holidays (e.g., Easter once in the first quarter and next time in the
second one)
• Wintertime or summertime
Such irregularities must be taken into account in an adequate way, e.g., differ-
ences in security trading or in quality of produced cars at the beginning, middle, and
end of particular weeks, different times to maturities quoted on some security
exchanges as the third Friday of particular months, and others. In practice, one
usually applies simple methods eliminating these undesirable phenomena. Several
examples follow:
• Calendar conventions are common in the framework of simple interest and
discount models (e.g., the calendar Euro-30/360 introduces months with
30 days and years with 360 days).
• If comparing monthly productions of some products (cars), the volumes are
adjusted using so-called standard month with 30 days: in such a case, one should
multiply the January production by coefficient 30/31, the February production in
the common year by 30/28, and in the leap year by 30/29, etc. Similarly if
comparing securities traded monthly, one should multiply the January volume
by (21/real number of January trading day), the February volume by (21/real
number of February trading day), etc., as the average annual number of trading
days is 252, i.e., 21 monthly.
• Some short-term calendar irregularities can be eliminated by means of accumu-
lation. For instance, if it suffices to analyze data accumulated annually instead of
original quarterly data, then some calendar problems (e.g., seasonal fluctuations,
moving Easter, and others) can be reduced in this natural way.
In addition to calendar problems, one must frequently face such irregularities in
time series that are consequences of operation risk (blackouts, breakdowns of web,
failures of human factor including frauds, etc.). The irregularities of this type are
classified as outliers, and statistical methods for time series with outliers should be
robustified to become insensitive to such outlying values. Another type or irregu-
larities are jumps in consequence of interventions (it can be, e.g., successful adver-
tising campaign, decision of bank council on decrease of key interest rates, new
legislative, and so on).
The length of time series is the number n of observations of the given time series (not
the time range between the beginning and the end of time series). Therefore, e.g., the
2.2 Specific Problems of Time Series Analysis 9
monthly time series over 10 years has the length of 120. It is logic that the volume of
information available for analysis increases with the increasing length of time series.
However, the length of time series is not a unique measure of information contained
in the time series (e.g., the doubling of time series length by halving the original time
intervals between neighboring points of observations does not mean usually the
doubling of information on this time series): one must consider also the inner
structure of given time series.
As the length of time series is concerned, usually a reasonable trade-off is
necessary in practice. On one side, some time series methods require a sufficient
length of series (e.g., the routine application of Box–Jenkins methodology is not
recommended for time series shorter than 50). On the other hand, characteristic
features of long time series usually change in time so that the construction of
adequate model becomes more complex with increasing length of time series.
Similarly, the typical problem in longer time series originates due to the fact that
the measurements in the beginning of the given time series need not be comparable
with the ones in its end, e.g., due to inflation, price growth, technical development,
and the like. In such a case, one should adjust data by means of a suitable index
(in practice, it can be not only the inflation rate but also the salary growth for time
series used in formulas of pay-as-you-go pension systems and the like).
The choice of suitable time series method depends on various factors, e.g.,
• Objective of analysis, mainly the identification of generating model, the hypoth-
esis testing, the prediction, the control and optimization (see the introduction to
Sect. 2.2); in this context, it is also relevant how the analysis results will be
exploited in practice, which will be the costs of analysis, which is the volume of
analyzed data and the like.
• Type of time series, since some methods are not suitable universally for all time
series (e.g., it has no sense to apply a Box–Jenkins model for an economic time
series of ten annual observations that show an apparent linear growth).
• Experience of analyst, who is responsible for the analysis, and software, which
will be exploited for the analysis.
The most popular methods and procedures of time series analysis are the
following ones:
Reality shows that time series of economic character can be usually decomposed to
several specific components, namely
10 2 Random Processes
(some authors demand for the white noise even stronger assumptions written usually
as εt ~ iid(0, σ 2), where εt are independent and identically distributed random vari-
ables with zero mean value and constant variance). The name “white noise” derives
from the spectral analysis and refers to the property of a constant spectrum with
equal magnitude at all frequencies (or wavelengths) similarly as in the white light in
optics. The values εt are also called innovations as they correspond to unpredictable
movements (shocks) in time series.
Obviously, one can look upon the given economic time series as a trend linked
with periodic components (i.e., seasonal and cyclical ones) and white noise. More-
over, the decomposition can be either additive or multiplicative:
Additive decomposition has the form
yt ¼ Tr t þ Ct þ I t þ Et : ð2:2Þ
In the additive decomposition, all components are measured in the units of the
time series yt, i.e., all components are absolute ones (not relative ones measured, e.g.,
as percent of the trend).
Multiplicative decomposition has the form
yt ¼ Tr t Ct I t Et : ð2:3Þ
Trt Trt
t t
2000 2010 2000 2010
Trt+Ct
t t
0 0
Ct
Trt+Ct+It
It
t t
0 0
Trt+Ct+It+Et
t t
0 0
Et
decomposition to the additive one, and vice versa, by means of the exponential
transformation (one must pay attention to the changes of statistical properties of the
transferred residual component in such a case).
If the observations are y1, . . ., yn, then
byt ¼ Tr b t þ bI t
b t þC b t bI t
b t C
or byt ¼ Tr ð2:4Þ
is the smoothed time series (for t n), or the prediction of time series (for t > n),
based on the calculated values of systematic components, or on the extrapolated
ones, respectively. Obviously, some systematic components can be missing in the
decomposition of various economic time series, e.g., the series yt ¼ Trt + Et does not
contain any periodic components at all. Figure 2.1 shows the scheme of additive
decomposition in a graphical way.
The decomposition methods are based mainly on the analysis of systematic compo-
nents of time series (i.e., the trend, seasonal, and cyclical components), and they
regard the particular observations as uncorrelated. The typical statistical instrument
2.2 Specific Problems of Time Series Analysis 13
yt ¼ εt þ 0:7εt1 , ð2:5Þ
where yt is the modeled time series and εt is the white noise (2.1). Other types of
models applied in the framework of Box–Jenkins methodology are so-called
autoregressive processes AR [see (6.31)] and processes ARMA [see (6.45)].
At first sight it could seem that the attention devoted by this methodology to the
random component is excessive and that one loses the possibility to model
nonstationary time series with evident trend or seasonal character (the so-called
stationarity of a time series means that the behavior of this series is stable in a
specific way; see Sect. 6.1). However, Box–Jenkins methodology is capable of
managing also these cases by means of so-called integrated processes ARIMA and
seasonal processes SARIMA, where the trend or seasonal components are modeled
in a stochastic way (in contrast to the deterministic modeling when using the
classical decomposition approach). For instance in a very simple model ARIMA
(0, 1, 0)
yt ¼ yt1 þ εt , ð2:6Þ
the stochastic trend can be characterized in such a way that its increments over
particular observation intervals are random in the form of white noise (hence it is
logic why the process (2.6) is called the random walk). Due to this stochastic
approach, Box–Jenkins methodology is very flexible modeling in a satisfactory
way also non-standard time series that are unmanageable by the classical decompo-
sition approach.
In analysis of multivariate time series, one models several time series simultaneously
including relations and correlations among them (see Chap. 12). Then the causality
relations among various economic variables modeled dynamically in time can be
addressed in this context. Another important phenomenon is here so-called
cointegration when particular (univariate) time series from multivariate model
have a common stochastic trend which can be eliminated completely combining
14 2 Random Processes
particular time series in a suitable way (see Sect. 12.5). The popular instrument for
modeling multivariate time series is the process VAR (vector autoregression; see
Sect. 12.2).
Three approaches presented above can be summarized as the time series analysis in
time domain. A distinct approach that regards the examined time series as an
(infinite) mixture of sinusoids and cosinusoids with different amplitudes and fre-
quencies (according to the Wiener–Khinchin theorem for stationary time series) is
the time series analysis in spectral domain called briefly the spectral analysis of time
series (sometimes one also speaks more generally on Fourier analysis). Applying
special statistical instruments, e.g., periodogram or spectral density, one can obtain
in this context the image which is the distribution of intensities of particular
frequencies in the examined time series (so-called spectrum of time series), which
of its frequencies are the most intensive ones including the estimation of the
corresponding periodic components, etc.
The spectral analysis is important for applications in engineering (vibrograms,
technical diagnostics, seismograms) and biology (electrocardiograms). On the other
hand, it is not usual for economic time series [except for the tests of periodicity (see
Sect. 4.2) or the investigation of cycles in economics (see Hatanaka 1996)]. In any
case, a deeper study of theoretical backgrounds of this approach to time series
demands special references, e.g., monographs Koopmans (1995) or Priestley (2001).
There exist plenty of methods concerning special types or aspects of time series, e.g.:
• Nonlinear models of time series: for instance, threshold models are suitable for
time series that change their character after exceeding particular threshold levels;
asymmetric models are applied for time series whose momentary development is
revised according to their previous development and, moreover, such a revision is
asymmetric in dependence on the previous growth or decline.
• Models of financial time series: these time series have various typical features,
e.g., so-called leptokurtic or heavy-tailed distribution, extreme values appearing
in clusters, high frequency of records, and others; therefore, very specific
nonlinear models are necessary for time series used in finance (e.g., models
ARCH or GARCH with conditional heteroscedasticity whose variance called
usually volatility depends in the given time on the previous behavior of time
series; see Chap. 8).
• Recursive methods in time series: these methods provide results for a new time
step (estimates, smoothed values, predictions, and others) using results from
previous time periods and adjusting them by means of new observations; in
2.2 Specific Problems of Time Series Analysis 15
particular, one can make use of Kalman filter here as a formal recursive method-
ology; see Chap. 14).
• Methods for time series with missing or irregular observations: in such series,
some observations are either missing (e.g., they are unobservable or false or
outlied or secret) or are observed in irregular time intervals (e.g., due to time
irregularities in trading on markets, one must also model so-called durations
between neighboring values; see Sect. 9.4).
• Robust analysis of time series: here one identifies and eliminates the influence of
outliers that contaminate analyzed records and distort results of classical methods
(a very simple example how to robustify a classical statistical method in order to
be insensitive to outliers is to replace the arithmetic average by the median when
estimating the average level of a time series).
• Intervention analysis of time series: this analysis examines one-off impacts from
outside that can influence significantly the course of time series (e.g., intervention
of central bank, useful advertising campaign, and others; see Sect. 7.4);
• Plenty of other special methods.
This classification of predictions holds not only for time series, but it is also
common, e.g., in the econometric regression analysis:
Point prediction is the quantity that presents a numerical estimate of future value
of time series which is optimal in a certain sense (i.e., the estimate of time series
value in so far unobserved future time point). For instance, the point prediction of
exchange rate EUR/USD in three future months predicted just now is 1.0635.
Obviously, the point prediction is always burdened by error so it must be taken
with discretion.
Interval prediction is the prediction interval which is quite analogous to the
confidence interval used in mathematical statistics; the only difference consists in
the fact that one estimates an unknown (future) value of time series instead of an
unknown parameter in this case. For instance, the 95% prediction interval presents
the lower and upper bounds for the range in which the corresponding future value of
time series will lie with the probability of 0.95. Let us consider again the previous
example with the exchange rate EUR/USD: if the corresponding 95% prediction
interval is (1.0605; 1.0665), then, e.g., a European company can expect with high
confidence that it obtains for each euro at least 1.0605 dollars. From the practical
point of view, the interval predictions seem to be more useful for users than the point
predictions.
Qualitative prediction methods are based usually on opinions of experts (one calls
them sometimes the “expert predictions”), and therefore in practice they have rather
subjective character. Sometimes one is forced to apply these methods when histor-
ical data are missing, e.g., when one introduces a new bank product. Since we avoid
these methods in the following chapters with regard to the character of this text,
some simple examples of qualitative predictions will be given below to have an idea
of this approach to predicting (sometimes the qualitative predictions are even better
than the purely mathematical ones):
• Subjective fitting by curve is a (graphical) method when experts strive to estimate
the future behavior of particular time series using their experience with time series
of similar type. For instance, the graphical plot describing the sale of a new
product (e.g., a new car make) has frequently the form of so-called S-curve shown
in Fig. 2.2: after the starting stage, the sale accrues during the growth stage in
dependence on the intensity of advertising campaign till the stable stage is
achieved (later usually the drop of sale follows only). Therefore, the experts
can suggest the prediction just according to a specific S-curve applying their
subjective opinion on its form (e.g., on the length of particular stages).
• Delphi method is the prediction method based on the enquiry in an expert group
and the gradual mediation of consensus for given prediction problem. This
methodology has been developed by large-scale multinational corporations to
forecast development in science, engineering, production, consumption, and the
like. In particular, its application consists in several stages of anonymous
enquiring where each of addressed experts presents his or her opinion on the
given prediction. In each stage, one adds to the enquiry form the statistical results
of previous stages so that experts can adjust their previous opinions and to
converge gradually to the group opinion which is declared as the final prediction.
It should be stressed that the results of particular stages are communicated only in
18 2 Random Processes
a global statistical form for the whole group (no individual answers are provided).
In the following Example 2.1, one uses only very simple statistical instruments in
this context, namely the mean values, standard deviations, and lower and upper
quartile.
1 10 þ . . . þ 4 40
¼ 25:4 %
50
Further one finds the lower and upper quartile. The lower (or upper) quartile is the
bound separating one-quarter of the lowest (or highest) observations, respectively. In
our case, one-quarter of number of observations is 50/4 ¼ 12.5, and therefore, the
lower quartile is 20% and the upper quartile is 30%. In the next stage of this
prediction method, one informs particular participants of the expert group on the
statistical results obtained in the first stage (but not on the answers of particular
experts) and so on.
⋄
2.2 Specific Problems of Time Series Analysis 19
In-sample prediction is that generated for the same set of data that was used to
estimate the model’s parameters. Obviously, one can expect good prediction results
since one only recalculates selected values of the original sample by means of the
constructed model so that all model assumptions remain valid in the “prediction
horizon.” Nevertheless, this procedure can serve a very simple test of in-sample fit of
the model.
Out-of-sample prediction is on the contrary the prediction of time series values
which have not participated in the construction of prediction model at all: either they
were not available at that time (i.e., they were future values from the point of view of
that time), or they were deleted from the sample on purpose (it is common in the
situation when one tries to evaluate the prediction ability of a time series model: then
the data deleted artificially are denoted as the hold-out sample). Obviously, pre-
dictions of hold-out sample represent a better evaluation of the prediction model than
an examination of its in-sample fit (see above).
As a simple example let us consider the time series of 120 monthly observations
in the period 2008M1–2017M12. The objective is to construct a model for this time
series and to assess its quality (in particular, its prediction abilities). Two solutions
are possible in this case: either (1) to construct the model using the whole time series
2008M1–2017M12 (and possibly to generate in-sample predictions) or (2) to con-
struct the model using only the shorter time series 2008M1–2016M12 and to
generate the out-of-sample predictions for 2017M1–2017M12 (which is the hold-
out sample here; see Fig. 2.3) and to compare them with the real values from the
hold-out sample. The second approach is more correct (of course, a suitable length of
20 2 Random Processes
Out-of-sample
Estimation of model predictions
(for hold-out sample)
hold-out sample must be chosen appropriately) since here the data information on
2017M1–2017M12 is not used for the construction of prediction model.
Single-prediction is the prediction constructed for a single time (usually for the next
one), e.g., at time n for time n + 1 denoted as bynþ1 ðnÞ, but also, e.g., at time n for time
n + 5 denoted as bynþ5 ðnÞ.
Multi-prediction is the prediction simultaneously for more (future) times (e.g., at
time n for times n + 1, . . ., n + h). Obviously, one obtains a vector of several single
predictions which are constructed at the same time (on the other hand, the sequence
of one-step-ahead predictions constructed at times n + 1, . . ., n + h always after
receiving particular observations yn, . . ., yn+h1 cannot be called multi-prediction).
The example in Fig. 2.3 may show some problems connected with multi-
predictions, e.g., with assessment of their quality. Let an examined prediction
technique applied at the end of 2006 for the hold-out sample 2007 provide good
result only for the first month 2007M1 (i.e., the short-term prediction result) and bad
results for the remaining months 2007M2–2007M12 (i.e., the long-term prediction
results). The multi-prediction from time 2006M12 for times 2007M1–2007M12 is
not enough to assess the quality of applied prediction methodology: one should
generate a set of multi-predictions, and it is possible to do it in a systematic way
using so-called prediction windows. Moreover, two types of prediction windows can
be used (see Table 2.2 if predicting only three closest future values for simplicity):
• Rolling windows: the samples used for prediction (observable in the rolling
windows) have a fixed length (e.g., 108 in Table 2.2), but their beginning is
shifted.
• Recursive windows: the samples used for prediction (observable in the recursive
windows) have a fixed beginning (e.g., 2008M1 in Table 2.2), but their length
increases.
In both cases, ten multi-predictions are obtained (see Table 2.2) so that conclusions
on the prediction methodology may be reliable.
2.2 Specific Problems of Time Series Analysis 21
Table 2.2 Rolling and recursive windows to assess the quality of multi-predictions (see Fig. 2.3)
Multi-prediction Multi-predictions based on samples provided by
Constructed for times Rolling windows Recursive windows
2017M1, M2, M3 2008M1–2016M12 2008M1–2016M12
2017M2, M3, M4 2008M2–2017M1 2008M1–2017M1
2017M3, M4, M5 2008M3–2017M2 2008M1–2017M2
2017M4, M5, M6 2008M4–2017M3 2008M1–2017M3
2017M5, M6, M7 2008M5–2017M4 2008M1–2017M4
2017M6, M7, M8 2008M6–2017M5 2008M1–2017M5
2017M7, M8, M9 2008M7–2017M6 2008M1–2017M6
2017M8, M9, M10 2008M8–2017M7 2008M1–2017M7
2017M9, M10, M11 2008M9–2017M8 2008M1–2017M8
2017M10, M11, M12 2008M10–2017M9 2008M1–2017M9
Dynamic prediction does not exploit values of predicted variable lying in the
prediction horizon (even if these values are known), but replaces them by
corresponding predictions. Therefore, in the situation described above, the dynamic
prediction will be
i.e., the value yt+1 in (2.7) is replaced by the one-step-ahead prediction bytþ1 ðt Þ
ignoring the possibility that the value yt+1 may be known at time t. Obviously, the
dynamic predictions are not so accurate as the static predictions (if the static pre-
dictions are feasible).
22 2 Random Processes
The important aspect of prediction consists in the measuring of its accuracy based on
the error of prediction. The error et of prediction byt (when predicting value yt) is
defined as
et ¼ yt byt : ð2:9Þ
The error of prediction cannot be calculated until the time when we know the actual
value yt (this value has been unknown at the time of prediction). However in
practice, when assessing the quality of prediction, one sometimes “predicts”
known values of time series to compare these predictions with the known actual
values (see the hold-out sample described above).
The main source of prediction errors consists in the residual component of time
series since it represents unpredictable (unsystematic) fluctuations in data. If the
participation of this component in time series is significant, then the possibility of
construction of reliable predictions is limited. On the other hand, the size of
prediction error depends also on the quality of predictions for systematic compo-
nents of time series. Therefore, significant prediction errors may indicate either an
extraordinary participation of residual component or inappropriateness of prediction
methodology.
In any case, the examination of error of prediction is useful. If the prediction
technique masters predictions of systematic components, then the prediction errors
reflect the influence of residual component only (see Fig. 2.4a). On the contrary,
Fig. 2.4b–d shows the cases when the prediction technique failed due to inappropri-
ate prediction of trend, seasonal, and cyclical components, respectively.
The measures of prediction accuracy assess the development of predictions in time.
We will give below the usual measures of this type for a simple situation when one
assesses in total the accuracy of predictions bynþ1 , . . . , bynþh of values yn+1, . . ., yn+h
(here it does not matter if we assess a multi-stage prediction or a sequence of one-step-
ahead predictions, static or dynamic predictions, or other types of predictions):
1. Sum of squared errors SSE (sum of squared errors):
Xnþh Xnþh
SSE ¼ ðy byt Þ2 ¼
t¼nþ1 t t¼nþ1
e2t : ð2:10Þ
1 Xnþh 1 Xnþh
MSE ¼ ð y t b
y t Þ 2
¼ e2 :
t¼nþ1 t
ð2:11Þ
h t¼nþ1 h
2.2 Specific Problems of Time Series Analysis 23
(a) (b)
et et
1 2 3 1 2 3
t (years) t (years)
et (c) et (d)
1 2 3 5 10 15
t (years) t (years)
MSE is a popular quadratic loss function. Some software decompose MSE to three
components:
2
1 Xnþh 2
ð y t b
y t Þ 2
¼ by y þ s^y sy þ 2 1 r^yy s^y sy ð2:12Þ
h t¼nþ1
(by, y, s^y , sy are the corresponding sample means and (biased) sample standard
deviations of values by and y; r ^yy is the sample correlation coefficient between by
and y). Usually, one uses relative values of these components, namely
(a) Proportional bias:
2
by y
Pnþh ð2:13Þ
1
h t¼nþ1 ðyt byt Þ2
24 2 Random Processes
2
s ^y sy
Pnþh ð2:14Þ
1
h t¼nþ1 ðyt byt Þ2
2 1 r ^yy s ^y sy
Pnþh ð2:15Þ
t¼nþ1 ðyt byt Þ 2
1
h
These proportional components have obviously the unit sum, in which the
proportional bias indicates the distance of average of predictions from average
of future values, the proportional variance indicates the distance of variance of
predictions from variance of future values, and the proportional covariance covers
the remaining unsystematic part of prediction error (any “good” prediction
technique has proportional bias and proportional variance small so that the
unsystematic component prevails in such a case).
3. Root mean squared error:
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 Xnþh 1 Xnþh
RMSE ¼ ðyt byt Þ2 ¼ t¼nþ1 t
e2 : ð2:16Þ
h t¼nþ1 h
RMSE modifies MSE in order to be measured in the same units as the given time
series.
4. Mean absolute error MAE:
1 Xnþh 1 Xnþh
MAE ¼ jyt byt j ¼ j et j: ð2:17Þ
h t¼nþ1 h t¼nþ1
various prediction techniques in similar time series. Now we will present further
measures which do not depend on the time series scale:
5. Mean absolute percentage error MAPE:
100 Xnþh yt byt
MAPE ¼ t¼nþ1 yt
: ð2:18Þ
h
100 Xnþh yt byt
AMAPE ¼ t¼nþ1 ðy þ b
: ð2:19Þ
h t yt Þ=2
AMAPE rectifies the asymmetry of the criterion MAPE in (2.18), namely that it
provides the same result even if one swaps the real value and its prediction (e.g.,
real value 0.7 and prediction 0.9 give the same value in (2.19) as real value 0.9
and prediction 0.7).
7. Theil’s U-statistic [see Theil (1966)]
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pnþh
t¼nþ1 ðyt b yt Þ2
U ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pnþh ffi q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pnþh ffi: ð2:20Þ
t¼nþ1 t b
y 2
þ t¼nþ1 t y 2
U lies always between 0 and 1 (e.g., U¼0 means the perfect coincidence of
prediction with reality).
Further group of measures of prediction accuracy only indicates whether the
model predicts correct signs of future values (i.e., whether these values will be
positive or negative) or predicts correct direction changes (i.e., whether an
increase changes to a decrease and the like). From the strategic point of view,
such predictions are often more important than numerical predictions:
8. Percentage of correct sign predictions:
100 Xnþh 1 for yt byt > 0,
z , where zt ¼
t¼nþ1 t
ð2:21Þ
h 0 otherwise:
26 2 Random Processes
100 Xnþh 1 for ðyt yt1 Þ ðbyt yt1 Þ > 0,
z , where zt ¼
t¼nþ1 t
ð2:22Þ
h 0 otherwise:
Remark 2.2 One should stress once more that the given measures concern the
statistical accuracy of predictions only. In any way they do not justify an economic
or financial adequacy of predictions. For instance, small values of MSE do not mean
that we dispose of a successful outline how to predict future market strategies (e.g.,
sometimes it can be desirable from the strategic point of view to underestimate or
overestimate the future development and the like).
⋄
If both the combined predictions are unbiased (i.e., E(e1) ¼ 0 and E(e2) ¼ 0) with
finite mean squared errors denoted as σ 12 ¼ E(e12) and σ 22 ¼ E(e22) and covariance
denoted as σ 12 ¼ cov(e1, e2) ¼ E(e1e2), then the corresponding optimal weights are
2.2 Specific Problems of Time Series Analysis 27
σ 22 σ 12 σ 21 σ 12
w¼ , 1w¼ : ð2:25Þ
σ 21 þ σ 22 2σ 12 σ 21 þ σ 22 2σ 12
According to (2.25), greater weights are assigned to more precise models with
smaller σ 12 or σ 22. Moreover, the weights can be negative if σ 12 > σ 22 or σ 12 >
σ 12 (the negative weight assigned to a prediction component means that this com-
ponent is replaced in the prediction combination by other prediction components
with lower prediction errors). The weakly correlated prediction errors enable us to
rewrite the weights as functions of the relative variance σ 22/σ 12:
σ 22 =σ 21 1
w , 1w :
1 þ σ 22 =σ 21 1 þ σ 22 =σ 21
The result (2.25) can be generalized easily if one combines m predictions. In the
literature and in software systems (R packages, EViews, and others), there are
suggested many strategies how to combine predictions, e.g.:
• Equal-weighted predictions:
1 X
m
byt ¼ by : ð2:26Þ
m i¼1 it
• Median prediction:
XÞmc
bð1λ
1
byt ¼ byit ð2:28Þ
mð1 2λÞ i¼bλmþ1c
1 Xm
byt ¼ Pm 1
MSE 1
i byit , ð2:29Þ
j¼1 MSE j i¼1
1 X
m
byt ¼ Pm 1
R1
i byit , ð2:30Þ
j¼1 R j i¼1
where Ri is the rank of prediction byit (e.g., the smallest prediction has the rank 1).
This weighting scheme which weights predictions inversely to their rank seems to
be surprisingly robust [see Timmermann (2006)].
The majority of methods of time series analysis in this publication concerns time
series that are modeled as random processes with continuous states in discrete time
(see Sect. 2.1). Therefore, the term “time series” means here usually the trajectory of
values y1, . . ., yn from a continuous interval on the real line which are observed in
(regular) discrete moments. For the sake of completeness, the remaining Sects. 2.3–
2.5 of this chapter are devoted to examples of time series with different character
(e.g., to time series which are modeled as random processes with discrete states in
continuous time; see Sect. 2.4) in order to get an idea on further possibilities in this
modeling framework.
Let us start with several examples of random processes with discrete states in
discrete time:
(the definition can be more general with asymmetric probabilities p and q). Its
trajectory may be interpreted as a record of results when tossing an ideal coin
(e.g., {1, 1, 1, 1, 1, 1, 1, . . .}; see Fig. 2.5).
Random walk (RW ) on line is the integer-valued random process in continuous time
2.3 Random Processes with Discrete States in Discrete Time 29
1 2 3 4 5 6 7 t
−1
Xt
fY t , t ¼ 0, 1, . . .g, where Y 0 ¼ 0; Yt ¼ i¼1
Xi, t ¼ 1, 2, . . . ;
X t iid; PðX t ¼ 1Þ ¼ p, PðX t ¼ 1Þ ¼ q ðp þ q ¼ 1Þ:
ð2:32Þ
Here Yt describes the random number of members of tth generation: the initial 0th
generation has only one member, and the jth member of tth generation gives rise to a
random number Ztj of members of (t +1)th generation (see Fig. 2.7). For example, if
in a pyramid game each player finds further three participants, then the cor-
responding trajectory is {1, 3, 9, 27, . . .}.
30 2 Random Processes
time 0 :
M M
number Yt
time t : 1 2 ........... Yt
for all t ¼ 0, 1, . . . and i, j, i0, . . ., it1 ¼ . . ., 1, 0, 1, . . . (in other words, the
probability of moving to the next state depends only on the present state and not on
the previous states). The probability on the right-hand side of (2.34) is so-called
transition probability from state i at time t to state j at time t + 1. The important
special case is the homogenous Markov chain whose transition probabilities do not
depend on time, i.e.,
for all t. For example, the symmetric random walk (see above) is the homogenous
Markov chain with starting value Y0 ¼ 0 and with transition probabilities
1=2 for j ¼ i 1
pij ¼ for all i, j ¼ . . . , 1, 0, 1, . . . : ð2:36Þ
0 otherwise
2.3 Random Processes with Discrete States in Discrete Time 31
nij
pij ¼ P1
b , ð2:37Þ
k¼1 nik
where nij is the number of transitions from state i to state j during time unit using a
sample of observed trajectories.
Further one introduces in this context the n-step transition probabilities defined
(for simplicity, we constrain ourselves to homogenous Markov chains) as
which corresponds to a stable limit behavior of Markov chain. The vector π fulfills
π0 ¼ π0 P: ð2:43Þ
Example 2.2 (Markov chain). A bonus system in motor car (Casco) insurance has
three bonus levels denoted as 0, 1, 2 (presenting, e.g., 100 %, 80 %, and 60 % of
basic insurance premiums): if the clients report no claims in the given year, their
bonus improves next year by one level or they remain at the best level 2; if reporting
one or more claims they grow worse next year by one level or they remain at the
worst level 0 (i.e., no malus level is introduced). The insurance company disposes of
stable insurance portfolio with 10,000 clients: 5000 are “good” drivers with
32 2 Random Processes
0 1 2 3 4 5 6 7
100 %
t (years)
80 %
60 %
estimated probability of loss-free year about 0.9 and 5000 are “bad” drivers with
estimated probability of loss-free year about 0.8. The objective is to estimate the
stabilized numbers of clients in particular bonus levels.
The behavior of clients can be described by homogenous Markov chain with
annual time units and with three possible states (then the trajectory for an individual
client is an annual time series jumping across particular levels 100 %, 80 %, and
60 %; see Fig. 2.8).
Obviously, the transition matrices of good or bad drivers are
0 1 0 1
0:1 0:9 0 0:2 0:8 0
B C B C
@ 0:1 0 0:9 A or @ 0:2 0 0:8 A: ð2:44Þ
0 0:1 0:9 0 0:2 0:8
The stationary distribution of good drivers must fulfill according to (2.43) the
system of linear equations
0 1
0:1 0:9 0
B C
ðπ 0 π 1 π 2 Þ ¼ ð π 0 π1 π 2 Þ @ 0:1 0 0:9 A: ð2:45Þ
0 0:1 0:9
Its solution is π 0 ¼ 0.010 989, π 1 ¼ 0.098 901, and π 2 ¼ 0.890 109 so that 5000
0.010 989 ¼ 55.0 clients have the bonus level 0; similarly 494.5 clients have the
bonus level 1 and 4450.5 clients have the bonus level 2 among 5000 good drivers in
the portfolio.
Quite analogously we get that 238.1 clients have the bonus level 0, 952.4 clients
have the bonus level 1, and 3809.5 clients have the bonus level 2 among 5000 bad
drivers in the portfolio. Hence in limit, the majority of good and also bad drivers will
achieve the best bonus level 2 (although this number is significantly lower among
bad drivers than among good drivers). In any case, the given bonus system is very
favorable for insured.
⋄
2.4 Random Processes with Discrete States in Continuous Time 33
ðλ t Þi
PðN t ¼ iÞ ¼ eλ t
for i ¼ 0, 1, . . . ð2:47Þ
i!
with mean value λ t. Hence it follows (without any additional assumptions) that the
periods T1, T2, . . . between particular occurrences of events are iid random variables
with exponential distribution and mean value 1/λ (this conclusion has a logic
interpretation: the mean number of occurrences per time unit is λ 1 so that the
mean period between two occurrences must be 1/λ). The efficient estimate of the
intensity λ is bλ ¼ n=T , where n is the observed number of occurrences during
period T.
Markov process (similarly as Markov chain in Sect. 2.3) is a general scheme for
random processes with discrete states in continuous time. Let for simplicity the
34 2 Random Processes
5
4 E(Nt) = λ⋅ t
3
2
1
E(T1) = 1/λ E(T2) = 1/λ E(T3) = 1/λ
first default second default third default t
Fig. 2.9 Numbers of mortgage defaults in a bank credit portfolio from the beginning of year to time
t modeled as Poisson process (T1, T2, . . . are random periods between defaults)
possible states of this process be again integer numbers i ¼ . . ., 1, 0, 1, . . . , and the
process can move across these states in any positive times with given transition
probabilities. Again the Markov property must hold
for all times 0 t1 < . . . < tn < s s + t and i, j, i1, . . ., in ¼ . . ., 1, 0, 1, . . . . The
probability on the right-hand side of (2.48) is the transition probability from state i at
time s to state j at time s + t. In the case of homogenous Markov process, it depends
only on time t
i.e.,
pii ðhÞ ¼ 1 þ qii h þ oðhÞ; pij ðhÞ ¼ qij h þ oðhÞ, where i 6¼ j: ð2:52Þ
2.5 Random Processes with Continuous States in Continuous Time 35
Markov process can be defined directly by transition intensities: in such a case the
transition probabilities and the probability distribution of Markov process are
obtained by solving so-called Kolmogorov differential equations.
Poisson process with intensity λ > 0 (see above) is the special case of homoge-
nous Markov process with
8
> λ h þ oð hÞ for j ¼ i þ 1;
>
>
< 1 λ h þ oð hÞ for j ¼ i;
pij ðhÞ ¼ ð2:53Þ
>
> oð hÞ for j > i þ 1;
>
:
0 for j < i,
i.e., in the interval of small length h the given event occurs just once with probability
λ h + o(h) (which is proportional approximately to the length of this interval) and
more than once with probability o(h).
Analogously one can define continuous Markov process [i.e., the Markov prop-
erty holds for continuous states in continuous time; see, e.g., Malliaris and Brock
(1982)].
For instance, the sinusoid with random amplitude and phase is the random process
with continuous states in continuous time {Yt, t 0} defined as
Wiener process (or also Brownian motion) is the random process with continuous
states in continuous time {Wt, t 0}, where
8
> ði Þ W 0 ¼ 0;
>
>
< ðiiÞ particular trajectories are continuous in time;
>
> ðiiiÞ W t2 W t1 , . . . , W tn W tn1 are independent for arbitrary 0 t1 < . . . < t n ;
>
:
ðivÞ W t W s N ð0, t sÞ for arbitrary 0 s < t:
ð2:56Þ
In particular, the increments Wt+h Wt have the normal distribution N(0, h), and the
correlation structure of this process fulfills
Further, more sophisticated properties of Wiener process (valid with probability one)
are, e.g.,
• The particular trajectories are continuous but not differentiable functions of time
(i.e., the derivations do not have to exist in any time point).
• The particular trajectories attain any real value infinitely times.
• The particular trajectories have the fractal form (i.e., they “look similarly in any
zoom”).
Wiener process is the basic concept of majority of financial models. After
transforming (to achieve a necessary trend, volatility, and the like), one can apply
it to model continuous movements of interest rates or asset prices (when jumps can
occur, one must combine Wiener process with Poisson process from Sect. 2.4).
Important modifications in practice are the following processes (they are described in
more details later in Chap. 10):
1. Wiener process with drift μ and volatility (or diffusion coefficient) σ:
fY t ¼ μ t þ σ W t , t 0g, ð2:58Þ
Y t ¼ eX t ¼ eμtþσW t , t 0 , ð2:59Þ
where E(Yt) ¼ exp{(μ + σ 2/ 2)t} and var(Yt) ¼ exp[(2μ + σ 2) t] [exp(σ 2 t) 1].
2.6 Exercises 37
Remark 2.3 One can refer to further examples of random processes with discrete
states or in continuous time which are more complex so that a specialized literature
should be consulted, e.g.:
• Binary process originating by clipping a stationary process (with continuous
states) where simply the values of this stationary process higher or equal to
zero are replaced by the value “1” and the values lower than zero are replaced
by “0” (a general threshold can be used instead of zero; see Kedem (1980)).
• Counting process of nonnegative integer random variables usually correlated
over time that modifies the Box–Jenkins methodology for integer-valued
processes:
– DARMA process (i.e., discrete mixed process) models a general stationary
series of counts with a given marginal distribution (binomial, geometric,
Poisson); see, e.g., Jacobs and Lewis (1983), McKenzie (1988). Sometimes
Markov chains present a suitable model scheme for such processes; see
MacDonald and Zucchini (1997).
– INAR process (i.e., integer autoregressive process) generates integer-valued
time series in a manner similar to the autoregressive recursive scheme for
continuous random variables; see, e.g., Al-Osh and Alzaid (1987), Kedem and
Fokianos (2002), Weiss (2018). In this context, one makes use of the so-called
thinning operator
X
X
p∘X ¼ Y i, ð2:60Þ
i¼1
where {Yt, t ¼ 1, 2, . . .} are iid Bernoulli (i.e., zero-one) random variables with
the probability of success equal to p
2.6 Exercises
Exercise 2.1 Realize practically (e.g., in a group of students) the Delphi method to
predict some actual economic or financial themes.
38 2 Random Processes
Exercise 2.2 Repeat the calculation from Example 2.2 (the bonus system in motor
car insurance), but for five bonus levels (e.g., 100%, 90%, 80%, 70%, and 60% of
basic insurance premiums). Moreover, apply for this bonus system the modified rule:
if reporting one or more claims the clients grow worse next year by two levels or they
remain at the worst level 0 (with 100% of basic insurance premiums).
Part II
Decomposition of Economic Time Series
Chapter 3
Trend
This chapter and Chaps. 4 and 5 describe various methods of additive and multipli-
cative decomposition which result in the elimination of particular components of
time series. In practice, it can have various motivations:
1. First and foremost, the analysis of separated (eliminated) components of time
series is useful from the practical point of view since one can detect in such a way
various patterns in behavior of time series, identify particular external effects
influencing records, and compare several time series and the like (e.g., using the
trend, securities dealers can compare the growth rate of various stocks, or using
the seasonal component, banks can assess the demand for commercial credits
during particular years).
2. Important objectives of decomposition are also predictions of future development
of particular components (e.g., which will be the growth rate of contracted
mortgages) or predictions of the (non-decomposed) time series constructed by
compounding predictions of particular components (which are relatively simple
and accurate predictions).
3. Sometimes due to the character of solved problems it is convenient to reveal the
behavior of given time series adjusted by removing some components. For
example, the economic and financial time series are frequently seasonally
adjusted (this seasonal adjustment is even demanded for economic time series
reported officially by government statistical offices; see also Sect. 2.2.2).
The methods of elimination of particular components of time series differ by
various levels of objectivity, accuracy, and computational complexity. The choice of
relevant method depends on the motivation for decomposition and on the type of
analyzed time series. The methods based on the regression approach are often very
popular in this context mostly under the assumption that the residual component is
uncorrelated and homoscedastic in time [see the concept of white noise in (2.1)].
Moreover, the normal distribution of the residual component is sometimes assumed
(and justified by the Central Limit Theorem since the residuals are resultants of many
random effects). If the time series is contaminated by outliers, then one should use
robust decomposition methods which are insensitive to outliers (e.g., one should
apply the median instead of the sample average when estimating the constant level of
time series) .
This chapter is devoted to methods suggested to eliminate the trend component from
time series and to extrapolate this component to the future. In this context, one
speaks of smoothing of time series since the seasonal (sometimes even periodic) and
random fluctuations of time series are damped down simultaneously. While in Sect.
3.1 we will deal with the classical methods of elimination of trend, in Sects. 3.2 and
3.3 we will present the adaptive approaches that take into account local changes in
the character of trend (e.g., the changes in the slope of linear trend).
yt
irrelevant local turning points which are not characteristic for cycle identification;
see Fig. 3.1).
Section 3.1.2 describes the methods that express the trend analytically by simple
curves used in mathematics (e.g., by the line or logarithmic curve). Such estimated
curves enable to calculate in a natural way their future values, i.e., as a matter of fact,
to construct predictions of the trend component (under the assumption that its future
character will sustain in time).
Using this philosophy, one usually assumes that the analyzed time series can be
modeled as
yt ¼ Tr t þ E t ð3:1Þ
(or one has transformed the time series to this form by methods described in the
following chapters, e.g., by means of the seasonal adjustment). Moreover, the
residual component in (3.1) has the properties of white noise. These assumptions
permit to apply suitable regression methods when estimating the parameters of trend
curves, and then to take directly the corresponding regression extrapolations for Trt
as the predictions for yt.
The choice of type of the most appropriate mathematical curve for particular time
series is based on a preliminary analysis, usually by means of graphical records of
time series or by using expected properties of the trend component following, e.g.,
from the economic theory (however, it is obvious that one cannot suppress
completely subjective impacts here). Several reference tests for the choice of the
most appropriate mathematical curves for given trajectory y1, . . ., yn are shown in
Table 3.6. There also exist systematic typologies where the controlled movement
along particular knots of the typological tree offers the most appropriate curve
according to answers to selecting questions (e.g., “the analyzed trajectory is/isn’t
symmetric around the point of inflection ?”).
Now we will survey favorite trend curves including the formulas for estimation of
their parameters and for construction of their (point and interval) predictions in the
time series models of the type (3.1):
Tr t ¼ β0 þ β1 t, t ¼ 1, . . . , n: ð3:2Þ
44 3 Trend
The OLS estimates b0 and b1 of parameters β0 and β1 fulfill the system of normal
equations
X
n X
n
b0 n þ b1 t ¼ yt ,
t¼1 t¼1
ð3:3Þ
X
n Xn Xn
b0 t þ b1 t 2
¼ tyt
t¼1 t¼1 t¼1
byT ¼ b0 þ b1 T ð3:5Þ
and (1 p)100% prediction interval for this value (e.g., 95% interval if p ¼
0.05; see Sect. 2.2.3) under the normality assumption (or normality achieved
asymptotically) is
b0 þ b1 T t 1 p=2 ðn 2Þ s f T , b0 þ b1 T þ t 1 p=2 ðn 2Þ s f T , ð3:6Þ
where
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn Pn 2 Pn 2
t¼1 ðyt b yt Þ 2 t¼1 yt t¼1b yt
s¼ ¼ ,
n2 n2
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u 2
u T nþ1
t 1
f T ¼ 1 þ þ nðn2 1Þ : 2
ð3:7Þ
n
12
Example 3.1 (linear trend). Table 3.1 and Fig. 3.2 present the elimination of linear
trend in the time series yt of the Swiss gross national income (at current prices in
billions of CHF) for particular years 1980–2015 (t ¼ 1, ..., 36). The spreadsheet of
EViews 7 in Table 3.2 and the predictions for particular years 2016–2025 in
Table 3.1 coincide with the calculations according to the formulas (3.4) and (3.5)
3.1 Trend in Time Series 45
Table 3.1 Annual data 1980–2015, eliminated linear trend, and predictions for years 2016–2025 in
Example 3.1 (Swiss gross national income in bn CHF)
t Year yt (bn CHF) byt (bn CHF) t Year yt (bn CHF) byt (bn CHF)
1 1980 203.9 211.7 24 2003 505.9 517.5
2 1981 220.7 225.0 25 2004 520.5 530.8
3 1982 231.4 238.3 26 2005 550.8 544.1
4 1983 239.4 251.6 27 2006 579.2 557.4
5 1984 257.7 264.9 28 2007 577.4 570.7
6 1985 273.2 278.2 29 2008 559.0 584.0
7 1986 284.6 291.5 30 2009 599.1 597.3
8 1987 294.9 304.8 31 2010 642.8 610.6
9 1988 315.3 318.1 32 2011 624.3 623.9
10 1989 339.4 331.4 33 2012 637.6 637.2
11 1990 365.8 344.7 34 2013 649.6 650.5
12 1991 382.4 358.0 35 2014 649.8 663.8
13 1992 389.2 371.3 36 2015 660.3 677.1
14 1993 399.6 384.6 37 2016 690.4
15 1994 406.1 397.9 38 2017 703.7
16 1995 414.5 411.2 39 2018 717.0
17 1996 419.0 424.5 40 2019 730.3
18 1997 435.0 437.8 41 2020 743.6
19 1998 448.9 451.0 42 2021 756.9
20 1999 460.4 464.3 43 2022 770.2
21 2000 489.3 477.6 44 2023 783.5
22 2001 488.9 490.9 45 2024 796.7
23 2002 482.4 504.2 46 2025 810.0
Source: AMECO (European Commission Annual Macro-Economic Database). (https://ec.europa.
eu/economy_finance/ameco/user/serie/SelectSerie.cfm)
P36 P36
t¼1 tyt 2
36þ1
t¼1 yt 51 655:23
b1 ¼ ¼ ¼ 13:296 07,
36ð362 1Þ 3 885
12
36 þ 1
b0 ¼ y b1 ¼ 444:400 9 18:5 13:296 07 ¼ 198:423 6,
2
etc. Further according to (3.6) and (3.7), we calculated the 95% prediction intervals,
e.g.:
46 3 Trend
900
800
700
600
500
400
300
200
100
1980 1985 1990 1995 2000 2005 2010 2015 2020 2025
Fig. 3.2 Annual data 1980–2015, eliminated linear trend, and predictions for years 2016–2025 in
Example 3.1 (Swiss gross national income in bn CHF)
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P36 2 P36 2
t¼1 yt t¼1b yt
s¼ ¼ 13:131 00,
34
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
u 36þ1 2
u 1 37
f 37 ¼ t1 þ þ 2
¼ 1:056,
36 36ð362 1Þ
12
i.e.,
ð662:2; 718:6Þ,
⋄
etc. (see Fig. 3.2).
Remark 3.1 As the polynomial trends of higher orders are concerned, the quadratic
trend can be also found in economic and financial applications
Tr t ¼ β0 þ β1 t þ β2 t 2 , t ¼ 1, . . . , n: ð3:8Þ
byT ¼ b0 þ b1 T þ b2 T 2 , ð3:9Þ
where
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn Pn 2 Pn 2
t¼1 ðyt b yt Þ 2 t¼1 yt t¼1b yt
s¼ ¼ ,
n3 n3
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
0ffi
1
fT ¼ 1 þ 1, T, T 2 ðX0 XÞ 1, T, T 2 ,
0 1
1 1 1
B1 4 C
B 2 C
X¼B C: ð3:11Þ
@⋮ ⋮ ⋮A
1 n n2
⋄
3.1.2.2 Exponential Trend
(the parameters are denoted as α and β). This trend has two typical characteristics,
namely that both its coefficient of growth (i.e., the ratio of neighboring values
Trt+1/Trt) and the ratio of neighboring differences
Tr tþ2 Tr tþ1
ð3:13Þ
Tr tþ1 Tr t
are constant with value β in time. If α > 0, then the exponential trend is increasing
for β > 1 and decreasing for 0 < β < 1. The both parameters of exponential trend can
be estimated by taking its logarithm which transfers this trend to the linear one
ln Tr t ¼ ln α þ t ln β: ð3:14Þ
Then it is sufficient to find the antilogarithm for the estimated parameters ln α and ln
β (in any case, if one conjectures that an analyzed trend could be exponential, one
should plot the corresponding time series using the logarithmic scale). Moreover, by
taking the antilogarithm of the prediction intervals in the linear model (3.14), one can
also construct the prediction intervals in the original exponential model. On the other
hand, practical experiences with the exponential trend (3.12) (and also with other
models of nonlinear regression which can be transferred to the linear regression
using a suitable transformation) show that more consistent estimation results can be
obtained using the weighted least squares method (WLS) with weights which are
obtained by a suitable transformation of the original weights since it is not possible
to assume the multiplicative form and the normal-logarithmic distribution of residual
components in the original model (3.12) before transformation. In particular for the
exponential trend, WLS method consists in minimizing the expression
X
n
2
vt ðyt αβt Þ , ð3:15Þ
t¼1
where the weights vt are chosen in advance. However instead of the expression
(3.15), one minimizes the sum of weighted least squares of the form
X
n
wt ð ln yt ln α t ln βÞ2 , ð3:16Þ
t¼1
where the weights wt are constructed in dependence on the original weights vt in such
a way that the minimization of (3.15) and (3.16) provides nearly identical estimates α
and β. It can be shown that in our case of logarithmic transformation one can put
wt ¼ y2t vt , t ¼ 1, . . . , n: ð3:17Þ
Since the most usual choice of original weights is vt ¼ 1, t ¼ 1, ..., n (if there is a
priori no reason to prefer some of given observations), the transformed weights are
3.1 Trend in Time Series 49
simply wt ¼ yt2, t ¼ 1, ..., n. Minimizing the expression (3.16) with the weights
(3.17), one obtains the following system of normal equations:
X X X
y2t ln α þ ty2t ln β ¼ y2t ln yt ,
X X X
ty2t ln α þ t 2 y2t ln β ¼ ty2t ln yt
Example 3.2 (exponential trend). Table 3.3 and Fig. 3.3 present the elimination of
exponential trend in the time series yt of the US gross national income (at current
prices in billions of USD) for particular years 1960–2016 (t ¼ 1, ..., 57). The
auxiliary results for formulas (3.18) given in Table 3.4 enable to calculate the
estimated parameters:
so that the eliminated exponential trend (regarded as the smoothed time series in
practice) can be calculated in Table 3.3 as
Figure 3.3 also plots the modified exponential trend (see (3.19) below) that fits
⋄
obviously better the given time series than the exponential trend.
Table 3.3 Annual data 1960–2016 and eliminated exponential trend in Example 3.2 (US gross national income in bn USD)
Year t yt (bn USD) byt (bn USD) Year t yt (bn USD) byt (bn USD) Year t yt (bn USD) byt (bn USD)
1960 1 542.7 1410.5 1979 20 2619.3 3498.8 1998 39 9167.6 8679.1
1961 2 562.0 1479.5 1980 21 2852.8 3670.1 1999 40 9725.3 9104.2
1962 3 604.7 1552.0 1981 22 3207.1 3849.9 2000 41 10,421.2 9550.1
1963 4 638.1 1628.0 1982 23 3374.7 4038.5 2001 42 10,788.6 10,017.8
1964 5 686.4 1707.7 1983 24 3621.0 4236.3 2002 43 11,098.9 10,508.5
1965 6 744.6 1791.4 1984 25 4038.3 4443.7 2003 44 11,591.4 11,023.2
1966 7 815.9 1879.1 1985 26 4320.9 4661.4 2004 45 12,372.6 11,563.0
1967 8 862.4 1971.2 1986 27 4530.4 4889.7 2005 46 13,221.8 12,129.4
1968 9 943.3 2067.7 1987 28 4847.2 5129.2 2006 47 14,140.8 12,723.4
1969 10 1020.7 2169.0 1988 29 5275.8 5380.4 2007 48 14,585.8 13,346.6
1970 11 1076.9 2275.2 1989 30 5618.3 5643.9 2008 49 14,791.2 14,000.3
1971 12 1165.9 2386.6 1990 31 5922.9 5920.3 2009 50 14,494.5 14,686.0
1972 13 1283.9 2503.5 1991 32 6117.2 6210.3 2010 51 15,121.1 15,405.3
1973 14 1435.1 2626.1 1992 33 6459.4 6514.4 2011 52 15,802.9 16,159.8
1974 15 1557.0 2754.8 1993 34 6758.4 6833.5 2012 53 16,596.1 16,951.3
1975 16 1688.6 2889.7 1994 35 7195.8 7168.2 2013 54 17,073.7 17,781.5
1976 17 1874.0 3031.2 1995 36 7602.3 7519.3 2014 55 17,899.1 18,652.4
1977 18 2087.0 3179.7 1996 37 8075.4 7887.6 2015 56 18,496.1 19,565.9
1978 19 2355.0 3335.4 1997 38 8620.4 8273.9 2016 57 19,041.6 20,524.2
Σ 21,944.2 Σ 101,057.6 Σ 266,430.3
Source: AMECO (European Commission Annual Macro-Economic Database) (https://ec.europa.eu/economy_finance/ameco/user/serie/SelectSerie.cfm)
3 Trend
3.1 Trend in Time Series 51
25000
20000
15000
10000
5000
0
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Fig. 3.3 Annual data 1960–2016 and eliminated exponential trend in Example 3.2 (US gross
national income in bn USD)
X X αβðβm 1Þ
y Tr ¼ mγ þ ,
1 t 1 t β1
52 3 Trend
X X αβmþ1 ðβm 1Þ
y Tr ¼ mγ þ ,
2 t 2 t β1
X X αβ2mþ1 ðβm 1Þ
y Tr ¼ mγ þ ,
3 t 3 t β1
where, e.g., ∑1yt and ∑1Trt denote the sum of observed and trend values from the
first third of time series, respectively. Solving this system of equations, one can
stepwise obtain the estimates b, a, c of parameters β , α, γ as
P P 1=m
y y
b¼ P3 t P2 t , ð3:20Þ
y
2 t 1 yt
X X
b1
a¼ y t y ,
1 t
ð3:21Þ
bð bm 1Þ 2 2
1 X abðbm 1Þ
c¼ y : ð3:22Þ
m 1 t b1
Another approach is also possible: If fixing the value of parameter β, the model
(3.19) will become obviously the linear model in which one estimates simply the
parameters α and γ for various fixed values β and chooses finally the variant
minimizing SSE.
Example 3.3 (modified exponential trend). Table 3.5 and Fig. 3.5 present the elim-
ination of the modified exponential trend in the time series yt of the Japan gross
national income (at current prices in billions of JPY) for particular years 1960–2016
(t ¼ 1, ..., 57).
The data are divided into three groups (m ¼ 19), for which particular sums are
calculated (see Table 3.5). Then the formulas (3.20)–(9.22) provide stepwise the
following results:
3.1 Trend in Time Series 53
Table 3.5 Annual data 1960–2016 and eliminated modified exponential trend in Example 3.3
(Japan gross national income in bn JPY)
Year t yt Year t yt Year t yt
1960 1 16,421 1979 20 227,692 1998 39 519,390
1961 2 19,817 1980 21 246,449 1999 40 511,280
1962 3 22,480 1981 22 264,544 2000 41 516,340
1963 4 25,717 1982 23 278,328 2001 42 513,933
1964 5 30,225 1983 24 289,723 2002 43 507,189
1965 6 33,640 1984 25 308,153 2003 44 507,117
1966 7 39,080 1985 26 331,538 2004 45 513,112
1967 8 45,807 1986 27 346,885 2005 46 515,652
1968 9 54,222 1987 28 361,520 2006 47 521,152
1969 10 63,708 1988 29 388,725 2007 48 530,313
1970 11 75,124 1989 30 419,067 2008 49 518,002
1971 12 82,724 1990 31 452,267 2009 50 484,216
1972 13 94,845 1991 32 479,613 2010 51 495,651
1973 14 115,496 1992 33 492,078 2011 52 486,254
1974 15 137,541 1993 34 495,227 2012 53 490,386
1975 16 152,089 1994 35 499,681 2013 54 496,725
1976 17 170,819 1995 36 505,821 2014 55 506,607
1977 18 190,438 1996 37 517,710 2015 56 522,127
1978 19 209,883 1997 38 530,218 2016 57 525,062
Σ 1,580,076 Σ 7,435,239 Σ 9,680,508
Source: AMECO (European Commission Annual Macro-Economic Database) (https://ec.europa.
eu/economy_finance/ameco/user/serie/SelectSerie.cfm)
P P 1=m 1=19
y y 9 680 508 7 435 239
b¼ P3 t P2 t ¼ ¼ 0:950 804,
2 yt 1 yt 7 435 239 1 580 076
X X
b1
a¼ y t y
1 t
bð bm 1Þ 2 2
0:950 804 1
¼ 2 ð7 435 239 1 580 076Þ ¼ 797 015,
0:950 804 0:950 80419 1
1 X abðbm 1Þ
c¼ y
m 1 t b1
1 ð797 015Þ 0:950 804 0:950 80419 1
¼ 1 580 076 ¼ 583 001:
19 0:950 804 1
The eliminated modified exponential trend (regarded again as the smoothed time
series) can be calculated as
54 3 Trend
600000
500000
400000
300000
200000
100000
-100000
-200000
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Fig. 3.5 Annual data 1960–2016 and eliminated modified exponential trend in Example 3.3 (Japan
gross national income in bn JPY)
and it is plotted in Fig. 3.5 (moreover, the saturation “insurmountable” level for the
Japan gross national income according to this model should be approximately
⋄
583 000 bn JPY).
γ
Tr t ¼ , t ¼ 1, . . . , n ðβ > 0, γ > 0Þ: ð3:23Þ
1 þ αβt
dTr t ln β
¼ Tr t ðγ Tr t Þ: ð3:24Þ
dt γ
It is another important indicator of the growth of trend curves (in general, the first
derivative of a trend curve is usually called the growth function). According to
(3.24), the velocity of growth of logistic trend is directly proportional to the achieved
level Trt and to the distance of the achieved level from the saturation level, i.e., γ
Trt ; see Fig. 3.6. Moreover, the first derivative (3.24) is symmetric around inflection
point lnα/lnβ . Hence the logistic trend can be classified as so-called S-curve
symmetric around inflection point (S-curves have been discussed in Sect. 2.2.3,
e.g., as a suitable instrument for modeling sales of new products; see Fig. 2.2).
As the estimation of logistic trend is concerned, its parameters can be estimated
by means of various methods. For example, the logistic trend can be regarded as the
reciprocal value of modified exponential trend so that one can apply the formulas
(3.20)–(3.22) for the time series with values 1/yt. Another approach consists in
so-called difference parametric estimation which is based on the time series of the
first differences yt+1 yt. Here we approximate the trend component Trt in (3.24) by
the real observations yt so that one can write
dyt ln β
y ðγ yt Þ: ð3:25Þ
dt γ t
If we approximate further
dyt y yt
tþ1 ¼ ytþ1 yt ¼ dt , ð3:26Þ
dt ðt þ 1Þ t
where dt denotes the time series of the first differences, then it follows from (3.25)
dt ln β
ln β þ y: ð3:27Þ
yt γ t
Using the classical least squares method in the linear regression model
−lnα / lnβ t
56 3 Trend
dt ln β
¼ ln β þ y þ εt ð3:28Þ
yt γ t
one obtains the OLS estimates of lnβ and lnβ/γ and hence the estimates of
parameters β and γ. In order to obtain the estimate of α, we finally approximate Trt
by yt in (3.23)
γ
αβt 1: ð3:29Þ
yt
After taking the logarithm and making the sum over t ¼ 1, ..., n one gets so-called
Rhodes formula
ðn þ 1Þ ln β 1 X
n
γ
ln α ¼ þ ln 1 , ð3:30Þ
2 n t¼1 yt
518 158
byt ¼ :
1 þ 37:143 7 0:845 748t
Obviously the logistic trend fits the given time series better than the modified
exponential trend in Fig. 3.5 (e.g., the estimated values of modified exponential
trend have turned out negative at the beginning of the given time series). The
saturation level for the Japan gross national income is approximately 518 160 bn JPY
⋄
in this case.
The trend in the form of this curve arises similarly to the logistic trend by
transforming the modified exponential trend. In this case, one puts
or equivalently
3.1 Trend in Time Series 57
600000
500000
400000
300000
200000
100000
0
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Fig. 3.7 Annual data 1960–2016 and eliminated logistic trend in Example 3.4 (Japan gross
national income in bn JPY)
If applying the parameter values from Fig. 3.8, then Gompertz trend has the
inflection in the point t ¼ ln(α)/lnβ and is bounded asymptotically. However,
the first derivative of this curve (i.e., the growth function) is not symmetric around
inflection point, but it is skewed to the right. Hence Gompertz trend is classified as
the S-curve asymmetric around inflection point. The estimation procedure is similar
to that for the modified exponential trend using the time series with values ln yt.
Example 3.5 (Gompertz trend). Fig. 3.9 presents the elimination of Gompertz trend
for the data from Example 3.3 (Japan gross national income at current prices in
billions of JPY) estimated as
The model implies the saturation level for the Japan gross national income approx-
imately 540 580 bn JPY.
Examples 3.1–3.5 demonstrate that time series of the same type (in our case the
gross national incomes) can be modeled using different trend curves. The choice of
the appropriate curve may depend on the economic or financial hypotheses: the
national income will not be saturated in the future (then e.g. the linear or even
58 3 Trend
−ln(−α ) / lnβ t
600000
500000
400000
300000
200000
100000
0
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Fig. 3.9 Annual data 1960–2016 and eliminated Gompertz trend in Example 3.5 (Japan gross
national income in bn JPY)
exponential trend) or there will be different levels of saturation (in the Japanese case
the lowest level 518 160 bn JPY applying the logistic trend or the highest one
⋄
583 000 bn JPY applying the modified exponential trend).
Remark 3.2 The examples of trend curves given in this section can be parametrized
in other ways, e.g., logistic trend (3.23) as
3.1 Trend in Time Series 59
β0
Tr t ¼ , t ¼ 1, . . . , n, ð3:33Þ
1 þ exp ðβ1 þ β2 t Þ
β
Tr t ¼ 0 , t ¼ 1, . . . , n: ð3:34Þ
exp β1 βt2
Tr t ¼ β0 þ β1 ln t, t ¼ 1, . . . , n, ð3:35Þ
which seems to resemble the modified exponential trend in Fig. 3.4 except for the
fact that it grows indefinitely (not to a saturation level), or Johnson trend
β
Tr t ¼ 0 , t ¼ 1, . . . , n: ð3:36Þ
β1
exp
β2 þ t
⋄
3.1.2.6 Splines
Sometimes the trend changes its character in time and cannot be modeled by means
of a single mathematical curve over the whole range of observations (or only in a
complicated way). In such a case, one can use the technique of so-called spline
functions. Here instead of applying sophisticated mathematical functions, one splits
the given time series to several segments and estimates the trends in particular
segments by simpler functions linked to each other. Moreover, the joint curve
must be sufficiently smooth which can be guaranteed, e.g., by means of conditions
for the existence of two-sided derivatives of appropriate orders in joint points.
Splines consist frequently of piecewise polynomials with pieces defined by a
sequence of knots where the pieces join smoothly.
As an example, Fuller (1976) used for the time series of average wheat yields
(in the USA in years 1908–1971) the trend which is compounded from the following
curves (t ¼ 1 corresponds to the year 1908):
Tr t ¼ 13:97, t ¼ 1, . . . , 25,
Tr t ¼ 13:97 þ 0:0123ðt 25Þ2 , t ¼ 25, . . . , 54,
Tr t ¼ 24:314 þ 0:664ðt 54Þ, t ¼ 54, . . . , 64:
60 3 Trend
In the first joint point t ¼ 25, there exists the two-sided derivative of the first order,
while in the second joint point t ¼ 54, the corresponding one-sided first derivatives
obviously differ from each other.
The simplest case of the spline function is a piecewise linear function that is
linear in all segments but with different slopes. However, such a function is
not flexible enough and, moreover, is not smooth in knots. In practice it is
common to use cubic splines with such cubic polynomials in particular segments
that the two-sided derivatives of the second order exist in the corresponding
knots (higher order polynomials can have erratic behavior at the boundaries of
the domain).
Penalized splines present a different approach to this issue [see, e.g., Eilers and
Marx (2010), Durbin and Koopman (2012)]. Suppose that we wish to approximate a
time series y1, . . ., yT by a relatively smooth function Trt. The penalized spline
method chooses Trt by minimizing
X
T X
T 2
ðyt Tr t Þ2 þ λ Δ2 Tr t ð3:37Þ
t¼1 t¼3
with respect to Trt for given λ > 0. The penalty is based on the level of variation in
Trt measured by the second difference Δ2Trt ¼ Trt 2Trt-1 Trt-2 [see (3.61)]. If λ
is small, the values of Trt will be close to the values of yt but Trt may not be smooth
enough. If λ is large, the Trt series will be smooth but the values of Trt may not be
close enough to the values of yt.
Remark 3.3 In order to choose the appropriate trend curve for the given time series,
one can make use of simple reference tests based on characteristic features of
⋄
particular curves. A survey of such tests is given in Table 3.6.
β0 þ β1 τ, τ ¼ 1, . . . , n, ð3:38Þ
but for short segments with middles in particular times t one can apply local trends
β0 ðt Þ þ β1 ðt Þτ, τ ¼ . . . , t 1, t, t þ 1, : . . . ð3:39Þ
Obviously, the process of trend elimination according to (3.39) adapts itself to the
actual local run of time series, and, moreover, the intensity of this adaptation can be
controlled. Another advantage of adaptive methods is the numerical simplicity and
the construction of predictions which respond flexibly to eventual changes in the
character of time series.
As the moving averages are concerned, this term denotes linear combinations of
time series values with the unit sum of weights, e.g.,
1
yt2 þ 2yt1 þ 2yt þ 2ytþ1 þ ytþ2 : ð3:40Þ
8
This approach is based on the axiom that each “reasonable” function can be
approximated in an acceptable way by a polynomial. Respecting the previous
discussion, at first let us fit by a suitable polynomial the initial time series segment
of length 2m + 1 and take the value of this polynomial at time t ¼ m + 1 (i.e., in the
middle of this segment) as the smoothed value bymþ1 of given time series at this time.
62 3 Trend
The parameters of this polynomial can be estimated by means of the least squares
method (i.e., as OLS estimates) minimizing the expression
X
2 2
ytþτ β0 β1 τ β2 τ2 β3 τ3 : ð3:42Þ
τ¼2
If deriving with respect to particular parameters, we obtain the system of four normal
equations for the estimates b0, b1, b2, b3 of parameters β0, β1, β2, β3 written as
X
2 X
2 X
2 X
2 X
2
ytþτ τ j b0 τ j b1 τ jþ1
b2 τ jþ2
b3 τ jþ3
¼ 0, j ¼ 0, 1, 2, 3:
τ¼2 τ¼2 τ¼2 τ¼2 τ¼2
ð3:43Þ
X
2
τi ¼ 0 ð3:44Þ
τ¼2
(it is one of the reasons for the choice of time series segments with the odd number
2m + 1 of observations), this system of equations simplifies to the form
X
5b0 þ 10b2 ¼ ytþτ ,
X
10b1 þ 34b3 ¼ τ ytþτ ,
X ð3:45Þ
10b0 þ 34b2 ¼ τ2 ytþτ ,
X
34b1 þ130b3 ¼ τ3 ytþτ :
However, we are interested only in the estimate b0 since it is the value of the fitting
polynomial b0 + b1τ + b2τ 2 + b3τ 3 at the point τ ¼ 0. Therefore, b0 is taken in our
method as the smoothed value of time series in the middle of the investigated
3.2 Method of Moving Averages 63
segment yt-2, ..., yt+2. Obviously, it is sufficient to use only the first and third equation
of system (3.45) with solution
X X
1
b0 ¼ 17 ytþτ 5 τ2 ytþτ
35
1
¼ 3yt2 þ 12yt1 þ 17yt þ 12ytþ1 3ytþ2 , ð3:46Þ
35
so that the fitted trend component, which presents simultaneously the smoothed
value of time series at time t, is also equal to
1
byt ¼ 3yt2 þ 12yt1 þ 17yt þ 12ytþ1 3ytþ2 : ð3:47Þ
35
1 1
byt ¼ ð3, 12, 17, 12, 3Þyt ¼ ð3, 12, 17, . . .Þyt : ð3:48Þ
35 35
Example 3.6 In this example one applies the formula (3.47) to smooth the time
series given in Table 3.7. The smoothed value at time t ¼ 3 is
1
by3 ¼ ð3 1 þ 12 8 þ 17 27 þ 12 64 3 125Þ ¼ 27 ¼ y3 :
35
Analogously
by4 ¼ 64 ¼ y4 ,
etc. This result corresponds to the fact that one smoothes the cubic time series by the
cubic polynomial in this example (one would obtain the same results for any poly-
⋄
nomials with the order higher than three).
X
m
τ j ytþτ ð3:49Þ
τ¼m
with even j ( j r), which can be derived if generalizing the system of equations
(3.45). After rearrangement it gives a linear combination of values ytm, ..., yt+m with
fixed coefficients called weights of moving average. One can verify easily the
following properties of moving averages:
1. The sum of weights of moving average is equal to one (if one applies the moving
average to any series of constant values, then obviously the smoothed values must
be again the original constant values).
2. The weights are symmetric around the middle value (since for even j the values
ytτ and yt+τ in the expressions of type (3.49) have symmetric coefficients).
3. If r is even, then the moving averages of orders r and r +1 with the same length
2m + 1 are identical (looking, e.g., at the system of equations (3.45), then
obviously its solution for the unknown b0 does not depend on including or not
including the unknown b3 to this system).
Let us note that the described moving averages produce only the smoothed values
bymþ1 , . . . , bynm (i.e., m values at the beginning and m values at the end remain
unsmoothed).
Another note concerns the case when it is desirable to smooth time series using
segments with an even length 2m; then the positions of smoothed values should be
just in the middle of original unit time intervals which has no reasonable practical
interpretation. We will solve both mentioned problems later (see, e.g., the centered
moving averages in Sect. 3.2.2).
231(21, 14, 39, 54, 59, . . .) 429(15, 55, 30, 135, 179, . . .)
9 1 1
429(36, 9, 44, 69, 84, 89, . . .) 429(18, 45, 10, 60, 120, 143, . . .)
11 1 1
143(11, 0, 9, 16, 21, 24, 25, . . .) 2431(110, 198, 135, 110, 390, 600, 677, . . .)
13 1 1
3.2 Method of Moving Averages 65
Table 3.8 summarizes the weights of moving averages of various lengths and
orders (r ¼ 2, ..., 5). Since the moving averages are symmetric, one gives only the
first half of weights (the middle one is bold-faced). The weights for the second and
third order or for the fourth and fifth order are equal (see above). The moving
averages of order zero and one are omitted since they have the form of arithmetic
averages
ytm þ . . . þ ytþm
:
2m þ 1
However for the sake of completeness, this table includes, e.g., the moving averages
of length 3 and order 3 in spite of the fact that it holds byt ¼ yt in such a case.
We have stressed above that the application of moving averages of length
2m + 1 does not deliver the smoothed values for the first m and the last
m observations and any predictions at all. Let us go back to fitting always five
neighboring observations by the cubic parabola (see above), and let the fitted
segment be the last one with values yn4, ..., yn. In contrast to the previous
construction, now we are interested in the values of the cubic parabola that fit the
last segment for τ ¼ 1 and 2 (these values have been ignored before). Therefore in
addition we also need the estimates of parameters β1, β2, and β3 in the parabola
model (before it has been sufficient to estimate only β0). Solving the system of
equations (3.45), one can easily find these estimates in the form
!
1 X 2 X2
b1 ¼ 65 τytþτ 17 τ ytþτ ,
3
72 τ¼2 τ¼2
!
1 X 2 X
2 2
b2 ¼ τ y 2 ytþτ ,
14 τ¼2 tþτ τ¼2
!
1 X 2 X 2
b3 ¼ 5 τ ytþτ 17
3
τytþτ : ð3:50Þ
72 τ¼2 τ¼2
Together with the value (3.46) of b0 one obtains for the last two observations yn1
and yn the following smoothed values:
byn2þk ¼ b0 þ b1 k þ b2 k2 þ b3 k 3
1 k
¼ ð3, 12, 17, 12, 3Þyn2 þ ð1, 8, 0, 8, 1Þyn2 þ
35 12
k2 k3
ð2, 1, 2, 1, 2Þyn2 þ ð1, 2, 0, 2, 1Þyn2 , k ¼ 1, 2: ð3:51Þ
14 12
1 1
byn1 ¼ ð2, 8, 12, 27, 2Þyn2 , byn ¼ ð1, 4, 6, 4, 69Þyn2 : ð3:52Þ
35 70
Due to the apparent symmetry we can also immediately rewrite it for the first and
second value at the beginning of time series
1 1
by1 ¼ ð69, 4, 6, 4, 1Þy3 , by2 ¼ ð2, 27, 12, 8, 2Þy3 : ð3:53Þ
70 35
Moreover, this approach enables to construct predictions in the given time series:
e.g., the prediction of value yn+1 can be constructed when substituting k ¼ 3 to
(3.51), i.e.,
1
bynþ1 ðnÞ ¼ ð4, 11, 4, 14, 16Þyn2 : ð3:54Þ
5
1
bynþ1 ðnÞ ¼ ð2yn2 þ yn1 þ 4yn Þ:
3
Table 3.9 Beginning moving averages of the second and third order
Order r ¼ 2
Length 5 Length 7 Length 9
by1 by2 by3 by1 by2 by3 by4 by1 by2 by3 by4 by5
31 9 3 32 5 1 2 109 126 378 14 21
9 13 12 15 4 3 3 63 92 441 273 14
3 12 17 3 3 4 6 27 63 464 447 39
5 6 12 4 2 4 7 1 39 447 536 54
3 5 3 6 1 3 6 15 20 390 540 59
35 35 35 3 0 1 3 21 6 293 459 54
5 1 2 2 17 3 156 293 39
42 14 14 21 3 7 21 42 14
21 6 238 294 21
165 330 2310 2310 231
Order r ¼ 3
Length 5 Length 7 Length 9
by1 by2 by3 by1 by2 by3 by4 by1 by2 by3 by4 by5
69 2 3 39 8 4 2 85 56 28 56 21
4 27 12 8 19 16 3 28 65 392 84 14
6 12 17 4 16 19 6 2 56 515 144 39
4 8 12 4 6 12 7 12 36 432 145 54
1 2 3 1 4 2 6 9 12 234 108 59
70 35 35 4 7 4 3 0 9 12 54 54
2 4 1 2 8 20 143 4 39
42 42 42 21 8 14 140 21 14
7 16 112 0 21
99 198 1 386 462 231
group of three neighboring observations there are either two upper turning points and
one lower turning point, or vice versa. In the second case (see Fig. 3.10b) just the
opposite situation occurs: the smoothed time series follows the original time series
upward to the upper turning points and downward to the lower turning points.
On the contrary, as the choice of order of moving averages is concerned, one can
decide on it by means of the following objective criterion based on differencing time
series (see also Remark 3.4). Let the given time series yt fulfill the model (3.1), where
Trt is a polynomial of the rth order (we denote it as β0 +β1 t + ... + βr tr) and Et is the
7(2, 1, 0, 1, 2, 3, 4)
7 1
143(44, 11, 36, 38, 24, 1, 24, 44, 52, 41, 4,
13 1
66, 176)
yt (a) yt (b)
white noise (we denote it for simplicity as εt and its variance as σ 2). The cor-
responding criterion, which should find r as the order of moving averages in
question, consists in differencing gradually the analyzed time series. When
differencing yt, we decrease the order of its polynomial trend by one in each step:
e.g., the order of the polynomial
ð β 0 þ β 1 t þ . . . þ β r t r Þ ð β 0 þ β 1 ð t 1Þ þ . . . þ β r ð t 1Þ r Þ
is r 1, etc. It is important in our context that Δr+1Trt ¼ 0, i.e., the polynomial trend
Trt can be eliminated completely after applying r + 1 gradual differences (only
r differences are not sufficient since they produce a constant which may be nonzero).
As differencing the white noise εt is concerned, then its kth difference
3.2 Method of Moving Averages 69
k k
Δ εt ¼ εt
k
εt1 þ εt2 . . . þ ð1Þk εtk ð3:55Þ
1 2
If we denote
P
n k 2
Δ yt
t¼kþ1
Vk ¼ , ð3:57Þ
2k
ðn k Þ
k
• jth power of lag operator B delays a variable defined in time by j time units:
70 3 Trend
B j yt ¼ B j1
ðByt Þ ¼ B j1
yt1 ¼ . . . ¼ ytj ; ð3:59Þ
d d
Δ yt ¼ Δ
d d1
ðΔyt Þ ¼ yt yt1 þ yt2 . . . þ ð1Þd ytd
1 2
¼ ð1 BÞd yt ; ð3:62Þ
⋄
Example 3.7 Table 3.13 and Fig. 3.11 present smoothing and predicting by moving
averages in the time series yt of the US nominal short-term interest rates (in % p.a.)
for particular years 1961–2015 (t ¼ 1, ..., 55).
The values Vk calculated in Table 3.14 according to (3.57) indicate that the upper
limit for the order of corresponding moving averages is probably equal to 3. There-
fore, we will apply for this time series the moving averages with the order r ¼ 3.
Table 3.13 and Fig. 3.11 present the calculated moving averages of this order which
have lengths 5 and 9 (i.e., m ¼ 2 and m ¼ 4). Figure 3.11 shows that the moving
averages of length 5 follow very closely the original observations so that the
eliminated trend includes some periodic and random fluctuations that should be
left aside from the trend component. On the contrary, the moving averages of length
9 smooth such short-term fluctuations in a sufficient way and, therefore, they should
be preferred for the trend elimination in our case. For smoothing one has applied the
weights from Table 3.8 so that, e.g., the moving averages of length 5 give
3.2 Method of Moving Averages 71
Table 3.13 Annual data 1961–2015 and smoothing and predicting by moving averages in Exam-
ple 3.7 (US nominal short-term interest rates in % p.a.)
byt byt byt byt
Year yt r ¼3, m ¼ 2 r ¼3, m ¼ 4 Year yt r ¼ 3, m ¼ 2 r ¼ 3, m ¼ 4
1961 2.37 2.37 2.26 1989 9.28 8.95 7.98
1962 2.77 2.77 2.88 1990 8.28 8.24 7.26
1963 3.17 3.17 3.30 1991 5.98 5.98 6.05
1964 3.57 3.53 3.62 1992 3.83 3.93 5.05
1965 3.97 4.18 3.91 1993 3.30 3.51 4.35
1966 4.86 4.43 4.46 1994 4.75 4.71 4.43
1967 4.30 4.67 5.21 1995 6.04 5.68 5.00
1968 5.35 5.43 5.63 1996 5.51 5.83 5.49
1969 6.74 6.52 5.43 1997 5.74 5.60 6.03
1970 6.28 6.03 5.34 1998 5.56 5.49 6.06
1971 4.32 4.49 5.79 1999 5.41 5.96 5.60
1972 4.18 4.76 5.97 2000 6.53 5.69 4.83
1973 7.19 6.77 5.95 2001 3.77 4.12 3.62
1974 7.89 7.49 5.94 2002 1.79 1.88 2.57
1975 5.77 6.15 5.99 2003 1.22 1.13 2.08
1976 5.00 4.93 6.07 2004 1.62 1.83 2.40
1977 5.33 5.47 6.19 2005 3.56 3.51 3.48
1978 7.37 7.45 7.80 2006 5.20 5.18 4.09
1979 10.11 9.75 9.90 2007 5.30 4.99 3.94
1980 11.56 12.33 11.14 2008 2.91 2.99 3.13
1981 13.97 12.77 11.70 2009 0.69 0.97 1.88
1982 10.60 11.10 11.39 2010 0.34 0.23 0.74
1983 8.67 9.20 10.24 2011 0.34 0.35 0.11
1984 9.54 8.99 8.87 2012 0.43 0.37 0.22
1985 8.38 8.32 7.66 2013 0.27 0.30 0.42
1986 6.83 7.15 7.71 2014 0.23 0.21 0.48
1987 7.19 7.06 8.03 2015 0.32 0.32 0.16
1988 7.98 8.23 8.09 2016 0.83 0.76
Source: AMECO (European Commission Annual Macro-Economic Database). (https://ec.europa.eu/
economy_finance/ameco/user/serie/SelectSerie.cfm)
1
by3 ¼ ð3 2:37 þ 12 2:77 þ 17 3:17 þ 12 3:57 3 3:97Þ ¼ 3:17%:
35
In the beginning and end of time series we have used the beginning and end moving
averages, respectively: e.g., applying again the moving averages of length 5 one gets
according to (3.53)
72 3 Trend
16
14
12
10
-2
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Fig. 3.11 Annual data 1961–2015 and smoothing and predicting by moving averages in Example
3.7 (US nominal short-term interest rates in % p.a.)
1
by1 ¼ ð69 2:37 þ 4 2:77 6 3:17 þ 4 3:57 1 3:97Þ ¼ 2:37%:
70
1
by31 ð30Þ ¼ ð56 5:30 þ 49 2:91 þ . . . þ 224 0:32Þ ¼ 0:76%:
126
Obviously for short-term predictions, the “short” moving averages should be pre-
ferred: e.g., the moving averages of order 3 and length 5 give a quite different
prediction 0.83% p.a. in this case. In general, the predictions based on moving
⋄
averages cannot be regarded as highly credible.
In practice, simpler moving averages are popular. The simplest ones are the arith-
metic moving averages. For instance, the arithmetic moving averages of length 5 are
ð5Þ 1 1
yt ¼ ð1, 1, 1, 1, 1Þyt ¼ yt2 þ yt1 þ yt þ ytþ1 þ ytþ2 : ð3:64Þ
5 5
ð2mþ1Þ 1
yt ¼ ytm þ ytmþ1 þ . . . þ ytþm : ð3:65Þ
2m þ 1
They correspond to the moving averages from Sect. 3.2.1 with order 0 or 1 and the
same length 2m + 1 (i.e., the time series segments of length 2m + 1 are fitted using
constant or linear trend). Therefore, it holds, e.g., for the length 5 and order 0
ð5Þ 1
yn1 ¼ yðn5Þ ¼ bynþτ ðnÞ ¼ ðyn4 þ yn3 þ . . . þ yn Þ ð3:66Þ
5
ð5Þ 1
yn1 ¼ ðy þ 2yn2 þ 3yn1 þ 4yn Þ,
10 n3
1
yðn5Þ ¼ ðyn4 þ yn2 þ 2yn1 þ 3yn Þ,
5
1
bynþ1 ðnÞ ¼ ð4yn4 yn3 þ 2yn2 þ 5yn1 þ 8yn Þ, . . . : ð3:67Þ
10
74 3 Trend
The centered moving averages modify the arithmetic moving averages in order to be
applicable when one smoothes economic time series over particular seasons with an
even number of observations (usually 4 for the quarterly data or 12 for the monthly
data). In such a situation, the methodological problem appears, namely whereabouts
to allocate the particular averages: e.g., the arithmetic average of the values over
January till December belongs to the midpoint between time points for June and July
values. However, when averaging such two neighboring moving averages (the first
one corresponds to the center of interval “June–July” and the second one to the
center of interval “July–August”), then the result can be undoubtedly allocated to the
time point “July”. In other words, we construct moving averages of the type
ð12Þ 1 1
yt ¼ y þ yt5 þ . . . þ yt þ . . . þ ytþ5
2 12 t6
1
þ yt5 þ yt4 þ . . . þ yt þ . . . þ ytþ6 Þ
12
1
¼ y þ 2yt5 þ 2yt4 þ . . . þ 2ytþ5 þ ytþ6 ð3:68Þ
24 t6
(obviously with length 13). That is, when calculating, e.g., the July value, one
exploits the February till December values of the given year (all with the weights
1/12) and the January values of the present and future year (both with weights 1/24).
In general, one can write
ð2mÞ 1
yt ¼ y þ 2ytmþ1 þ . . . þ 2ytþm1 þ ytþm : ð3:69Þ
4m tm
ð4Þ
These values are denoted as the centered moving averages (quarterly ones yt for
ð12Þ
m ¼ 2 or monthly ones yt for m ¼ 6).
This method denotes the moving averages that are capable of restraining or
completely filtering off the outliers (i.e., the outlying observations) in time series.
A simple example is the moving medians of odd length 2m + 1
ð2mþ1Þ
medyt ¼ med ytm , ytmþ1 , . . ., ytþm : ð3:70Þ
Figure 3.12 compares the moving medians with the arithmetic moving averages in
the case of one or two outliers (obviously the length 3 of moving averages is
3.3 Exponential Smoothing 75
yt yt
1200 1200
1000 1000
800 800
600 600
400 400
200 200
0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
moving medians of length 3 moving medians of length 5
arithmetic moving averages of length 3 arithmetic moving averages of length 5
insufficient in the second case with two outliers so that it has been necessary to
prolong it to 5).
Remark 3.5 In practice, various software systems offer a plenty of other moving
averages presented frequently as filters. For instance in macroeconomics, especially
in real business cycle theory, so-called Hodrick-Prescott filter is popular to remove
the cyclical component (see, e.g., EViews). This filter constructs the smoothed
values by1 , . . . , byn for given time series y1, ..., yn by minimizing the expression
X
n X
n1 2
ðyt byt Þ2 þ λ bytþ1 byt ðbyt byt1 Þ : ð3:71Þ
t¼1 t¼2
The positive constant λ controls the intensity of smoothing of given time series
(obviously if λ ! 1, then the method eliminates the linear trend). Another example
is the moving averages based on OWA operators (ordered weighted averaging)
which calculate weighted averages of ordered values in particular segments of
time series. They can be of interest when we want to over- or underestimate the
⋄
results; see Merigó and Yager (2013).
The exponential smoothing is another adaptive approach to the trend component that
is frequently used in practice (see also the introduction to Sect. 3.2). It is a special
case of the moving averages, in which the values observed up to the present period
76 3 Trend
get weights that decrease exponentially with the age of particular observations. Such
moving averages byt are constructed minimizing the expressions of the form
where (0 < β < 1) is a fixed discount constant. The discounting of weights in (3.72)
can be interpreted in a reasonable way: the observations more distant in the past have
lower weights. At first glance this approach may seem complicated but from the
numerical point of view it is easily realized, in particular if one uses recursive
formulas. In this section, we again assume that time series have the form (3.1)
(i.e., the time series model consists only of trend and additive residual component).
More details can be found, e.g., in Abraham and Ledolter (1983), Bowerman and
O’Connell (1987), Montgomery and Johnson (1976), and others.
The simple exponential smoothing is recommended for time series in which the
trend can be viewed as locally constant (i.e., constant in short segments of time
series)
Tr t ¼ β0 : ð3:73Þ
X
1 2
ytj β0 β j , ð3:74Þ
j¼0
where β (0 < β < 1) is a fixed discount constant. It should be pointed out that the sum
in the minimized expression (3.74) is infinite, although in real situations we always
know only a finite number of values y1, ..., yt. However, the hypothetical extension of
time series to the past simplifies significantly the corresponding formulas due to
simpler limit results. In any case, the numerical calculations based on this abstraction
exploit only the observed values y1, ..., yt of given time series (see below).
3.3 Exponential Smoothing 77
If we derive (3.74) with respect to β0 and put this derivative equal to zero, then
due to the convexity of minimized function we get the estimate b0(t) of parameter
β0 at time t as
X
1
byt ¼ ð1 βÞ β j ytj : ð3:75Þ
j¼0
Hence one can see that the smoothed value of time series at time t is the weighted
average of values of this time series till time t with weights decreasing exponentially
to the past
Since the formula (3.75) is not comfortable for practical calculations, it is transferred
to the recursive form
In addition to the formulas (3.75) and (3.77), there exists the third form of smoothing
formula making use of (3.78), namely
byt ¼ byt1 þ αðyt byt1 Þ ¼ byt1 þ αðyt byt ðt 1ÞÞ ¼ byt1 þ α et , ð3:80Þ
The form (3.80) is sometimes denoted as the “error” formula: in order to correct the
previous smoothed value byt1 one exploits the (reduced) one-step-ahead error et of
prediction byt ðt 1Þ constructed at time t 1 as soon as the value yt is observed.
78 3 Trend
1
α¼ , ð3:81Þ
mþ1
X
2m
k
, ð3:82Þ
k¼0
2m þ 1
X
1
kαð1 αÞk , ð3:83Þ
k¼0
where the mean ages (3.82) and (3.83) must coincide, and hence α expressed
as (3.81) follows); however, this approach is handicapped by the fact that at
first one must decide on an adequate length of moving averages.
(c) The estimate of α : one admits the grid points 0.01; 0.02; ..., 0.30 as possible
values of the smoothing constant and chooses such a value of α from this grid
that gives the best predictions with minimum SSE [see (2.10)] in the given
time series. This approach is included in many software systems (see, e.g.,
EViews).
Example 3.8 Figure 3.13 presents the simple exponential smoothing in the time
series yt of annual averages of exchange rates USD/EUR(ECU) for particular years
1960–2016 with t ¼ 1, ..., 57 (the former basket currency ECU of the European
Community was introduced as late as in 1979, but it was formally amended for the
period since 1960). The smoothed values for α ¼ 0.01, 0.02, and 0.3 including the
one-step-ahead prediction for year 2017 are plotted in the figure. As the possibility of
estimation of α is concerned (see above), EViews has found the value close to one
3.3 Exponential Smoothing 79
1.5
1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Fig. 3.13 Annual data 1960–2016 and single exponential smoothing in Example 3.8 (annual
averages of exchange rates USD/EUR(ECU)). Source: AMECO (European Commission Annual
Macro-Economic Database) (https://ec.europa.eu/economy_finance/ameco/user/serie/SelectSerie.
cfm)
(α ¼ 0.999), which means that the smoothed time series nearly coincides with the
⋄
original time series.
Remark 3.6 Assuming the normal distribution of residual components one can
construct by means of exponential smoothing not only point predictions but also
interval predictions [(3.84) is only approximative without this assumption]. For
example, the (1p)100% prediction interval (i.e., 95% interval if p ¼ 0.05) is
recommended in the form
bynþτ ðnÞ u1p=2 d τ MAE, bynþτ ðnÞ þ u1p=2 d τ MAE , ð3:84Þ
Pn
MAE the mean absolute deviation [see (2.17)]: MAE ¼ ð1=nÞ t¼1 jyt byt ðt 1Þj.
⋄
Remark 3.7 The exponential smoothing algorithm can be controlled by means of
so-called adaptive control process, which
• indicates that the applied type of exponential smoothing stops being adequate for
given time series;
• adjusts automatically the values of smoothing constants when it is necessary (then
this methodology can be looked upon as a special case of so-called stochastic
control).
Here the indicators of “default” are, e.g., significantly high values It(α) constructed
online as
jY t ðαÞj
I t ðα Þ ¼ , ð3:85Þ
D t ðα Þ
where
X
t
1 X
t
Y t ðαÞ ¼ e j ðαÞ, Dt ðαÞ ¼ e ðαÞ : ð3:86Þ
j¼1
t j¼1 j
The symbol ej(α) denotes the error of prediction of value yj (the prediction is
constructed at time j 1, i.e., one-step-ahead, applying the smoothing constant
equal to α). The indicator It(α) is usually calculated online, and its increased values
exceeding a given boundary K indicate that a change of α or even a change of the
type of exponential smoothing is necessary (e.g., one should apply the double
exponential smoothing from Sect. 3.3.2 instead of the simple exponential smooth-
ing). The boundary K is usually fixed in the range from 4 to 6 (it can be constructed
in a similar way to the critical value of statistical tests with fixed significance levels).
One of the methods to change automatically the smoothing constant α makes use
of the approach that three procedures of exponential smoothing are realized in
parallel with three different values of this constant: one can use, e.g., the values α
0.05, α, α + 0.05. Here only the procedure using at time t the “middle” smoothing
constant α delivers the output results for users, and (in each time point) the meth-
odology compares the values Dt(α 0.05), Dt(α), Dt(α + 0.05) calculated according
to (3.86). If it holds
then the “middle” process transfers α to α + 0.05, and the algorithm goes on using the
triplet of smoothing constants α, α + 0.05, α + 0.10, respectively.
⋄
The estimates b0(t) and b1(t) constructed at time t for the parameters β0 and β1 are
obtained by minimizing the expression
X
1
2
ytj ðβ0 þ β1 ðjÞÞ β j , ð3:90Þ
j¼0
where again β (0 < β < 1) is a fixed discount constant. If we put the partial
derivatives of (3.90) with respect to β0 and β1 both equal to zero, we get the system
of normal equations
X
1 X
1 X
1 X
1 X
1 X
1
β j ytj β0 β j þ β1 jβ j ¼ 0, jβ j ytj β0 jβ j þ β1 j2 β j ¼ 0,
j¼0 j¼0 j¼0 j¼0 j¼0 j¼0
ð3:91Þ
X
1
1 X
1
β X
1
β ð1 þ β Þ
βj ¼ , jβ j ¼ , j2 β j ¼ ð3:92Þ
j¼0
1β j¼0 ð1 β Þ2 j¼0 ð1 β Þ3
to the form
β X 1
β ð1 þ β Þ X 1
β0 β ¼ ð1 βÞ β j ytj , β β0 β1 ¼ ð1 βÞ2 jβ j ytj :
1β 1 j¼0
1β j¼0
ð3:93Þ
82 3 Trend
X
1
St ¼ ð 1 β Þ β j ytj ð3:94Þ
j¼0
(due to (3.75), this St corresponds to the value of time series smoothed at time t by
the simple exponential smoothing). Therefore, it holds according to (3.77)
½2
X
1
St ¼ ð 1 β Þ β j Stj ð3:96Þ
j¼0
(the relation (3.96) is analogous to (3.94), but the values yt are replaced by St). Hence
the analogy to the recursive relation (3.95) gives
½2 ½2
St ¼ αSt þ ð1 αÞSt1 : ð3:97Þ
The introduced smoothing statistics enable to rewrite the system of normal equation
(3.93) to the form
β β ð1 þ β Þ ½2
β0 β ¼ St , β β0 β1 ¼ St ð1 βÞSt : ð3:98Þ
1β 1 1β
Then the prediction of value yt+τ constructed at time t has the natural form
ατ α τ ½2
bytþτ ðt Þ ¼ b0 ðt Þ þ bt ðt Þ τ ¼ 2 þ St 1 þ S : ð3:100Þ
1α 1α t
The special case τ ¼ 0 delivers the smoothed value of time series, i.e.,
½2
byt ¼ 2St St : ð3:101Þ
The statistics St and St[2] are calculated recursively according to (3.95) and (3.97).
3.3 Exponential Smoothing 83
1α ½2 2ð1 αÞ
S0 ¼ b0 ð 0Þ b1 ð0Þ, S0 ¼ b0 ð 0Þ b1 ð0Þ: ð3:102Þ
α α
2. For the choice of smoothing constant α in practice, one again recommends the
interval 0 < α 0.3, in which (similarly to the simple exponential smoothing) we
can use:
(a) The fixed choice α ¼ 0.1 or α ¼ 0.2.
(b) The choice
rffiffiffiffiffiffiffiffiffiffiffiffi
1
α¼ , ð3:103Þ
mþ1
bynþτ ðnÞ u1p=2 dτ MAE, bynþτ ðnÞ þ u1p=2 dτ MAE , ð3:104Þ
0 11=2
1 þ ð1þβ
1β
Þ3
1 þ 4β þ 5β2 þ 2ð1 βÞð1 þ 3βÞτ þ 2ð1 βÞ2 τ2
d τ 1:25 @ A :
2
1 þ ð1þβ
1β
Þ 3 1 þ 4β þ 5β 2
þ 2 ð 1 β Þ ð 1 þ 3β Þ þ 2 ð 1 β Þ
ð3:105Þ
⋄
84 3 Trend
Remark 3.9 A natural extension of simple and double exponential smoothing is the
triple exponential smoothing (the local quadratic trend necessitates to introduce the
triple smoothing statistics St[3] in addition to St and St[2]). Even though the exponen-
tial smoothing of a general order r is possible, the order r ¼ 3 is the highest, which is
⋄
used in practice.
α
αHolt ¼ αð2 αÞ, γ Holt ¼ : ð3:110Þ
2α
Example 3.9 Table 3.15 and Fig. 3.14 present double exponential smoothing (with
the fixed choice of smoothing constant α ¼ 0.15) and Holt’s method (with the fixed
choice of smoothing constants α ¼ 0.1 and γ ¼ 0.2) in the time series yt of the US
nominal short-term interest rates (in % p.a.) for particular years 1961–2015 (t ¼ 1, ...,
55). The smoothed values and the one-step-ahead prediction for year 2016 have been
obtained by EViews and can be compared with the corresponding results by moving
⋄
averages in Example 3.7 for the same data (see Fig. 3.11).
3.4 Exercises 85
Table 3.15 Annual data 1961–2015 and smoothing and predicting by double exponential smooth-
ing and Holt’s method in Example 3.9 (US nominal short-term interest rates in % p.a.)
Doub. exp. Holt α ¼ Doub. exp. Holt α ¼
Year t yt α ¼ 0.15 0.1, γ ¼ 0.2 Year t yt α ¼ 0.15 0.1, γ ¼ 0.2
1961 1 2.37 2.37 2.26 1989 29 9.28 8.95 7.98
1962 2 2.77 2.77 2.88 1990 30 8.28 8.24 7.26
1963 3 3.17 3.17 3.30 1991 31 5.98 5.98 6.05
1964 4 3.57 3.53 3.62 1992 32 3.83 3.93 5.05
1965 5 3.97 4.18 3.91 1993 33 3.30 3.51 4.35
1966 6 4.86 4.43 4.46 1994 34 4.75 4.71 4.43
1967 7 4.30 4.67 5.21 1995 35 6.04 5.68 5.00
1968 8 5.35 5.43 5.63 1996 36 5.51 5.83 5.49
1969 9 6.74 6.52 5.43 1997 37 5.74 5.60 6.03
1970 10 6.28 6.03 5.34 1998 38 5.56 5.49 6.06
1971 11 4.32 4.49 5.79 1999 39 5.41 5.96 5.60
1972 12 4.18 4.76 5.97 2000 40 6.53 5.69 4.83
1973 13 7.19 6.77 5.95 2001 41 3.77 4.12 3.62
1974 14 7.89 7.49 5.94 2002 42 1.79 1.88 2.57
1975 15 5.77 6.15 5.99 2003 43 1.22 1.13 2.08
1976 16 5.00 4.93 6.07 2004 44 1.62 1.83 2.40
1977 17 5.33 5.47 6.19 2005 45 3.56 3.51 3.48
1978 18 7.37 7.45 7.80 2006 46 5.20 5.18 4.09
1979 19 10.11 9.75 9.90 2007 47 5.30 4.99 3.94
1980 20 11.56 12.33 11.14 2008 48 2.91 2.99 3.13
1981 21 13.97 12.77 11.70 2009 49 0.69 0.97 1.88
1982 22 10.60 11.10 11.39 2010 50 0.34 0.23 0.74
1983 23 8.67 9.20 10.24 2011 51 0.34 0.35 0.11
1984 24 9.54 8.99 8.87 2012 52 0.43 0.37 0.22
1985 25 8.38 8.32 7.66 2013 53 0.27 0.30 0.42
1986 26 6.83 7.15 7.71 2014 54 0.23 0.21 0.48
1987 27 7.19 7.06 8.03 2015 55 0.32 0.32 0.16
1988 28 7.98 8.23 8.09 2016 56 0.83 0.76
Source: AMECO (European Commission Annual Macro-Economic Database) (https://ec.europa.
eu/economy_finance/ameco/user/serie/SelectSerie.cfm)
3.4 Exercises
Exercise 3.1 Repeat the analysis from Example 3.1 (the linear trend in the Swiss
gross national income) only for data since 1990 (hint: 343.348 1 + 12.566 40 t, by27 ¼
682.6, (652.5; 712.8)).
Exercise 3.2 Repeat the analysis from Example 3.2 (the exponential trend in US
gross national income) only for data since 1970 (hint: 2 466.881.048 05t).
86 3 Trend
16
12
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Fig. 3.14 Annual data 1961–2015 and smoothing and predicting by double exponential smoothing
and Holt’s method in Example 3.9 (US nominal short-term interest rates in % p.a.)
Exercise 3.3 Repeat the analysis from Example 3.3 (the modified exponential trend
in Japan gross national income) only for data since 1970 (hint: 511 354-770 662
0,862 404t, saturation in 511 360).
Exercise 3.4 Eliminate numerically the logistic trend in Example 3.4 (the Japan
gross national income).
Exercise 3.5 Eliminate numerically Gompertz trend in Example 3.5 (the Japan
gross national income).
Exercise 3.6 Repeat the analysis from Example 3.7 (the moving averages for the
US nominal short-term interest rates) only for data since 1981.
Exercise 3.7 Derive the formulas (3.66) and (3.67) in the case arithmetic moving
averages of length 5.
Exercise 3.8 Repeat the analysis from Example 3.8 (the simple exponential
smoothing for the annual averages of exchange rates USD/EUR(ECU)) only for
data since 1980 (these data are not presented numerically in the monograph; hint:
α ¼ 0.999).
Exercise 3.9 Repeat the analysis from Example 3.9 (the double exponential
smoothing and Holt’s method for the US nominal short-term interest rates) only
for data since 1981.
Chapter 4
Seasonality and Periodicity
This deals with the elimination of seasonal component describing periodic changes
in time series which pass off during one calendar year and repeat themselves each
year. Even if the moving averages from Sect. 3.2 are capable of eliminating the
seasonality significantly (e.g., the monthly centered moving averages (3.68) have
such an effect in the case of monthly seasonal observations), an effective seasonal
analysis should moreover deliver so-called seasonal indices I1, I2, . . . , Is (s denotes
the length of season, i.e., s ¼ 12 in the case of monthly observations). These indices
model the seasonality in particular seasons, and, moreover, they can be used not only
to eliminate the seasonal phenomenon but also to construct predictions. However,
they have sense only under the assumption that the seasonality is really regular so
that its modeling by repeating seasonal indices is justified for the given time series.
The seasonal indices have the following properties:
• The units, in which the seasonal indices are measured, depend on the type of
decomposition. When the decomposition is
– additive (i.e., yt ¼ Trt + Szt + Et): It is measured in the same units as the
corresponding time series yt (e.g., the December seasonal index of a retail sale
amounting to EUR 45m means that the seasonality manifests itself by the
December increase of time series by EUR 45m above the average trend
behavior);
– multiplicative (i.e., yt ¼ Trt Szt Et): It is a relative variable (e.g., the December
seasonal index of a retail sale amounting to 1.38 means that the seasonality
manifests itself by the December increase of time series by 38% above the
trend).
• It is typical for the multiplicative decomposition that the seasonal fluctuations
increase (decrease) with increasing (decreasing) trend, respectively, even if the
seasonal indices repeat themselves regularly in particular seasons (in the case of
additive decomposition
multiplicative decomposition
additive decomposition, the seasonal fluctuation does not depend on the trend
monotonicity; see Fig. 4.1).
• The relation of trend and seasonal component is not determined unambiguously:
one of them can be shifted upward in an arbitrary way, if it is offset by shifting the
second one downward, and vice versa. This ambiguity is removed when the
seasonal indices are normalized. Such a normalization of seasonal indices differs
again according to the type of decomposition. When the decomposition is:
– additive: then the usual normalization rule demands that the sum of seasonal
indices over each season must be equal to zero; e.g., monthly observations
must fulfill for each i 0
– multiplicative: then the usual normalization rule demands that the product of
seasonal indices over each season must be equal to one; e.g., monthly obser-
vations must fulfill for each i 0
(obviously after taking logarithm, this rule transfers to the form (4.1)), or
occasionally
ð12Þ
1. One constructs the centered moving averages yt (in the case of quarterly
ð4Þ
observations, one should construct the centered moving averages yt ). At the
beginning and at the end of time series, one can repeat the first and the last
calculable centered moving average, respectively (if it is necessary).
2. The centered moving averages can be looked upon as a raw estimate of trend
component that enables to eliminate the trend from data
ð12Þ
yt ¼ yt yt : ð4:4Þ
3. One constructs the (non-normalized) seasonal indices I1, I2, . . ., I12, where the
seasonal index Ij for the jth month is estimated as the arithmetic average of all
values yt, which correspond to the jth month over all years included in time series
( j ¼ 1, . . ., 12).
4. One normalizes the values I1, I2, . . ., I12 by subtracting their arithmetic mean
I 1 þ . . . þ I 12
I j ¼ I j I ¼ I j , j ¼ 1, . . . , 12, ð4:5Þ
12
ð12Þ
1. One constructs the centered moving averages yt similarly as in the case of
additive decomposition (see above).
2. One eliminates the trend from data
yt
yt ¼ ð12Þ
: ð4:7Þ
yt
3. One constructs the (non-normalized) seasonal indices I1, I2, . . ., I12, where the
seasonal index Ij for the jth month is estimated as the arithmetic average of all
values yt, which correspond to the jth month over all years included in time series
( j ¼ 1, . . ., 12).
4. One normalizes the values I1, I2, . . ., I12 by dividing by their geometric mean
I j I j
Ij ¼ ¼ p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , j ¼ 1, . . . , 12, ð4:8Þ
bI 12
I 1 . . . I 12
yt
byðt 12Þ ¼ , ð4:9Þ
Ij
⋄
Box–Jenkins methodology, and the like.
Example 4.1 Table 4.1 and Fig. 4.2 present the additive elimination of seasonality
in the time series yt of the Czech construction production index for particular quarters
2009Q1-2016Q4 (t ¼ 1, . . ., 32). Table 4.1 shows also numerically the estimated
seasonal indices I1, . . ., I4 according to formulas (4.4)–(4.6).
⋄
Example 4.2 Figure 4.3 presents the multiplicative elimination of seasonality in the
time series yt of the job applicants kept in the Czech labor office register for
particular months 2005M1–2016M12 (t ¼ 1, . . ., 144; see also Table 4.4). The
4.1 Seasonality in Time Series 91
Table 4.1 Quarterly data 2009Q1–2016Q4 and the simple approach to additive seasonal elimina-
tion in Example 4.1 (Czech construction production index)
130
120
110
100
90
80
70
60
50
40
2009 2010 2011 2012 2013 2014 2015 2016
Fig. 4.2 Quarterly data 2009Q1–2016Q4 and the simple approach to the additive seasonal
elimination in Example 4.1 (Czech construction production index)
output of EViews in Table 4.2 shows the estimated seasonal indices I1, . . ., I12
⋄
according to formulas (4.7)–(4.9).
92 4 Seasonality and Periodicity
640000
600000
560000
520000
480000
440000
400000
360000
320000
280000
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Fig. 4.3 Monthly data 2005M1–2016M12 and the simple approach to multiplicative seasonal
elimination in Example 4.2 ( job applicants kept in the Czech labor office register); see Table 4.4.
Source: Czech Statistical Office
The regression approaches differ from the simple approaches of Sect. 4.1.1 only by
estimating the seasonal indices using more sophisticated regression models.
where the dummies x2, x3, x4 are defined by the following table:
The estimated model with OLS estimates b0, b1, a2, a3, a4 can be used to construct
the point and interval predictions, for which the future values of dummies are
obtained by natural extension of the previous table till the corresponding prediction
horizon. Moreover, if one needs the seasonal indices explicitly, then their normal-
ization (4.3) is possible in the form
where
a2 þ a3 þ a4
a¼ : ð4:12Þ
4
94 4 Seasonality and Periodicity
Table 4.3 Quarterly data 2009Q1–2016Q4 and the regression approach to additive seasonal
elimination in Example 4.3 (Czech construction production index); see Table 4.1
Example 4.3 Table 4.3 and Fig. 4.4 present the additive elimination of seasonality
in the time series yt of the Czech construction production index for particular quarters
2009Q1–016Q4 (t ¼ 1, . . ., 32) using the regression approach to seasonality (see
also Example 4.1 for the same data but applying the simple approach to additive
seasonality from Sect. 4.1.1). Again Table 4.3 shows numerically the seasonal
indices I1, . . ., I4 estimated according to formulas (4.10)–(4.12) with the only
difference that the model using quadratic trend has been applied instead of linear
trend used in (4.10), i.e.,
130
120
110
100
90
80
70
60
50
40
2009 2010 2011 2012 2013 2014 2015 2016 2017
Fig. 4.4 Quarterly data 2009Q1–2016Q4 and the regression approach to the additive seasonal
elimination including predictions for data 2017Q1–2017Q4 in Example 4.3 (Czech construction
production index)
Table 4.3 and Fig. 4.4 again present the values byðt 4Þ after seasonal elimination (4.6).
Finally, the predictions for year 2017 are shown in Fig. 4.4, e.g., for 2017Q1
calculated as
⋄
4.1.2.2 Seasonality Modeled by Goniometric Functions
The goniometric functions enable us to model the seasonality with length of season
s explicitly by means of models of the form
2πt 2πt
yt ¼ β0 þ β1 t þ β2 sin þ β3 cos þ εt ð4:13Þ
s s
Table 4.4 Monthly data 2005M1–2016M12 and the multiplicative seasonal elimination by Holt–
Winters’ method including predictions for data 2017M1–2017M12 in Example 4.4 ( job applicants
kept in the Czech labor office register)
obs yt byt obs yt byt obs yt byt
2005M1 561 662 561 401 2009M5 457 561 415 928 2013M9 557 058 560 632
2005M2 555 046 560 999 2009M6 463 555 436 915 2013M10 556 681 557 446
2005M3 540 456 543 236 2009M7 485 319 471 992 2013M11 565 313 566 554
2005M4 512 557 514 050 2009M8 493 751 490 025 2013M12 596 833 606 851
2005M5 494 576 492 728 2009M9 500 812 500 613 2014M1 629 274 636 620
2005M6 489 744 483 847 2009M10 498 760 500 469 2014M2 625 390 639 169
2005M7 500 325 496 289 2009M11 508 909 512 454 2014M3 608 315 619 771
2005M8 505 254 497 258 2009M12 539 136 554 066 2014M4 574 908 585 569
2005M9 503 396 498 453 2010M1 574 226 586 434 2014M5 549 973 560 130
2005M10 491 878 492 879 2010M2 583 135 594 621 2014M6 537 179 546 537
2005M11 490 779 496 296 2010M3 572 824 585 490 2014M7 541 364 554 316
2005M12 510 416 524 791 2010M4 540 128 562 720 2014M8 535 225 549 561
2006M1 531 235 549 598 2010M5 514 779 539 075 2014M9 529 098 545 301
2006M2 528 154 539 345 2010M6 500 500 526 087 2014M10 519 638 533 373
2006M3 514 759 519 664 2010M7 505 284 534 095 2014M11 517 508 531 787
2006M4 486 163 489 954 2010M8 501 494 526 799 2014M12 541 914 556 343
2006M5 463 042 468 218 2010M9 500 481 517 533 2015M1 556 191 579 989
2006M6 451 106 456 462 2010M10 495 161 502 021 2015M2 548 117 567 196
2006M7 458 270 461 290 2010M11 506 640 503 480 2015M3 525 315 540 863
2006M8 458 729 458 063 2010M12 561 551 536 489 2015M4 491 585 502 268
2006M9 454 182 453 279 2011M1 571 853 590 852 2015M5 465 689 473 425
2006M10 439 788 442 873 2011M2 566 896 589 323 2015M6 451 395 456 122
2006M11 432 573 441 332 2011M3 547 762 568 253 2015M7 456 341 456 930
2006M12 448 545 460 103 2011M4 513 842 533 396 2015M8 450 666 450 923
2007M1 465 458 479 713 2011M5 489 956 504 445 2015M9 441 892 446 274
2007M2 454 737 469 777 2011M6 478 775 488 996 2015M10 430 432 435 493
2007M3 430 474 448 495 2011M7 485 584 495 685 2015M11 431 364 431 793
2007M4 402 932 414 079 2011M8 481 535 491 332 2015M12 453 118 451 977
2007M5 382 599 388 422 2011M9 475 115 485 599 2016M1 467 403 469 766
2007M6 370 791 374 060 2011M10 470 618 471 305 2016M2 461 254 463 410
2007M7 376 608 374 935 2011M11 476 404 473 334 2016M3 443 109 444 681
2007M8 372 759 370 801 2011M12 508 451 505 004 2016M4 414 960 415 571
2007M9 364 978 363 714 2012M1 534 089 531 607 2016M5 394 789 393 370
2007M10 348 842 351 060 2012M2 541 685 532 930 2016M6 384 328 381 009
2007M11 341 438 345 060 2012M3 525 180 522 306 2016M7 392 667 384 149
2007M12 354 878 356 754 2012M4 497 322 496 235 2016M8 388 474 381 810
2008M1 364 544 370 824 2012M5 482 099 476 241 2016M9 378 258 379 133
2008M2 355 033 361 174 2012M6 474 586 469 617 2016M10 366 244 370 435
2008M3 336 297 342 850 2012M7 485 597 482 137 2016M11 362 755 367 779
2008M4 316 118 317 381 2012M8 486 693 483 795 2016M12 381 373 382 580
2008M5 302 507 298 817 2012M9 493 185 483 997 2017M1 393 632
(continued)
4.1 Seasonality in Time Series 97
If the form of seasonality is more complex one can add components of the form
4πt 4πt
β4 sin þ β5 cos : ð4:15Þ
s s
This method extends Holt’s method from Sect. 3.3.3 to include in the adaptive way
not only the local linear trend but also the seasonality. Therefore, both versions of
Holt–Winters’ method (additive and multiplicative) exploit even three smoothing
constants: α to smooth the level Lt, γ to smooth the slope Tt, and δ to smooth the
seasonal index It of given time series with length of season s (0 < α, γ , δ < 1); see,
e.g., Abraham and Ledolter (1983), Bowerman and O’Connell (1987), Montgomery
and Johnson (1976), and others.
This method similarly as Holt’s method in Sect. 3.3.3 has been initially suggested ad
hoc using logical arguments only. For example, the eliminated seasonal index It of
given time series in time t is constructed according to (4.18) as a convex combination
of two items, namely (1) an estimation of this seasonal index constructed in time t by
removing the trend component from the observed value yt (i.e., yt Lt) and (2) an
estimation of this seasonal index constructed in time t 1 using the most actual
estimated value Its from the previous season. One proceeds in a similar way also in
(4.16) to remove the seasonal component from the observed value yt (i.e., yt Its)
using again the most actual estimated value Its from the previous season, and in
(4.20) to predict (in the case of predictions, one must distinguish in (4.20) particular
future seasons respecting the fact that forecasting in prediction horizons, which are
too remote in future, may be unreliable).
To start the recursive formulas of additive Holt–Winters’ method. one must
choose initial values L0, T0, Is+1, Is+2, . . ., I0 and smoothing constants α, γ, δ:
1. Suitable initial values can be found simply if one models the seasonality by
dummies as in Sect. 4.1.2 (the normalization is not here necessary)
with dummy variables x2, . . ., xs, so that using OLS estimates b0, b1, a2, . . ., as
one can put
In comparison with the previous additive Holt–Winters’ method, one only replaces
sums and differences within brackets in (4.16)–(4.20) by products and quotients,
respectively.
To start the recursive formulas of multiplicative Holt–Winters’ method, one must
again choose initial values L0, T0, I-s+1, I-s+2, . . ., I0 and smoothing constants α, γ, δ:
1. Suitable initial values can be found by means of simple formulas
ym y1 sþ1
T0 ¼ , L0 ¼ y 1 T 0,
ðm 1Þs 2
1 X
m1
y jþsi
I js ¼ , j ¼ 1, . . . , s, ð4:28Þ
m i¼0 yiþ1 sþ1
2 j T0
where yi is the arithmetic average of observations over the ith season (of length s)
and m is the total number of these seasons.
2. Suitable smoothing constants α, γ, δ can be found in the same way as in the case
of additive Holt–Winters’ method.
Remark 4.2 Using a similar denotation as in Remark 3.6, one recommends to con-
struct the (1p)100% prediction interval in the form
bynþτ ðnÞ u1p=2 dτ MAE, bynþτ ðnÞ þ u1p=2 dτ MAE , ð4:29Þ
dτ 1, 25 @ θ
A ð4:30Þ
1 þ ð1þν Þ3
ð1 þ 4ν þ 5ν2 Þ þ 2θð1 þ 3νÞ þ 2θ2
100 4 Seasonality and Periodicity
1 X
n
MAE ¼ y Lt1 T t1 I ts , ð4:31Þ
ns t¼sþ1
t
⋄
Example 4.4 Table 4.4 and Fig. 4.5 present the multiplicative elimination of
seasonality by Holt–Winters’ method (with the fixed choice of smoothing constant
α ¼ δ ¼ 0.4 and γ ¼ 0.1) in the time series yt of job applicants kept in the Czech labor
office register for particular months 2005M1–2016M12 (t ¼ 1, . . ., 144) including
predictions for data 2017M1–2017M12. The smoothed values and predictions have
been obtained by EViews and can be compared with the corresponding results by the
simple approach to multiplicative seasonal elimination in Example 4.2.
650000
600000
550000
500000
450000
400000
350000
300000
250000
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Fig. 4.5 Monthly data 2005M1–2016M12 and the multiplicative seasonal elimination by Holt–
Winters’ method in Example 4.4 ( job applicants kept in the Czech labor office register), see
Table 4.4. Source: Czech Statistical Office
4.1 Seasonality in Time Series 101
y¼xþzþu ð4:33Þ
(all these vectors are column vector of type T 1). The criteria that characterize the
particular decomposition components are recommended in the following form (see
Schlicht (1982)):
1. Criterion minimizing the measure of trend smoothness f: RT ! R (over x 2 RT):
X
T 2 X
T
f ðxÞ ¼ α Δ 2 xt ¼ α ðxt 2xt1 þ xt2 Þ2 : ð4:34Þ
t¼3 t¼3
!2
X
T
2
X
T X
s1
gð z Þ ¼ β ðzt zts Þ þ γ ztτ : ð4:35Þ
t¼sþ1 t¼s τ¼0
X
T
hðuÞ ¼ u0 u ¼ u2t : ð4:36Þ
t¼1
x y
H ¼ , ð4:38Þ
z y
where
α P0 P þ I I
H¼ ð4:39Þ
I β Q0 Q þ γ R0 R þ I
(these are band matrices that have zero elements with the exception of main diagonal
and several upper diagonals that are formed by the same elements, e.g., by 2 and 1
in the case of P; here only nonzero bands are shown in P, Q, and R).
In the decomposition model (2.2.2), the cyclical component Ct can sometimes play
an important role. It is the periodic component with periodicity longer than one year,
e.g., the five-year business cycle from Sect. 2.2.2 (the annual periodicity is classified
as seasonality). Sometimes there are even several such periodicities compounded in
the given time series. Their elimination is complex since one must decide on the
number and length of corresponding periodicities (e.g., quarterly one in combination
with annual and five-year periodicities using monthly observations). As objective
instruments in such situations one can apply various tests of periodicity which are
usually based on a so-called periodogram.
The periodogram (as well as the spectral density) is an important instrument of
spectral analysis of time series (see Sect. 2.2.2). Spectral analysis transfers the time
domain (which looks upon the given time series as a sequence of observations in
time) to the spectral domain (which looks upon the given time series as an (infinite)
mixture of periodic components and calculates their intensities in this mixture).
More specifically, the periodogram I(ω) of time series y1, y2, . . ., yn is a function
of the frequency ω (such functions are typical for the spectral domain, while
functions of time are used in the time domain). The frequency ω is usually measured
by radians per time unit (this time unit corresponds to the time interval between
4.2 Tests of Periodicity 103
neighboring observations, e.g., one year for an annual time series). Then ω /2π is the
number of cycles per one time unit. For example, the five-year periodicity in an
annual time series, where one-fifth of cycle occurs per one year, has the frequency 2π
/5. It is obvious that by observing a given time series one is capable of recognizing
statistically only the frequencies ranging maximally to π radians per time unit, i.e., to
one-half cycle per time unit (this upper limit is called Nyquist frequency); the
“quicker” frequencies remain hidden from the point of view of the grid of observed
values (e.g., the “quick” frequency 2π radians per time unit of the time series yt ¼
sin(2πt) observed at times t ¼ 1, . . ., n with one cycle per each time unit obviously
cannot be identified from the observed zero values yt ¼ 0 for t ¼ 1, . . ., n).
Numerically, the periodogram I(ω) is defined as
1 2
I ð ωÞ ¼ a ð ωÞ þ b 2 ð ωÞ , 0 ω π, ð4:40Þ
4π
where
rffiffiffi n rffiffiffi n
2 X 2 X
a ð ωÞ ¼ y cos ðω t Þ, bð ω Þ ¼ y sin ðω t Þ: ð4:41Þ
n t¼1 t n t¼1 t
X
k
yt ¼ μ þ δi cos ðωi t þ ϕi Þ þ εt
i¼1
X
k
¼μþ ðαi cos ðωi t Þ þ βi sin ðωi t ÞÞ þ εt , t ¼ 1, . . . , n, ð4:42Þ
i¼1
where μ denotes the level of this time series, ω1, . . ., ωk are the mutually different
(unknown in general) frequencies from the interval (0, π) for k periodic components
contained in (4.42), φ1, . . ., φk are the corresponding phases (i.e., the shifts of
cosinusoids from the origin for particular periodic components), and εt is the residual
component in the form of white noise with variance σ 2. Then the periodogram of
time series (4.42) fluctuates around the constant σ 2/ 2π with the exception of
frequencies ω1, . . ., ωk, in which the periodogram rockets to local extremes compa-
rable with the size of n (i.e., they are of order O(n)). Therefore, the periodogram can
indicate by its “bursts” the position of frequencies ω1, . . ., ωk .
In practice, a graphical search for the local extremes of periodogram may be
subjective. Moreover, a practical realization of periodogram typically highly fluctu-
ates because it is not consistent estimator of the spectrum (i.e., its variance may not
decrease with the increasing length of time series). Therefore, in practice one should
104 4 Seasonality and Periodicity
prefer suitable statistical tests. The best-known test of this type is Fisher’s test of
periodicity, which tests the null hypothesis
H0 : yt ¼ μ þ εt , t ¼ 1, . . . , n ð4:43Þ
with the normally distributed white noise {εt} against the alternative hypothesis
(4.42) with a given significance level α. The test statistics is constructed using the
periodogram values over the grid of frequencies
2πj
ωj ¼ , j ¼ 1, . . . , m, ð4:44Þ
n
n1
where m ¼ 2 is the integer part of n1
2 , and has the form
I ωj
W ¼ max Y j ¼ Y j ¼ max ð4:45Þ
j¼1, ..., m j¼1, ..., m I ω1 þ . . . þ I ωm
(i.e., the test statistics equals to the maximum standardized value of periodogram
over the grid (4.44) achieved for the index value denoted as j).
The critical region of Fisher’s test of periodicity with significance level α is then
W gα , ð4:46Þ
where gα is the critical value of this test (see the tabulated values in Table 4.5). When
the inequality (4.46) occurs, then we have found simultaneously the frequency of the
periodic component that causes the rejection of null hypothesis (4.43): it is the grid
point (4.44) for j ¼ j. Repeating the test after removing this frequency (i.e., for
m 1), we can find further frequencies in the analyzed time series (see Example
4.5). The practical realization of Fisher’s test of periodicity is described in Example
4.5.
Remark 4.3 The distribution of test statistics under the null hypothesis in Fisher’s
test is complex. However, for m 50 a simple approximation holds, namely
There exist various modification of Fisher’s test, which have been suggested to
improve the power of this test, in particular when the periodicity is compounded
⋄
k > 1 (e.g., Siegel’s test (1980), Bølviken’s test (1983), and others).
b1, . . . , ω
If in time series y1, . . ., yn the frequencies ω bk 2 ω1 , . . ., ωm
has
been indicated by Fisher’s test, then the OLS estimates in the resulting model
(4.42) are very simple due to the orthogonality of its regressors, namely
1X
n
b
μ¼ y,
n t¼1 t
2X 2X
n n
b
αj ¼ b jt , b
yt cos ω βj ¼ b jt ,
yt sin ω j ¼ 1, . . . , k: ð4:48Þ
n t¼1 n t¼1
I ω3 1:083 8
W ¼ max Y j ¼ Y 3 ¼ ¼ ¼ 0:500 7: ð4:49Þ
j¼1, ..., 10 I ω1 þ . . . þ I ω10 2:164 5
Using the tabulated critical values from Table 4.5, it holds for m ¼ 10
The approximation (4.47) can replace the tabulated critical value (since m ¼ 9 50)
so that the presence of further periodic component with frequency ω7 ¼ 2π7/21
¼ 2π/3 cannot be confirmed with significance level of 5% (obviously, this compo-
nent would model a 3-day periodicity).
Finally by (4.48), the model (4.42) can be estimated in the form (see also Fig. 4.6)
4.3 Transformations of Time Series 107
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Fig. 4.6 Numbers of defective pieces in daily production (in thousands) and estimated periodicity
by means of (4.53) in Example 4.5
2π 2π
byt ¼ 3:01 þ 1:06 cos t þ 0:42 sin t
7 7
2π
¼ 3:01 þ 1:14 cos t 0:377 2 : ð4:53Þ
7
In practice, the analyzed time series are sometimes transformed in a suitable way to
simplify the decomposition of the time series after transformation. Moreover, such
transformations may be useful not only in the framework of decomposition. Two
examples will be given here: Box–Cox transformation and the transformation based
on differencing.
• It makes homogenous the variance of given time series (including the seasonal
variance) to become (approximately) constant in time.
• It makes symmetric the skewed distribution of given time series (or even normal,
which enables, e.g., to construct easily the prediction intervals).
• It makes linear a given model of time series (frequently in the framework of Box–
Jenkins methodology; see Chap. 6).
The usual form of Box–Cox transformation is
8
< ð yt þ cÞ λ 1
ðλÞ
¼ for λ 6¼ 0, ð4:54Þ
yt λ
:
ln ðyt þ cÞ for λ ¼ 0:
Here the level parameter c > 0 may be fixed in such a way that holds yt + c > 0,
while the type parameter λ 2 R plays the key role in this transformation, e.g., it is
obviously
ð yt þ cÞ λ 1
lim ¼ ln ðyt þ cÞ ð4:55Þ
λ!0 λ
(therefore the index λ participates in the symbol yt(λ) denoting the transformed time
series). Even though the parameter value λ that makes homogenous the variance of
given time series can be estimated by the maximum likelihood method, practical
applications frequently prefer more subjective approaches based on considerations
of the following type. Since it holds
!2
ðλÞ
ðλÞ dyt
var yt varðyt Þ ð4:56Þ
dyt
yt ¼y
one obtains from (4.56) and (4.57) the following important relation for the sample
standard deviation sy of time series yt:
sy k y 1λ
: ð4:58Þ
For example, the logarithmic transformation (4.55) with λ ¼ 0 will make the
considered time series homogenous if the relation sy k y holds approximately
between the sample standard deviation sy and the sample mean y, and similarly for
other values of λ.
4.3 Transformations of Time Series 109
λ=0
0<λ < 1
λ=1
y( j)
Therefore, in practice one recommends to divide the given time series into short
segments of the same length (logically the length of segments may be 4 for quarterly
time series and 12 for monthly time series). In each segment, the sample mean yð jÞ
and the sample standard deviation sy( j) are calculated ( j denotes the jth segment),
and one plots a point with these coordinates in the plane (i.e., one has a system of
points corresponding to particular segments). Finally, a smooth curve is fitted
subjectively to this system of points in the plane (see Fig. 4.7). According to the
(a) (b)
90000 11.4
80000 11.2
70000
11.0
60000
10.8
50000
10.6
40000
30000 10.4
20000 10.2
2014 2015 2016 2014 2015 2016
Fig. 4.8 Monthly data 2014M1–2016M12 in Example 4.6 ( job applicants kept in the Czech labor
office register): (a) before transformation; (b) after logarithmic transformation
110 4 Seasonality and Periodicity
11000
10000
9000
Standard Deviation
8000
7000
6000
5000
4000
30000 35000 40000 45000 50000 55000 60000 65000 70000
Average
(a) (b)
10 12
10
8
8
6
6
4
4
2 2
0 0
30000 40000 50000 60000 70000 80000 10.2 10.4 10.6 10.8 11.0 11.2 11.4
Fig. 4.10 Histogram of job applicants kept in the Czech labor office register in Example 4.6:
(a) before transformation; (b) after logarithmic transformation
shape of such a curve, one selects a value for the parameter λ and decides in this way
on the corresponding form of Box–Cox transformation for given time series (see
Table 4.8). The power transformation with λ > 1 gives a hyperbolic shape of the
corresponding curve (this case is not usual in practice and is ignored here).
Example 4.6 Let us consider the time series yt of job applicants kept in the Czech
labor office register for particular months 2014M1–2016M12 (t ¼ 1, . . ., 36); see
Table 4.4 and Fig. 4.8a. Figure 4.9 for segments of length of 12 monthly observa-
tions indicates that the logarithmic transformation is desirable (i.e., Box–Cox trans-
formation with type parameter λ ¼ 0; see Fig. 4.8b). In this example, Fig. 4.8a, b
demonstrates the homogenization of variance after this transformation, and histo-
grams in Fig. 4.10a, b show that the logarithmic transformation really rectified a
skewed distribution to approximately symmetric distribution.
⋄
4.4 Exercises 111
Other transformations frequently used for time series consist in a suitable differenc-
ing (see Remark 3.4) that simplifies decomposition components of the original time
series (such transformations can be looked upon as special cases of moving averages
from Sect. 3.2). Usually, a constant trend remains in the transformed time series
only, as it is the case in the following examples with various decomposition
structure:
(a) Linear trend yt ¼ β0 + β1t + εt:
ð1 BÞk yt ¼ Δk yt βk : ð4:60Þ
ð1 Bs Þyt ¼ Δs yt 0: ð4:61Þ
4.4 Exercises
Exercise 4.1 Repeat the analysis from Example 4.1 (the simple approach to additive
seasonal elimination for the Czech construction production index) only for data since
2013 (hint: I1 ¼ 33.76, I2 ¼ 1.00, I3 ¼ 15.01, I4 ¼ 19.75).
112 4 Seasonality and Periodicity
Exercise 4.2 Repeat the analysis from Example 4.2 (the multiplicative elimination
of seasonality for the job applicants kept in the Czech labor office register) only for
data since 2013 (hint: I1 ¼ 1.084, I2 ¼ 1.082, I3 ¼ 1.053, I4 ¼ 1.000, I5 ¼ 0.962,
I6 ¼ 0.948, I7 ¼ 0.970, I8 ¼ 0.969, I9 ¼ 0.970, I10 ¼ 0.964, I11 ¼ 0.976, I12 ¼ 1.035).
Exercise 4.3 In Example 4.3 (the additive elimination of seasonality for the Czech
construction production index using the regression approach with dummies) con-
struct the prediction intervals for the year 2017.
Exercise 4.4 Repeat the analysis from Example 4.4 (the multiplicative Holt–Win-
ters’ method for the job applicants kept in the Czech labor office register) only for
data since 2010 (hint: predictions for 2017: 395068; 390412; 375287; 350837;
332122; 320818; 322058; 315104; 307204; 298323; 296419; 310548).
Exercise 4.5 Apply Fisher’s test of periodicity for the time series in Table 5.1 (hint:
no significant periodicities with significance level of 5%, bμ ¼ 0.167).
Chapter 5
Residual Component
Sometimes it seems from the visual point of view that the analyzed time series does
not indicate the presence of any systematic component, so that it is white noise only
(even if this white noise can be shifted to a nonzero level). For example, the
graphical record of monthly time series in Table 5.1 plotted for 2015–2017 (t ¼ 1,
. . ., 36) in Fig. 5.1 seems to be white noise. Moreover, sometimes one must assess
whether the elimination of systematic components from a decomposed time series
has been perfect, i.e., whether some reminders of systematic behavior do not persist
in the estimated residuals (e.g., patterns of trend, seasonality, and the like).
However, a visual decision can be subjective so that objective statistical tests with
fixed significance levels are desirable to test the null hypothesis
H 0 : yt iid: ð5:1Þ
This hypothesis is stronger than a test of white noise since it requires the indepen-
dence and the identical distribution (iid). On the other hand, (5.1) does not require
the level in zero (as is the case of white noise).
In general, the tests of this type are denoted as tests of randomness and they are
mostly nonparametric. We will describe some of them briefly. For all of them one
recommends before initiating the test procedure to arrange the given time series in
such a way that in each group of equal neighboring observations (equal approxi-
mately in the sense of applied rounding) one keeps only one observation (the other
equal observations in the group are deleted). In each of following tests, let y1, . . ., yn
denote the tested time series after this adjustment (i.e., yt 6¼ yt + 1 for all t ¼ 1, . . .,
n 1). To be on the safe side, we remind that we deal exclusively with time series
with continuous states.
15
10
-5
-10
-15
-20
-25
12 24 36
Fig. 5.1 Time series to be analyzed by tests of randomness (see also Table 5.1)
This test is based on the number of positive first differences of given time series, i.e.,
on the number of points in which this time series grows (so-called points of growth;
see also the growth function in Sect. 3.1.2.4).
Let Vt be random variables defined as
1 for yt < ytþ1 ,
Vt ¼ ð5:2Þ
0 for yt > ytþ1
5.1 Tests of Randomness 115
(the case yt ¼ yt+1 is excluded due to the preliminary adjustment). The mean value of
the number of positive first differences k (or equivalently the number of points of
growth) is then obviously under the null hypothesis (5.1) equal to
!
X
n1 n1
X
1 1 n1
Eðk Þ ¼ E Vt ¼ 1þ 0 ¼ , ð5:3Þ
t¼1 t¼1
2 2 2
since the relations between values of two neighboring values yt and yt + 1 have under
the null hypothesis the same probabilities 1/2. One can derive analogously that the
variance of k under the null hypothesis (5.1) fulfills
nþ1
varðk Þ ¼ : ð5:4Þ
12
Even if under (5.1) one can tabulate the (non-asymptotic) distribution of random
variable k, in practice one prefers the asymptotic version of the test which is
acceptable for higher n. Its critical region is
j k ðn 1Þ=2 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u1α=2 , ð5:5Þ
ðn þ 1Þ=12
Let r denote the total number of upper and lower turning points in the tested time
series (see Sect. 3.1.1). Analogously as in the previous test one can derive that under
the null hypothesis (5.1) it holds
2ð n 2Þ 16n 29
E ðr Þ ¼ , varðr Þ ¼ : ð5:6Þ
3 90
In practice, one again applies the asymptotic version of the corresponding test
with the critical region
j r 2ðn 2Þ=3 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u1 α=2 : ð5:7Þ
ð16n 29Þ=90
116 5 Residual Component
This test makes use of Kendall rank correlation coefficient τ (or briefly Kendall’s
tau), which was originally suggested as a measure of ordinal association between
two observed quantities. In our context, it has the form
4v
τ¼ 1, ð5:8Þ
nð n 1Þ
where v denotes such a number of pairs ys and yt in the given time series y1, . . ., yn
fulfilling ys < yt for s < t. The formula (5.8) for τ standardizes v in such a way that
1 τ 1 and under the null hypothesis (5.1) it holds
2ð2n þ 5Þ
EðτÞ ¼ 0, varðτÞ ¼ : ð5:9Þ
9nðn 1Þ
In practice, one applies mainly the asymptotic version of the corresponding test with
critical region
j τ j
qffiffiffiffiffiffiffiffiffiffiffiffi u1 α=2 : ð5:10Þ
2ð2nþ5Þ
9nðn1Þ
Let q1, . . ., qn denote the ranks of values of given time series. For example, if it is
y1 ¼ 10, y2 ¼ 6, y3 ¼ 2, y4 ¼ 6, then q1 ¼ 4, q2 ¼ 1, q3 ¼ 3, q4 ¼ 2 (sometimes
one uses fractional ranks with rank averages for equal values, i.e., q1 ¼ 4, q2 ¼ 1.5,
q3 ¼ 3, q4 ¼ 1.5). Then the Spearman rank correlation coefficient ρ (or briefly
Spearman’s rho), suggested similarly as τ to measure statistical dependence between
the ranking of two observed variables, can be calculated as
6 Xn
ρ¼1 ð i qi Þ 2 ð5:11Þ
nðn2 1Þ i¼1
(in our context, the one of rankings is obviously the natural one 1, 2, . . ., n). Even if
the tabulated critical values r1α/2 fulfilling P(|ρ| r1α/2) α under the null
hypothesis (5.1) are easily available nowadays, in practice again the asymptotic
version of the corresponding test is preferred with critical region
pffiffiffiffiffiffiffiffiffiffiffi
n 1 j ρj u1 α=2 : ð5:12Þ
5.1 Tests of Randomness 117
In this test, one must construct the sample median M of observations in given time
series (therefore one calls it sometimes median test). Graphically it means that we
look for such a line parallel with the time axis that the numbers of observations above
and below it are the same (see Fig. 5.2).
Sometimes several observations of the given time series must lie on this line. In
other situations (see, e.g., Fig. 5.2), such a line cannot be even constructed: then one
recommends to shift arbitrary observations from the line to the region (above or
below) with smaller number of observations to make of the line the correct median
(in Fig. 5.2, we have to shift one observation downward). Now we ignore all
observations on the line and pool the others into groups called runs in such a way
that all neighboring observations lying above or below the line create one particular
run (see Fig. 5.2). Let us denote the number of runs by u and the number of
observations above (or equivalently below) the median line by m.
Even if the critical values of this test are tabulated, in practice one usually makes
use of the asymptotic version of this test with the following critical region:
j u ð m þ 1Þ j
qffiffiffiffiffiffiffiffiffiffiffiffi u1 α=2 : ð5:13Þ
mðm1Þ
2m1
Example 5.1 The time series from Table 5.1 and Fig. 5.1 has the length n ¼ 36
(obviously the preliminary adjustment recommended in the previous text is not
yt
− − + + − + − + − + t
j 16 ð36 1Þ=2 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:854 < 1:96;
ð36 þ 1Þ=12
j 25 2ð36 2Þ=3 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:946 < 1:96;
ð16 36 29Þ=90
4 297
τ¼ 1 ¼ 0:057,
36 35
j 0:057 j
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:489 < 1:96;
2ð236þ5Þ
936ð361Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
36 1 j 0:051j ¼ 0:302 < 1:96;
• Median test: the given time series has M ¼ 2 (here it is y10 ¼ y34 ¼ 2, so that there
is no need to shift observations), m ¼ 17 and u ¼ 23, which implies according to
(5.13)
j 23 ð17 þ 1Þ j
qffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 1:742 < 1:96:
17ð171Þ
2171
Obviously the null hypothesis that the observations in Table 5.1 are iid could not
be rejected with significance level of 5% by any of the applied tests of randomness.
⋄
Remark 5.1 The recommended choice of a test may be subjective depending on our
suspicion that a systematic behavior survives in residuals, e.g.:
5.2 Exercises 119
• If we suspect that a linear trend remains in residuals, then one recommends the
test based on signs of differences, test based on τ ,and test based on ρ.
• If we suspect that a periodicity remains in residuals, then one recommends the test
based on turning points and median test (e.g., the test based on signs of differ-
ences is not suitable for residuals with remaining periodicity since in such a case
obviously k ~ n/2 so that the test statistics (5.5) lies close to zero and the test has a
low power).
5.2 Exercises
Exercise 5.1 Derive the formula (5.4) for the variance of test statistics k in the test of
randomness based on signs of differences. Hint: under H0 it holds var(Vt) ¼ 1/4 and
cov(Vt,Vt + 1) ¼ 1/12 in the variance of k:
!
X
n1 X
n1 X
n2
varðkÞ ¼ var Vt ¼ varðV t Þ þ 2 covðV t , V tþ1 Þ:
t¼1 t¼1 t¼1
Exercise 5.2 Simulate white noise N(0, 1) with length of 100 and apply five tests of
randomness from Sect. 5.1 to it.
Part III
Autocorrelation Methods for Univariate
Time Series
Chapter 6
Box–Jenkins Methodology
(–) The interpretation of constructed models is not mostly easy; typically, laymen
ask how it is possible that their data are modeled combining random shocks;
numerical outputs (e.g., predictions) may serve as acceptable arguments in such
cases.
References for more comprehensive study are, e.g., Brockwell and Davis (1993,
1996), Hamilton (1994), and others.
6.1.1 Stationarity
Generally speaking, the stationarity of a time series {yt} means that the behavior of
this series is stable in a specific way. One usually distinguishes two cases:
• Strict stationarity means that the probability behavior of corresponding
stochastic process is invariant to shifts in time, i.e., the probability distribution
of random vector yt1 , . . ., ytk is the same as the distribution of vector
yt1 þh , . . ., ytk þh for arbitrary h.
• (Weak) Stationarity is not so restrictive as the strict stationarity since the invari-
ance to time shifts suffices only for the first and second moments, i.e., it must hold
for each s and t
i.e., particularly
In other words, the level and variance of stationary time series are constant in time. A
trend, seasonality or non-constant variance (volatility) is incompatible with
stationarity and should be removed from time series to make it stationary. Also the
covariance structure of stationary time series must be invariable in time (e.g., the
character of dependence between the first and second quarter of stationary quarterly
series must be the same in all years).
Remark 6.1 If finite second moments of a given process exist, then obviously the
strict stationarity implies the weak one. Moreover, if such a process is normal (i.e.,
each finite sample from this process has joint normal distribution), then the both
types of stationarity are equivalent.
⋄
6.1 Autocorrelation Properties of Time Series 125
This text deals only with the weak stationarity that will be addressed simply as
stationarity. If introducing Box–Jenkins methodology it is suitable to start just with
models of stationary time series. The text respects this methodological recommen-
dation which will be valid until being canceled. The concept of autocovariance and
autocorrelation functions will be introduced only for stationary time series as well.
γk γk
ρk ¼ ¼ , k ¼ . . . , 1, 0, 1, . . . : ð6:5Þ
γ 0 σ 2y
Remark 6.2 The term “autocorrelation” for ρk in (6.5) is correct, since one can
write due to stationarity
γk covðyt , ytk Þ
ρk ¼ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffip ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ corr ðyt , ytk Þ: ð6:6Þ
σ 2y varðyt Þ varðytk Þ
0 1 2 3 4 5 6 7 8 9 10 k
For a given stationary time series, one usually constructs the estimated mean value
1X
n
y¼ y, ð6:7Þ
n t¼1 t
1 X
n
ck ¼ ðy yÞðytk yÞ, k ¼ 0, 1, . . . , n 1 ð6:8Þ
n t¼kþ1 t
ck
rk ¼ , k ¼ 0, 1, . . . , n 1: ð6:9Þ
c0
all ρk are zero for k > k0 (k0 is then called truncation point), or to conclude that such
a point k0 does not exist at all. For example in a model of the form
yt ¼ εt þ θ1 εt1 ð6:10Þ
(εt is the white noise (see (2.1)) and θ1 is a parameter (see Sect. 6.2)) it holds
θ1
ρ1 ¼ , ρk ¼ 0 for k > 1, ð6:11Þ
1 þ θ21
kP
1
rk r k1,j r kj
j¼1
r 11 ¼ r 1 , r kk ¼ kP
1
for k > 1, ð6:14Þ
1 r k1,j r j
j¼1
where
Similarly to the autocorrelation function, there can exist truncation points for the
partial autocorrelation function as well (e.g., for autoregressive processes ), so that
ρkk is an important identifying instrument again. In this case, one can apply so-called
Quenouille’s approximation: if ρkk ¼ 0 for k > k0, then under specific assumptions
again it holds (asymptotically with growing length n)
1
r kk N 0, for k > k0 : ð6:16Þ
n
where {εt } is white noise (i.e., a sequence {εt} of uncorrelated random variables
with zero mean values and constant (finite) variances σ 2 > 0; see (2.1)) and B is lag
operator (see Remark 3.4: the transcription of the models of Box–Jenkins method-
ology by means of the operators B and Δ is popular due to its simplicity, e.g., in
(6.17) one constructs a power series ψ(B) applying formally the operator B as if it is
the variable z in the classical power series ψ(z)). Moreover, one assumes that
ψ ðzÞ converges for j z j 1 ði:e:, inside the unit circle in complex planeÞ
ð6:18Þ
(see Brockwell and Davis (1996)). One can show under this assumption that the
infinite series of random variables (6.17) for particular times t converge in the sense
6.2 Basic Processes of Box–Jenkins Methodology 129
of convergence in mean square and the limits form a stationary process with zero
mean value (E(yt) ¼ 0). Some authors (see, e.g., Davidson (2000)) assume more
strongly that εt ~ iid (0, σ 2) in the linear process.
Another expression of the linear process (6.17), which can be useful especially
when constructing predictions, is possible for so-called invertible process. In this
case, one can rewrite (6.17) in the form
π ðzÞ converges for j z j 1 ði:e:, inside the unit circle in complex planeÞ:
ð6:20Þ
Remark 6.4 There is a lot of reasons why the models based on the principle of
linear process are suitable to model reality. For instance, let us consider a stationary
process {yt} with zero mean value and let us predict the value yt on the basis of last
values Yt – 1 ¼ {yt – 1, yt – 2, ...}. Then the optimal prediction (in the sense of minimal
mean squared error MSE in (2.11)) is E(yt | Yt – 1). The error of this prediction
has properties of white noise. One calls it innovation (this name is logic since the
innovation process {et} corresponds to unpredictable movements in values {yt}).
Moreover, if the process {yt} is normal, then the conditional mean value E(yt | Yt – 1)
has the form of linear combination of values yt – 1, yt – 2, ..., and (6.21) can be
rewritten as
Here one must start warning that MA models have nothing to do with the method of
moving averages for trend elimination (see Sect. 3.2). Moving average process of
order q denoted as MA(q) has the form
where θ1, ..., θq are parameters and θ(B) ¼ 1 + θ1B + ... + θqBq is moving average
operator (obviously, MA(q) originates by truncating the linear process (6.17) behind
the lag q).
The process MA(q) is always stationary with zero mean value and variance
σ 2y ¼ 1 þ θ21 þ . . . þ θ2q σ 2 ð6:25Þ
(apparently the autocorrelation function has the truncation point k0 equal to the
model order q). The partial autocorrelation function ρkk of the process MA(q) has no
truncation point, but it is bounded by a linear combination of geometrically decreas-
ing sequences and sinusoids with geometrically decreasing amplitudes.
The process MA(q) is invertible if all roots z1, ..., zq of polynomial θ(z) lie outside
the unit circle in complex plane (i.e., |z1|, ..., |zq| > 1, since then the assumption (6.20)
is fulfilled).
Remark 6.6 The process MA(1) (see (6.10)) has the autocorrelation function (6.11)
with truncation point k0 ¼ 1. Its partial autocorrelation function is (without trunca-
tion point)
ð1Þk1 θk1 1 θ21
ρkk ¼ 2ðkþ1Þ
for k ¼ 1, 2, . . . , ð6:27Þ
1 θ1
with truncation point k0 ¼ 2. The invertibility condition (6.20) for MA(2) process
has the form
so that the invertibility region of MA(2) (in the plane with horizontal axis for
values θ1 and vertical axis for values θ2) is the interior of triangle with vertices
(–2, 1), (0, –1), and (2, 1).
⋄
6.2.3 Autoregressive Process AR
where φ1, ..., φp are parameters and φ(B) ¼ 1 – φ1B – ... – φpB p is autoregressive
operator (it originates by truncating the inverted linear process (6.19) behind the
lag p).
The process AR( p) is stationary, if all roots z1, ..., zp of polynomial φ(z) lie
outside the unit circle in complex plane (i.e., |z1|, ..., |zp| > 1, since then the
assumption (6.18) is fulfilled). In such a case the process has zero mean value and
variance
σ2
σ 2y ¼ ð6:32Þ
1 φ1 ρ1 . . . φp ρp
(to derive (6.33) it suffices to multiply all terms in the equality (6.31) by the value
yt – k /σ y2 and to calculate mean values at the both sides; moreover, it is E(yt – kεt) ¼
0 for k > 0 since the stationary process AR( p) can be expressed as the linear
process). Due to the theory of difference equations (see Brockwell and Davis
(1993), Section 3.6) the solution of (6.33) can be expressed in the form
ρk ¼ α1 zk k k
1 þ α2 z2 þ . . . þ αp zp for k 0, ð6:34Þ
where z1, ..., zp are mutually distinct roots of the polynomial φ(z) (|z1|, ..., |zp| > 1; see
above) and α1, ..., αp are fixed coefficients (if the roots zi and zj are complex
conjugate, then they can be replaced by a single term of the type α d ksin(λk + φ)
with 0 < d < 1; similarly if the roots z1, ..., zp are not mutually distinct, then all terms with
a multiple root zi of multiplicity r must be replaced in (6.34) by a more complex term
(β0 + β1k + ... + βr–1 kr–1)zi–k, which is always significantly overlapped by the behavior of
the term zi–k for higher k). In any case, the autocorrelation function of process AR( p) can
be approximated by a linear combination of geometrically decreasing sequences and
sinusoids with geometrically decreasing amplitudes (see, e.g., Fig. 6.4).
Remark 6.8 If we write (6.33) only for k ¼ 1, ..., p, then we obtain so-called system
of Yule-Walker equations for unknown parameters φ1, ..., φp by means of autocor-
relations ρ1, ..., ρp (or vice versa)
ρ1 ¼ φ 1 þ φ2 ρ 1 þ ... þ φp ρp1 ,
ρ2 ¼ φ1 ρ 1 þ φ2 þ ... þ φp ρp2 ,
ð6:35Þ
⋮ ⋮ ⋮ ⋮
ρp ¼ φ1 ρp1 þ φ2 ρp2 þ ... þ φp :
⋄
The partial autocorrelation function ρkk of the process AR( p) has the truncation
point k0 equal to the model order p (it follows directly from the definition of partial
autocorrelation function of an autoregressive process of order p fulfilling ρkk ¼ 0 for
all k > p; see (6.13)). This property makes of the partial autocorrelation function an
important instrument for the identification of autoregressive processes.
The process AR( p) is always invertible since (6.31) is directly the invertible form
of this model.
Remark 6.9 The process AR(1)
yt ¼ φ1 yt1 þ εt ð6:36Þ
is stationary for |φ1| < 1. In such a case, it has zero mean value and variance
6.2 Basic Processes of Box–Jenkins Methodology 133
4 4
3 3
2 2
1 1
0 0
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
-1 -1
-2 -2
-3 -3
-4 -4
Fig. 6.2 (a) Positive correlatedness for yt ¼ 0.8yt 1 + εt (ρ > 0) and (b) negative correlatedness for
yt ¼ 0.8yt 1 + εt (ρ < 0)
σ2
σ 2y ¼ ð6:37Þ
1 φ21
ρ1 ¼ φ1 , ð6:39Þ
i.e., the first autocorrelation of the process AR(1) equals its autoregressive param-
eter. Hence the sign of parameter φ2 plays an important role here: the positive φ1 >
0 (so-called positive correlatedness) induces the inertia for the signs of neighboring
values of the corresponding time series (see Fig. 6.2(a) with a relatively rare crossing
of time axis), while on the contrary the negative φ1 < 0 (so-called negative
correlatedness) induces frequent changes of the signs of neighboring values of the
corresponding time series (see Fig. 6.2(b) with a relatively dense crossing of time
axis).
The partial autocorrelation function of the process AR(1) has the form
is stationary for
so that the stationarity region of AR(2) (in the plane with horizontal axis for values
φ1 and vertical axis for values φ2) is the interior of triangle with vertices (–2, –1),
(0, 1), and (2, –1). In such a case, the process AR(2) has zero mean value and
variance
σ2
σ 2y ¼ ð6:43Þ
1 φ1 ρ1 φ 2 ρ2
where z1 and z2 are mutually distinct roots of the polynomial φ(z) (|z1|, |z2| > 1, in the
case of double root the form of autocorrelation function is analogous); ρk is without
any truncation point and has the form of a linear combination of two geometrically
decreasing sequences or the form of a sinusoid with geometrically decreasing
amplitude.
The partial autocorrelation function of the process AR(2) has the truncation point
k0 ¼ 2.
⋄
yt ¼ φ1 yt1 þ ... þ φp ytp þ εt þ θ1 εt1 þ ... þ θq εtq , i:e: φðBÞyt ¼ θðBÞεt , ð6:45Þ
where the operators φ(B) and θ(B) have been defined in the context of processes
AR( p) and MA(q), respectively. The condition of stationarity and the condition of
invertibility of the process ARMA( p, q) correspond with the condition of
stationarity of AR( p) and the condition of invertibility of MA(q), respectively.
6.2 Basic Processes of Box–Jenkins Methodology 135
The stationary process ARMA( p, q) has zero mean value, and its autocorrelation
function fulfills the following difference equation:
ρk ¼ α1 zk k k
1 þ α2 z2 þ . . . þ αp zp for k max ð0, q p þ 1Þ, ð6:47Þ
where z1, ..., zp are mutually distinct roots of the polynomial φ(z) (|z1|, ..., |zp| > 1).
Hence the autocorrelation function of process ARMA( p, q) is without any truncation
point and can be approximated by a linear combination of geometrically decreasing
sequences and sinusoids of various frequencies with geometrically decreasing
amplitudes excepting the initial values ρ0, ρ1, ..., ρq – p (this exception is non-
empty only in the case of q p).
The partial autocorrelation function of the process ARMA( p, q) has no truncation
point as well and it is bounded by a linear combination of geometrically decreasing
sequences and sinusoids of various frequencies with geometrically decreasing
amplitudes excepting the initial values ρ00, ..., ρp – q, p – q (this exception is non-
empty only in the case of p q).
Remark 6.11 The process ARMA(1, 1)
is stationary for |φ1| < 1. In such a case, it has zero mean value and variance
1 þ θ21 þ 2φ1 θ1 2
σ 2y ¼ σ ð6:49Þ
1 φ21
ð 1 þ φ 1 θ 1 Þ ð φ1 þ θ 1 Þ
ρ1 ¼ , ρk ¼ φ1 ρk1 for k > 1 ð6:50Þ
1 þ θ21 þ 2φ1 θ1
Remark 6.12 The stationary processes introduced in this section have the zero
mean value. It is natural to generalize them to the case of nonzero value (constant in
time). For example, the process MA(q) with the mean value μ has the form
More generally, the process ARMA( p, q) with the mean value μ has the form
yt μ ¼ φ1 ðyt1 μÞ þ . . . þ φp ytp μ þ εt þ θ1 εt1 þ . . . þ θq εtq ,
ð6:52Þ
or equivalently
Example 6.1 Table 6.2 presents values yt of 3-month interbank interest rate (in % p.
a.) in Germany (Dreimonatsgeld; see Deutsche Bundesbank) for particular years
1960–1999 (t ¼ 1 , ..., 40). Since the corresponding graph in Fig. 6.3 can be regarded
as stationary in this time period (see also Example 6.4 in Sect. 6.3.3), one has
estimated the corresponding correlogram and partial correlogram (see Table 6.3
and Fig. 6.4).
Applying the characteristics from Table 6.1, the most suitable model for this time
series seems to be the process AR(4): the correlogram rk corresponds to a sinusoid
Table 6.1 Form of autocorrelation and partial autocorrelation function of stationary and invertible
processes AR( p), MA(q), and ARMA( p, q) (U denotes the curve in the form of linear combination
of geometrically decreasing sequences and sinusoids with geometrically decreasing amplitudes)
AR( p) MA(q) ARMA( p, q)
ρk Non-existent k0 ; k0 ¼ q Non-existent k0 ;
ρk in form of curve U ρk in form of curve U
excepting values ρ0, ρ1, ..., ρq–p
ρkk k0 ¼ p Non-existent k0 ; Non-existent k0 ;
ρkk bounded by curve U ρkk bounded by curve U
excepting values ρ00, ..., ρp–q, p–q
138 6 Box–Jenkins Methodology
Table 6.2 Annual data 1960–1999 in Example 6.1 (3-month interbank interest rate in Germany in
% p.a.—Dreimonatsgeld)
t Year yt t Year yt t Year yt t Year yt
1 1960 5.10 11 1970 9.41 21 1980 9.54 31 1990 8.43
2 1961 3.59 12 1971 7.15 22 1981 12.11 32 1991 9.18
3 1962 3.42 13 1972 5.61 23 1982 8.88 33 1992 9.46
4 1963 3.98 14 1973 12.14 24 1983 5.78 34 1993 7.24
5 1964 4.09 15 1974 9.90 25 1984 5.99 35 1994 5.31
6 1965 5.14 16 1975 4.96 26 1985 5.44 36 1995 4.48
7 1966 6.63 17 1976 4.25 27 1986 4.60 37 1996 3.27
8 1967 4.27 18 1977 4.37 28 1987 3.99 38 1997 3.30
9 1968 3.81 19 1978 3.70 29 1988 4.28 39 1998 3.52
10 1969 5.79 20 1979 6.69 30 1989 7.07 40 1999 2.94
Source: OECD (https://data.oecd.org/interest/short-term-interest-rates.htm#indicator-chart)
14
12
10
2
1960 1965 1970 1975 1980 1985 1990 1995
Fig. 6.3 Annual data 1960–1999 in Example 6.1 (3-month interbank interest rate in Germany in
% p.a.—Dreimonatsgeld)
with geometrically decreasing amplitude and the partial correlogram rkk has evi-
dently the truncation point k0 ¼ 4; the statistical test (6.55) gives really
rffiffiffiffiffi rffiffiffiffiffiffiffiffi
1 1
jr kk j <2 ¼ 2 ¼ 0:316 for k > 4
n 40
(and it is not true for k ¼ 4). One could try as an alternative the process MA(1) with
the truncation point k0 ¼ 1 in the correlogram rk, since it holds according to (6.54)
6.3 Construction of Models by Box–Jenkins Methodology 139
1.00 1.00
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
-0.25 -0.25
-0.50 -0.50
-0.75 -0.75
-1.00 -1.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Fig. 6.4 Correlogram and partial correlogram in Example 6.1 (Dreimonatsgeld) estimated by
means of EViews
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 1
jr k j <2 1 þ 2r 21 ¼ 2 1 þ 2 0:6122 ¼ 0:418 for k > 1,
n 40
but the value |r5| ¼ 0.400 is relatively close to this border (moreover, it would be
evidently difficult to look for a curve U bounding the partial correlogram rkk).
⋄
To check the correctness of identified model, one sometimes makes use of the
inequalities for estimated autocorrelations rk that should hold theoretically under the
assumption of stationarity and invertibility of process (see Table 6.5: e.g., according
to Remark 6.6 it holds |ρ1| < 1/2 in the process MA(1)).
140 6 Box–Jenkins Methodology
This advanced approach to the model identification enables (at least theoretically) a
fully automatic identification excluding any subjective interference of analysts. The
problem of identification of process ARMA( p, q) for a given time series is addressed
here as the problem of estimation of unknown parameters p and q by means of
optimization
ðb
p, b
qÞ ¼ arg min Aðk, lÞ, ð6:56Þ
ðk, lÞ
2ð k þ l þ 1 Þ
σ 2k,l þ
AICðk, lÞ ¼ ln b : ð6:57Þ
n
ðk þ l þ 1Þ ln n
σ 2k,l þ
BICðk, lÞ ¼ ln b : ð6:58Þ
n
The value b σ 2k,l in (6.57) and (6.58) denotes the estimated variance of white noise in
the process ARMA(k, l ) (more correctly one should use the minimal value of
logarithmic likelihood multiplied by coefficient (–2/n) instead of the first term in
(6.57) and (6.58); see, e.g., EViews), the numerator of the second term contains
obviously the number of estimated parameters (including the level parameter μ to
penalize unnecessarily high orders k and l), and n is the length of given time series.
The criterion AIC produces the strongly consistent estimator of the model order (i.e.,
this estimator converges to the true order with probability one for increasing n), but it
may have a high variance (i.e., it lacks of efficiency). The properties of the criterion
BIC are just opposite: the corresponding estimator of model order is not consistent,
but it is efficient.
6.3 Construction of Models by Box–Jenkins Methodology 141
Example 6.2 The values of information criteria AIC and BIC calculated by means
of EViews for the time series from Example 6.1 (Dreimonatsgeld) are shown in
Table 6.4, where one examines autoregressions up to the order six. The identified
process is AR(4), since the process AR(2) according to BIC is nested into the process
AR(4) according to AIC.
⋄
Remark 6.13 If two models are acceptable in the identification step and one of
them is nested into the second one (e.g., AR(2) is nested into AR(4); see above), then
one can decide on proper identification by means of statistical tests (F-test or
Lagrange Multiplier (LM) test), which test whether the parameters distinguishing
the both models are zero (see, e.g., EViews).
⋄
6.3.2 Estimation of Model
Simple models of Box–Jenkins methodology (up to the orders two) can be estimated
by means of the moment estimates making use of relations among the parameters of
the identified model and its autocorrelations (see Table 6.5: e.g., according to (6.39)
it holds φ1 ¼ ρ1 in the process AR(1)). However, such estimates are usually
perceived as preliminary ones and are used in practice as initial values for more
complex (iterative) procedure.
The estimation procedures for construction of the final estimates (not only the
initial ones) in particular models are software matters definitely. For the process
AR( p)
one can use the classical OLS estimation (including the classical estimation of its
variance matrix). Under the stationarity assumption, this estimate is consistent due to
the orthogonality of regressors to residuals in (6.59) (the orthogonality cov(yt – 1, εt)
¼ ... ¼ cov(yt – p, εt) ¼ 0 can be shown by expressing the process AR( p) in the form
of linear process (6.17)).
142 6 Box–Jenkins Methodology
Table 6.5 Moment estimates of simple stationary and invertible models of Box–Jenkins method-
ology and check inequalities for estimated autocorrelations
Model Moment estimates Check inequalities for rk
AR(1) b1 ¼ r1 ,
φ b
σ2 ¼ b b1 r1 Þ
σ 2y ð1 φ |r1| < 1
AR(2) r 1 ð1 r 2 Þ r 2 r 21 jr 2 j < 1, r 21 < 1þr 2
b1 ¼
φ , b2 ¼
φ , 2
1 r 21 1 r 21
σ2 ¼ b
b σ 2y ð1 φ b1 r1 φb2 r2 Þ
pffiffiffiffiffiffiffiffiffi2
MA(1) b 1 14r 1 bσy
2
|r1| < 1/2
θ1 ¼ , b
σ 2
¼
1þb
2r 1 2
θ1
MA(2) b bσ y
2
r 1 þ r 2 > 1=2, r 2 r 1 > 1=2
θ1 ¼ b
θ2 0:1 , b
σ2 ¼
1þbθ1 þb
2 2
θ r 21 < 4r 2 ð1 2r 2 Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffi2
ARMA
b 2r 21 jr 1 j < r 2 < jr 1 j
b b
2
r b 4
(1, 1) b1 ¼ 2 , b
φ θ1 ¼ , b θ1 < 1,
r1 2
1 2r 2 þ φ b 21 b
σy2
b
b¼ , b σ2 ¼ ,
r1 φ
b1 1þb
2
θ1
2
X n
yt y
b
σ 2
y ¼ , y t ¼ yt φ b 1 yt1
t¼1
n
In the case of the stationary and invertible process ARMA( p, q) (with zero mean
value for simplicity)
one usually uses the NLS estimates (nonlinear least squares) which are realized by
means of iterative algorithms (Gauss–Newton and others; see, e.g., EViews). The
corresponding NLS procedures consist mostly in the minimization of sum of squares
(which are nonlinear in parameters φ1, ..., θq)
X
n 2
min et φ1 , . . ., θq , ð6:61Þ
φ1 , ..., θq
t¼pþ1
where residuals et(φ1, ..., θq) are constructed recursively by means of the relation
et φ1 , . . ., θq ¼ et ¼ yt φ1 yt1 . . . φp ytp θ1 et1 . . . θq etq
for t ¼ p þ 1, . . . , n:
ð6:62Þ
with suitable initial values ep–q+1, ..., ep. Finally, the estimate of the variance of white
noise σ 2 is obtained if dividing the minimal value of (6.61) by the length of time
series n. Under the normality assumption and for higher n, these estimates are very
6.3 Construction of Models by Box–Jenkins Methodology 143
Table 6.6 Approximate standard deviations of estimated parameters of simple stationary and
invertible models of Box–Jenkins methodology
Model Standard deviations of estimated parameters
AR(1) 2 1=2
b
b 1 Þ 1nφ1
σ ðφ
AR(2) 1=2
1b
2
φ2
σ ðφ b2 Þ
b 1 Þ σ ðφ n
MA(1) b2 1=2
σ b
1θ1
θ1 n
MA(2) b2 1=2
θ1 σ b
σ b
1θ2
θ2 n
0 2 2 11=2
ARMA(1, 1) 2 !1=2
1b
2
φ1 φ1b
1þb θ1
1b
θ1 1þb φ1b
θ1
b1 Þ
σ ðφ 2 , σ b
θ1 @ 2 A
φ1 þb
n b θ1 n bφ1 þbθ1
np 2 1 X 2
n
np
L φ1 , ..., θq ,σ 2 ¼ ln ð2π Þ ln σ 2 e φ , ...,θq : ð6:63Þ
2 2 2σ t¼pþ1 t 1
Remark 6.14 Table 6.6 enables us to evaluate the errors of estimated parameters by
means of their approximate standard deviations (the applicability and derivations can
be found in Box and Jenkins (1970, Chapter 7)). For instance in the process AR(1),
one can evaluate the corresponding error as
!1=2
1φb 21
b b1 Þ ¼
σ ðφ : ð6:64Þ
n
⋄
Example 6.3 In Table 6.7, the time series from Example 6.1 (Dreimonatsgeld)
identified as the process AR(4) (see also Example 6.2) is estimated by means of
EViews as
⋄
144 6 Box–Jenkins Methodology
Table 6.7 Estimation of the process AR(4) from Example 6.3 (Dreimonatsgeld) calculated by
means of EViews
Variable Coefficient Std. Error t-Statistic Prob.
C 6.203253 0.427508 14.51027 0.0000
AR(1) 0.950188 0.163212 5.821793 0.0000
AR(2) 0.725744 0.216765 3.348060 0.0021
AR(3) 0.541991 0.215576 2.514154 0.0173
AR(4) 0.451544 0.165381 2.730320 0.0103
R-squared 0.567557 Mean dependent var 6.186667
Adjusted R-squared 0.511758 S.D. dependent var 2.513598
S.E. of regression 1.756359 Akaike info criterion 4.092609
Sum squared resid 95.62875 Schwarz criterion 4.312543
Log likelihood 68.66697 F-statistic 10.17145
Durbin–Watson stat 2.057756 Prob(F-statistic) 0.000022
Inverted AR Roots 0.700.49i 0.70+0.49i 0.22+0.76i 0.220.76i
Source: Calculated by EViews
In this case, one checks whether the estimated model fulfills the condition of
stationarity, i.e., whether the roots of estimated autoregressive polynomial lie outside
the unit circle in complex plane (or equivalently, whether their inverted values,
which are the roots of autoregressive polynomial written with the opposite order of
powers z p – φ1 z p – 1 – ... – φp, lie inside this circle; see Example 6.4). In particular,
this check of stationarity is important in the cases in which the estimation method is
strongly based on the stationarity assumption (e.g., for the estimates based on the
Yule–Walker equations; see Remark 6.8). It is also possible to separate several
segments in the given time series and to test the coincidence of estimated levels,
variances, and autocorrelations (or higher moments such as skewness and others)
among particular segments.
Another approach (so-called impulse response) consists in analyzing the response
of an impulse that occurred in the estimated model either in a single time moment or
repeatedly since this moment and that influences the consecutive values of the
process (such an impulse is mostly standardized to the size of standard deviation
of the corresponding white noise or to a multiple of this standard deviation). For
example, the estimated ARMA structure is transferred to the form of linear process
6.3 Construction of Models by Box–Jenkins Methodology 145
(6.17), and hereinto one substitutes (since a given time moment) an “artificial”
innovation process {εt} either with the only nonzero value in this time or with
fixed nonzero values since this moment. If the given time series is stationary, then by
increasing the time distance from the initial moment of impulse (1) the response to a
single impulse should fade away gradually to the zero level and (2) the response to
repeated impulses should stabilize itself to an appropriate (non-zero) level (see
Example 6.4).
This check means first of all the coincidence of the correlation structure estimated
from the data (i.e., the functions rk and rkk) with the correlation structure derived
from the estimated model that is to be verified (see Example 6.4). Another check of
structure of model is based on testing the uncorrelatedness (e.g., by means of Q-tests;
see below) in the white noise that has been estimated using the tested model.
An important diagnostic instrument is the white noise fbεt g constructed from the
estimated model of given time series (similarly as the residuals calculated using an
estimated regression model). The graphical record of this estimated white noise (and
its estimated correlogram, histogram, etc.) can indicate eventual flaws of the model
(in standard situations the estimated white noise is usually expected to show zero
mean value, constant variance, uncorrelatedness and normality; see Example 6.4).
The uncorrelatedness of the estimated white noise (see above) can be tested under
the normality assumption directly by means of the test based on Bartlett’s approx-
imation (6.54), where we use the estimated autocorrelations of the estimated white
noise r k ðbεt Þ . Obviously, the null hypothesis has the critical region (applying the
significance level of 5 %)
rffiffiffiffiffi
1
jr k ðbεt Þj 2 for k ¼ 1, 2, . . . : ð6:65Þ
n
However, so-called Q-tests (or equivalently portmanteau tests) are also fre-
quently used that test cumulatively the significance of the K initial autocorrelations
of estimated white noise (the integer K must be chosen in advance with
recommended size K √n, where n is the length of given time series). In this way
one verifies simultaneously the used structure ARMA( p, q) since the corresponding
146 6 Box–Jenkins Methodology
Q-statistics of this test has the asymptotic distribution χ 2(K – p – q) (under the null
hypothesis that the original time series admits to be modeled by ARMA( p, q)). As
the Q-statistics are concerned, in practice one uses mainly Box–Pierce statistics with
the critical region (applying the significance level α) of the form
X
K
Q¼n ðr k ðbεt ÞÞ2 χ 21α ðK p qÞ ð6:66Þ
k¼1
or statistically more powerful Ljung–Box statistics with the critical region (applying
again the significance level α) of the form
X
K
1
Q ¼ nðn þ 2Þ ðr ðbε ÞÞ2 χ 21α ðK p qÞ: ð6:67Þ
k¼1
nk k t
Example 6.4 The stationarity of model AR(4) estimated in Example 6.3 (Drei-
monatsgeld) was checked at first in the framework of verification: in Table 6.7 and
Fig. 6.5, one can see the inverted roots of estimated autoregressive polynomial
which lie distinctly inside the unit circle in complex plane. Further in Fig. 6.6 one
shows the response corresponding to impulse (standardized to the double size of
estimated standard deviation of white noise) and to repeated impulses. In the first
case, the response fades away gradually reaching the zero level finally and in the
second case it stabilizes to the level of 2.56. Obviously, none of performed checks
reject stationarity.
0.5
0.0
-0.5
-1.0
-1.5
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
6.3 Construction of Models by Box–Jenkins Methodology 147
-1
-2
2 4 6 8 10 12 14 16 18 20 22 24
-2
2 4 6 8 10 12 14 16 18 20 22 24
Fig. 6.6 Response to impulse standardized to the double size of (estimated) standard deviation of
white noise in the case of (a) single impulse (see upper graph) and (b) repeated impulses (see lower
graph) from Example 6.4 (Dreimonatsgeld) calculated by means of EViews
In Fig. 6.7, one compares the correlation structure estimated from data (i.e., the
functions rk and rkk) with the correlation structure derived from estimated model (i.e.,
the functions ρk and ρkk corresponding to the estimated model). The achieved
coincidence testifies to the adequacy of constructed model AR(4).
Finally, Table 6.8 shows the estimated autocorrelations of estimated white noise
and the results of a Q-test. According to (6.65) one gets
rffiffiffiffiffi
1
jr k ðbεt Þj 2 ¼ 0, 316 for k ¼ 1, 2, . . . : ð6:68Þ
40
In addition using the Q-test based on Ljung–Box statistics (6.67) for various K ¼ 5, 6,
... , one cannot reject (applying the significance level 5 %) the null hypothesis on
uncorrelatedness of white noise (i.e., null hypothesis on adequacy of the constructed
model AR(4)).
⋄
148 6 Box–Jenkins Methodology
.8
.4
ACF
.0
-.4
-.8
2 4 6 8 10 12 14 16 18 20 22 24
.4
PACF
.0
-.4
-.8
2 4 6 8 10 12 14 16 18 20 22 24
Fig. 6.7 Coincidence of the correlation structure estimated from data (i.e., the functions rk and rkk)
with the correlation structure derived from estimated model (i.e., the functions ρk and ρkk) in
Example 6.4 (Dreimonatsgeld) calculated by means of EViews
yt ¼ α þ β t þ εt , ð6:69Þ
where εt is a white noise (see, e.g., Fig. 3.2). It is the example of so-called
deterministic nonstationarity caused for instance by deterministic trend (in our
case by a linear line), and when it is eliminated the time series becomes stationary
(e.g., white noise in our case).
2. The second type of nonstationarity is represented, e.g., by the model
yt ¼ α þ yt1 þ εt , ð6:70Þ
where εt is again the white noise with variance σ 2 (see, e.g., Fig. 6.8(a)), though
here one usually assumes that in addition εt ~ iid. It is the example of the
stochastic nonstationarity, which can be modeled in some specific situations by
150 6 Box–Jenkins Methodology
(a) (b)
960 40
30
920
20
10
880
840 -10
-20
800
-30
760 -40
50 100 150 200 250 50 100 150 200 250
index PX (in year 2016) first differences of index PX (in year 2016)
Fig. 6.8 (a) Index PX in the year 2016 (values for 251 trading days) from Example 6.5. (b) First
differences of time series from (a)
special (stochastic) models and then also made stationary by exploiting these
models in a suitable way. More specifically, the model (6.70) is so-called random
walk with drift, and in this case, the corresponding time series can be
“stationarized” simply by transferring it to the time series of first differences
Δyt, since from the model (6.70) one easily obtains
Δyt ¼ α þ εt : ð6:71Þ
The right-hand side of (6.71), i.e., the white noise shifted to level α, is trivially
a stationary time series. The principle of stochastic nonstationarity in the case of
model (6.70) can be presented better if one rewrite it in the form
X
t
y t ¼ y 1 þ α ð t 1Þ þ ετ : ð6:72Þ
τ¼2
The time series has obviously not only the deterministic trend (namely, the linear
trend with the slope α), but also a stochastic trend consisting in progressive
cumulation of white noise. From the interpretation point of view, the conditional
values are also interesting (under the assumption of mutual independence of εt)
According to (6.73), this time series tends not to revert back to the original level,
but on the contrary, it tends to higher (or lower) values for α > 0 (or α < 0),
respectively, since the development rate of the mean value is O(t), while for the
standard deviation it is only O(√t). Even in the case of zero slope (α ¼ 0), the
random walk without drift (in contrast to the white noise) intersects the horizontal
axis (i.e., the zero level) only rarely. Moreover, the relations (6.74) imply that the
mean level and variance (volatility) are unlimited, while the autocorrelation
function has values near to one and decreases to zero in a slower rate than linearly.
Remark 6.15 Let us rewrite the relation (6.70) to a more general form
yt ¼ α þ φ1 yt1 þ εt ð6:75Þ
(the relation (6.70) is a special case for φ1 ¼ 1). If it holds |φ1| < 1, then (6.75) is
obviously the stationary process AR(1) with nonzero mean value μ ¼ α /(1 – φ1)
yt μ ¼ φ1 ðyt1 μÞ þ εt ð6:76Þ
(see Remark 6.12), which can be rewritten as Δyt ¼ (φ1 – 1)(yt – 1 – μ) + εt. Then the
conditional mean value (6.73) of such a stationary process AR(1) obviously fulfills
i.e., now in contrast to the random walk with drift, the process {yt} does not drift, but
it reverts to the previous level (so-called mean reverting). Finally, the remaining case
of |φ1| > 1 is a very special one since then {yt} is the explosive process comparable
with the powers φ1k (e.g., the process yt ¼ 2yt – 1 + εt behaves since later times t as
the deterministic sequence 2t regardless of the size of white noise εt).
⋄
Remark 6.16 Once more let us stress the distinction in stationarization for the
model described above:
• The stationarity of the model (6.69) with deterministic trend can be achieved
simply by regression methods eliminating trend. The stationarization based on
differences should not be used since it may lead to models with residuals in the
form of a (noninvertible) MA process
152 6 Box–Jenkins Methodology
with both deterministic and stochastic trend, then the effort to eliminate the trend
by means of regression methods would face the problem that, e.g., t-ratio may not
have (not even asymptotically) t-distribution. The model (6.79) can be also
rewritten as
β α φ1 β1
yt β0 β1 t ¼ φ1 ðyt1 β0 β1 ðt 1ÞÞ þ εt , where β1 ¼ , β0 ¼ ,
1 φ1 1 φ1
ð6:80Þ
i.e., if |φ1| < 1, then (6.80) is in fact the stationary process AR(1) with linear
trend.
The possibility to stationarize the analyzed time series by means of differencing can
be considered as the evidence of existence of (nearly) unit roots of the
autoregressive operator for given model (e.g., the autoregressive operator of the
model (6.70) has obviously the single root equal to one). In accord with the
previous discussion, the decision on existence of such a unit root (or multiple
unit roots) is often the key point of the corresponding analysis, even if the form
of estimated correlogram can indicate the presence of such a root (namely, a very
slow decline starting from unit to zero, since the particular estimated autocorrela-
tions converge to 1 with the increasing length of nonstationary time series).
However, a subjective checking of estimated correlograms cannot usually distin-
guish nonstationary models of the type yt ¼ yt–1 + εt from stationary ones with
nearly unit root of the type yt ¼ 0.95yt–1 + εt so that an application of a statistical
test with a prescribed significance level is recommended here.
6.4 Stochastic Modeling of Trend 153
DF test (see Dickey and Fuller (1979, 1981)) was the pioneering one among the tests
of the unit root. In particular, Dickey and Fuller suggested three variants all denoted
as τ-tests:
(1) τ-test: H0: yt ¼ yt – 1 + εt versus H1: yt ¼ φ1yt – 1 + εt for φ1 < 1, i.e., the
one-tailed test of random walk versus stationary AR(1) process (the possible
nonstationarity caused by φ1 –1 is not important in practice);
(2) τμ-test: H0: yt ¼ yt – 1 + εt versus H1: yt ¼ α + φ1yt – 1 + εt pro φ1 < 1, i.e., the
one-tailed test of random walk versus stationary AR(1) process with (nonzero)
level;
(3) ττ-test: H0: yt ¼ yt – 1 + εt versus H1: yt ¼ α + β t + φ1yt – 1 + εt pro φ1 < 1, i.e.,
the one-tailed test of random walk versus stationary AR(1) process with linear
trend.
The null hypothesis in each of all three cases can be written simply as
where ψ ¼ φ1 – 1 and (1) α ¼ β ¼ 0 and (2) β ¼ 0. One should stress that all
alternatives consist in the inequality ψ < 0 only, and the equalities α ¼ β ¼ 0 in the
alternative (1) or β ¼ 0 in the alternative (2) are not investigated at all (including the
numerical values of the intercept α or the slope β whose correctness is not guaranteed
under nonstationarity caused by ψ ¼ 0 anyhow).
The test statistics in each of three variants of DF test is the classical t-ratio (we test
simply the significance of regression parameter ψ in the model (6.81))
b
ψ
DF ¼ ð6:83Þ
b bÞ
σ ðψ
using the estimates constructed by means of the methodology from Sect. 6.3.2 and
with the critical region
DF t α ðnÞ: ð6:84Þ
However under the null hypothesis ψ ¼ 0, the test statistics DF does not have
t-distribution (not even asymptotically or under the assumption of εt ~ iid) as is the
case of the classical t-ratio, but a nonstandard distribution, for which one must
calculate the critical value in (6.84) by means of simulations separately for particular
tests (1), (2), and (3) and for particular lengths n of time series (see selected critical
154 6 Box–Jenkins Methodology
values for the asymptotic case n ! 1 in Table 6.9). It holds generally that this
distribution has heavier tails than the corresponding t-distribution so that its critical
values are much more higher in the absolute value than for t-distribution (e.g., the
critical value –3.41 for 5 % and n ! 1 is in the absolute value twice as much as the
critical value –1.645 for the classical t-test, i.e., one needs a more significant value of
t-ratio to reject the null hypothesis of ψ ¼ 0). The reason consists in the fact that one
applies the nonstationary regressor (see also the introduction of Sect. 6.4). Even if
the critical values were calculated by Dickey and Fuller, nowadays the software
systems use more sophisticated algorithms delivering directly the corresponding
p-values; see, e.g., MacKinnon (1996).
The previous DF test is applicable only in the case that the residual component εt has
the form of independent white noise. However, if the model (6.81) explaining the
dependent variable Δyt includes autocorrelatedness, which is not reflected correctly,
then the type-one error of DF test (i.e., the probability of rejection of valid H0) is
higher than the declared α. Mainly for this case, so-called augmented DF test (ADF-
test) has been suggested which has the null hypothesis of the form
X
p
H 0 : Δyt ¼ ψyt1 þ γ i Δyti þ εt for ψ ¼ 0 ð6:85Þ
i¼1
instead of (6.81). The test statistics and the critical values for particular variants (1),
(2), and (3) (i.e., for τ-test, τμ-test, and ττ-test) are the same as before the augmen-
tation (the test concerns again the parameter ψ only). The added autoregressive terms
in (6.85) absorb the dynamic structure explaining the dependent variable. For the
identification of order p of added autoregressive terms, one recommends to apply the
information criteria (see Sect. 6.3.1.2).
PP test (see Phillips and Perron (1988)) is similar to ADF test, except that it models
the possible autocorrelatedness of residuals not by adding autoregressive terms as
in (6.85), but directly by correcting the estimated standard deviation in the deno-
minator of the original DF statistics (6.83). Essentially, it is HAC approach
6.4 Stochastic Modeling of Trend 155
KPSS test (see Kwaitkovski et al. (1992)) improves the resolving power of DF test
which can be sometimes weaker. For instance, one should reject the null hypothesis
of unit root for the theoretical model yt ¼ 0.95yt – 1 + εt. If it is not the case, then it
means that either the model is really nonstationary or we do not have sufficient
information to reject it (e.g., a short segment of time series yt ¼ 0.95yt – 1 + εt is
observed only). Therefore, KPSS test was suggested in such a way that the hypoth-
eses H0 and H1 are just opposite than for ADF test (i.e., the null hypothesis H0
represents stationarity versus nonstationarity in the alternative H1). Moreover, one
recommends to carry out ADF test and KPSS test always simultaneously with the
following conclusions: (a) if H0ADF is rejected and simultaneously H0KPSS cannot be
rejected, then the stationarity is confirmed; (b) if H0ADF cannot be rejected and
simultaneously H0KPSS is rejected, then the nonstationarity is confirmed; and (c) both
remaining combinations are regarded as inconclusive. To summarize the topic, the
previous tests (and others) can be found in modern software systems recommended
for time series analysis.
Nowadays the topic of the unit roots testing is very complex so that other
references should be also addressed for a deeper understanding (see, e.g., Brockwell
and Davis 1996, Section 6.3 or Heij et al. 2004, Section 7.3.3).
Example 6.5 Figure 6.8(a) and Table 6.10 show the values of index PX of Prague
Exchange (i.e., the time series {PXt}) for 251 trading days of the year 2016. This
time series seems to be the random walk PXt ¼ PXt – 1 + εt, (see (6.70) for α ¼ 0).
Table 6.11 shows the results of DF test of type τ (i.e., with H1: yt ¼ φ1yt – 1 + εt for
φ1 < 1): the null hypothesis of nonstationarity with one unit root is not rejected even
when applying the significance level of 10 %. This test is performed simultaneously
with ADF test since according to Table 6.11 the system EViews performs the
automatic choice of the order p of autoregressive terms in (6.85) by means of the
information criterion SIC (so-called Schwarz information criterion also denoted
sometimes as BIC; see Sect. 6.3.1.2).
After transferring to the first differences ΔPXt, the previous DF test of type τ
rejects the null hypothesis of nonstationarity (i.e., the existence of second unit root in
the original time series PXt) applying the significance level of 1 % (see Table 6.12),
so that the construction of first differences is sufficient in order to make the series PXt
stationary.
Finally, Table 6.13 presents the results of KPSS test by means of EViews that
rejects significantly the null hypothesis of stationarity of the (nondifferenced) time
series PXt even when applying the significance level of 1 %. This result together with
the previous ADF test confirms unambiguously the nonstationarity of PXt.
⋄
Table 6.10 Index PX in year 2016 (values for 251 trading days written in columns) from Example 6.5 (see also Fig. 6.8(a))
156
1 2 3 4 5 6 7 8 9 10
1 938.23 872.53 913.94 916.94 890.28 808.21 849.79 879.83 915.33 886.31
2 941.07 852.97 909.99 918.60 891.30 816.91 862.37 870.08 922.50 888.14
3 936.17 863.74 910.20 919.60 890.47 824.43 856.79 868.29 921.35 885.13
4 916.09 847.23 898.32 912.35 892.68 826.29 859.18 864.48 928.28 879.33
5 924.04 845.92 914.85 914.89 893.76 814.58 861.20 860.81 935.36 881.22
6 918.64 875.81 908.44 913.71 888.21 811.26 863.87 861.83 934.05 885.05
7 919.62 861.37 901.14 909.01 888.41 820.31 861.30 865.53 919.18 887.20
8 914.73 877.55 890.47 916.04 879.51 827.31 856.30 864.77 925.46 886.66
9 898.62 878.51 888.93 909.34 892.27 826.17 850.80 875.87 921.78 894.86
10 881.12 871.22 893.17 896.67 895.20 844.90 850.98 874.05 908.80 894.24
11 867.85 886.64 900.82 886.94 874.05 863.55 847.60 869.10 902.89 899.57
12 873.98 879.15 899.91 886.83 867.79 870.26 846.29 866.34 909.66 900.71
13 855.92 855.43 892.92 867.79 840.05 876.21 850.72 874.57 893.82 905.43
14 859.87 865.67 896.85 864.20 818.38 882.07 858.06 863.58 899.00 911.09
15 886.77 865.37 889.53 869.46 808.20 887.50 855.79 868.59 897.95 903.02
16 883.85 857.61 884.39 866.93 817.58 891.37 853.19 875.13 897.76 911.98
17 892.81 871.89 884.30 871.43 815.86 892.44 852.94 881.09 901.69 917.59
18 902.56 879.32 899.33 873.87 831.21 889.42 858.83 889.38 900.99 912.46
19 909.43 883.06 893.56 882.66 838.94 893.42 859.14 885.72 905.11 917.48
20 921.07 889.69 887.01 869.31 842.31 887.26 866.37 891.00 904.68 917.55
21 914.71 888.01 895.62 873.38 852.05 881.74 875.71 894.57 889.62 917.53
22 902.20 892.56 895.62 873.17 855.26 880.08 880.84 890.74 884.41 916.75
23 886.72 886.71 906.63 875.19 819.58 876.28 882.42 886.64 884.00 920.35
24 897.48 896.15 904.65 874.83 790.09 857.86 881.41 899.16 892.29 923.54
25 904.79 907.56 915.23 876.22 806.43 856.10 884.80 906.97 888.72 919.58
26 921.61
6 Box–Jenkins Methodology
Table 6.11 DF test of time series of index PX from Example 6.5 by means of EViews
Null Hypothesis: PX2016 has a unit root
Exogenous: None
Lag Length: 0 (Automatic—based on SIC, maxlag¼15)
t-Statistic Prob.a
Augmented Dickey–Fuller test statistic –0.202174 0.6126
Test critical values: 1% level –2.574245
5% level –1.942099
10% level –1.615852
Source: calculated by EViews
a
MacKinnon (1996) one-sided p-values
Table 6.12 DF test of first differences of index PX from Example 6.5 by means of EViews
Null Hypothesis: D(PX2016) has a unit root
Exogenous: None
Lag Length: 0 (Automatic—based on SIC, maxlag¼15)
t-Statistic Prob.a
Augmented Dickey—Fuller test statistic –14.49788 0.0000
Test critical values: 1% level –2.574282
5% level –1.942104
10% level –1.615849
Source: calculated by EViews
a
MacKinnon (1996) one-sided p-values
Table 6.13 KPSS test of time series of index PX from Example 6.5 by means of EViews
Null Hypothesis: PX2016 is stationary
Exogenous: Constant
Bandwidth: 11 (Newey–West automatic) using Bartlett kernel
Kwiatkowski–Phillips–Schmidt–Shin test statistic 0.284476
Asymptotic critical values: 1% level 0.739000
5% level 0.463000
10% level 0.347000
Source: calculated by EViews
The time series with stochastic trend of type (6.70), which can be stationarized by
means of differencing, are modeled as processes ARIMA in Box–Jenkins method-
ology. Integrated mixed process of order p, d, q denoted as ARIMA( p, d, q) has the
form
158 6 Box–Jenkins Methodology
where
wt ¼ Δd yt ð6:87Þ
is the dth difference of the original time series yt (see also (3.62)), which is modeled
as a stationary (and invertible) process ARMA( p, q) in (6.86). In other words, the
principle of ARIMA processes is as follows: at first (1) the modeled time series is
stationarized by differencing it suitably and then (2) the corresponding stationary
time series (denoted as wt in (6.86) and (6.87)) is modeled by means of mixed
process ARMA. Usually one writes it summarily as
An important special case is the integrated process I(d) presented mostly in a simple
form
Δ d yt ¼ εt , ð6:89Þ
which can be constructed by “integrating” the white noise, e.g., for d ¼ 1 it holds
X
t
yt ¼ y1 þ ετ : ð6:90Þ
τ¼2
Remark 6.17 The drift parameter α serves to model a possible nonzero level of the
process wt, i.e., a deterministic trend in the form of polynomial of the dth order for
original time series {yt}. If d > 0, then the model ARIMA of time series yt is
invariant when shifting the time series by an arbitrary constant. Therefore, obviously
it makes no sense to center such series by subtracting the sample means before their
analysis.
⋄
Remark 6.18 The operator φ(B)Δd on the left-hand side of the model (6.88) is
called generalized autoregressive operator. This operator is characterized by the
property that the corresponding polynomial φ(z)Δd has p roots lying outside the unit
circle in complex plane and, in addition, a unit root of multiplicity d. More general
types are the processes ARUMA that have at least one root different from unit, but
lying on the unit circle, and the explosive processes (see also Remark 6.15), which
have at least one root inside the unit circle.
⋄
6.4 Stochastic Modeling of Trend 159
1.0
0.9
0.8
0.7
0.6
ACF
0.5
0.4
0.3
0.2
0.1
2 4 6 8 10 12 14 16 18 20 22 24
1.0
0.8
0.6
PACF
0.4
0.2
0.0
-0.2
2 4 6 8 10 12 14 16 18 20 22 24
Fig. 6.9 Estimated correlogram and partial correlogram from Example 6.6 (index PX in the year
2016)
• The comparison of sample standard deviations (volatilities) of time series yt, Δyt,
Δ2yt, ... (one chooses such an order of differencing that shows the lowest
volatility; on the other hand, one must pay attention to overdifferencing since
the volatilities can again increase for higher d).
• The application of information criteria (see Sect. 6.3.1.2) which can be modified
for the models ARIMA.
Example 6.6 For index PX in year 2016, the Example 6.5 identified by means of
tests of unit root the random walk PXt ¼ PXt – 1 + εt (also the linear decrease of
estimated correlogram of {PXt} in Fig. 6.9 indicates the need to transfer this time
series by differencing). In Table 6.14, one estimated {PXt} by means of EViews in
the form
ΔPX t ¼ εt , b
σ ¼ 9:26
(the intercept (or drift parameter) α in the model ΔPXt ¼ α + εt is highly insignif-
⋄
icant; see Table 6.14).
Remark 6.19 One should understand correctly the meaning of constants in time
series models. In stationary models, this constant interrelates to the mean value (i.e.,
the level) of the corresponding process. For example, for MA(1) process yt ¼ α + εt +
θ1εt – 1 it is directly μ ¼ E(yt) ¼ α . For the stationary AR(1) process yt ¼ α + φ1yt – 1
+ εt it holds μ ¼ α /(1 – φ1). Finally for the random walk with drift yt ¼ α + yt – 1 + εt,
the constant α presents the slope of process (even if this trend is loaded significantly
by the integrated random walk).
⋄
Table 6.14 Estimation of random walk from Example 6.6 (index PX in the year 2016) calculated
by means of EViews
Dependent Variable: DPX2016
Method: Least Squares
DPX2016¼C(1)
Coefficient Std. Error t-Statistic Prob.
C(1) –0.066480 0.585607 –0.113523 0.9097
R-squared 0.000000 Mean dependent var –0.066480
Adjusted R-squared 0.000000 S.D. dependent var 9.259263
S.E. of regression 9.259263 Akaike info criterion 7.293118
Sum squared resid 21,347.75 Schwarz criterion 7.307204
Log likelihood –910.6397 Hannan–Quinn criter. 7.298787
Durbin–Watson stat 1.834323
Source: Calculated by EViews
6.5 Stochastic Modeling of Seasonality 161
Remark 6.20 In financial practice, we model frequently the time series of logarith-
mic rates of returns rt (so-called log returns)
Pt
r t ¼ ln ¼ ln Pt ln Pt1 ¼ pt pt1 ð6:91Þ
Pt1
for various financial assets (e.g., stocks or commodities) or price indices. These time
series have usually a constant mean value of small positive size with added white
noise
r t ¼ μ þ εt : ð6:92Þ
pt ¼ μ þ pt1 þ εt , ð6:93Þ
which can be looked upon as a random walk with drift increasing approximately
as μ t.
⋄
where the time index skips across the January periods. The symbols used in the
formula (6.94) are following:
Φ B12 ¼ 1 Φ1 B12 Φ2 B24 . . . ΦP B12P ð6:95Þ
(see also (3.63)). The model (6.94) can be looked upon as the process ARIMA
describing development of January observations. Similar models are constructed for
the time series that skips across the February observations only, and so on. Let us
suppose now that the models for particular months are approximately the same.
However, the random components ηt in these models should be correlated mutually
in time since there can exist, e.g., a relation between January and February values.
Therefore, let us assume that also the time series {ηt} is described by a model
ARIMA of the form
where εt is finally the white noise and the time index runs in the classical way.
Obviously, the models (6.94) and (6.99) can be linked together to a single model of
the form
12
φðBÞΦ B12 Δd ΔD
12 yt ¼ θ ðBÞΘ B εt : ð6:100Þ
(P, D, Q)12 and is denoted usually by the acronym SARIMA (the adjective “multi-
plicative” expresses the fact that the operators of models (6.94) and (6.99) are
multiplied mutually). For example, the process SARIMA (0, 1, 1)
(0, 1, 1)12 has
the form
ð1 BÞ 1 B12 yt ¼ ð1 þ θ1 BÞ 1 þ Θ1 B12 εt ð6:101Þ
or equivalently
Clearly, the number twelve is replaced by four in the case of quarterly seasonality.
There are also the additive seasonal processes but they are applied in practice only
rarely: if comparing with (6.104), a simple example of an additive seasonal processes
is
θ1 Θ1 θ 1 Θ1
ρ1 ¼ , ρ12 ¼ , ρ11 ¼ ρ13 ¼ ,
1 þ θ21 1 þ Θ1
2
1 þ θ21 1 þ Θ21
ρk ¼ 0 for k 6¼ 1, 11, 12, 13: ð6:106Þ
(P, 0, 0)12 ). In practice, one usually chooses among several alternatives offered by
a suitable software (see, e.g., EViews).
Example 6.7 Figure 6.10 and Table 4.4 show the time series yt (t ¼ 1, ..., 96) of the
job applicants kept in the Czech labor office register for particular months 2009M1-
2016M12 (the same data with different time range were analyzed in Sect. 4.1
applying decomposition methods; see Examples 4.2 and 4.4).
By means of EViews (see Table 6.15) one has constructed the multiplicative
seasonal model SARIMA (1, 0, 0)
(0, 1, 0)12 (with deterministic trend) of the form
ð1 0:93BÞ 1 B12 yt ¼ 56 506:21 þ 51:11 t þ εt , b
σ ¼ 9 658:47
or equivalently
⋄
Remark 6.21 The seasonal time series in financial practice are frequently modeled
using so-called airline model, which is the process SARIMA (0, 1, 1)
(0, 1, 1)s
⋄
164 6 Box–Jenkins Methodology
650000
600000
550000
500000
450000
400000
350000
2009 2010 2011 2012 2013 2014 2015 2016
job applicants
job applicants estimated by SARIMA
Fig. 6.10 Monthly data 2009M1-2016M12 and the values estimated by model SARIMA in
Example 6.7 ( job applicants kept in Czech labor office register); see Table 4.4. Source: Czech
Statistical Office
Similarly as in the previous chapters, the symbol bytþk ðt Þ will denote the prediction of
value yt+k constructed in time t, i.e., the prediction for time t + k in time t (k-step-
ahead prediction).
For simplicity, we shall construct the linear prediction, i.e., the prediction
which is a linear function of values yt, yt–1, ... or equivalently a linear function of
εt, εt–1, ... (since we assume the stationarity and invertibility). In addition, the
mean square error of constructed prediction
2
MSE ¼ E ytþk bytþk ðt Þ ð6:109Þ
should be minimal over all linear predictions. If one takes such a prediction in the
form
(see (6.17)), then obviously one should look for such coefficients ψ k , ψ kþ1 , . . . ,
which minimize the expression
!
1
X 2
1þ ψ 21 þ ... þ ψ 2k1 þ ψj ψ j σ2 : ð6:112Þ
j¼k
ψ j ¼ ψ j , j ¼ k, k þ 1, . . . : ð6:113Þ
Particularly, it holds
et ðt 1Þ ¼ yt byt ðt 1Þ ¼ εt , ð6:118Þ
i.e., the white noise can be looked upon as the one-step-ahead prediction errors (this
fact also justifies why one sometimes denotes the white noise as innovation; see also
Remark 6.4).
So far we have dealt with theoretical features of predictions. Now we will show
how to construct predictions according to Box–Jenkins methodology in reality. As it
holds
The relation (6.120) is basic one for real calculations of predictions since one can
substitute
(
0 for j > 0 ,
bεtþj ðt Þ ¼ ð6:122Þ
εtþj ¼ ytþj bytþj ðt þ j 1Þ for j 0 :
etc., until we reach the prediction horizon and the prediction time (mostly the end
t ¼ n of observed time series), which correspond to the our real prediction
problem.
(b) To realize (a) one makes use of the formula (6.120) with estimated parameters
and substituting relations (6.121) and (6.122) (in order to start the recursive
calculations, one must choose initial values, e.g., in the model MA(q) one can
start with ε1 ¼ ε2 ¼ ... ¼ εq ¼ 0).
(c) One can also construct the interval predictions. For example assuming the
normality, the 95 % prediction interval can be approximated by means of
(6.117) as
0 !1=2 !1=2 1
X
k1 X
k1
@bytþk ðt Þ 2b
σ 1þ b 2j
ψ , bytþk ðt Þ þ 2b
σ 1þ b 2j
ψ A: ð6:123Þ
j¼1 j¼1
Remark 6.22 Let us stress once more that in practice one substitutes to the
prediction formulas the estimated parameter (see, e.g., (6.123)). Fortunately in
routine situations, the predictions remain after such a substitution acceptable (par-
ticularly for longer time series).
⋄
Example 6.8 This example demonstrates how to construct predictions in three
estimated models of different types:
1. Stationary AR(1) process with deterministic linear trend:
yt ¼ yt1 þ εt þ 0:39εt1 ;
by6 ð5Þ ¼ 0
by7 ð6Þ ¼ 0:4 y6 by6 ð5Þ ¼ 0:4y6
by8 ð7Þ ¼ 0:4 y7 by7 ð6Þ ¼ 0:4y7 0:16y6
⋮
by7 ð5Þ ¼ by8 ð6Þ ¼ by9 ð7Þ ¼ 0
by10 ð8Þ ¼ 0:5ðy6 by6 ð5ÞÞ ¼ 0:5y6
⋮
As the interval predictions are concerned, e.g., in the second example (i.e., for
ARIMA) one can write
so that
1=2
σ ðetþk ðt ÞÞ ¼ 1 þ ðk 1Þð1 þ 0:39Þ2 þ 0:392 b
b σ,
⋄
6.6 Predictions in Box–Jenkins Methodology 169
(a) (b)
14 440000
12
400000
10
360000
8
6 320000
4
280000
2
240000
0 01 02 03 04 05 06 07 08 09 10 11 12
2000 2001 2002 2003 2004 2017
Fig. 6.11 Point and 95 % interval predictions from Example 6.9 for (a) 3-month interbank interest
rate (Dreimonatsgeld in % p.a.) in Germany for years 2000–2004 (see also Example 6.1); (b) job
applicants kept in the Czech labor office register for particular months 2017M1-2017M12 (see also
Example 6.7) calculated by means of EViews
Remark 6.23 The behavior of predictions is different for stationary and nonstationary
processes. If the prediction horizon increases in a stationary process, then the predic-
tion converges to the mean value of the process (mean reversion) and the variance of
prediction error converges to the variance of process: bytþk ðt Þ! E(yt) ¼ μ in the sense
of convergence in mean square (it follows from (6.114) for k ! 1 where μ ¼ 0),
var(et+k(t)) ! var(yt) ¼ σ 2 in nondecreasing way (it follows from (6.116) again
for k ! 1). On the contrary in nonstationary processes (e.g., ARIMA), if the
prediction horizon increases then the width of prediction horizon grows to infinity:
var(et+k(t)) ! 1. Therefore, the applicability of these predictions is more and more
dubious for k ! 1) and also the (unconditional) variance of such processes is infinite
(i.e., yt can attain any real value for sufficiently large t). In any case, the prediction
band composed from particular prediction intervals has a “funnel” shape (see, e.g.,
Fig. 6.11).
⋄
Remark 6.24 If we write an estimated model ARIMA(0,1,1) as
Δyt ¼ εt b
θεt1 , ð6:124Þ
X
1
bytþ1 ðt Þ ¼ 1 b b j
θ θ ytj , ð6:125Þ
j¼0
170 6 Box–Jenkins Methodology
i.e., they coincide with predictions according to simple exponential smoothing if one
chooses the discount constant β ¼ b θ1 (i.e., the smoothing constant α ¼ 1 b
θ1 ; see
(3.75)).
⋄
Example 6.9 Figure 6.11(a) and (b) plot the point and 95 % interval predictions for
(a) 3-month interbank interest rate (Dreimonatsgeld in % p.a.) in Germany for years
2000–2004 (see Example 6.1); (b) job applicants kept in the Czech labor office
register for particular months 2017M1-2017M12 (see Example 6.7).
For example, one can see in Fig. 6.11(a) that the predictions of 3-month interbank
interest rate stabilize with increasing prediction horizon to the level 8% of uncondi-
tional mean value of the given process.
⋄
6.7 Long Memory Process
Some financial (but also economic or hydrologic) data remain autocorrelated even
over very long time distances. Strictly speaking, their estimated correlogram and
partial correlogram decrease hyperbolically (by a polynomial rate; see (6.132)). Such
a rate lies between a very slow linear decrease for the processes ARIMA with some
characteristic roots nearly on the border of unit circle and a very fast exponential
decrease for the stationary processes ARMA. The time series of this type are usually
called long-memory process (or persistent process). A successful method how to
model it consists in so-called fractional differencing (see, e.g., Hurst (1951) in
hydrology and Granger (1980) in economy).
The simplest example of a long-memory process is the fractionally integrated
process of order d (d is a non-integer) denoted by acronym FI(d) of the form
ð1 BÞd yt ¼ εt or Δd yt ¼ εt : ð6:126Þ
X
1
yt ¼ εt þ ψ i εt i , ð6:127Þ
i¼1
where
6.7 Long Memory Process 171
d ðd þ 1Þ . . . ðd þ i 1Þ
ψi ¼ ð6:128Þ
i !
(it follows from the extension of (1–z)–d to power series). The process is
nonstationary for d 0.5.
2. d > –0.5: The process is invertible with representation in the form of (infinite)
autoregressive process
X
1
yt ¼ π i yti þ εt , ð6:129Þ
i¼1
where
d ð d 1Þ . . . ð d i þ 1Þ
π i ¼ ð1Þi ð6:130Þ
i !
(it follows from the extension of (1–z)d to power series). The process is
noninvertible for d –0.5.
3. –0.5 < d < 0.5: The process is stationary and invertible with autocorrelation and
partial autocorrelation function of the form
d ðd þ 1Þ ... ðd þ k 1Þ d
ρk ¼ ,ρ ¼ , k ¼ 1, 2, ... : ð6:131Þ
ð1 dÞð2 dÞ ... ðk dÞ kk k d
X
1
ρk ¼ 1: ð6:133Þ
k¼1
Particularly in this case one uses explicitly the attribution long-memory process
(or persistent process). The partial sums y1 + ... + yt grow with a quicker rate than
the linear one.
172 6 Box–Jenkins Methodology
X
1
jρk j < 1: ð6:134Þ
k¼1
(i.e., a stationary and invertible model ARMA( p, q) for wt), where wt arises from the
original process yt as a fractionally integrated process of order d
wt ¼ ð1 BÞd yt : ð6:136Þ
X
t
yt ¼ y1 þ ετ ð6:137Þ
τ¼2
(see (6.90)) are called strong-memory processes since they “remember” all last
shocks εt, εt – 1, ... .
⋄
Remark 6.26 A sudden structural break (i.e., an abrupt change of the model) within
an observed time series may lead, when one fits a model, to a pseudo-long-memory
behavior. Even a mean change of this type already pretends a long-memory behavior
in sample (partial) correlograms. Particularly, it is typical for some financial time
series which are modeled as long-memory processes since a structural break has
been not taken into account.
⋄
6.8 Exercises
Exercise 6.1 Repeat the analysis from Examples 6.1–6.4 and 6.9(a) (the stationary
process of “Dreimonatsgeld”), but only for data since 1965 (hint: yt ¼ 6:466 þ
0:924yt1 0:735yt2 þ 0:508yt3 0:510yt4 þ εt , b
σ ¼ 1:785).
6.8 Exercises 173
Exercise 6.2 Repeat the analysis from Examples 6.5 and 6.6 (the nonstationary
index PX), but only for last 150 observations (hint: b
σ ¼ 8:33).
Exercise 6.3 Repeat the analysis from Examples 6.7 and 6.9(b) (the seasonal
process of job applicants), but only for data since 2010 (hint: yt 0:98yt1
yt12 þ 0:98yt13 ¼ 193 593:5 2 684:2 t þ εt , b
σ ¼ 7 267:96).
Chapter 7
Autocorrelation Methods in Regression
Models
Formally, one can write the (linear) dynamic regression model for explained vari-
able yt as
where all explanatory variables xti are orthogonal to the residual ut in the same time
t, i.e., cov(xti, ut) ¼ 0 (so-called simultaneous uncorrelatedness). Such a condition of
orthogonality guarantees some useful statistical properties of the model, e.g., OLS
estimates of regression parameters in (7.2) are consistent under this condition.
Dynamic regression models are broadly used in econometric modeling, where the
orthogonal explanatory variables xti may be either (strictly) exogenous (i.e., origi-
nating outside the model (7.2)) or predetermined (i.e., originating within the model,
but in a past time viewed from the perspective of present time t, e.g., originating in
time t 1).
More specifically, one usually assumes that the residual component ut is modeled
as an ARMA process
where εt is white noise with variance σ 2 and β1, . . ., βk, φ1, . . ., φp, θ1, . . ., θq, σ 2 are
(unknown) parameters.
One should stress that the typical feature of dynamic regression models is the
exploitation of time lagged (delayed) variables. It has practical reasons, e.g.:
• Decelerated responses to changes: Some economic and financial variables
change slowly so that a response to the changes of this type (e.g., changes in
the structure of financial markets, government politics, bank strategy) is measur-
able with a substantial time delay and not within the same time period. In the
economic and financial context, the reasons of such delay can be manifold:
– Psychological: e.g., market subjects disbelieve at first new messages or under-
estimate their consequences.
– Technological: e.g., the speed of transactions depends on technical facilities of
financial exchanges.
– Due to liquidity: e.g., new investment positions cannot be open until the old
ones are closed (or sold), or until a necessary capital is available.
Moreover, the speed and intensity of responses depend on the character of
changes, e.g., whether the changes are permanent or transient. In any case, a
complex dynamic structure complicates the model interpretation.
• Overreaction to changes: Sometimes pessimistic economic prognoses cause
immediate decreases of prices which are stabilized later (i.e., with a time gap)
as soon as the real results are announced.
• Modeling autocorrelated residuals: The application of dynamic models instead
of static ones can sometimes remove the problem of autocorrelated residuals (see
Sect. 7.2).
In this chapter, we deal with several special cases of dynamic regression models
which are important for economic and financial time series, namely:
• Linear regression model with autocorrelated residuals: does not contain any
lagged variables (neither explanatory nor explained), but the delay is comprised
in the residual component (see Sect. 7.2).
• Distributed lag model: contains the lagged explanatory variables but no lagged
explained variable (see Sect. 7.3).
• Autoregressive distributed lag model: contains the lagged explained variable
(and possibly also lagged explanatory variables (see Sect. 7.4)).
with the ARMA structure (7.3) of residuals ut, but in contrast to (7.2) the explanatory
variables xti are looked upon as deterministic regressors. It is a popular generaliza-
tion of linear regression where the residual component has the form of uncorrelated
white noise to the case with correlated observations. Such a correlatedness must be
taken into account since it is usual in practice (e.g., the delayed values of some
variables, which should be included among regressors of (7.4), are present only in
the residuals ut causing correlatedness in time).
The simplest type of correlatedness covering majority of routine situations
consists in modeling the residual component ut by means of the stationary
autoregressive model of the first order (see the process AR(1) in Remark 6.9)
written as
ut ¼ ρut1 þ εt , ð7:5Þ
where the autoregressive parameter ρ (–1 < ρ < 1) equals the first autocorrelation of
the process ut (it is denoted ρ instead of ρ1 for simplicity) and εt is white noise. The
sign of ρ plays an important role here: the positive ρ > 0 (so-called positive
correlatedness plotted for a trajectory of residuals ut in the scatterplot on the left-
hand side of Fig. 7.1) induces the inertia for the signs of neighboring values ut (see
the right-hand side of Fig. 7.1 with a relatively rare crossing of time axis), while on
the contrary the negative ρ < 0 (so-called negative correlatedness plotted in the
scatterplot on the left-hand side of Fig. 7.2) induces frequent changes of the signs of
neighboring values ut (see the right-hand side of Fig. 7.2 with a relatively dense
crossing of time axis).
ut ut
ut−1 t
ut ut
ut−1 t
DW 2ð1 b
ρÞ, ð7:7Þ
where
Pn
t¼2 b
ut1but
b
ρ¼ P T
ð7:8Þ
t¼1 b
2
ut
is the estimate of the first autocorrelation ρ (see r1 according to (6.9) with u ¼ 0). The
relation (7.7) implies:
• If b
ρ 0 (i.e., the neighboring residuals are uncorrelated), then DW 2.
• If b
ρ 1 (i.e., the neighboring residuals are extremely positively correlated), then
DW 0.
• If b
ρ –1 (i.e., the neighboring residuals are extremely negatively correlated),
then DW 4.
7.2 Linear Regression Model with Autocorrelated Residuals 179
0 dL dU 2 4−dU 4−dL 4 DW
The statistics DW does not have any standard probability distribution. However,
if one assumes the normality of white noise εt, then DW has two critical values dL
(lower) and dU (upper), which depend only on the number of observations n and
regressors k (but not on the values of these regressors). Nowadays, DW test is mainly
used as an informal instrument indicating possibility of existence of autocorrelated
residuals. Its conclusions on the null hypothesis H0: ρ ¼ 0 are summarized in Fig. 7.3
(however, the test is inconclusive for some values of the statistics DW, so that no
conclusions are possible in such a case):
The critical values dL and dU can be found in statistical tables or they are
calculated by means of simulations directly in software systems (in the form of p-
values). Moreover, in practice one can apply simplified rules (“rules of thumb”): e.g.,
if one has more observations n (more than fifty) and the number of regressors k is not
too high, then the value of DW lower than 1.5 usually implies the positive
autocorrelatedness (see Example 7.1).
Later some more general tests have been also suggested that enable to detect even
higher order autocorrelations than the order one in DW test (of course, it would be
possible to try sequentially in the numerator of DW statistics (7.6) various differ-
ences of non-neighboring OLS residuals, but such an approach is too tedious). The
other alternative is to apply procedures suggested originally for verification of Box–
Jenkins models, mainly Q-tests (e.g., Box–Pierce test or Ljung–Box test; see (6.66)
and (6.67)).
Nowadays in the context of regression models of the type (7.4), econometric
software systems offer particularly Breusch–Godfrey test of autocorrelated residuals.
It was suggested for the alternative hypothesis that the residual component ut is the
autoregressive model AR( p) of a higher order p 1. The BG test proceeds in the
following way:
1. One calculates OLS residuals b
ut in the model (7.4) (i.e., by the classical method of
least squares in the same way as for DW test).
2. One estimates an auxiliary model
180 7 Autocorrelation Methods in Regression Models
b
ut ¼ γ 1 þ γ 2 xt2 þ . . . þ γ k xtk þ φ1b
ut1 þ φ2b
ut2 þ . . . þ φpb
utp þ εt : ð7:9Þ
H0 : φ1 ¼ φ2 ¼ . . . ¼ φp ¼ 0 against H1 : φ1 6¼ 0 or
φ2 6¼ 0 or . . . or φp 6¼ 0 ð7:10Þ
applying the classical F-test in the model (7.10). In particular, the critical value of
this test with significance level α is the quantile F1α( p, nkp) of F-distribution.
Other tests instead of F-test, e.g., LM test (Lagrange multiplier), are also possible
(see Example 7.1).
The problematic point of BG test is how to choose the autoregressive order p.
A simple recommendation is the choice corresponding to the frequency of data (e.g.,
p ¼ 4 for quarterly observations, p ¼ 12 for monthly observations, etc.), since the
residual component is usually correlated mainly with the residual component for the
same seasonal period of previous year. On the other hand, if the model is adequate
from the statistical point of view, then no autocorrelations should be significant for
arbitrary choice of p.
(obviously, εt ¼ ut – ρut – 1; see (7.5)). Should the value of ρ be known, then one
would obtain by means of Koyck transformation the classical linear regression
where yt ¼ yt ρyt1 , β1 ¼ ð1 ρÞβ1 , xt2 ¼ xt2 ρxt1,2 , . . . , xtk ¼ xtk ρxt1,k .
This fact is the principle of Cochrane–Orcutt method that is phased in the following steps:
1. One calculates the OLS residuals b ut in the model (7.4).
2. One constructs the estimate bρ of the parameter ρ according to (7.8).
3. One constructs the OLS estimate in the model (7.12) replacing the parameter ρ by
the estimate b
ρ.
4. The procedure goes on iteratively by repeating the steps 1 to 3, where in the step 1
one applies the OLS residuals calculated by means of the OLS estimate from the
previous step 3. The procedure stops finally using a suitable stopping rule (e.g., if
the change in the estimated value of ρ between neighboring iteration cycles drops
under a limit fixed in advance).
The disadvantage of Cochrane–Orcutt method consists mainly in the fact that it
delivers estimated parameters of the transformed model (7.12). Some software
systems (e.g., EViews in Example 7.1) enable to return to the estimated parameters
in the original model (7.4) (i.e., before Koyck transformation) by testing the para-
metric constraints following from this transformation.
If we consider the linear regression model (7.4) with general ARMA structure
(7.3) of residuals ut (and not specifically AR(1)), then there are various sophisticated
method for its construction (e.g., the two-stage estimation procedures using the
concept of instrumental variables or other methods; see EViews).
On the other hand, an opposite approach ignoring the residual correlations is also
possible. Namely, if we apply the classical OLS methodology directly to the model
(7.4) (i.e., ignoring the fact that ut need not be white noise), then the corresponding
OLS estimates of parameters β1, . . ., βk are not the best linear unbiased estimates, but
remain consistent (i.e., for large n they are near to the true theoretical values of these
parameters with a high probability). The only weak point of OLS estimates if they
are used in the models with autocorrelated residuals consists in underestimating their
errors (it can, e.g., impair t-tests of parameter significance). The remedy of this
weakness is the application of Newey–West estimate of the error matrix of OLS
estimates of parameters β1, . . ., βk denoted in software as HAC (heteroscedasticity
and autoregression consistent covariances; see, e.g., EViews in Example 7.1).
Example 7.1 Table 7.1 presents the values AAAt and TBILLt of (average) yields to
maturity (YTM in % p.a.) of corporate bonds of the highest quality AAA and
3-month T-bills according to S&P in the USA for particular quarters 1990–1994
(t ¼ 1 , ..., 20). These are two alternative ways of risk-free investing so that one
should expect that yields AAAt depend positively on short-term interest rates TBILLt
in time. This expectation is confirmed in Table 7.2 for model
(the estimated model is highly significant according to t-ratios and F-test for the
coefficient of determination R2 with significantly positive estimate b2 ¼ 0.426 of the
182 7 Autocorrelation Methods in Regression Models
Table 7.2 Estimation of the model (7.13) from Example 7.1 (yields to maturity of corporate bonds
AAA) calculated by means of EViews
Dependent Variable: AAA
Method: Least Squares
Sample: 1 20
Included observations: 20
Variable Coefficient Std. Error t-Statistic Prob.
C 6.242529 0.230299 27.10622 0.0000
TBILL 0.425939 0.045876 9.284506 0.0000
R-squared 0.827259 F-statistic 86.20205
S.E. of regression 0.343246 Prob (F-statistic) 0.000000
Durbin–Watson stat 0.778482
Source: Calculated by EViews
parameter β2; see Table 7.2). The strong positive autocorrelatedness between εt–1
and εt is also demonstrated by means of the scatterplot in Fig. 7.4 (the corresponding
correlation coefficient is 0.602).
As statistical tests are concerned, the rule of thumb in the framework of Durbin–
Watson test confirms the positive autocorrelatedness, since DW ¼ 0.778 is much
lower than 1.5 (see Table 7.2). The results of Breusch–Godfrey test with the residual
7.2 Linear Regression Model with Autocorrelated Residuals 183
0.0
OLS-RESID
-0.2
-0.4
-0.6
-0.8
-1.0
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
OLS-RESID(-1)
Table 7.3 Breusch–Godfrey test of the model (7.13) with p ¼ 1 from Example 7.1 (yields to
maturity of corporate bonds AAA) calculated in EViews
Breusch–Godfrey Serial Correlation F- and LM Test:
F-statistic 9.759497 Prob. F(1,17) 0.006178
ObsR-squared 7.294230 Prob. Chi-Square(1) 0.006918
Source: Calculated by EViews
autoregressive model of order p ¼ 1 in Table 7.3 (both in the form of F-test, and in
the form of LM test) give the same conclusion.
The final estimate of the identified model
Table 7.4 Estimation of the model (7.14) from Example 7.1 by means of Cochrane–Orcutt method
(yields to maturity of corporate bonds AAA) calculated in EViews
Method: Least Squares
Sample (adjusted): 2 20
Included observations: 19 after adjustments
Convergence achieved after 10 iterations
Variable Coefficient Std. Error t-Statistic Prob.
C 6.246693 0.487125 12.82358 0.0000
TBILL 0.429785 0.105027 4.092135 0.0009
AR(1) 0.601729 0.199094 3.022333 0.0081
R-squared 0.880105 F-statistic 58.72528
S.E. of regression 0.286794 Prob (F-statistic) 0.000000
Durbin–Watson stat 1.722505
Source: Calculated by EViews
Table 7.5 Estimation of the model (7.14) from Example 7.1 including Newey–West estimate of
the error matrix of estimated parameters (yields to maturity of corporate bonds AAA) calculated by
means of EViews
Dependent Variable: AAA
Method: Least Squares
Sample: 1 20
Included observations: 20
Newey-West HAC Standard Errors and Covariance (lag truncation ¼ 2)
Variable Coefficient Std. Error t-Statistic Prob.
C 6.242529 0.376704 16.57143 0.0000
TBILL 0.425939 0.061072 6.974403 0.0000
R-squared 0.827259 F-statistic 86.20205
S.E. of regression 0.343246 Prob (F-statistic) 0.000000
Durbin–Watson stat 0.778482
Source: Calculated by EViews
Distributed lag model (or DL model) contains lagged explanatory variables but no
lagged explained variable (obviously, it fulfills in this way the condition that
explanatory variables are orthogonal to the residual in the same time, the lagged
regressors being predetermined in the previous times). For simplicity, we confine
ourselves to the case with a single explanatory variable x and the residual component
in the form of a white noise (the generalization for more lagged explanatory vari-
ables does not cause a serious complication):
X1
yt ¼ α þ i¼0
βi xti þ εt : ð7:15Þ
7.3 Distributed Lag Model 185
Moreover, one usually calculates also other characteristics of lagged effects, e.g.:
Pq
βi
median lag ¼ smallest q such that Pi¼0
1 0:5 ð7:18Þ
i¼0 β i
and
P1
i βi
mean lag ¼ Pi¼0
1 : ð7:19Þ
i¼0 β i
The DL model of the form (7.15) is too general to be applied in practice. Therefore,
various modifications have been suggested. The geometric distributed lag model
(or GDL model) is a very pragmatic solution since it uses only a finite number of
parameters α, β, and λ:
X1
yt ¼ α þ β i¼0
ð1 λÞλi xti þ εt , 0 < λ < 1: ð7:20Þ
In this case, the long-run effect (7.17) equals directly the parameter β since
186 7 Autocorrelation Methods in Regression Models
X1
β i¼0
ð1 λÞλi ¼ β: ð7:21Þ
Applying Koyck transformation, when one subtracts from the regression equation
(7.20) in time t the same equation in time t – 1 multiplied by the constant λ (compare
with (7.11)), one obtains
Polynomial distributed lag model (or PDL model) was suggested by Almon (1965)
as a special case of DL model with simplified expression of coefficients by means of
polynomials. This approach can reduce substantially the number of parameters
which must be estimated when constructing “trimmed” distributed lag model
Xk
yt ¼ α þ i¼0
βi xti þ εt ð7:23Þ
(obviously, one must choose a priori an adequate length k of trimming). The PDL
models suppose the possibility of approximation
βi ¼ α0 þ α1 i þ α2 i2 þ . . . þ αr ir , i ¼ 0, 1, . . . , k, ð7:24Þ
where the order r of approximative polynomial is much lower than the maximum lag
k (r << k). After substituting (7.24) into (7.23), one obtains
X
k X
k X
k
yt ¼ α þ α 0 xti þ α1 i xti þ . . . þ αr ir xti þ εt ¼
i¼0 i¼0 i¼0
ð7:25Þ
¼ α þ α0 z0t þ α1 z1t þ . . . þ αr zrt þ εt ,
i.e., each zjt is a linear combination of actual value and k lagged values xt, xt–1, ...,
xt–k. In the model (7.25), one can use mostly without problems the classical OLS
methodology and then according to (7.24) find the corresponding estimates of
original parameters βi. Moreover, one can also include the constraint
ensuring the null influence of xt+1 on yt (i.e., the null influence from the future time),
or the constraint
ensuring the null influence of xt–k–1 on yt (i.e., the null influence from the past time
beyond the used trimming).
Example 7.2 Table 7.6 presents the values of money supply M1t and gross domes-
tic product GDPt (in billions of USD) in the USA for particular quarters 1950–2000
(t ¼ 1, ..., 204). For these data, we shall estimate the DL model explaining the gross
domestic product by means of lagged money supplies (since there is usually an
inertia in the effect of M1):
X4
ln GDPt ¼ α þ i¼0
βi ln M1ti þ εt , t ¼ 5, . . . , 204: ð7:28Þ
The model (7.28) trimmed beyond the lag of four quarters is estimated in Table 7.7
(the coefficient of determination R2 is relatively high, but the statistics DW near to
zero indicates the strong positive autocorrelatedness). Hence the long-run effect of
money supply on gross domestic product is
X4
βDL ¼ i¼0
βi ¼ 1:314 þ . . . þ ð0:021Þ ¼ 0:579:
is estimated in Table 7.8. In this case, the long-run effect of M1 on GDP works out
Finally, Table 7.9 presents the results when applying PDL approach (7.25) to
model (7.28) with a higher trimming lag k ¼ 12 and lower order r ¼ 3 of approx-
imative polynomial. In the first part of Table 7.9, one estimates the model (7.25)
(e.g., PDL01 is the regressor z0t, etc.). In the second part of Table 7.9 one calculates
according to (7.24) the estimates of original parameters βi (till the lag of 12)
including their graphical plot.
Table 7.6 Quarterly data 1950–2004 in Example 7.2 (money supply M1 and gross domestic product GDP in the USA in billions of USD)
188
3 35 1958 2177.5 136.64 103 1975 4115.4 286.00 171 1992 6899.7 988.70
4 36 1958 2226.5 138.48 104 1975 4167.2 286.80 172 1992 6990.6 1024.00
1 37 1959 2273.0 139.70 105 1976 4266.1 292.40 173 1993 6988.7 1038.10
2 38 1959 2332.4 141.20 106 1976 4301.5 296.40 174 1993 7031.2 1075.30
3 39 1959 2331.4 141.00 107 1976 4321.9 300.00 175 1993 7062.0 1105.20
4 40 1959 2339.1 140.00 108 1976 4357.4 305.90 176 1993 7168.7 1129.20
1 41 1960 2391.0 139.80 109 1977 4410.5 313.60 177 1994 7229.4 1140.00
2 42 1960 2379.2 139.60 110 1977 4489.8 319.00 178 1994 7330.2 1145.60
3 43 1960 2383.6 141.20 111 1977 4570.6 324.90 179 1994 7370.2 1152.10
4 44 1960 2352.9 140.70 112 1977 4576.1 330.50 180 1994 7461.1 1149.80
1 45 1961 2366.5 141.90 113 1978 4588.9 336.60 181 1995 7488.7 1146.50
2 46 1961 2410.8 142.90 114 1978 4765.7 347.10 182 1995 7503.3 1144.10
3 47 1961 2450.4 143.80 115 1978 4811.7 352.70 183 1995 7561.4 1141.90
4 48 1961 2500.4 145.20 116 1978 4876.0 356.90 184 1995 7621.9 1126.20
1 49 1962 2544.0 146.00 117 1979 4888.3 362.10 185 1996 7676.4 1122.00
2 50 1962 2571.5 146.60 118 1979 4891.4 373.60 186 1996 7802.9 1115.00
3 51 1962 2596.8 146.30 119 1979 4926.2 379.70 187 1996 7841.9 1095.80
4 52 1962 2603.3 147.80 120 1979 4942.6 381.40 188 1996 7931.3 1080.50
(continued)
189
Table 7.6 (continued)
190
Table 7.7 Estimation of distributed lag model (7.28) from Example 7.2 (gross domestic product
GDP explained by lagged money supply M1) calculated by means of EViews
Dependent Variable: LOG(GDP)
Method: Least Squares
Sample (adjusted): 5 204
Included observations: 200 after adjustments
Variable Coefficient Std. Error t-Statistic Prob.
C 4.946307 0.056813 87.06268 0.0000
LOG(M1) 1.313582 0.867809 1.513677 0.1317
LOG(M1(-1)) –0.406028 1.489386 –0.272614 0.7854
LOG(M1(-2)) –0.055179 1.493922 –0.036936 0.9706
LOG(M1(-3)) –0.252407 1.490153 –0.169383 0.8657
LOG(M1(-4)) –0.021173 0.878340 –0.024106 0.9808
R-squared 0.949165 F-statistic 724.4522
S.E. of regression 0.108728 Prob (F-statistic) 0.000000
Durbin–Watson stat 0.018483
Source: Calculated by EViews
Table 7.8 Estimation of geometric distributed lag model (7.29) after Koyck transformation (7.30)
from Example 7.2 (gross domestic product GDP explained by lagged money supply M1) calculated
by means of EViews
Dependent Variable: LOG(GDP)
Method: Least Squares
Sample (adjusted): 2 204
Included observations: 203 after adjustments
Variable Coefficient Std. Error t-Statistic Prob.
C 0.060444 0.030959 1.952384 0.0523
LOG(M1) 0.004530 0.003765 1.203104 0.2304
LOG(GDP(-1)) 0.990601 0.006235 158.8895 0.0000
R-squared 0.999584 F-statistic 240302.5
S.E. of regression 0.009932 Prob (F-statistic) 0.000000
Durbin–Watson stat 1.286477
Source: Calculated by EViews
The sum of estimated parameters corresponding to the actual value and lagged
values of lnM1 in Table 7.9 (i.e., the long-run effect of money supply on gross
domestic product) is βPDL ¼ 0.565.
⋄
192 7 Autocorrelation Methods in Regression Models
Table 7.9 Estimation of polynomial distributed lag model from Example 7.2 (gross domestic
product GDP explained by lagged money supply M1) calculated by means of EViews
Dependent Variable: LOG(GDP)
Method: Least Squares
Sample (adjusted): 13 204
Included observations: 192 after adjustments
Variable Coefficient Std. Error t-Statistic Prob.
C 5.041623 0.054881 91.86544 0.0000
PDL01 –0.193079 0.073166 –2.638932 0.0090
PDL02 0.058884 0.090832 0.648271 0.5176
PDL03 0.016896 0.005232 3.229455 0.0015
PDL04 –0.003285 0.003699 –0.887938 0.3757
R-squared 0.952468 F-statistic 936.7900
Durbin–Watson stat 0.016460 Prob(F-statistic) 0.000000
Lag Distribution of LOG(M1) i Coefficient Std. Error t-Statistic
| 0 0.77136 0.27859 2.76880
| 1 0.34548 0.06241 5.53531
| 2 0.05194 0.12823 0.40505
| 3 –0.12898 0.17439 –0.73960
| 4 –0.21699 0.15996 –1.35647
| 5 –0.23178 0.10951 –2.11655
| 6 –0.19308 0.07317 –2.63893
| 7 –0.12058 0.11146 –1.08182
| 8 –0.03400 0.16172 –0.21026
| 9 0.04695 0.17525 0.26793
| 10 0.10258 0.12791 0.80200
| 11 0.11317 0.06468 1.74959
| 12 0.05902 0.28403 0.20778
Sum of Lags 0.56508 0.00954 59.2126
Source: Calculated by EViews
Autoregressive distributed lag model (or ADL model) contains both lagged explan-
atory variables and lagged explained variable. It can be looked upon as a special
(linear) filtering scheme (therefore, it is sometimes also called transfer function
model, since transfer functions are typical concepts in filtering theories). This
model can be formally written by means of operators used in Box–Jenkins method-
ology (see Sect. 6.2) as
where φ(B) ¼ 1 – φ1B –...– φp B p is the autoregressive operator, β(B) ¼ β0 + β1B +...
+ βk Bk is the operator of distributed lags of explanatory variable x, and εt is the
7.4 Autoregressive Distributed Lag Model 193
residual in the form of stationary process ARMA(r, s). In addition, more explanatory
variables can be included, but then the model (7.31) must be extended, e.g., for two
explanatory variables x1 and x2 to the form
If the autoregressive operator φ(B) is stationary (i.e., its roots lie outside the unit
circle in complex plane), then (7.31) can be rewritten as
βðBÞ
yt ¼ μ þ x þ ηt , ð7:33Þ
φðBÞ t
⋄
The construction of ADL model is analogous to the classical procedures in Box–
Jenkins methodology and supposes the application of a suitable software (see, e.g.,
EViews). Moreover, the models have usually specific forms since they are
constructed for specific situations. We describe here two specific cases, for which
the ADL models seem to be useful:
with the moment of intervention t0 (jumps St and pulses Pt are obviously special
examples of so-called dummy variables (or dummies) which are popular in the
econometric modeling). Choosing a jump or a pulse and a suitable form of ADL
scheme (7.31), one can model various modes how the intervention fades away (such
an analysis is sometimes also denoted as the impulse response). For example, an
immediate dynamic change in a given time series can be modeled using (7.33) in
the form
β0
yt ¼ μ þ S þ ηt ¼ μ þ β0 St þ φ1 St1 þ φ21 St2 þ . . . þ ηt : ð7:35Þ
1 φ1 B t
The response to such an intervention mode corresponds to shifting the time series by
the value β0(1 + φ1 + ... + φ1h) in each time t0 + h (h ¼ 0, 1, ...). If |φ1| < 1, then this
shift achieves asymptotically the value β0/(1 – φ1). If φ1 ¼ 1, then the level of time
series changes linearly with the accrual of β0 during each time unit. Models for other
modes of intervention changes including practical applications are described, e.g., in
Box and Tiao (1975).
7.4.2 Outliers
Another possible application of ADL schemes consists in modeling outliers (on the
other hand, the outliers can be also handled by applying other approaches, e.g., by
robustifying statistical methods to be insensitive to the outlying values; see Sect.
2.2.1.2). The approach based on ADL modeling may be convenient (especially, if
the aim is predicting time series). In general, two types of outliers should be
distinguished:
1. Additive outlier (abbreviated as AO) is linked additively to the basic (e.g.,
stationary) process in time t0, i.e.:
yt ¼ z t þ δ P t , ð7:36Þ
where zt is a stationary process, Pt is the pulse according to (7.34), and δ is the size
of modeled outlier. Particularly in the case of stationary autoregressive process of
the form φ(B)zt ¼ α + εt, it holds for the observed (contaminated) time series yt
(simply substituting zt ¼ yt – δ Pt to this autoregressive model)
7.5 Exercises 195
X
p X
p
yt ¼ α þ φ j ytj þ δ Pt δφ j Ptj þ εt , ð7:37Þ
j¼1 j¼1
where the dummy variable Pt–j is unit in time t0 + j and otherwise zero. Neglecting
the outlier (i.e., applying the classical autoregressive model directly for yt) is
incorrect and can cause substantial estimation and prediction errors. In addition
under suspicion on an outlier in time t0, one can test the significance of parameters
δ in (7.37) by means of the classical t-test.
2. Innovation outlier (abbreviated as IO) is generated in the innovation process so
that, e.g., in the case of stationary autoregressive structure of observed process yt
one should write
X
p
yt ¼ φ j ytj þ δ Pt þ εt : ð7:38Þ
j¼1
Such an innovation irregularity has the main impact only in time t0 and then its
influence decays so that its ignoring is not so dangerous for estimation or
prediction as in the case of AO. The test for IO is analogical as for AO, i.e., by
means of the classical t-test of significance of the parameter δ in (7.38). However,
if the observed process {yt} has the nonstationary ARIMA structure, then the
influence of innovation outlier persists over long time horizons.
7.5 Exercises
Exercise 7.1 Repeat the analysis from Example 7.1 (the yields to maturity of
corporate bonds of the highest quality AAA), but only for data since 1991 (hint:
AAAt ¼ 5.84 + 0.535TBILLt + εt, DW ¼ 0.770).
Exercise 7.2 Repeat the analysis from Example 7.2 (the gross domestic product
GDP in the USA), but only for data since 1980 (hint: βDL ¼ 0.469, βGDL ¼ 0.907,
βPDL ¼ 0.421).
Part IV
Financial Time Series
Chapter 8
Volatility of Financial Time Series
(see also (6.91)), where pt ¼ ln Pt are logarithmic prices at time t. Sometimes also
relative price variations or simply returns are used (even if sometimes the term
“return” denotes the log return (8.1))
Pt Pt1
returnt ¼ : ð8:2Þ
Pt1
.04
.02
.00
-.02
-.04
-.06
25 50 75 100 125 150 175 200 225 250
Fig. 8.1 Daily log returns of index PX in 2016 (251 trading days)
(a) (b)
.4 .4
.3 .3
.2 .2
.1 .1
.0 .0
-.1 -.1
-.2 -.2
-.3 -.3
-.4 -.4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ACF of log returns of index PX (in year 2016) ACF of log returns squared of index PX (in year 2016)
Fig. 8.2 Estimated autocorrelations of (a) log returns and (b) log returns squared of index PX in
2016 (251 trading days)
Fig. 8.3 Probability density of N(0, 1) (dotted line) versus probability density of a leptokurtic
distribution with zero mean value, unit variance, and kurtosis γ 2 > 0 (solid line)
35
Series: log returns of PX (in 2016)
30 Observations 250
25 Mean -7.15e-05
Median 0.000228
20 Maximum 0.034724
Minimum -0.042614
Std. Dev. 0.010647
15
Skewness -0.650275
Kurtosis 4.997044
10
Jarque-Bera 59.16266
5
Probability 0.000000
0
-0.025 0.000 0.025
Fig. 8.4 Leptokurtic distribution (particularly with higher kurtosis coefficient compared with the
corresponding normal distribution) for daily log returns of index PX in 2016 from Example 8.1.
Source: calculated by EViews
• Volatility clustering: Large absolute log returns | rt | tend to appear in clusters, i.e.,
turbulent (high-volatility) subperiods are followed by quiet (low-volatility)
periods, since high (low) deviations of returns can be expected after high (low)
previous deviations, respectively (see Fig. 6.8(b) for first differences and Fig. 8.1
for log returns of daily PX in 2016). The subperiods of volatility bursts are
recurrent, but they do not appear periodically.
• Leverage effect: This effect involves an asymmetry of the impact of past positive
and negative log returns on the current volatility (obviously, positive log returns
correspond to increases of prices, while negative log returns correspond to
decreases of prices). More exactly, previous negative returns (i.e., price
decreases) tend to increase volatility by a larger amount than positive returns
(i.e., price increases) of the same magnitude. Empirically, a positive correlation is
often detected between rt+ ¼ max(rt, 0) and |rt+h| for positive h, but this correla-
tion is generally less than between rt ¼ max(rt, 0) and |rt+h| (e.g., for
log returns of daily PX in 2016 one has corr(rt+, |rt+1|) ¼ 0.377, while corr
(rt, | rt+1|) ¼ 0.527).
• Calendar effects: Various calendar effects should be also mentioned in the
context of financial time series: the day of week, the proximity of holidays,
seasonality, and other factors may have significant effects on returns. Following
a period of market closure, volatility tends to increase, reflecting the information
cumulated during this break. Similar effects appear in intraday series as well.
One can see that the concept of volatility is very important for financial analysis.
In general, volatility can be looked upon as the spread of all possible outcomes of an
uncertain variable. Volatility is related to, but not exactly the same as, risk. Risk is
associated with undesirable outcome, whereas volatility (as the measure strictly for
uncertainty) can occur due to a positive outcome. In any case, the volatility is an
8.1 Characteristic Features of Financial Time Series 203
1 X
T
σ2 ¼
b μ Þ2 ,
ðr b ð8:4Þ
T 1 t¼1 t
1 Xt1 1 Xt1
b
σ 2t ¼ τ¼tk τ
μ τ Þ2 ,
ðr b where b
μt ¼ r
τ¼tk τ
ð8:5Þ
k1 k
(it is constructed conditionally using information relevant for time t, e.g., data over
several days for risk management, over several months for option pricing, or over
several years for investment analysis).
The volatility is obviously a latent (i.e., non-observable) matter. In Sect. 8.3,
several models are given that enable to estimate it. However, besides model
approaches to volatility one can also use so-called proxy approaches which are
based on replacing the non-observable volatility by an observable proxy of it (see,
e.g., Poon (2005)):
• The most usual proxy for the volatility in time t is the square of log return in this
time, i.e., rt2 (surprisingly, taking deviations around zero instead of centering
them by means of the sample mean typically increases the accuracy of volatility
prediction; see Poon (2005)).
• Another approach consists in applying H-L measure (so-called high-low measure
by Parkinson (1980))
ð ln H t ln Lt Þ2
σ 2t ¼
b , ð8:6Þ
4 ln 2
where Ht and Lt denote, respectively, the highest and the lowest prices on day t
(the H-L proxy assumes that the price process follows a geometric Brownian
motion).
• If one disposes of intraday data at short intervals such as 5 or 15 min (so-called
tick data), then the realized volatility can be constructed by integrating squared
log returns, i.e.:
204 8 Volatility of Financial Time Series
X
m
RV tþ1 ¼ r 2m,tþj=m ð8:7Þ
j¼1
(there are m log returns in one unit of time). If the log returns are serially
uncorrelated, then one can show (see, e.g., Karatzas and Shreve (1988)) that
0 tþ1 1
Z X
m
plim @ σ 2 ds
s r2 m,tþj=m
A, ð8:8Þ
m!1
j¼1
t
X
1
yt ¼ εt þ ψ i εti ð8:10Þ
i¼1
(the linear process is a general scheme of linear models, which in comparison to (8.9)
uses explicitly white noise values εt as the corresponding generators).
The nonlinear process in the form (8.9) is too general to be applied practically.
Therefore, one prefers a more specific form of it written by means of the first and
second conditional moments. It should not be surprising, since, e.g., the simple
stationary process AR(1) introduced in (6.36) can be written by means of the
conditional mean value as
8.2 Classification of Nonlinear Models of Financial Time Series 205
Generally, one can condition in time t by the entire information Ωt–1 known till
time t – 1. More specifically, we can imagine that the past information is generated
by all past values {yt–1, yt–2, ...} and {et–1, et–2, ...} using a suitable function of these
values (one usually uses the term σ-algebra in such a situation). Due to the restriction
to the first and second moments only, one usually models the conditional mean value
μt and the conditional variance σ t2 by means of simple (nonlinear) functions of
information in Ωt–1
μt ¼ E yt jΩt1 Þ ¼ gðΩt1 Þ, σ 2t ¼ ht ¼ varðyt jΩt1 Þ ¼ hðΩt1 Þ, ð8:12Þ
where g and h are suitable functions (h() > 0). Although the time index should
distinguish the conditional moments from the unconditional ones, it would be more
correct to write, e.g., μt|t–1 and σt|t–1
2
instead of simplified symbols μt and σ t2 in (8.12)
(in fact, these are one-step-ahead predictions of mean value and variance of the given
process). Then one can write
yt ¼ μ t þ et , ð8:13Þ
since et are the prediction errors or equivalently the deviations from conditional
mean value (if yt ¼ rt, then one also uses for et the term mean-corrected returns).
Moreover, as it holds
one addresses σ t2 as volatility of given time series in time t (see also Sect. 8.1). The
final form of nonlinear process, which is applied in this context most often, is then
pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
yt ¼ μt þ σ t εt ¼ μt þ ht εt ¼ gðΩt1 Þ þ hðΩt1 Þ εt , ð8:15Þ
where εt are iid random variables with zero mean value and unit variance. Obviously,
it holds
et ¼ σ t εt : ð8:16Þ
It is worth noting that the random variables et are uncorrelated, but in contrast to εt
they do not need to be generally independent.
Overall, the considered model is given by two equations in (8.12): the first one is
the mean equation and the second one is the volatility equation. According to the
type of these equations, the nonlinear processes can be classified to
206 8 Volatility of Financial Time Series
for a suitable length of sample period k (see also (8.5)). Moreover, the value (8.17) is
often used in practice as the prediction constructed in time t for short prediction
horizons. Even though previously the historical volatility has been applied broadly in
practice (e.g., in order to estimate the volatility of underlying assets when calculating
option premiums according to Black–Scholes formula), nowadays its meaning is
reduced to determination of benchmarks when assessing the effectiveness of more
complex models of volatility.
A pragmatic extension of the historical volatility approach are EWMA models.
The most frequent model EWMA (exponentially weighted moving average) is an
8.3 Volatility Modeling 207
analogy of simple exponential smoothing (see Sect. 3.3.1) for volatility. In contrast
to the historical volatility calculation, in EWMA models the averaged squares in
(8.17) are weighted with weights which decrease exponentially to the past. Such a
modification is advantageous practically:
• In practice, volatility is usually influenced more by current values which are
distinguished by higher weights from the values farther in the past.
• Moreover, in EWMA models the influence of high deviations persists during
longer time periods than in (8.17) with smaller k, where high deviations leaving
the sample range can cause even jumps in the estimated volatility.
Due to analogy of EWMA models to the simple exponential smoothing (see
(3.75) and (3.77)), the volatility can be estimated using EWMA as
X
1 2
σ 2t ¼ ð1 λÞ
b λ j yt1j y ¼ ð1 λÞðyt1 yÞ2 þ λ b
σ 2t1 , ð8:18Þ
j¼0
where the estimated volatility b σ 2t presents the volatility prediction from time t – 1, y
is an average level of given time series, and λ (0 < λ < 1) is a discount constant
chosen in advance. If one calculates the volatility of time series of log returns rt (see
(8.1)), then the average return is often nearly zero (particularly for higher frequencies
of observations, e.g., for daily returns), then (8.18) transfers to the form
X
1
b
σ 2t ¼ ð1 λÞ λ j r 2t1j ¼ ð1 λÞr 2t1 þ λ b
σ 2t1 : ð8:19Þ
j¼0
In financial practice (see, e.g., RiskMetrics (1996)) due to broad experience with
volatility estimation, one recommends for constant λ routinely the value 0.94.
Example 8.1 Table 8.1 and Fig. 8.1 show daily log returns rt of index PX in 2016
(250 values for 251 trading days calculated as differences of logarithmic index
values rt ¼ lnPXt – lnPXt – 1).
This time series {rt} shows typical features of financial time series (see Sect. 8.1):
• Volatility clustering (see the volatility bunches in the beginning and in the middle
of time series in Fig. 8.1)
• Leptokurtic distribution (see the histogram, the kurtosis coefficient 4.997 – 3 ¼
1.997 > 0, and the test of normality Jarque–Bera in Fig. 8.4).
By means of the recursive formula (8.19) of EWMA model with zero initial
value, one has estimated the corresponding volatility (see the graphical plot in
Fig. 8.5). The EWMA estimation justifies the previous subjective conclusion
(namely the occurrence of increased volatility in the beginning and in the middle
of {rt}).
⋄
Table 8.1 Daily log returns of index PX in 2016 (250 values for 251 trading days written in columns) from Example 8.1 (see also Fig. 8.1 and Table 6.10)
208
1 2 3 4 5 6 7 8 9 10
1 0.0030 0.0227 0.0043 0.0018 0.0011 0.0107 0.0147 0.0111 0.0078 0.0021
2 0.0052 0.0125 0.0002 0.0011 0.0009 0.0092 0.0065 0.0021 0.0012 0.0034
3 0.0217 0.0193 0.0131 0.0079 0.0025 0.0023 0.0028 0.0044 0.0075 0.0066
4 0.0086 0.0015 0.0182 0.0028 0.0012 0.0143 0.0023 0.0043 0.0076 0.0021
5 0.0059 0.0347 0.0070 0.0013 0.0062 0.0041 0.0031 0.0012 0.0014 0.0043
6 0.0011 0.0166 0.0081 0.0052 0.0002 0.0111 0.0030 0.0043 0.0160 0.0024
7 0.0053 0.0186 0.0119 0.0077 0.0101 0.0085 0.0058 0.0009 0.0068 0.0006
8 0.0178 0.0011 0.0017 0.0073 0.0144 0.0014 0.0064 0.0128 0.0040 0.0092
9 0.0197 0.0083 0.0048 0.0140 0.0033 0.0224 0.0002 0.0021 0.0142 0.0007
10 0.0152 0.0175 0.0085 0.0109 0.0239 0.0218 0.0040 0.0057 0.0065 0.0059
11 0.0070 0.0085 0.0010 0.0001 0.0072 0.0077 0.0015 0.0032 0.0075 0.0013
12 0.0209 0.0274 0.0078 0.0217 0.0325 0.0068 0.0052 0.0095 0.0176 0.0052
13 0.0046 0.0119 0.0044 0.0041 0.0261 0.0067 0.0086 0.0126 0.0058 0.0062
14 0.0308 0.0003 0.0082 0.0061 0.0125 0.0061 0.0026 0.0058 0.0012 0.0089
15 0.0033 0.0090 0.0058 0.0029 0.0115 0.0044 0.0030 0.0075 0.0002 0.0099
16 0.0101 0.0165 0.0001 0.0052 0.0021 0.0012 0.0003 0.0068 0.0044 0.0061
17 0.0109 0.0085 0.0169 0.0028 0.0186 0.0034 0.0069 0.0094 0.0008 0.0056
18 0.0076 0.0042 0.0064 0.0100 0.0093 0.0045 0.0004 0.0041 0.0046 0.0055
19 0.0127 0.0075 0.0074 0.0152 0.0040 0.0069 0.0084 0.0059 0.0005 0.0001
20 0.0069 0.0019 0.0097 0.0047 0.0115 0.0062 0.0107 0.0040 0.0168 0.0000
21 0.0138 0.0051 0.0000 0.0002 0.0038 0.0019 0.0058 0.0043 0.0059 0.0009
22 0.0173 0.0066 0.0122 0.0023 0.0426 0.0043 0.0018 0.0046 0.0005 0.0039
23 0.0121 0.0106 0.0022 0.0004 0.0366 0.0212 0.0011 0.0140 0.0093 0.0035
24 0.0081 0.0127 0.0116 0.0016 0.0205 0.0021 0.0038 0.0086 0.0040 0.0043
25 0.0363 0.0070 0.0019 0.0159 0.0022 0.0074 0.0056 0.0092 0.0027 0.0022
8 Volatility of Financial Time Series
.0004
.0003
.0002
.0001
.0000
25 50 75 100 125 150 175 200 225 250
Fig. 8.5 Volatility of daily log returns of index PX in 2016 (250 values for 251 trading days)
estimated by means of EWMA model in Example 8.1
In finance one exploits some relations using the volatility as one of explaining
factors. The best known in this context is Black–Scholes formula mentioned in
Sect. 8.3.1, which expresses analytically the call or put option premium as a function
of five factors: St (spot price of underlying asset, e.g., a stock), X (exercise price of
option), T – t (time to maturity of option), σ (volatility of underlying asset), and
i (risk-free interest rate in the given capital environment). For example, the premium
of European call option Ct in time t is
C t ¼ St Φ ð d 1 Þ X e iðT t Þ
Φðd2 Þ, ð8:20Þ
X
s
σ 2t ¼ β0 þ β j σ 2tj þ εt : ð8:21Þ
j¼1
The classical autoregressive scheme AR(s) in (8.21) (with the classical white
noise {εt}) is used to predict volatility if we replace {σ t2} by a suitable proxy
(usually by rt2 or by (8.6); see Sect. 8.1). Nowadays, this method is not
recommended in practice since it has several handicaps (e.g., the nonnegativity of
the right-hand side of (8.21) is not guaranteed, even if the logarithmic transformation
can solve this problem; see also (8.79)).
A significant breakthrough to model the volatility systematically has been just the
model ARCH (autoregressive conditional heteroscedasticity) applied by Engle
(1982) to model the inflation in the UK. The models of this type (and particularly
their generalization to the GARCH models; see Sect. 8.3.5) are apparently one of the
most successful instruments of modeling financial time series (so far without signif-
icant competitors). Their principle is based on two predicates, namely
• The models of financial time series are heteroscedastic, i.e., their volatility
changes in time.
• The volatility is a simple quadratic function of past prediction errors et (deviations
from the conditional mean value).
Only the second predicate needs an explanation (the first one is sufficiently
supported by financial empirical experience): Due to the phenomenon of volatility
clustering according to which high (low) deviations of returns can be expected rather
after higher (lower) previous deviations, respectively, one can assume that the
particular volatilities are positively correlated and make use of the autoregressive
model as the simplest scheme to model them. Moreover, according to (8.14) it holds
σ 2t ¼ varðet jΩt1 Þ ¼ E e2t jΩt1 e2t ð8:22Þ
(obviously E(et) ¼ 0) so that the squared errors et2 can be used as natural approx-
imations of volatilities σ t2. Therefore, if we express the volatility as the following
quadratic function of delayed values et2:
8.3 Volatility Modeling 211
then we obtain a realistic model (when the order r is chosen in a suitable size). It is
worth noting that (8.23) is a “nonstochastic” relation, i.e., without a random residual
component.
Due to the previous discussion and respecting the general form of nonlinear
model (8.15), one formulates the model ARCH(r) of order r as
where εt are iid random variables with zero mean value and unit variance (moreover,
they are frequently assumed to have the normal distribution, i.e., εt ~ N(0, 1), or the t-
distribution which is standardized to have also zero mean value and unit variance). In
any case, increased past values of volatility imply the increased present volatility in
the model (8.24) which can be also rewritten as
pffiffiffiffi
yt ¼ e t , et ¼ ht ε t , ht ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr : ð8:25Þ
or equivalently
(i.e., with conditional mean value μt corresponding to the process AR( p)).
212 8 Volatility of Financial Time Series
Moreover, the parameters of the model ARCH(r) must fulfill the following
constraints:
α0 > 0, α1 0, . . . , αr 0 ð8:30Þ
and
α1 þ . . . þ αr < 1: ð8:31Þ
The first constraint (8.30) guarantees that the sign of volatility σ t2 in (8.24) is
positive: it is a sufficient (but not necessary) condition for this natural property of
the model. The second constraint (8.31) is not so clear but is also important: it
guarantees that the model ARCH(r) has constant (finite) unconditional variance (see
its derivation for ARCH(1) in (8.36)).
Remark 8.1 One should stress once more that in general the random variables et are
only uncorrelated (see (8.35)), while εt are independent. The graphical plots of
correlogram and partial correlogram of model ARCH correspond to this fact: e.g.,
the estimated correlogram of time series yt in the model (8.27) should have all values
insignificant as a white noise while the estimated partial correlogram of squared time
series yt2 should have the truncation point equal to r.
⋄
Remark 8.2 The model (8.24) can be generalized by means of matrix calculus:
where the matrix A of unknown parameters must be positive semidefinite and α0 > 0.
The original model (8.24) is a special parsimonious version of (8.32), where the matrix
A is diagonal with nonnegative diagonal elements.
⋄
The main properties of the ARCH models will be derived only for the process
ARCH(1), i.e., for the model
covðet , etk Þ ¼ Eðet etk Þ ¼ EðEðσ t εt etk jΩt1 ÞÞ ¼ Eðσ t etk Eðεt jΩt1 ÞÞ ¼ 0:
ð8:35Þ
varðet Þ ¼ E e2t ¼ E E e2t jΩt1 ¼ E σ 2t ¼ E α0 þ α1 e2t1
¼ α0 þ α1 varðet1 Þ: ð8:36Þ
α0
varðet Þ ¼ ð8:37Þ
1 α1
0 α1 < 1 ð8:38Þ
(the properties 1–3 mean that the time series {et } is a white noise (in particular,
weakly stationary) under the sufficient condition (8.38)). Interestingly, despite the
changing conditional variance, i.e., the changing volatility, the unconditional
variance of {et } remains constant over time.
4. Constant nonnegative kurtosis of et : If εt ~ N(0, 1), then it holds
2
E e4t ¼ E E e4t jΩt1 ¼ 3E α0 þ α1 e2t1
¼ 3 α20 þ 2α0 α1 varðet1 Þ þ α21 E e4t1 : ð8:39Þ
3α20 ð1 þ α1 Þ
E e4t ¼ ð8:40Þ
ð1 α1 Þ 1 3α21
E e4t 1 α21 6α21
γ2 ¼ 3¼3 3 ¼ 0, ð8:42Þ
ðvarðet ÞÞ 2 1 3α21 1 3α21
α0
varðet Þ ¼ : ð8:43Þ
1 α1 . . . αr
⋄
After the theoretical description of ARCH models, we can deal briefly with their
practical construction (since the construction of GARCH models is analogical, we
will skip this technical topic in Sect. 8.3.5 devoted to these models). Here for
simplicity, we confine ourselves to the model ARCH(r) in the form (8.26), i.e.,
with zero conditional mean value μt , where the deviations et are directly observable
(this assumption is fulfilled in practice for the financial time series of log returns rt).
In the opposite case, one eliminates the deviations et at first, e.g., in the model
(8.29) as
The order r can be identified as the truncation point of estimated partial correlogram
in model
where ut is the classical white noise (i.e., in the same way as for the classical AR
model in the framework of Box–Jenkins methodology; see Sect. 6.3.1). If the order
r is too high, then the nonnegativity of large number of parameters can be a problem
(see the constraint (8.30)). In his seminal work, Engle (1982) suggested to apply
8.3 Volatility Modeling 215
a parsimonious model only with two parameters (but with r ¼ 4) instead of (8.23),
namely
σ 2t ¼ δ0 þ δ1 0:4e2t1 þ 0:3e2t2 þ 0:2e2t3 þ 0:1e2t4 : ð8:46Þ
Due to various reasons, the estimation methods based on the principle of least
squares are not suitable for models with conditional heteroscedasticity. Therefore,
one recommends for these models the method of maximum likelihood. The
corresponding probability density fulfills obviously the relation
Therefore assuming εt ~ N(0, 1), one can write the (conditional) log likelihood
function as
Xn
1 1 2 1 e2t
lðα0 , . . ., αr Þ ¼ ln ð2π Þ ln σ t ð8:48Þ
t¼rþ1
2 2 2 σ 2t
(we have omitted the last factor in (8.47) since we condition by initial values e1, ..., er
in (8.48)). The values et necessary for the construction of (8.48) are calculated
recursively for each choice of arguments α0, . . ., αr including the volatilities
If the normal distribution of εt does not fit heavy tails of modeled financial data
properly, then one can apply other distributions. For instance, if the standardized t-
distribution is a better choice for εt (with the unit variance and the degrees of freedom
v (v > 2)), then (8.48) must be replaced by
Xn
vþ1 e2t 1 2
lðα0 , . . ., αr Þ ¼ ln 1 þ þ ln σ ð8:50Þ
t¼rþ1
2 ðv 2Þσ 2t 2 t
estimates the entire model including its mean equation (e.g., including the parame-
ters φ, ..., φp in the expression (8.44)) and including the variance matrix of estimated
parameters.
Remark 8.4 If the assumption of conditional normal distribution (tj. εt ~ N(0, 1)) is
used improperly in a (correctly identified) model ARCH then the corresponding ML
estimates of its parameters remain consistent, but their estimated variance matrix
should be repaired. For such a case, Bollerslev and Wooldridge (1992) suggested a
possible approach denoted as heteroscedasticity consistent covariances, which is
robust against non-normal distributions (it is based on the QML estimation (quasi-
maximum likelihood; see Example 8.2). Nowadays, nonparametric estimation of
conditional heteroscedasticity is also recommended (see, e.g., Fan and Yao (2005)).
⋄
8.3.4.3 Verification of Model ARCH
Most estimation procedures for ARCH models enable to obtain “by-products” which
can be used consequently to the verification of constructed model, namely:
• The estimated deviation bet (the one-step-ahead prediction error in given time
series for time t): e.g., in the model (8.29) it can be estimated as
bet ¼ yt φ
b 1 yt1 . . . φ
b p ytp : ð8:51Þ
bet
eet ¼ : ð8:52Þ
b
σt
series ee2t or special tests for time series feet g (e.g., LM-tests based on Lagrange
multipliers) testing a potential remaining ARCH structure in feet g.
• Verification of normality of conditional ARCH model: Jarque–Bera test (or the
numerical value of kurtosis coefficient only) for time series feet g:
8.3 Volatility Modeling 217
Volatility can be predicted by means of the relation (8.23) in the same way as we
construct predictions in the linear models of Box–Jenkins methodology (see Sect.
6.6), i.e.:
b σ 2t ¼ b
σ 2t ðt 1Þ ¼ b α0 þ b
α1be2t1 þ b
α2be2t2 þ . . . þ b
αrbe2tr , ð8:53Þ
σ 2tþ1 ðt 1Þ ¼ b
b α0 þ b
α1 b α2be2t1 þ . . . þ b
σ 2t ðt 1Þ þ b αrbe2tþ1r ð8:54Þ
etc.
The model ARCH(r) from the previous section has some drawbacks, e.g.:
• One must often use a high order r to describe the volatility of given time series in
an adequate way.
• If r is high, then it is necessary to estimate a large number of parameters under the
condition of nonnegativeness (8.30) and stationarity (8.31).
• ARCH models cover the volatility clustering but not the leverage effect (i.e.,
asymmetry of the impact of past positive and negative deviations et on the current
volatility).
These drawbacks can be reduced by applying the model GARCH (generalized
ARCH) suggested by Bollerslev (1986). In this model and in its various modifica-
tions (see Sect. 8.3.6), the volatility (i.e., the conditional variance) may also depend
on its previous (lagged) values. Specially the model GARCH(1,1), which is the
simplest representative of this class of models, is very popular model instrument for
financial time series nowadays: it is capable of managing very general volatility
structures by applying three parameters only (the GARCH models of higher orders
are used in routine practice rarely).
The model GARCH(r, s) has the form
X
r X
s
yt ¼ μ t þ et , et ¼ σ t εt , σ 2t ¼ α0 þ αi e2ti þ β j σ 2tj , ð8:55Þ
i¼1 j¼1
where εt are iid random variables with zero mean value and unit variance (again they
are mostly assumed to have the normal or t-distribution) and the parameters of model
fulfill
218 8 Volatility of Financial Time Series
X
max fr, sg
α0 > 0, αi 0, β j 0, ðαi þ βi Þ < 1 ð8:56Þ
i¼1
(one puts αi ¼ 0 for i > r and βj ¼ 0 for j > s; if s ¼ 0, then we go back to the model
ARCH(r)). The last inequality in (8.56) is the sufficient condition for the existence of
variance
α0
varðet Þ ¼ P max fr,sg : ð8:57Þ
1 i¼1 ðαi þ βi Þ
Remark 8.5 If we put ut ¼ et2 – σ t2 in (8.55), then ut has the property of white noise,
and it holds
X
max fr, sg X
s
e2t ¼ α0 þ ðαi þ βi Þe2ti þ ut β j utj : ð8:58Þ
i¼1 j¼1
Hence the volatility equation of the model GARCH can be looked upon as the model
ARMA for the time series of squared deviations {et2}. As the (non-squared) process
{et} is concerned, under the assumptions as (8.31) or (8.56) it is weakly (second
order moments) stationary. The strict (distribution) stationarity demands other type
of assumptions than (8.56), e.g., E{ln(α1εt2 + β1)} < 0 for GARCH(1,1) in (8.59);
see Francq and Zakoian (2010).
⋄
In particular, the model GARCH(1,1) has a simpler form
yt ¼ μ t þ et , et ¼ σ t εt ,
σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 ðα0 > 0, α1 , β1 0, α1 þ β1 < 1Þ: ð8:59Þ
b
σ 2t ðt 1Þ ¼ b
σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 ð8:62Þ
σ 2tþ1 ðt 1Þ ¼ α0 þ ðα1 þ β1 Þb
b σ 2t ðt 1Þ ð8:64Þ
and generally
b
σ 2tþτ ðt Þ ¼ α0 þ ðα1 þ β1 Þb
σ 2tþτ1 ðt Þ, τ > 1: ð8:65Þ
(the method by Bollerslev and Wooldridge from Remark 8.4 has been applied to
estimate the variance matrix of estimated parameters to be robust against non-normal
distributions). The results of the verification procedures are not presented here, but
Q tests for the estimated standardized deviation (8.52) and for its square (see Sect.
220 8 Volatility of Financial Time Series
Table 8.2 Estimation of the process GARCH(1, 1) from Example 8.2 (index PX in year 2016)
Dependent Variable: log returns of PX (in 2016)
Included observations: 250 after adjustments
Convergence achieved after 12 iterations
Bollerslev–Wooldrige robust standard errors and covariance
GARCH ¼ C(2) + C(3)RESID(-1)^2 + C(4)GARCH(-1)
Coefficient Std. Error z-Statistic Prob.
C 0.000286 0.000538 0.531451 0.5951
Variance Equation
C 2.48E-06 2.00E-06 1.244526 0.2133
RESID(-1)^2 0.125878 0.036311 3.466680 0.0005
GARCH(-1) 0.852148 0.037461 22.74740 0.0000
Source: Calculated by EViews
36
Series: Standardized Residuals
32 Observations 250
28
Mean -0.038325
24 Median -0.007027
Maximum 2.375025
20
Minimum -3.470062
16 Std. Dev. 1.006280
Skewness -0.532023
12 Kurtosis 3.667134
8
Jarque-Bera 16.42982
4 Probability 0.000271
0
-3 -2 -1 0 1 2
Fig. 8.6 Histogram of estimated standardized deviation eet (see (8.52)) for daily log returns of index
PX in 2016 from Example 8.2. Source: calculated by EViews
8.3.4.3) verify statistically the constructed model (also the LM test mentioned in
Sect. 8.3.4.3 does not find in these estimated deviations any remaining ARCH
structure). On the other hand, the histogram of estimated feet g and Jarque–Bera
test shown in Fig. 8.6 indicate the non-normality with higher kurtosis so that one
should apply t or GED distributions when constructing this GARCH model (see
Sect. 8.3.4.2).
Finally, Fig. 8.7 plots the volatility which is constructed by means of the
estimated model GARCH(1, 1) (one can compare it with its EWMA estimate from
Example 8.1 in Fig. 8.5).
⋄
8.3 Volatility Modeling 221
.0005
.0004
.0003
.0002
.0001
.0000
25 50 75 100 125 150 175 200 225 250
Fig. 8.7 Volatility of daily log returns of index PX in 2016 (250 values for 251 trading days)
estimated by means of model GARCH(1, 1) in Example 8.2 (compare with its EWMA estimate
from Example 8.1 in Fig. 8.5)
Analysis of financial (or nonlinear) time series is a very progressive sector. The offer
of various models is really enormous including a flood of nonsystematic acronyms of
the type FIEGARCH with tens of references in various sources each year (therefore,
it has no sense to survey the bibliography in this section). Typical examples are just
various modifications of GARCH models motivated mostly by an effort to repair
various drawbacks of the classical GARCH models from Sect. 8.3.5: some of them
are briefly described just in this section (respecting the fact that practical calculations
mostly suppose the application of specialized software instruments).
8.3.6.1 IGARCH
X
max fr, sg
ðαi þ βi Þ ¼ 1, ð8:67Þ
i¼1
yt ¼ μ t þ et , et ¼ σ t εt ,
σ 2t ¼ α0 þ ð1 β1 Þe2t1 þ β1 σ 2t1 ðα0 > 0, 0 β1 1Þ: ð8:68Þ
σ 2tþτ ðt Þ ¼ α0 þ b
b σ 2tþτ1 ðt Þ, τ > 1, ð8:69Þ
so that
b
σ 2tþτ ðt Þ ¼ ðτ 1Þα0 þ b
σ 2tþ1 ðt Þ ¼ ðτ 1Þα0 þ b
σ 2tþ1 , τ > 1: ð8:70Þ
One can see that the influence of current volatilities on predictions of future
volatilities really persists and that these predictions follow a line with the slope α0.
The classical GARCH model is not capable of modeling the leverage effect, i.e., the
asymmetry in the impact of past positive and negative deviations et on the current
volatility (the volatility is prone to increase more after price drops than after price
growths of the same size; see also Sect. 8.1). Glosten et al. (1993) suggested a
successful modification of GARCH model correcting this drawback (see also
Zakoian (1994)), which is usually denoted as GJR GARCH according to its authors
(sometimes the denotation threshold GARCH or acronym TARCH also appears):
yt ¼ μ t þ et , et ¼ σ t εt ,
(
X
r X
s X
n 1 for et < 0,
σ 2t ¼ α0 þ αi e2ti þ β j σ 2tj þ γ k e2tk I
tk , I
t ¼
i¼1 j¼1 k¼1 0 for et 0:
ð8:71Þ
8.3 Volatility Modeling 223
This model can be interpreted in such a way that the impact of “good news” (et–i 0)
modeled by means of αi differs from the impact of “bad news” (et–i < 0) modeled by
means of αi + γ i. If γ i > 0, then the bad news induce the growth of volatility so that
the leverage effect works with delay i. In any case, the model behaves asymmetri-
cally for γ i 6¼ 0.
The most frequent form of the model GJR GARCH in practice (see also Example
8.3) is simply
8.3.6.3 EGARCH
X
r eti X s Xn
e
yt ¼ μ t þ et , et ¼ σ t εt , ln σ 2t ¼ α0 þ
αi þ β ln σ 2
þ γ k tk :
σ ti
j tj σ tk
i¼1 j¼1 k¼1
ð8:73Þ
Moreover, before constructing asymmetric models of the type GJR GARCH and
EGARCH, the asymmetry should be tested statistically (see, e.g., Engle and Ng
(1993)). One usually uses the residuals bet obtained from the estimated (symmetric)
GARCH model (see, e.g., (8.51)) and tests by means of classical t, F, or LM tests the
significance of parameters in the linear model of the type
224 8 Volatility of Financial Time Series
1 for bet1 < 0 ,
be2t ¼ δ0 þ δ 1 S þ ut , S ¼ ð8:75Þ
t1 t1
0 for bet1 0 ;
be2t ¼ δ0 þ δ1 S
t1b
et1 þ ut ; ð8:76Þ
be2t ¼ δ0 þ δ1 S
et1 þ δ3 Sþ
t1 þ δ2 St1b t1b
et1 þ ut , Sþ
t1 ¼ 1 St1 , ð8:77Þ
where this once the significantly negative estimate –0.166 1 of parameter γ 1 confirms
again the occurrence of leverage effect.
Finally, Figs. 8.10 and 8.11 plot the volatilities which are constructed by means of
the estimated model GJR GARCH(1, 1) and EGARCH(1, 1) (they can be compared
mutually).
⋄
Table 8.3 Daily log returns of stocks KB in 2005 (252 values for 253 trading days written in columns) from Example 8.3 (see also Fig. 8.9)
1 2 3 4 5 6 7 8 9 10 11
1 – 0.0215 –0.0281 0.0041 –0.0268 0.0068 –0.0057 0.0081 –0.0417 –0.0039 –0.0055
2 0.0255 –0.0014 0.0116 –0.0154 –0.0181 –0.0023 –0.0048 0.0526 0.0121 0.0039 0.0128
3 –0.0062 –0.0043 –0.0794 –0.0083 –0.0052 0.0100 –0.0015 –0.0029 0.0000 0.0104 –0.0087
4 –0.0194 0.0043 –0.0073 0.0146 0.0172 0.0096 0.0150 0.0060 –0.0124 –0.0021
5 0.0238 0.0239 0.0285 –0.0098 0.0196 0.0182 –0.0060 0.0082 –0.0129 0.0208
8.3 Volatility Modeling
6 –0.0074 0.0179 –0.0303 –0.0013 0.0149 –0.0182 –0.0030 0.0145 0.0029 –0.0138
7 0.0118 –0.0014 0.0422 –0.0341 0.0131 0.0032 –0.0060 0.0179 0.0303 0.0088
8 0.0003 0.0216 –0.0413 –0.0353 –0.0078 –0.0095 –0.0091 0.0173 0.0230 –0.0070
9 –0.0003 –0.0244 0.0294 0.0220 –0.0020 0.0127 0.0119 –0.0110 0.0125 0.0218
10 0.0058 –0.0110 0.0051 0.0277 0.0114 –0.0111 0.0093 –0.0027 0.0210 0.0029
11 0.0176 0.0308 –0.0265 –0.0250 0.0132 0.0079 –0.0060 0.0027 0.0290 –0.0218
12 –0.0060 –0.0016 –0.0225 –0.0384 0.0156 –0.0032 –0.0106 –0.0219 0.0060 0.0088
13 –0.0087 –0.0097 0.0289 0.0290 0.0072 –0.0160 0.0076 0.0227 –0.0086 0.0072
14 –0.0205 –0.0071 –0.0015 0.0101 –0.0041 0.0032 –0.0122 –0.0103 0.0158 0.0066
15 0.0059 0.0257 –0.0107 0.0040 0.0016 –0.0129 0.0003 –0.0055 0.0014 –0.0034
16 –0.0068 –0.0048 0.0227 –0.0178 –0.0238 0.0058 –0.0058 –0.0125 0.0045 0.0012
17 0.0286 0.0075 0.0015 –0.0150 0.0159 0.0290 –0.0006 0.0193 –0.0217 0.0086
18 0.0242 –0.0180 0.0006 –0.0314 0.0281 0.0094 –0.0127 0.0082 –0.0043 –0.0071
19 –0.0343 –0.0145 0.0318 –0.0316 0.0000 –0.0091 0.0019 0.0014 –0.0090 0.0043
20 0.0093 –0.0041 –0.0087 –0.0248 0.0015 0.0141 0.0276 –0.0571 –0.0023 –0.0086
21 –0.0136 0.0296 0.0035 0.0259 –0.0287 0.0068 0.0105 –0.0276 –0.0106 0.0029
22 0.0058 –0.0011 0.0020 0.0568 0.0144 0.0147 0.0228 0.0132 –0.0149 0.0000
23 –0.0087 –0.0081 –0.0218 0.0336 –0.0281 0.0166 –0.0035 0.0116 0.0149 –0.0116
24 0.0186 –0.0081 –0.0256 –0.0064 –0.0178 0.0089 –0.0032 –0.0253 0.0089 0.0029
25 –0.0098 –0.0165 –0.0291 0.0166 0.0033 –0.0119 –0.0272 –0.0415 –0.0172 –0.0017
225
.06
.04
.02
.00
-.02
-.04
-.06
-.08
-.10
25 50 75 100 125 150 175 200 225 250
Fig. 8.8 Daily log returns of stocks KB in 2005 (252 values for 253 trading days) from Example
8.3 (see also Table 8.3). Source: calculated by EViews
3800
3600
3400
3200
3000
2800
2600
25 50 75 100 125 150 175 200 225 250
Fig. 8.9 Daily closing prices of stocks KB in year 2005 (values in CZK for 253 trading days) from
Example 8.3. Source: kurzy.cz (https://akcie-cz.kurzy.cz/akcie/komercni-banka-590/graf_2005)
Table 8.4 Estimation of the process GJR GARCH(1, 1) from Example 8.3 (daily log returns of
stocks KB in year 2005)
Included observations: 252
Convergence achieved after 26 iterations
Bollerslev–Wooldrige robust standard errors and covariance
GARCH ¼ C(2) + C(3)RESID(-1)^2 + C(4)RESID(-1)^2(RESID(-1)<0) + C(5)GARCH(-1)
Coefficient Std. Error z-Statistic Prob.
C 0.000117 0.001024 0.114515 0.9088
Variance Equation
C 4.26E-05 1.93E-05 2.210756 0.0271
RESID(-1)^2 –0.029062 0.039569 –0.734468 0.4627
RESID(-1)^2(RESID(-1)<0) 0.272706 0.125334 2.175834 0.0296
GARCH(-1) 0.762199 0.083962 9.077869 0.0000
Source: Calculated by EViews
Table 8.5 Estimation of the process EGARCH(1, 1) from Example 8.3 (daily log returns of stocks
KB in year 2005)
Included observations: 252
Convergence achieved after 20 iterations
Bollerslev–Wooldrige robust standard errors and covariance
LOG(GARCH) ¼ C(2) + C(3)ABS(RESID(-1)/@SQRT(GARCH(-1))) + C(4)RESID(-1)/
@SQRT(GARCH(-1)) + C(5)LOG(GARCH(-1))
Coefficient Std. Error z-Statistic Prob.
C –1.34E-05 0.001040 –0.012899 0.9897
Variance Equation
C(2) –0.968342 0.415948 –2.328036 0.0199
C(3) 0.153073 0.069119 2.214631 0.0268
C(4) –0.166100 0.059934 –2.771389 0.0056
C(5) 0.895763 0.048687 18.39837 0.0000
Source: Calculated by EViews
⋄
228 8 Volatility of Financial Time Series
.0020
.0016
.0012
.0008
.0004
.0000
25 50 75 100 125 150 175 200 225 250
volatility of daily log returns of stocks KB (in year 2005) by means of model GJR GARCH(1, 1)
Fig. 8.10 Volatility of daily log returns of stocks KB in 2005 (252 values for 253 trading days)
estimated by means of model GJR GARCH(1, 1) in Example 8.3
.0020
.0016
.0012
.0008
.0004
.0000
25 50 75 100 125 150 175 200 225 250
volatility of daily log returns of stocks KB (in year 2005) by means of model EGARCH(1, 1)
Fig. 8.11 Volatility of daily log returns of stocks KB in 2005 (252 values for 253 trading days)
estimated by means of model EGARCH(1, 1) in Example 8.3 (compare with volatility estimated by
means of GJR GARCH(1,1) in Fig. 8.10)
8.3.6.4 GARCH-M
The return of financial asset often depends on its volatility (e.g., investors are
compensated for higher risk by higher return). Therefore, Engle et al. (1987)
suggested a modification of ARCH models (and later GARCH models), where the
volatility or its square root enters the mean equation (so-called ARCH-M models).
8.3 Volatility Modeling 229
For instance, the model GARCH(1,1)-M (i.e., GARCH-in-mean) has the form
yt ¼ μt þ γ 1 σ 2t þ et ðor yt ¼ μt þ γ 1 σ t þ et Þ, et ¼ σ t εt ,
ð8:78Þ
σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 :
If the parameter γ 1 is significantly positive, then the increased risk manifests itself by
increased volatility, which causes the increased level of time series (i.e., the
increased mean).
The volatility equation of GARCH model is obviously fully deterministic (in the
sense of conditioning by past information). The denotation stochastic volatility is used
in this context only in such a case, when the volatility equation contains additional
error term which remains random even if one conditions by the past information.
Although simple examples of such models are the autoregressive models of volatility
from Sect. 8.3.3, the general SV model (see, e.g., Taylor (1994)) is presented as
X
s
yt ¼ μ t þ et , et ¼ σ t εt , ln σ 2t ¼ α0 þ β j ln σ 2tj þ ut , ð8:79Þ
j¼1
where {ut} is another white noise (mostly iid with normal distribution) that is
independent on {εt} (the formulation by means of logarithmic volatility enables to
ignore the condition of nonnegativeness similarly as in EGARCH model). The
models of the type SV turned out well, e.g., in the context of option pricing,
where the volatility of underlying asset enters the famous Black–Scholes formula
(see Sect. 8.3.2). On the other hand, the difficult estimation is one of drawbacks of
these models.
Remark 8.7 We have mentioned in the beginning of this section that there are many
modifications of GARCH (and many new ones probably will appear in future), e.g.:
• FIGARCH ( fractionally IGARCH) are FI models (i.e., the long-memory pro-
cesses from Sect. 6.7), but for volatility, an analogical character has the model
FIEGARCH and others.
• QGARCH (quadratic GARCH) models (see Sentana (1995)) reflect asymmetry
in such a way that the delayed deviations et–i figure directly on the right-hand side
of volatility equation (in addition to the squared delayed deviations e2ti ).
• APARCH (asymmetric power ARCH) models (see Ding et al. (1993)) induce the
long-memory property by means of a suitable power transformation of volatilities
and are capable of expressing well the fat tails, excess kurtosis, and leverage
effects.
⋄
230 8 Volatility of Financial Time Series
8.4 Exercises
Exercise 8.1 Repeat the analysis from Examples 8.1 and 8.2 (daily log returns of
index PX in 2016), but only for last 100 values of time series {rt} (hint: r t ¼ σ t εt ,
σ 2t ¼ 0:0168r 2t1 þ 0:5419σ 2t1 ).
Exercise 8.2 Repeat the analysis from Example 8.3 (daily log returns of stocks
KB in 2005), but only for last 203 values of time series {rt} hint: rt ¼ et , et ¼ σ t εt ,
1 for et < 0,
σ 2t ¼ 0:0641e2t1 þ 0:8062σ 2t1 þ 0:2629e2t1 I
t1 , I t ¼ ;
0 for et 0,
e e
rt ¼ et , et ¼ σ t εt , ln σ 2t ¼ 0:9576 þ 0:0559 t1 þ 0:8862 ln σ 2t1 0:1811 t1 .
σ t1 σ t1
Chapter 9
Other Methods for Financial Time Series
where εt ¼ et /σ t are standardized shocks et (εt are usually iid in contrast to the
uncorrelated et, which may be possibly dependent). In this framework, we dealt so
far (see Sect. 8.3) with the volatility equation σ t2 ¼ h(Ωt – 1) only (an exception was
the model GARCH-M). Now on the contrary, we focus on various nonlinear models
for the (conditional) mean equation μt ¼ g(Ωt – 1), even though we shall present only
the most important ones from the point of view of applications in finance (see also
monographs by Priestley (1988), Tong (1990), and others). These models are mostly
specific cases (acceptable from the computational point of view) of the
general model
(see also (8.9)), where et is an (uncorrelated) white noise with the variance σ 2e :
More specifically, the models in this section have been motivated by some
nonlinear characteristics of data from practice (not necessarily from financial prac-
tice only). Examples are the asymmetry between increase and decrease of time
series, the limit cycle (i.e., limit form of the process in regular cycles if one excludes
all random elements), and the dependence of frequency on amplitude in periodic
behavior of some processes (e.g., the frequency increases with decreasing amplitude
and decreases with increasing amplitude, or on the contrary).
While the linear process (6.17) can be looked upon in such a way that it originates by
means of Taylor expansion of the function f() in (9.2) to the first order, in the case of
bilinear models it should be the expansion to the second order
X
p X
q P X
X Q
yt ¼ α þ φi yti þ θ j etj þ βmn ytn etm þ et : ð9:3Þ
i¼1 j¼1 n¼1 m¼1
Some special cases of (9.3) belong to the class of models with conditional
heteroscedasticity assuming usually et ~ iid (0, σ e2): e.g., if we consider the model
X
Q
yt ¼ μ þ βm et etm þ et ð9:4Þ
m¼1
then it holds
!2
X
Q
μt ¼ Eðyt jΩt1 Þ ¼ μ, σ 2t ¼ varðyt jΩt1 Þ ¼ 1þ βm etm σ 2e : ð9:5Þ
m¼1
The most frequent model of this type is so-called completely bilinear model of
the form
P X
X Q
yt ¼ βmn ytn etm þ et ð9:6Þ
n¼1 m¼1
(here again the white noise values et are usually assumed to be independent). If
dealing with models of the type (9.6), the form of matrix (βmn) is substantial.
Moreover, one distinguishes so-called superdiagonal or diagonal or subdiagonal
models depending on whether the matrix (βmn) has zero elements only above the
main diagonal or only on the main diagonal or only under the main diagonal,
respectively.
The detailed theoretical analysis of some special cases of superdiagonal, diago-
nal, and subdiagonal models (including conditions of stationarity for these models)
has shown some paradoxical results: e.g., the correlation structures of some bilinear
models correspond to the correlation structures of simple linear processes ARMA
(or even to the one of white noise). It has the practical consequence for the
identification of bilinear models when the correlogram does not distinguish them
from ARMA models. Moreover, the practical calculation of partial autocorrelation
function of bilinear models is so complex that it cannot be recommended for
9.1 Models Nonlinear in Mean Value 233
where the sufficient and necessary condition of stationarity has the form λ2 < 1 for
λ ¼ βσ e. Then the corresponding time series {yt} has the zero mean value, the
variance σ e2/(1 – λ2), and the autocorrelation function
ρk ¼ 0 for k 6¼ 0: ð9:8Þ
Moreover, the autocorrelation function ρk(2) of the squared time series {yt2} fulfills
ð2Þ ð2Þ
ρk ¼ λn ρkn for k > m, ð9:9Þ
so that these autocorrelation functions identify the time series {yt} as the white noise,
while the time series {yt2} as the process ARMA(n, m) (see (6.46)).
2. Example of diagonal model:
where the sufficient and necessary condition of stationarity has again the form
λ2 < 1. The corresponding time series {yt} has the mean value βσ e2, the variance
σ e2(1 + λ2 + λ4) /(1 – λ2), and the autocorrelation function
λ2 1 λ2
ρk ¼ for k ¼ 1, ρk ¼ 0 for k > 1 ð9:11Þ
1 þ λ2 þ λ4
(under stronger assumption et ~ iid N(0, σ e2)). The autocorrelation function ρk(2) of
{yt2} fulfills
ð2Þ ð2Þ
ρk ¼ λ2 ρk1 for k > 1, ð9:12Þ
so that these autocorrelation functions identify the time series {yt} as the process
MA(1), while the time series {yt2} as the process ARMA(1, 1).
234 9 Other Methods for Financial Time Series
where the sufficient and necessary condition of stationarity has again the form
λ2 < 1. The corresponding time series {yt} has the zero mean value, the variance
σ e2/(1 – λ2), and the autocorrelation function
ρk ¼ 0 for k 6¼ 0: ð9:14Þ
ð2Þ ð2Þ
ρk ¼ λ2 ρk2 for k > 3, ð9:15Þ
so that these autocorrelation functions identify the time series {yt} as white noise,
while the time series {yt2} as the process ARMA(2, 3). Anyway, the analysis of
theoretical properties of subdiagonal models is usually much more complex than for
the superdiagonal and diagonal models.
The bilinear models can be estimated similarly as the linear models applying a
recursive calculation of values et in dependence on the model parameters (see Sect.
6.3.2). Also the predictions can be constructed analogously as in the linear case. For
example in the diagonal model (9.10), it is possible to derive a necessary condition of
invertibility in the form (see Granger and Newbold (1986))
λ2 2λ2 þ 1
<1 ð9:16Þ
1 λ2
(i.e., |λ| < 0.605). The prediction in this model can be constructed as
bytþ1 ðt Þ ¼ b bytþτ ðt Þ ¼ E ytþτ ¼ b
2
β ytbet , βbσe for τ > 1: ð9:17Þ
Remark 9.1 Granger and Andersen (1978) present the following financial applica-
tion of bilinear models. A time series {yt} of stock prices of a big corporation
with length of 169 observations was originally estimated by means of a linear
model yt ¼ et + 0.26et1 with the variance of white noise estimated as 24.8.
However, the time series {et} originally looked on as a (linear) white noise was
identified and estimated as the bilinear model of the form et ¼ 0.02et1ut1 + ut,
where ut denotes a white noise with estimated variance 23.5. Even though the
reduction of white noise variance seems insignificant (from 24.8 to 23.5), the
mean squared error MSE of the one-step-ahead prediction (see (2.11)) calculated
for the last fifteen out-of-sample observations (i.e., h¼15) decreased by 11% when
applying the bilinear scheme.
⋄
9.1 Models Nonlinear in Mean Value 235
Threshold models SETAR replaces linear relations by a piecewise linear function f()
in (9.2), the changes of this function being controlled not from the time space but
from the state space of function values. More specifically, these models are
constructed applying some critical limits (thresholds) and change when the observed
time series exceeds these thresholds (a similar principle has been applied for GJR
GARCH processes from Sect. 8.3.6, but for their conditional variance (volatility),
and not for the conditional mean which is just the case of models SETAR). A more
general framework are switching regimes models, where the particular regimes can
be controlled by fixed (deterministic) thresholds (see just the models SETAR) or by a
stochastic way (see, e.g., MSW models later in this section).
Let us consider a very simple model SETAR of the form
1:8yt1 þ et for yt1 < 0,
yt ¼ ð9:18Þ
0:5yt1 þ et for yt1 0,
where et are iid N(0, 1) (obviously, this model has a single threshold in zero, where
the past value yt1 with time delay d ¼ 1 controls the current value yt). Figure 9.1
plots one of simulations of this process with length 200 and zero starting value
(trajectories of other simulations are very similar). At first glance, one can see some
interesting properties of this process:
• The process is stationary (even though the first autoregressive polynomial has a
root lying significantly inside the unit circle in complex plane).
• The process is (geometrically) ergodic, i.e., its sample mean converges (in
a specific way) to the theoretical mean.
-1
-2
25 50 75 100 125 150 175 200
Fig. 9.1 Simulation of threshold model (9.18) with one zero threshold
236 9 Other Methods for Financial Time Series
• The given realization shows an asymmetry between its upward and downward
jumps: if yt1 < 0, then the process tends to turn over immediately to a positive
(i.e., opposite) value due to the significantly negative value of the autoregressive
parameter –1.8, while if yt1 > 0, then the turnover to negative (i.e., opposite)
values usually takes more time units. It implies directly that the process attains
more values above the zero threshold than below it and that it shows immediate
significant jumps upward to the positive values, as soon as it becomes negative.
• The sample mean of the given realization is 0.75 with standard deviation 0.08 of
this sample estimate so that it lies significantly above the zero threshold (the
theoretical mean of the given process is the weighted average of its conditional
mean values for both threshold regions with weights corresponding to the prob-
abilities of both regions from the point of view of stationary distribution of the
process).
In general, the process {yt} denoted usually as the thresholds autoregressive
model with r autoregressive regimes of orders pj and controlling delay d (the
acronym SETAR comes from self-exciting threshold AR to stress the self-regulation
of process) has the form
ð jÞ ð jÞ
yt ¼ αð jÞ þ φ1 yt1 þ . . . þ φðp jjÞ ytp j þ et for P j1 ytd < P j , j ¼ 1, . . . , r,
ð9:19Þ
where d and r are given natural numbers, thresholds Pj are real numbers fulfilling
inequalities –1 ¼ P0 < P1 < ... < Pr ¼ 1, and {et( j )} are mutually independent
white noises usually of the type iid with variances σ j2.
The identification and (simultaneous) estimation of the models SETAR is mostly
realized by applying information criteria of the type AIC from Sect. 6.3.1 (see Tong
(1983, 1990)).
Remark 9.2 Chappell et al. (1996) estimated the model SETAR with one threshold
for the time series {Et} of log returns of daily exchange rate French franc / German
mark (FRF/DEM) in the period from May 1, 1990, to March 30, 1992 (i.e., 450
observations)
(
ð1Þ
0:022 2 þ 0:996 2E t1 þ et for E t1 < 5:830 6 ,
Et ¼ ð2Þ
0:348 6 þ 0:439 4E t1 þ 0:305 7E t2 þ 0:195 1E t3 þ et for E t1 5:830 6:
The threshold value was estimated a few percent below the upper limit prescribed by
the Exchange Rate Mechanism (ERM) in the Economic and Monetary Union (EMU)
which was in force just at this time (it corresponds to reality, since the central banks
of particular states usually intervened some time before the exchange rate achieved
the permitted limit).
⋄
9.1 Models Nonlinear in Mean Value 237
Remark 9.3 As the conditional mean value of models SETAR is not continuous
(indeed, the thresholds are points of discontinuity of μt ), one has suggested the
models STAR (smooth transition AR model; see Chan and Tong (1986) and others).
For instance, in the case of model with two regimes it can be
!
X
p
ð1Þ ytd Δ Xp
ð2Þ
ð1Þ ð2Þ
yt ¼ α þ φi yti þF α þ φi yti þ et , ð9:20Þ
i¼1
s i¼1
where parameters Δ and s and a transition function F() determine the way of
transition between both regimes (a usual choice of F in practice is the distribution
function of logistic or exponential distribution). Even though the corresponding
conditional mean value μt is assumed to be differentiable in continuous time, the
consistency of parameter estimation is usually problematic (particularly for the
location Δ and scale s).
⋄
yt ¼ eþ þ þ þ þ
t þ θ 1 et1 þ . . . þ θq etq þ et þ θ 1 et1 þ . . . þ θ q etq , ð9:21Þ
where et is a normal white noise with variance σ e2, et+ ¼ max (0, et), et– ¼ min (0, et),
and θ1+, ..., θq– are parameters. If θj+ ¼ θj– ( j ¼ 1, ..., q), then (9.21) is the classical
“symmetric” moving average process MA(q) (see (6.24)).
In contrast to the symmetric moving average models, the asymmetric ones may
not have zero mean value, even though their correlation structure is similar to the
symmetric case with truncation point in q. For example, the asymmetric process
MA(1) fulfills
θþ
1 θ1 σ e
μ ¼ Eðyt Þ ¼ pffiffiffiffiffi , ð9:22Þ
2π
2
2 2
2
1 þ θþ
1 σe 1 þ θ
1 σe
γ 0 ¼ varðyt Þ ¼ þ μ2 ,
2 2
238 9 Other Methods for Financial Time Series
θþ 2
1 þ θ1 σ e
ρ1 ¼ , ρk ¼ 0 for k > 1: ð9:23Þ
2
If θ1+ ¼ –θ1–, then such a process has obviously the same correlation structure as a
white noise.
X
p
yt ¼ α þ ðφi þ δit Þyti þ et , ð9:24Þ
i¼1
where {δt} ¼ {(δ1t, ..., δpt)0 } is a sequence of independent random vectors with zero
(vector) mean and variance matrix Σδδ independent of the white noise {et}. The
conditional mean and variance of (9.24) fulfill
X
p
μt ¼ Eðyt jΩt1 Þ ¼ α þ φi yti ,
i¼1
0
σ 2t ¼ varðyt jΩt1 Þ ¼ σ 2e þ yt1 , . . ., ytp Σδδ yt1 , . . ., ytp : ð9:25Þ
Double stochastic models extend the principle of RCA by modeling the parameters
of an ARMA process (or another classical linear process) by means of other random
processes (see, e.g., Chen and Tsay (1993); Tjøstheim (1986)). Special cases are
functional-coefficient autoregressive processes FAR of the form
9.1 Models Nonlinear in Mean Value 239
where functions f1, ..., fp should be differentiable to the second order, e.g., the
exponential autoregressive model (see Haggan and Ozaki (1981))
yt ¼ φ1 þ π 1 exp γ y2t1 yt1 þ . . .
þ φp þ π p exp γ y2t1 ytp þ et , ð9:27Þ
In contrast to the models, the regimes of which are controlled by observable vari-
ables (e.g., by the location of a past value of the given time series between thresholds
of SETAR), the models denoted as MSW change particular regimes in an
unobservable (latent) way, namely by means of a Markov mechanism (MSW is
the acronym for Markov switching).
Particularly, the simplest case of so-called process MSA (Markov-switching
autoregressive) with two regimes has the form
8
>
> ð1Þ
P
p1
ð1Þ ð1Þ
>
< α þ ϕi yti þ et for st ¼ 1,
i¼1
yt ¼ ð9:28Þ
>
> Pp2
ð2Þ ð2Þ
> ð2Þ
:α þ ϕi yti þ et for st ¼ 2,
i¼1
and {et(1)} and {et(2)} are (mutually independent) iid white noises. Obviously, a
small value of the transit probability wi means that the process remains a longer time
in the state i (the reciprocal value 1/wi is equal to the mean period of stay (the mean
holding time) in this state. One can see that the process MSA makes use of the
Markov probability mechanism to control the transits among particular conditional
mean values.
240 9 Other Methods for Financial Time Series
Due to the stochastic (i.e., latent) control of regime switching, the construction of
models MSW is not simple (see, e.g., Hamilton (1989, 1994)). One can make use of
some estimation techniques based on simulations, e.g., MCMC method (Markov
Chain Monte Carlo). The construction of prediction is more complex as well,
combining linearly the predictions constructed for particular regimes (in contrast
to predicting in a threshold model, where the observed past value yt – d unambigu-
ously determines in which regime the prediction will be constructed).
Remark 9.4 In financial practice, the models MSW are popular for modeling time
series of gross domestic products GDPt. For example, Tsay (2002) constructed the
following model for the quarterly (seasonally adjusted) time series of GDP growth
(in %) in the USA in years 1947–1990
(
ð1Þ
0:909 þ 0:265yt1 þ 0:029yt2 0:126yt3 0:110yt4 þ et for st ¼ 1,
yt ¼ ð2Þ
0:420 þ 0:216yt1 þ 0:628yt2 0:073yt3 0:097yt4 þ et for st ¼ 2,
where the standard deviations σ e(1) and σ e(2) of white noises were estimated as 0.816
and 1.017. Hence the mean values of the process {yt} for the first and second state
can be evaluated as 0.965 and –1.288 (evidently, the first state corresponds to the
quarters of economic growth, while the second state to the quarters of economic
decline). Finally, the transit probabilities w1 and w2 were estimated as 0.118 and
0.286 which can be interpreted as follows: to get out of recession is more probable
than to enter it. More specifically, the mean length of recession period is
1/0.286 ¼ 3.50 quarters, i.e., less than 1 year, while the mean length of boom period
is 1/0.118 ¼ 8.47 quarters, i.e., more than 2 years).
⋄
9.2 Further Models for Financial Time Series
There are further approaches to nonlinear models of financial time series that cannot
be classified analogously as in Sect. 9.1 (due to their philosophy, due to the type of
analysis, etc.). Two examples of such approaches will be given here as illustrations.
Let in the simplest case two financial variables yt and xt be linked by the relation
yt ¼ m ð xt Þ þ et , ð9:30Þ
9.2 Further Models for Financial Time Series 241
where m() is an unknown (nonlinear, but smooth) function and {et} is a white noise.
A natural task is to estimate the function m() as truly as possible by means of
observed data y1, ..., yT and x1, ..., xT .
As the arithmetic average of values e1, ..., eT converges to zero with increasing
T (according to the law of large numbers), it seems that the natural estimate of m() is
the arithmetic average of values y1, ..., yT
1 X
T
y: ð9:31Þ
T t¼1 t
However, there is a problem consisting in the fact that one should estimate the
function m(x) for a given value of argument x, but the observed values x1, ..., xT may
differ from x significantly. Therefore, it is reasonable to replace (9.31) by the
weighted average
1X
T
b ð xÞ ¼
m w ðxÞ yt , ð9:32Þ
T t¼1 t
where the weights wt(x) are large (or small) for such indices t, for which the observed
values xt lie close to x (or far from x), respectively.
In other words, one weighs the values yt by means of locally weighted averages
using weights of described properties. A unifying principle in this context consists in
application of so-called kernel (see, e.g., Härdle (1990)), which is a suitable function
K() with properties of a probability density
Z1
K ðxÞ 0, K ðzÞ dz ¼ 1: ð9:33Þ
1
P
T
K h ð x xt Þ yt
b ðxÞ ¼ t¼1T
m , ð9:34Þ
P
K h ð x xt Þ
t¼1
exp ðzÞ
f j ðzÞ ¼ , ð9:36Þ
1 þ exp ðzÞ
xi is the value of the ith input node, oj is the value of the jth output node, α0j is called
bias, wij are weights, and the summation i ! j means summing over all input nodes
feeding to j. If a node has an activation function of the form
1 for z > 0,
f j ðzÞ ¼ ð9:37Þ
0 for z 0,
then one calls it a threshold node, with “1” denoting that the node fires (revitalizes)
its message. The final connection from inputs to outputs can be more complex, if
there are hidden layers (due to compounding gradually the activation functions),
e.g.:
!!
X X X
o ¼ f α0 þ wi xi þ wko f k α0k þ wik xi , ð9:38Þ
i!o k!o i!k
where not only a direct connection from the input to the output layer is possible, but
also an indirect one by means of a hidden layer (with summing index k in (9.38)).
A typical application of neural networks for financial time series is the following
one. One observes data xt and yt (t ¼ 1, ..., T ), where xt is a vector of input values at
time t and yt is an observation of given time series at time t. In addition to it, we have
9.3 Tests of Linearity 243
also model output values ot expressed analytically for particular times t by means of
relations of the type (9.38). Then by minimizing a simple criterion, e.g., the sum of
squares
X
T
ð yt o t Þ 2 , ð9:39Þ
t¼1
one estimates the biases α and weights w in (9.38). The neural network calibrated in
this way can be used, e.g., for construction of predictions in the given time series.
Moreover, the hold-out sample approach (see Sect. 2.2.3.4) enables us to evaluate
the prediction qualities of this model.
Remark 9.5 Tsay (2002) applied the model (9.38) with three nodes in the input
layer, two nodes in the hidden layer, and one node in the output layer for 864 daily
log returns rt of IBM stocks. Choosing the vector of input values as xt ¼ (rt – 1, rt – 2,
rt – 3), one estimated the neural network as
br t ¼ 3:22 1:81 f 1 ðxt Þ 2:28 f 2 ðxt Þ 0:09r t1 0:05r t2 0:12r t3 ,
where
⋄
9.3 Tests of Linearity
The null hypothesis of such tests of linearity is mostly the acceptability of a linear
model for the analyzed time series. In particular, one often verifies the null hypoth-
esis that the values {et} are independent or even iid: any violation of independence of
the calculated residuals usually indicates an inadequacy of constructed model
including the assumption of linearity. In practical analysis of (financial) time series,
one recommends particularly the following tests of linearity (see, e.g., Tsay (2002)):
• Q tests using, e.g., Ljung–Box statistics with the critical region of the form
(applying the significance level α)
X
K
1
Q ¼ nð n þ 2Þ ðr ðe ÞÞ2 χ 21α ðK p qÞ, ð9:40Þ
k¼1
nk k t
where {et} are residuals constructed by estimating a model ARMA( p, q) for the
given time series (see (6.67) and the verification of models ARCH in Sect. 8.3.4.3).
• RESET tests were suggested for various regression problems in statistics (Regres-
sion Equation Specification Error Tests, see Ramsey (1969)). When applying
them to test the linearity in time series, one tests, e.g., the null hypothesis of
the form
H 0 : β1 ¼ 0, . . . , βs ¼ 0 ð9:41Þ
where the values byt are calculated in the original model AR( p) for the original
time series {yt} (i.e., under constraints (9.41) in (9.42)).
• BDS test is a widely used (nonparametric) test of independence H0: et ~ iid (it is
called according to its authors Brock, Dechert, and Scheinkman). This test is also
recommended as an effective test of linearity of {et} in the framework of
modeling financial time series. Numerous applications show that the test has a
high power when detecting various violations of independence (these violations
may have different forms of linear or nonlinear dependence of all types, deter-
ministic chaos, etc.). Let us describe this test in more details:
The BDS test starts by choosing a distance ε > 0. If really et ~ iid, then the
probability that the distance |eset| does not exceed ε for a pair es and et is the same
for an arbitrary choice of such pairs. Let us denote this probability as c1(ε). The index
1 in this symbol is used due to the fact that we shall consider more generally also
m such pairs ordered in time as (es, et), (es+1, et+1), ..., (es+m1, et+m1), and in such a
case, the symbol cm(ε) will denote the probability that for all pairs in this group the
corresponding distances do not exceed ε. If the null hypothesis on independence
holds, then it must be
9.3 Tests of Linearity 245
cm ð εÞ ¼ ð c1 ð εÞ Þ m : ð9:43Þ
To perform the test practically, one must dispose of sample versions (estimates) of
these values (they are called correlation integrals in the theory of chaos)
2 X
Tmþ1 X mY
Tmþ1 1
cm,T ðεÞ ¼ I ε esþj , etþj , ð9:44Þ
ðT m þ 1ÞðT mÞ s¼1 t¼sþ1 j¼0
where
1 for jx yj ε,
I ε ðx, yÞ ¼ ð9:45Þ
0 otherwise :
Then the test of hypothesis H0: et ~ iid consists in testing whether the deviation
where
X
m1
σ 2T,m ðεÞ ¼ 4 ðk T ðεÞÞm þ 2 ðk T ðεÞÞmj ðc1,T ðεÞÞ2j þ ðm 1Þ2 ðc1,T ðεÞÞ2m
j¼1
!
2m2
m2 k T ðεÞðc1,T ðεÞÞ ,
2
k T ðεÞ ¼
T ðT 1ÞðT 2Þ
X
T X
T X
T
ðI ε ðet , es ÞI ε ðes , er Þ þ I ε ðet , er ÞI ε ðer , es Þ þ I ε ðes , et ÞI ε ðet , er ÞÞ:
t¼1 s¼tþ1 r¼sþ1
ð9:48Þ
This result is then used in the (asymptotic) BDS test with a given significance level.
Table 9.1 presents the application of BDS test for 100 simulated values of the type iid
N(0, 1). The test was performed for particular values m ¼ 2, ..., 6 and for the distance
limit ε ¼ 1.378 which is set up optimally by software (moreover, if the sample size
246 9 Other Methods for Financial Time Series
Table 9.1 BDS test for 100 simulated values of type iid N(0, 1)
BDS Test for Y
Included observations: 100
Dimension BDS Statistic Std. Error z-Statistic Normal Prob. Bootstrap Prob.
2 0.001753 0.006313 0.277743 0.7812 0.7352
3 0.002676 0.010054 0.266189 0.7901 0.7128
4 0.003480 0.011994 0.290146 0.7717 0.6772
5 0.009605 0.012523 0.767018 0.4431 0.4236
6 0.013696 0.012097 1.132115 0.2576 0.2848
Raw epsilon 1.378340
Source: Calculated by EViews
The test was performed again for particular values m ¼ 2, ..., 6 and for the distance
limit ε ¼ 0.013 set up optimally by software. Due to high p-values in Table 9.3, the
null hypothesis on independence of residuals estimated by means of this model
cannot be rejected (more specifically, the test confirms that no unexplained nonlinear
structure is remaining in these residuals).
⋄
9.4 Duration Modeling
Typical data in finance are transactions data (usually prices or volumes of traded
financial assets) with values observed at times of particular transactions (i.e., at time
ti for the ith transaction). Financial time series originating in this way have some
special features:
• As the non-aggregated information has often the form of high-frequency data in
this context, the timescale must reflect this fact (minutes or even seconds for big
stock and derivative exchanges or multinational foreign exchange markets).
Moreover, the technical support of trading becomes very important including
Table 9.2 Daily log returns of index PX50 in 2004 (249 values for 250 trading days written in columns) from Example 9.1 (see also Fig. 9.2 and Table 10.1)
1 2 3 4 5 6 7 8 9 10
1 – 0.0090 –0.0049 0.0114 –0.0045 0.0047 0.0074 –0.0001 –0.0064 0.0016
2 0.0123 0.0084 –0.0047 0.0017 0.0137 –0.0034 –0.0131 0.0054 –0.0107 0.0208
3 0.0057 0.0126 0.0129 –0.0139 0.0076 0.0088 –0.0052 0.0057 0.0250 0.0010
4 –0.0098 0.0171 0.0076 0.0022 0.0111 –0.0051 0.0089 0.0025 0.0088 –0.0108
5 0.0036 0.0022 0.0087 0.0067 0.0067
9.4 Duration Modeling
.04
.03
.02
.01
.00
-.01
-.02
-.03
-.04
-.05
25 50 75 100 125 150 175 200 225 250
Fig. 9.2 Daily log returns of index PX50 in 2004 (249 values for 250 trading days)
Table 9.3 BDS test applied to residuals estimated by means of model GARCH(1,1) from Example
9.1 (daily log returns of index PX50 in year 2004)
BDS Test for RESID
Included observations: 249
Dimension BDS Statistic Std. Error z-Statistic Prob.
2 –0.002106 0.005460 –0.385828 0.6996
3 0.003350 0.008682 0.385813 0.6996
4 0.012462 0.010345 1.204570 0.2284
5 0.017166 0.010790 1.590984 0.1116
6 0.017386 0.010413 1.669745 0.0950
Raw epsilon 0.013311
Source: Calculated by EViews
• The price is mostly a discrete-valued variable in transactions data, since the price
change from one transaction to the next occurs only in multiples of tick size (see
above). For example, the NYSE traded gradually in eights, sixteenths, and
decimals of dollar.
• In periods of heavy trading, more transactions may occur (even with different
prices) within a single second or another very small time unit of transaction
recording (so-called multiple transactions).
In this section, we focus on modeling durations between particular transactions.
To be more specific, let ti denote the (calendar) time measured in seconds since
midnight till the moment of the ith transaction (obviously, the index i describes the
order of transactions in time, not the calendar time). The corresponding duration
between the (i – 1)th and ith transaction will be then denoted as Δti ¼ ti – ti – 1 (and
sometimes for simplicity even in the abbreviated form as zi ¼ Δti).
One of the most successful approaches to the duration modeling copies the
philosophy of GARCH models, but for time durations zi (and not for values of the
given time series). The corresponding models based on this analogy are called
autoregressive conditional duration processes ACD(r, s) (see Engle and Russell
(1998)):
X
r X
s
z i ¼ τ i εi , τ i ¼ α0 þ α j τij þ βk zik , ð9:49Þ
j¼1 k¼1
where εi are iid nonnegative random variables generally with unit mean value and
specifically with exponential distribution in the model EACD(r, s), or Weibull
distribution in the model WACD(r, s), or gamma distribution in the model GACD
(r, s).
Analogously as in the model GARCH, if the following sufficient condition holds
X
max fr, sg
α0 > 0, α j 0, βk 0, α j þ β j < 1, ð9:50Þ
j¼1
then the model ACD is stationary with mean value of the form
α0
Eðzi Þ ¼ P max fr,sg : ð9:51Þ
1 j¼1 αj þ βj
2
α0 1 α21 2α1 β1
varðzi Þ ¼ : ð9:53Þ
1 α1 β1 1 α21 2β21 2α1 β1
The models of the type EACD, WACD, and GACD can be estimated by the
method of maximum likelihood due to specified distributions of εi (see, e.g., Tsay
(2002)).
Remark 9.6 Tsay (2002) constructed the following model WACD(1,1) for dura-
tions in trading the stocks IBM during five trading days (in total, 3534 durations after
eliminating the diurnal pattern of daily periodicity were used in this construction)
where Weibull distribution of εi (standardized by unit mean value) has the proba-
bility density
( λ n λ o
λ Γ 1 þ 1λ xλ1 exp Γ 1 þ 1λ x for x 0,
f ðx j λ Þ ¼ ð9:54Þ
0 otherwise
with the estimated parameter λ of size 0.879. Then the estimated mean duration
(9.51) after elimination of daily periodicity is 3.31 seconds, which coincides with the
duration estimated directly as the sample mean of the time series of periodically
adjusted durations.
⋄
9.5 Exercises
Exercise 9.1 Apply the BDS test from Sect. 9.3 for (a) 100 simulated values of the
type iid N(0, 1); (b) the daily log returns of index PX in 2016 estimated as GARCH
(1, 1) in Example 8.2.
Chapter 10
Models of Development of Financial Assets
In Sect. 2.1, we have defined the random (or stochastic) process {Yt, t 2 T} in
continuous time as a set of random variables in the same probability space (Ω, ℑ, P)
indexed by means of values t from the set T ¼ h0, 1) interpreted as time. The
continuous time is the necessary assumption for various financial schemes that
model (in a practically acceptable way) the price changes of financial assets, even
though in reality these prices are observed in discrete time moments only (however,
an awareness of the fact that the analyzed prices exist continually and change in time
unceasingly may serve as a motivation for their analysis).
The models of Box–Jenkins methodology in discrete time from Chap. 6 (e.g., the
linear process (6.17)) are based on unpredictable discrete increments (innovations,
shocks) in the form of white noise. The analogy for models in continuous time can be
based on increments of Wiener process {Wt, t 0}. The properties (2.56) of Wiener
process can be rewritten by means of its increments ΔWt ¼ Wt+Δt – Wt as
8
> ði Þ W 0 ¼ 0;
>
>
< ðiiÞ the particular trajectories are continuous in time;
pffiffiffiffiffi ð10:1Þ
>
> ðiiiÞ ΔW t ¼ ε Δt with a random variable ε N ð0, 1Þ;
>
:
ðivÞ ΔW t is independent on W s for arbitrary 0 s < t:
The property (i) can be formulated more generally as P(W0 ¼ 0) ¼ 1. If we delete the
property (ii), then such a process is called standard Brownian motion (however, it is
possible to show that every Brownian motion has a modification with continuous
trajectories). The property (iii) can be also written as
ΔW t N ð0, Δt Þ: ð10:2Þ
In particular, the standard deviation of the process increment is equal to the square
root of the corresponding time increment. Finally, the assumption (iv) is so-called
Markov property (see also (2.48)), i.e., at time t, any future value Wt+h (h > 0)
depends only on the present value Wt, and not on previous values Ws (s < t). It means
consequently that the increments of the process are mutually independent for non-
overlapping time intervals (see Sect. 2.5.2). From the financial interpretation point of
view, this Markov property corresponds to so-called weakly efficient markets. There
are other specific properties of Wiener process suitable for financial modeling, e.g., it
holds (if we consider a “long” time increment from zero to t)
so that the variance of Wiener process accrues linearly with increasing time.
As the trajectories of Wiener process are not differentiable in any point of time
(i.e., they have nowhere derivations, even though they are continuous), one cannot
integrate them in the classical way and has to make use of so-called stochastic (Ito’s)
calculus.
10.1 Financial Modeling in Continuous Time 253
The general scheme of financial models in continuous time is the diffusion process
(Ito’s process). If we denote small changes of a variable x as dx, then the usual form
of this process {Yt, t 0} is
where Wt is Wiener process. This model has the drift component μ(Yt, t)dt for
modeling the trend and the diffusion component σ(Yt, t)dWt for modeling the vola-
tility. Since the drift coefficient μ(Yt, t) and the diffusion coefficient σ(Yt, t) may
change in time (they depend on the time and even on the value of the process),
one has to use integration when solving the differential equation (10.4), i.e.,
Zt Zt
Yt ¼ Y0 þ μðY s , sÞds þ σ ðY s , sÞdW s for t 0, ð10:5Þ
0 0
where the second integral is stochastic (i.e., one integrates with respect to random
processes; see also Sect. 10.1.2) assuming so-called previsibility of process σ(Yt, t)
(i.e., independence on the future in terms of measurability of this process with
respect to the current and past information).
An important special case of Ito’s process is Wiener process with drift μ and
volatility σ (generalized Wiener process, arithmetic Wiener process) of the form
dY t ¼ μ dt þ σ dW t for t 0 ð10:6Þ
(σ 0), which has the constant drift and diffusion coefficient. It means that it holds
(assuming Y0 ¼ 0)
Yt ¼ μ t þ σ Wt for t 0 ð10:7Þ
pffiffiffiffiffi X
k
Y kΔt ¼ μ k Δt þ σ Δt εj for k ¼ 1, 2, . . . : ð10:9Þ
⋄
j¼1
Another special case of Ito’s process is the exponential Wiener process (some-
times called geometric Brownian motion; see Sects. 2.5.2 and 10.1.3).
The stochastic calculus demands to modify the classic derivatives and integrals to
their stochastic variants, which will be briefly described below (the theoretical
backgrounds of this complex discipline including a technical construction of sto-
chastic integral can be found in various sources, see, e.g., Baxter and Rennie (1996),
Dupačová et al. (2002), Karatzas and Shreve (1988), Kwok (1998), Malliaris and
Brock (1982), Musiela and Rutkowski (2004), Neftci (2000), Wilmott (2000), and
others):
The basic principle of random differential (or equivalently stochastic differenti-
ation) is the well-known Ito’s lemma. Let us consider a diffusion process {Yt, t 0}
according to (10.4). Further let f(y, t) be a continuous (nonrandom) function of
variable y and time t with continuous partial derivatives fy ¼ ∂f/∂y, fyy ¼ ∂2f/∂y2,
and ft ¼ ∂f/∂t. Then the transformed process f(Yt, t) fulfills (so-called Ito’s lemma):
1
df ðY t , t Þ ¼ f y μðY t , t Þ þ f t þ f yy σ 2 ðY t , tÞ dt þ f y σ ðY t , t Þ dW t for t 0:
2
ð10:10Þ
If, e.g., f(Wt, t) ¼ Wt2, then fy ¼ 2Wt, fyy ¼ 2 and ft ¼ 0, so that the differential of
squared Wiener process fulfills
dW 2t ¼ dt þ 2W t dW t for t 0, ð10:11Þ
Zt
dY s ¼ Y t Y 0 : ð10:12Þ
0
Zt
dW s ¼ W t : ð10:13Þ
0
Zt
W 2t ¼ t þ 2 W s dW s , ð10:14Þ
0
Zt
1 2
W s dW s ¼ Wt t ð10:15Þ
2
0
Rt
(again it is different significantly from the classical (nonrandom) integral x dx ¼
0
t 2 =2).
One of the most utilized processes for modeling prices {Pt, t 0} of financial assets
(e.g., stocks) in continuous time is exponential Wiener process (geometric Brownian
motion) defined as Ito’s process of the form
ΔPt
¼ μ Δt þ σ ΔW t ð10:17Þ
Pt
indicates that one models, as a matter of fact, returns of given asset by means of
(deterministic) drift component μΔt and (random) diffusion component σ ΔWt ~ N
(0, σ 2Δt).
However, in practice we often model the logarithmic price
pt ¼ ln Pt , ð10:18Þ
since then we obtain easily by discrete differencing the log return rt ¼ pt pt1 (see
(8.1)). Obviously, it holds
256 10 Models of Development of Financial Assets
2
∂ ln Pt 1 ∂ ln Pt 1 ∂ ln Pt
¼ , ¼ 2, ¼ 0,
∂Pt Pt ∂P2t Pt ∂t
so that the logarithmic transformation of the process Pt from (10.16) fulfills ac-
cording to Ito’s lemma
σ2
dpt ¼ d ln Pt ¼ μ dt þ σ dW t for t 0: ð10:19Þ
2
Therefore, the logarithmic price has the drift coefficient μσ 2/2 and the diffusion
coefficient (or volatility) σ. The differential equation (10.19) may be solved by
integrating its both sides (see Sect. 10.1.2)
Zt Zt Zt
σ2
dps ¼ μ ds þ σ dW s , ð10:20Þ
2
0 0 0
i.e.,
σ2
pt ¼ p0 þ μ t þ σ Wt for t 0 ð10:21Þ
2
or equivalently for the original price Pt (i.e., after removing the logarithms)
σ2
Pt ¼ exp p0 þ μ t þ σ Wt
2
σ2
¼ P0 exp μ t þ σ Wt for t 0: ð10:22Þ
2
The exponential Wiener process is usually presented just in this exponential form,
which justifies its name (the form (10.22) can be also compared with the
reparameterized version (2.59) of this process).
The formulas (10.21) and (10.22) imply that it holds (conditionally for the price
p0 ¼ ln P0 at time t ¼ 0)
σ2 σ2
pt N p0 þ μ t, σ t , Pt LN p0 þ μ
2
t, σ t , ð10:23Þ
2
2 2
where LN(μ, σ 2) denotes a random variable with lognormal distribution, which has
the form of exponential function exp(X) of a random variable X ~ N(μ, σ 2). Then one
can show (again conditionally for the price p0 ¼ ln P0) that, e.g.:
10.1 Financial Modeling in Continuous Time 257
EðPt Þ ¼ P0 exp ðμ t Þ, varðPt Þ ¼ P20 exp ð2μ t Þ exp σ 2 t 1 : ð10:24Þ
In particular, the drift coefficient μ represents the average annual log return due to the
price changes of given asset (see (8.1) rewritten to the form Pt ¼ Pt1exp(rt) for log
returns {rt}).
Remark 10.2 The relations (10.21) and (10.22) can be used when simulating the
development of prices pt or Pt (see also Remark 10.1). Note also according to (10.21)
that if we apply for modeling the price of given financial asset the exponential
Wiener process, then the log return rt ¼ pt pt1 is white noise with probability
distribution
σ2
rt N μ , σ2 : ð10:25Þ
2
⋄
In financial practice, both unknown parameters μ and σ 2 of exponential Wiener
process can be statistically estimated by means of observed data (so-called calibra-
tion of model in a given financial environment):
Let, e.g., r1, . . ., rn be the corresponding log returns measured using regular time
intervals of length Δ, which is a given fraction of year (mostly daily log returns are
used, i.e., Δ 1/250 for 250 trading day in 1 year). Then similarly as in (10.25)
it holds
σ2
Eðr t Þ ¼ μ Δ, varðr t Þ ¼ σ 2 Δ: ð10:26Þ
2
The given data set usually enables to estimate the sample mean and sample variance
as
1X 1 X
n n
r¼ r, s2r ¼ ðr r Þ2 : ð10:27Þ
n t¼1 t n 1 t¼1 t
Comparing the theoretical values (10.26) and sample values (10.27) for higher n (the
sample values can be obviously used as consistent estimate of theoretical values),
then one finally obtains the estimates of μ and σ as
s σ2
r b r s2
σ ¼ prffiffiffiffi ,
b b
μ¼ þ ¼ þ r ð10:28Þ
Δ Δ 2 Δ 2Δ
ffiffiffiffiffithe estimate b
(moreover, e.g., the standard deviationpof σ, which serves as an error of
this estimate, can be estimated by b
σ = 2n).
258 10 Models of Development of Financial Assets
Example 10.1 In this example, we estimate the exponential Wiener process for
index PX50 (250 trading days in year 2004; see Table 10.1 and Fig. 10.1 on the left).
By means of the formulas (10.27), one estimates easily for data sample in
Table 10.1
(in particular, the error of volatility estimate is 0.7%), so that the average annual log
return of the index PX50 was 45.8%.
Using the estimated parameters μ and σ 2 in the relation (10.22), simulations of the
index PX50 are possible similarly to that in Remark 10.1. One of such simulations is
plotted in Fig. 10.1 on the right (including the true observations on the left to
compare both plots). It is apparent that in such simulations the high drift coefficient
45.8% prevails over the volatility 15.6% so that the simulated trajectories are often
very distinctly increasing.
⋄
10.2 Black–Scholes Formula
1 2 3 4 5 6 7 8 9 10
1 662.10 726.00 789.70 846.00 759.70 796.20 800.90 837.60 883.80 999.90
2 670.30 732.10 786.00 847.40 770.20 793.50 790.50 842.10 874.40 1020.90
3 674.10 741.40 796.20 835.70 776.10 800.50 786.40 846.90 896.50 1021.90
4 667.50 754.20 802.30 837.50 784.80 796.40 793.40 849.00 904.40 1010.90
5 665.10 748.10 805.20 834.90 786.50 789.90 785.40 856.40 910.50 1017.70
6 663.20 748.60 797.80 844.20 778.70 786.40 782.50 857.50 905.50 1029.70
Black–Scholes Formula
7 674.90 745.40 804.20 850.40 781.80 784.10 784.60 860.70 907.50 1030.10
8 676.90 748.40 813.20 842.50 781.60 787.00 790.50 850.50 906.20 1021.00
9 678.80 749.60 817.50 818.00 784.70 793.40 783.80 855.40 892.90 1024.20
10 684.40 751.00 813.50 816.70 795.30 787.50 782.20 860.20 911.90 1012.60
11 687.80 751.60 812.70 799.60 793.10 779.70 785.10 849.80 916.80 981.00
12 688.50 746.70 823.50 794.40 784.20 784.80 785.00 847.60 936.60 992.10
13 688.20 758.00 823.80 789.10 782.10 786.50 791.60 847.40 935.70 994.00
14 693.40 760.20 830.20 772.50 779.00 787.10 797.20 875.40 933.00 1015.20
15 692.40 774.30 837.00 741.50 779.00 780.70 796.20 875.40 940.90 1019.50
16 687.80 775.80 845.70 763.70 783.40 770.80 801.20 872.10 933.40 1020.10
17 687.30 775.50 842.00 763.40 787.50 779.20 804.20 864.10 938.80 1013.00
18 689.90 776.40 848.70 749.20 804.40 784.30 807.30 872.00 949.40 1009.60
19 691.70 784.40 853.60 750.80 807.30 785.60 816.00 878.50 957.10 1002.00
20 691.90 792.00 850.20 739.00 797.90 786.10 817.60 891.60 954.60 1007.10
21 698.20 811.10 854.30 744.90 794.70 794.10 817.00 884.50 958.10 1011.00
22 703.20 809.60 835.10 762.30 805.10 790.20 819.10 902.30 960.60 1015.70
23 709.60 801.70 838.10 760.70 799.90 787.30 831.40 899.10 994.60 1020.50
24 714.30 790.70 831.50 759.00 798.80 794.30 828.90 900.50 1001.30 1030.90
25 719.50 793.60 836.40 763.10 792.50 795.00 837.70 889.50 998.30 1032.00
259
1,050 1,100
1,000
1,000
950
900 900
850
800 800
750
700
700
650 600
50 100 150 200 250 50 100 150 200 250
Fig. 10.1 Index PX50 in the year 2004 (on the left) and its simulation by means of estimated
exponential Wiener process from Example 10.1 (on the right)
• F is a function of time t and the price Pt of underlying asset at time t (e.g., stock,
currency, crude oil, stock index). One will write for simplicity Ft ¼ F(Pt, t).
• The value FT is determined by a boundary condition at maturity time T (t < T )
according to the type of financial derivative.
For example, let us consider a (European) call option (simply call) which gives its
buyer (holder in long position) the right to buy at maturity time T an underlying asset
for a preset price X (exercise price or strike price) even though the market price of
the given asset at time T is PT. The call option can be purchased at time t for a price
Callt (so-called call premium), where one knows only the asset price Pt at time t, but
not the future price PT at time T. The seller of this call (underwriter in short position)
must sell the underlying asset at maturity time according to the holder’s decision. In
this case, the boundary condition for the function Ft (¼ Callt) has obviously the form
(European options can be exercised only at the maturity date, while American
options at any time up to the maturity date). Similarly, a (European) put option
(simply put) gives its buyer (holder in long position) the right to sell at maturity time
T an underlying asset for a preset price X. The put option can be purchased at time
t for a price Putt (put premium). The seller of this put (underwriter in short position)
must buy the underlying asset at maturity time according to the holder’s decision. In
this case, the boundary condition for the function Ft (¼ Putt) has the form
2
∂F t ∂F t 1 ∂ F t 2 2 ∂F t
dF t ¼ μ Pt þ þ σ Pt dt þ σ Pt dW t : ð10:31Þ
∂Pt ∂t 2 ∂P2t ∂Pt
One can rewrite the previous relations to the following discrete form:
ΔPt ¼ μ Pt Δt þ σ Pt ΔW t , ð10:32Þ
2
∂F t ∂F t 1 ∂ F t 2 2 ∂F t
ΔF t ¼ μ Pt þ þ σ Pt Δt þ σ Pt ΔW t : ð10:33Þ
∂Pt ∂t 2 ∂P2t ∂Pt
∂F t
V t ¼ F t þ Pt : ð10:34Þ
∂Pt
In the framework of this portfolio, one owns the underlying asset of size ∂Ft / ∂Pt
(with price Pt per unit) and simultaneously owes the considered financial derivative
of unit size (with price Ft per unit). It holds
2
∂F t ∂F 1 ∂ Ft 2 2
ΔV t ¼ ΔF t þ ΔPt ¼ t σ Pt Δt, ð10:35Þ
∂Pt ∂t 2 ∂P2t
i.e., the changes of this portfolio do not include the random component ΔWt.
Therefore, the portfolio Vt is riskless in the framework of small time changes Δt
and must earn during such time changes equally as other riskless investments with
riskless interest rate rf (free of risk), i.e.:
ΔV t ¼ r f V t Δt ð10:36Þ
(otherwise one could break the no arbitrage rule; see above). Substituting (10.34)
and (10.35) to (10.36) one obtains
2
∂F t 1 ∂ F t 2 2 ∂F t
þ σ Pt Δt ¼ r f F t Pt Δt: ð10:37Þ
∂t 2 ∂P2t ∂Pt
2
∂F t ∂F t 1 2 2 ∂ F t
þ r f Pt þ σ Pt ¼ r f Ft : ð10:38Þ
∂t ∂Pt 2 ∂P2t
262 10 Models of Development of Financial Assets
The solution of this equation under the boundary condition (10.29) is the well-
known Black–Scholes formula for the (European) call premium (see also (8.20)):
where
ln ðPt =X Þ þ r f þ σ 2 =2 ðT t Þ ln ðPt =X Þ þ r f σ 2 =2 ðT t Þ
dþ ¼ pffiffiffiffiffiffiffiffiffiffiffi , d ¼ pffiffiffiffiffiffiffiffiffiffiffi
σ T t σ T t
pffiffiffiffiffiffiffiffiffiffiffi
¼ dþ σ T t,
Φ() is the distribution function of N(0, 1) and the remaining symbols are described
in the previous text. Similarly, Black–Scholes formula for the (European) put
premium is
Example 10.2 Let us consider a call option to buy a stock with exercise price
50 EUR. Three months before maturity of this European option, the price of stock is
53 EUR. Which is the call premium according to Black–Scholes formula (the price
volatility of stock has been estimated as 0.50 EUR and the corresponding riskless
interest rate is 5% p.a.)?
According to (10.39) for Pt ¼ 53, X ¼ 50, T t ¼ 3/12 ¼ 0.25, σ ¼ 0.50 and
i ¼ 0.05, one obtains
ln ð53=50Þ þ 0:05 þ 0:502 =2 0:25
dþ ¼ pffiffiffiffiffiffiffiffiffi ¼ 0:4081, Φð0:4081Þ ¼ 0:6584,
0:50 0:25
pffiffiffiffiffiffiffiffiffi
d ¼ 0:4081 0:50 0:25 ¼ 0:1581, Φð0:1581Þ ¼ 0:5628,
so that
ln Pðt, T Þ
Rðt, T Þ ¼ , ð10:42Þ
T t
which follows from the formula of continuous discounting (see, e.g., Cipra
(2010)):
∂ ln Pðt, T Þ ∂ ln Pðt, t Þ
r t ¼ Rðt, t Þ ¼ ¼ ð10:44Þ
∂T T¼t ∂T
is called instantaneous interest rate at time t (since rt is the limit value R(t, T ) for
T ! t). The adjective “instantaneous” expresses the fact that if applying this
interest rate to a capital K at time t, then the capital accrual during a short time
interval Δt is approximately ΔK ¼ KrtΔt, even if more correctly one should
write the differential relation dK ¼ Krt dt. Obviously, the instantaneous rate rt
presents the interest intensity at time t independently of time of maturity of the
corresponding loan (investment).
Most models of term structure of interest rates assume that rt is the diffusion
process of the form (see (10.4))
The models used in practice have specified forms of drift coefficient a(rt, t) and
diffusion coefficient b(rt, t). Their main outputs are explicit formulas for instanta-
neous interest rate rt and bond price P(rt, t, T ) by means of these coefficients and
consequently an explicit formula for yield curve R(rt, t, T ) according to (10.42). If
we succeed in estimating the chosen model (10.45) for observed financial data
(at time t one observes the bond prices P(t, Tk) with various times of maturity Tk;
see also Remark 10.4), then we can:
• Estimate continuous yield curves using only limited volume of data.
• Perform various simulations (for yield curves, instantaneous interest rates, and
the like).
In practice, one constructs (see, e.g., Baxter and Rennie (1996), Cipra (2010),
Hull (1993)):
• Single-factor interest rate models that include only one interest rate factor rt (e.g.,
models by Vasicek, Cox–Ingersoll–Ross, Hull–White, Ho-Lee, Black–Derman–
Toy, Black–Karasinski, and others).
• Binomial tree models (e.g., models by Rendleman–Bartter, Jarrow-Rudd, and
others).
• Multi-factor interest rate models that include several interest rate factors (e.g.,
models by Brennan–Schwartz, Fong–Vasicek, Longstaff–Schwartz, and others).
1. Vasicek Model
Vasicek model (also Ornstein–Uhlenbeck process or mean-reverting model) is based
on the diffusion equation (10.45) in the form
10.3 Modeling of Term Structure of Interest Rates 265
dr t ¼ α ðγ r t Þ dt þ b dW t ð10:46Þ
b q b2 1
Rðr t , t, T Þ ¼ γ þ 2 1 eαTÞ
α 2α αT
b q b2 b2 2
γþ 2 r t þ 3 1 eαT , ð10:47Þ
α 2α 4α T
where q ¼ q(rt, t) is so-called market price of risk (if no arbitrage opportunities exist,
then q does not depend on the time of maturity T ). The interpretation of q can be
shown symbolically (not writing arguments of variables for simplicity). If one writes
the stochastic differential equations for P ¼ P(rt, t, T) symbolically as dP ¼ μ P
dt + σ P dW (applying Ito’s lemma to P(rt, t, T ) and (10.46)), then the market price
of risk is defined as q ¼ (μ r)/σ. The equality μ r ¼ q σ can be interpreted in
such a way that the expected yield μ exceeding r compensates the risk q σ.
Moreover, R(rt, t, T ) increases, or reverses from increase to decrease, or decreases
if rt γ + b q/α 3b2/(4α 2), or γ + b q/α 3b2/(4α2) < rt < γ + b q/α, or
rt γ + b q/α, respectively.
⋄
2. Model Cox–Ingersoll–Ross
This model denoted briefly as CIR model is based on the diffusion equation (10.45)
in the form
pffiffiffiffi
dr t ¼ α ðγ r t Þ dt þ b r t dW t ð10:48Þ
.04
.03
.02
.01
.00
25 50 75 100 125 150 175 200 225 250
Vasicek CIR
⋄
Example 10.3 Figure 10.2 plots one trajectory of simulations of instantaneous
interest rate rt using 250 regular time intervals of length Δ ¼ 1/250 (see Remark
10.2) by means of Vasicek model
Both trajectories fluctuate around the level 5%, but the trajectory of model CIR is
much more stable (the trajectory of Vasicek model might sink even to negative rates
in longer simulations).
⋄
10.4 Exercises
Exercise 10.1 Repeat the simulations analysis from Example 10.3 using different
values of coefficients for Vasicek and CIR model and compare their graphs.
Chapter 11
Value at Risk
Methodology VaR (value at risk) and its modifications are usual measures of risk in
practice (e.g., it is one of the best used approaches to set up capital requirements
when regulating capital adequacy in so-called internal models of banks). More
generally, VaR is the key instrument for financial risk management, e.g., by
means of commercial systems of the type RiskMetrics. This topic is included in
the presented text since some methods of VaR construction make use of the analysis
of financial time series.
In general, the financial risk concerns potential price changes of financial assets,
where the corresponding price change (expressed mainly as the rate of return; see
Remark 6.20) is looked upon as a random event. If the financial risk is measured as
the variance or standard deviation of (log) returns in the form of random process,
then it is usually called (conditional) volatility (see Chap. 8).
Moreover, the financial risk can be classified into several categories, mainly:
1. Market risk is the risk of loss due to changes (variations) of market prices
(of securities, commodities, and others) or market rates (interest rates, rates of
exchange, and others). Accordingly, it can be sorted to more specific risk sub-
categories, e.g.:
• Interest risk
• Currency risk
• Stock risk
• Commodity risk
• Credit spread risk (i.e., the risk of loss due to changes in differences between
the yields of various debt instruments)
• Correlation risk (i.e., the risk of loss due to changes in traditional correlations
between considered risk categories, e.g., between stocks and bonds) and
others.
2. Credit risk is the risk that the creditor (lender) may not receive promised
repayments on outstanding investments (such as loans, credits, bonds) because
Risk can be measured and quantified by means of various ways. In some cases (e.g.,
in various regulatory systems for banks), one prefers deterministic instruments for
this purpose, e.g., stress tests constructed in accordance with prescribed instructions
without any portion of stochasticity.
In this text, we deal only with stochastic risk measures that respect the random
character of potential losses (or profits). Let random variable X represent loss (if X is
positive) or profit (if X is negative) accumulated during the given holding period
(moreover, X is usually observed in time, i.e., in the form of a time series; see Sect.
11.2). Then a risk measure ρ can be defined as a mapping that assigns real values
ρ(X) to the random variables X. In particular, a risk measure is called coherent if it
possesses the following properties (for bounded random variables X and Y denoting
losses in the same financial environment):
(i) Subadditivity: ρ(X + Y) ρ(X) + ρ(Y)
(ii) Monotony: if X Y, then ρ(X) ρ(Y )
(iii) Positive homogeneity: ρ(λX) ¼ λρ(X) for arbitrary constant λ > 0
(iv) Translation invariancy: ρ(X + a) ¼ ρ(X) + a for arbitrary constant a > 0.
11.1.1 VaR
The methodology VaR (value at risk) is based on an estimate of the worst loss that
can occur with a given probability (confidence) in a given future period (alternatively
one can say that with a prescribed confidence α, e.g., 95%, there cannot occur a loss
that is higher than VaR). For example in the context of capital requirements or capital
adequacy of banks, VaR represents the smallest capital amount that guarantees the
bank solvency with a given confidence. VaR is specified by the following factors:
• Holding period is the period in which a potential loss can occur. Accordingly, the
used terms may be the daily VaR (over one business day, e.g., in RiskMetrics) or
the 10 days VaR (over two calendar weeks with 10 business days, e.g., according
11.1 Financial Risk Measures 269
0,06
E(X) VaR95% ES
0,05
Probability density of X
0,04
0,03
0,02
0,01
0,00
-5 −0.88 0 2.75 4.26 5 10
Daily loss X (mil. EUR)
where random variable X denotes loss (of course the negative loss means profit)
accumulated during the given holding period (e.g., during one trading day), α is the
corresponding confidence level (e.g., 95% for α ¼ 0.95), and FX(x) ¼ P(X x) is the
probability distribution function of X. When one expresses it in statistical terms, then
VaRα(X) is α-quantile qα of random variable X. Moreover, if the probability distri-
bution function FX() is increasing and continuous, then it holds simply
VaRα ðX Þ ¼ F 1
X ðαÞ ¼ qα , i:e: PðX VaRα ðX ÞÞ ¼ α: ð11:2Þ
Figure 11.1 plots VaR95% for a daily loss X with given probability density. This
random variable X has mean value 0.88 million euros (i.e., one can expect a profit
on average) and skewed to the right (i.e., potential losses are not negligible). The
270 11 Value at Risk
daily value at risk achieves with confidence 95% relatively high level 2.75 million
euros (i.e., ceteris paribus one may expect in each twentieth trading day the loss of at
least 2.75 million euros). The drawback of this risk measure is the fact that it does not
inform on possible losses higher than VaR95% (in contrast to the expected shortfall
ES ¼ 4.26 million euros; see Fig. 11.1 and Sect. 11.1.2).
In addition to the “absolute” VaR, one sometimes applies also the relative value at
risk which is related to the mean value E(X), namely
For example, the relative VaRrel in Fig. 11.1 is 2.75 (0.88) ¼ 3.63 million euros
as the “distance of the absolute VaR from the mean loss.”
Remark 11.1 In the class of basic parametric distributions, it is possible to write
analytic formulas for VaR directly as the corresponding quantiles qα, e.g.:
1. For normal distribution X ~ N(μ, σ 2):
(in this case, the probability density of X with a suitable configuration of param-
eters looks similarly as in Fig. 11.1).
3. For exponential distribution X ~ Exp(λ), i.e., FX(x) ¼ 1 exp(λx) for x 0:
(this distribution is applicable, e.g., in the case when no profit with negative X is
possible).
⋄
11.1 Financial Risk Measures 271
Here we shall give a brief survey of other types of risk measures that are applied in
financial practice:
1. Deviation Risk Measures
Deviation risk measures regard the risk as fluctuations around a given value (usually
around the mean value E(X) which is interpreted as the average loss). The main
representatives (used, e.g., in risk management) are:
• Standard deviation:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 ffi
2 2
σ ðX Þ ¼ EðX EðX ÞÞ ¼ E X ðEðX ÞÞ ð11:7Þ
(σ(X) used by Markowitz in his theory of portfolio is a very popular risk measure due
to its simplicity; on the other hand, it has some drawbacks, namely (i) it is applicable
only when the second moment of loss X exists and (ii) it does not distinguish positive
and negative deviations around E(X) so that it cannot be recommended for asym-
metric and skewed loss distributions).
• Variance:
varðX Þ ¼ σ 2 ðX Þ ¼ E X 2 ðEðX ÞÞ2 ð11:8Þ
(var(X) is used as the measure of volatility in models of the type ARCH for financial
time series in Sect. 8.3).
• One-sided standard deviations:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
σ þ ðX Þ ¼ Eð max fX EðX Þ, 0gÞ2 , σ ðLÞ ¼ Eð min fX EðX Þ, 0gÞ2
ð11:9Þ
(in contrast to the two-sided standard deviation, σ(X) measures only positive or
negative deviations from the mean value, respectively).
• Variance coefficient:
σ ðX Þ
vð X Þ ¼ 100%: ð11:10Þ
jEðX Þj
Z1
1
ESα ¼ VaRu du, ð11:13Þ
1α
α
Obviously, instead of fixing a particular confidence level α we average VaRu over all
levels u α and thus look further into the tail of the loss distribution (it holds always
ESα VaRα; see Fig. 11.1). In any case, the expected shortfall is the coherent risk
measure.
Remark 11.2 For continuous loss distribution, an even more intuitive expression
for ESα in (11.13) is possible, namely
which shows that ESα can be also interpreted as the expected loss that is incurred in
the case that VaRα is exceeded (see, e.g., McNeil et al. (2005)). In general (i.e.,
including discrete loss distributions), one defines
1
CVaRα ¼ EðX jX VaRα Þ ¼ E X I ½XVaRα ð11:15Þ
1α
as conditional value at risk CVaR (or sometimes also tail conditional expectation
TVaR).
One can see that the difference between (11.13) and (11.14) consists in the lower
bound for averaging the worst losses:
• In (11.13) one averages over the worst scenarios that occur with probability 1α.
• In (11.14) one averages over the worst losses which are not lower than VaRα.
11.1 Financial Risk Measures 273
Table 11.1 Probability distribution of losses expected in investment portfolios A and B during
next year
Portfolio A Portfolio B
Loss (million euros) Probability (%) Loss (million euros) Probability (%)
20 4 30 2
10 3 10 98
100 93 – –
⋄
Remark 11.3 Similarly as in Remark 11.1 one can derive analytic formulas for ES
¼ ESα under some parametric loss distributions, e.g.:
1. For normal distribution X ~ N(μ, σ 2):
φ Φ1 ðαÞ φ Φ1 ðαÞ
ES ¼ μ þ σ , ES rel
¼σ , ð11:16Þ
1α 1α
where φ and Φ are the probability density and the distribution function of
standard normal distribution N(0, 1), respectively.
2. For log-normal distribution X ~ LN(μ, σ 2):
Φ σ Φ1 ðαÞ
ES ¼ exp μ þ σ 2 =2 ,
1 1
α
α Φ Φ ðαÞ σ
ESrel ¼ exp μ þ σ 2 =2 : ð11:17Þ
1α
⋄
3. Distorted Risk Measures
Distorted risk measures originate by artificially “distorting” the distribution function
of loss: the expected value of loss after this adjustment is the risk measure result. The
motivation of distortion consists in the fact that in specific situations the risk
measures of the type VaR and ES do not distinguish the risk in an acceptable way.
For example, let us have choice between two investment portfolios whose stochastic
behavior is described in Table 11.1.
Then it holds
• in the portfolio A:
4 1
VaR0:95 ¼ 10 million euros, ES0:95 ¼ 20 þ 10 ¼ 18 million euros;
5 5
• in the portfolio B:
2 3
VaR0:95 ¼ 10 million euros, ES0:95 ¼ 30 þ 10 ¼ 18 million euros:
5 5
274 11 Value at Risk
0 α 1 z
Even though both the portfolios have the same risk values VaR0.95 and ES0.95, each
investor in practice would prefer to invest to the portfolio A (while the portfolio B is
always in loss with maximum possible loss of 30 million euros, the portfolio A is in
loss of 10 or 20 million euros only with relatively small probabilities and otherwise it
is highly profitable). This reasonable decision can be confirmed by means of risk
measures only if we distort them in a suitable way.
In general, let X be a loss with the distribution function F(x) ¼ P(X x) and the
finite mean value E(X). Then the distorted risk measure Eg(X) of loss X is defined as
Z0 Z1
E g ðX Þ ¼ F g ðxÞdx þ 1 F g ðxÞ dx, ð11:18Þ
1 0
where
F g ð x Þ ¼ gð F ð x Þ Þ ð11:19Þ
⋄
11.1 Financial Risk Measures 275
Table 11.2 Values of Wang Wang distorted risk measure (million euros) for
distorted risk measure for
λ Portfolio A Portfolio B
portfolios A and B from
Table 11.1 3 11.93 24.84
2 17.02 14.36
1 62.85 4.38
0 91.90 0.60
1 99.24 0.03
2 99.97 0.00
3 100.00 0.00
Z1
Mψ ¼ ψ ðuÞ VaRu du, ð11:21Þ
0
Z1
ψ ðuÞ du ¼ 1: ð11:22Þ
0
0 α 1 u
One of the main drawbacks of VaR is the fact that different methods of its numerical
calculation (or estimation or prediction) may deliver in practice (substantially)
different results. In this section, we shall describe several methods for the calculation
of VaR used in practice which are based on data in the form of time series. Some of
them can be classified as parametric methods and others as nonparametric or
combined ones.
1. Variance-Covariance Method
This method is frequent in practice among various parametric approaches calculating
VaR. It offers a direct analytical solution of given problem, but it is based on
simplifying assumptions which need not be fulfilled in practice (even in routine
situations) and must be taken as drawbacks of this method, namely
(i) The loss Xt at time t originates as an aggregate of component losses by m risk
sources.
(ii) The probability distribution of this aggregate loss Xt can be approximated as
X t N μ1 þ þ μm , σ 21 þ þ σ 2d þ 2σ 12 þ þ 2σ m1,m , ð11:23Þ
where μ1, . . ., μm are mean values, σ 12, . . ., σ m2 are variances, and σ 12, σ 13, . . ., σ m1, m
are covariances of component losses (more generally, these moments can vary in time,
but must be estimable from data).
The most usual practical situation for application of this method is the prediction
of VaR in a portfolio composed of various investment or credit instruments, for
which the risk of possible losses must be evaluated or even controlled by manage-
ment (see Example 11.1). If one has the data information xt, xt1, xt2, . . . till time
11.2 Calculation of VaR 277
t (i.e., component losses in the form of m-variate time series observed till time t),
then, e.g., one can predict VaR for next time t + k (e.g., for the next trading day),
which may be prescribed by regulators of various financial institution (for banks by
Basel III or for insurance companies by Solvency II).
Remark 11.6 In practice, the financial time series xt can be often modeled using
methods of multivariate volatility modeling and predicting (see Chap. 13, or (8.62)
for univariate case). Moreover, one can model (log) returns rt instead of absolute
losses Xt (negative values of rt can be interpreted as relative losses), namely r t ¼
Pm
i¼1 cti r ti with moments of the form
X
m m X
X m
Eðr t Þ ¼ cti Eðr ti Þ, varðr t Þ ¼ cti ctj cov r ti , r tj , ð11:24Þ
i¼1 i¼1 j¼1
where rti and cti denote (log) returns and portfolio weights of the ith risk component
95% ¼ 1:645b
at time t, respectively. Then the formulas of the type VaR rel σ tþ1 ðt Þ can be
generalized to the multivariate case. The variances and covariances in (11.23) and
(11.24) explain the name of this method (sometimes they are not estimated from
analyzed data but taken from various published databases).
⋄
Example 11.1 (Calculation of VaR by variance-covariance method). The calcula-
tion of VaR by various methods is demonstrated using a real investment portfolio
composed of three investment instruments (the Czech Republic in 2013):
• 1000 pieces of the Czech government bonds 3.40/15 (i.e., the face value 10,000
CZK, the annual coupons 340 CZK paid on September 1, 2013, on September
1, 2014, and finally on the maturity date of September 1, 2015; see Fig. 11.4).
• 1 million pieces of the stocks of electricity operator ČEZ (the dividend 40 CZK
for each stock paid out in 2013 on June 25; see Fig. 11.5).
• 10 million euros (the deposit priced in CZK using the actual exchange rates EUR/
CZK; see Fig. 11.6).
Table 11.3 and Fig. 11.7 present the development of daily portfolio loss in the
year 2013 (negative values mean profits). For example, the loss for the first trading
day January 2, 2013, is calculated as
(the stock dividend and coupon payment were included in such a way that on June
25, 2013, the price of portfolio was increased by the dividend income of
1,000,000 40 ¼ 40 million CZK and on September 2, 2013, by the coupon income
of 1000 340 ¼ 0.340 million CZK, but on the next trading day these incomes are
transferred to another account and further are not included in the price of portfolio).
278 11 Value at Risk
108
107
106
105
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.4 Price of the Czech government bond 3.40/15 in the year 2013 from Example 11.1.
Source: kurzy.cz (https://akcie-cz.kurzy.cz/emise/dluhopisy/statni-dluhopisy/2010/)
700
Price of stock ČEZ (CZK)
600
500
400
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.5 Price of the stock of electricity operator ČEZ in the year 2013 from Example 11.1.
Source: kurzy.cz (https://prague-stock.kurzy.cz/akcie/cez-183/graf_2013)
Histogram of portfolio loss in Fig. 11.8 indicates (at least graphically) that the
assumption of normality is realistic with negative values denoting profits.
Table 11.4 contains the sample means and the sample covariance matrix of
portfolio components (bonds, stocks, euro deposit) which are necessary for the
variance-covariance method. Hence one easily calculates by means of (11.23) that
11.2 Calculation of VaR 279
28
Exchange rate EUR/CZK (CZK)
27
26
25
24
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.6 Exchange rates EUR/CZK in the year 2013 from Example 11.1. Source: EUROSTAT
(https://ec.europa.eu/eurostat/data/database)
X t N 0:670; 8:5952
These values can be interpreted either as risk characteristics of the given portfolio
during year 2013 or as VaR predictions for the first trading day of year 2014.
⋄
2. Method of Historical Simulation
This method is evidently the most popular in the framework of nonparametric
approaches to the calculation of VaR in practice, since it is very simple. It ignores
entirely the problem of probability distribution or the correlation structure among
component losses of portfolio and assumes simply that the character of losses in
previous periods (e.g., in the trading days of previous years) will sustain also during
a future period. In other words, this method is based on losses, which are simulated
by the “history.” In this context, one usually applies the following estimate used
typically for construction of empirical distribution functions:
280 11 Value at Risk
Table 11.3 Daily portfolio loss in the year 2013 from Example 11.1 (see also Fig. 11.7)
Price of Price of Exchange rate Price of Loss
Trading bond 3.40/15 stock ČEZ EUR/CZK portfolio (million
day Date (%) (CZK) (CZK) (CZK) CZK)
0 28.12.2012 108.16 680.00 25.140 2,013,000,000 –
1 2.1.2013 108.16 680.20 25.225 2,014,050,000 1.050
2 3.1.2013 108.16 675.00 25.260 2,009,200,000 4.850
3 4.1.2013 108.16 680.00 25.355 2,015,150,000 5.950
4 7.1.2013 108.16 658.50 25.535 1,995,450,000 19.700
5 8.1.2013 108.16 663.50 25.580 2,000,900,000 5.450
6 9.1.2013 108.16 673.50 25.530 2,010,400,000 9.500
7 10.1.2013 108.16 661.50 25.630 1,999,400,000 11.000
8 11.1.2013 108.16 655.10 25.615 1,992,850,000 6.550
9 14.1.2013 108.16 644.90 25.615 1,982,650,000 10.200
10 15.1.2013 108.16 644.00 25.610 1,981,700,000 0.950
11 16.1.2013 108.16 648.50 25.580 1,985,900,000 4.200
12 17.1.2013 107.75 651.70 25.540 1,984,600,000 1.300
13 18.1.2013 107.75 652.90 25.630 1,986,700,000 2.100
14 21.1.2013 107.75 647.10 25.625 1,980,850,000 5.850
15 22.1.2013 107.75 648.00 25.610 1,981,600,000 0.750
16 23.1.2013 107.75 643.00 25.600 1,976,500,000 5.100
17 24.1.2013 107.75 622.00 25.595 1,955,450,000 21.050
18 25.1.2013 107.75 617.00 25.605 1,950,550,000 4.900
19 28.1.2013 107.75 613.00 25.700 1,947,500,000 3.050
20 29.1.2013 107.75 615.00 25.660 1,949,100,000 1.600
21 30.1.2013 107.75 615.00 25.660 1,949,100,000 0.000
22 31.1.2013 107.42 612.10 25.620 1,942,500,000 6.600
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
230 26.11.2013 105.30 540.10 27.330 1,866,400,000 9.600
231 27.11.2013 105.30 553.00 27.340 1,879,400,000 13.000
232 28.11.2013 105.30 555.00 27.350 1,881,500,000 2.100
233 29.11.2013 105.30 559.00 27.390 1,885,900,000 4.400
234 2.12.2013 105.30 560.00 27.405 1,887,050,000 1.150
235 3.12.2013 105.30 545.00 27.460 1,872,600,000 14.450
236 4.12.2013 105.30 540.80 27.455 1,868,350,000 4.250
237 5.12.2013 105.30 530.00 27.450 1,857,500,000 10.850
238 6.12.2013 105.30 525.50 27.490 1,853,400,000 4.100
239 9.12.2013 105.30 533.00 27.500 1,861,000,000 7.600
240 10.12.2013 105.30 533.70 27.450 1,861,200,000 0.200
241 11.12.2013 105.30 532.90 27.435 1,860,250,000 0.950
242 12.12.2013 105.30 520.00 27.480 1,847,800,000 12.450
243 13.12.2013 105.30 514.40 27.535 1,842,750,000 5.050
244 16.12.2013 105.30 513.90 27.595 1,842,850,000 0.100
245 17.12.2013 105.30 514.80 27.655 1,844,350,000 1.500
(continued)
11.2 Calculation of VaR 281
40
30
Portfolio loss (mil. CZK)
20
10
-10
-20
-30
-40
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.7 Daily portfolio loss in the year 2013 from Example 11.1 (negative values mean profits)
1 X
T
PðX > xÞ ¼ I , ð11:25Þ
T t¼1 ½xt >x
where x1, x2, . . ., xT are observed losses during a period of length T (e.g.,
T trading days).
Example 11.2 (Calculation of VaR by method of historical simulation). Let us
consider the portfolio from Example 11.1 composed of government bonds, stocks of
ČEZ, and euro deposit. Table 11.5 presents 15 highest daily losses. The value at risk
of this portfolio with confidence level 95% can be found according to (11.25): as 12/
252 ¼ 4.76% (the twelfth highest daily loss is 13.450 million CZK) and 13/
252 ¼ 5.16% (the thirteenth highest daily loss is 13.250 million CZK), hence it
follows approximately by means of interpolation (according to Table 11.5 with the
282 11 Value at Risk
80
70
60
Loss frequency
50
40
30
20
10
0
-35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40
Portfolio loss (mil. CZK)
Fig. 11.8 Histogram of portfolio loss in the year 2013 from Example 11.1
Table 11.4 Variance-covariance method from Example 11.1: sample means and sample covari-
ance matrix of portfolio components (bonds, stocks, and euro deposit)
Bonds Stocks Euros
Sample mean 0.113 0.647 0.091
Sample variance 0.634 75.831 0.919
Sample standard deviation 0.796 8.708 0.959
Sample covariance matrix: Bonds 0.634 1.358 0.054
Stocks 1.358 75.831 0.342
Euros 0.054 0.342 0.919
daily losses during 2013 ordered from the highest 40.700 million CZK to the lowest
one)
This value at risk is significantly lower than the value 14.8 million CZK from
Example 11.1.
⋄
3. Various Modification of Methods of Historical Simulation
The method of historical simulation described above in its basic form can be
modified in various ways:
(a) Method simulating previous development:
This method simulates additional data respecting the development of previous
ratios among neighboring observations (see Example 11.3):
11.2 Calculation of VaR 283
108:16 680:20
105:30 ¼ 105:30 CZK, 517:00 ¼ 517:15 CZK,
108:16 680:00
25:225
27:425 ¼ 27:518 CZK,
25:140
i.e., the prices from the reference row of 31.12.2013 are multiplied by growth rates
between neighboring trading days December12, 2012, and January 2, 2013; in the
second simulation, prices from the reference row of December 31, 2013, are
multiplied by growth rates between neighboring trading days January 2, 2013, and
January 3, 2013, presenting further possibility of change of the reference row to the
neighboring date January 2, 2014 (it is the first trading day of year 2014, for which
VaR is predicted in this example), and so on.
In this way, one obtains 252 simulations in Table 11.6 including corresponding
losses. For example, the loss (i.e., the profit with negative sign) generated in the first
simulation is
284 11 Value at Risk
Table 11.6 Method from Example 11.3 simulating previous development: 252 simulated losses
(the first row contains values for reference date)
Price of Price of Exchange rate Loss
Order number bond 3.40/15 stock ČEZ EUR/CZK Price of (million
of simulation (%) (CZK) (CZK) portfolio (CZK) CZK)
31.12.2013 105.30 517.00 27.425 1,844,250,000
1 105.30 517.15 27.518 1,845,329,316 1.079
2 105.30 513.05 27.463 1,840,678,158 3.572
3 105.30 520.83 27.528 1,849,111,053 4.861
4 105.30 500.65 27.620 1,829,850,630 14.399
5 105.30 520.93 27.473 1,848,658,896 4.409
6 105.30 524.79 27.371 1,851,505,949 7.256
7 105.30 507.79 27.532 1,836,112,645 8.137
8 105.30 512.00 27.409 1,839,087,530 5.162
9 105.30 508.95 27.425 1,836,200,237 8.050
10 105.30 516.28 27.420 1,843,474,960 0.775
11 105.30 520.61 27.393 1,847,541,316 3.291
12 104.90 519.55 27.382 1,842,380,681 1.869
13 105.30 517.95 27.522 1,846,168,397 1.918
14 105.30 512.41 27.420 1,839,603,758 4.646
15 105.30 517.72 27.409 1,844,808,518 0.559
16 105.30 513.01 27.414 1,840,153,715 4.096
17 105.30 500.12 27.420 1,827,311,521 16.938
18 105.30 512.84 27.436 1,840,201,201 4.049
19 105.30 513.65 27.527 1,841,915,824 2.334
20 105.30 518.69 27.382 1,845,509,938 1.260
21 105.30 517.00 27.425 1,844,250,000 0.000
22 104.98 514.56 27.382 1,838,159,635 6.090
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
230 105.30 507.42 27.485 1,835,270,637 8.979
231 105.30 529.35 27.435 1,856,698,616 12.449
232 105.30 518.87 27.435 1,846,220,112 1.970
233 105.30 520.73 27.465 1,848,377,223 4.127
234 105.30 517.92 27.440 1,845,325,058 1.075
235 105.30 503.15 27.480 1,830,952,187 13.298
236 105.30 513.02 27.420 1,840,215,844 4.034
237 105.30 506.68 27.420 1,833,875,350 10.375
238 105.30 512.61 27.465 1,840,260,013 3.990
239 105.30 524.38 27.435 1,851,728,451 7.478
240 105.30 517.68 27.375 1,844,430,351 0.180
241 105.30 516.23 27.410 1,843,325,169 0.925
242 105.30 504.48 27.470 1,832,184,730 12.065
243 105.30 511.43 27.480 1,839,231,207 5.019
244 105.30 516.50 27.485 1,844,345,076 0.095
245 105.30 517.91 27.485 1,845,751,733 1.502
(continued)
11.2 Calculation of VaR 285
ð1, 845, 329, 316 1, 844, 250, 000Þ ¼ 1, 079, 316 CZK
¼ 1:079 million CZK,
and similarly for further simulations. Table 11.7 contains such daily losses for each
of 252 simulations ordered downward. Hence in the same way as in Table 11.5, one
can find the value at risk approximately as
This value at risk is significantly lower than the value 14.8 million CZK from
Example 11.1, and it is nearly the same as in Example 11.2.
⋄
(b) Method of historical simulation based on principle EWMA:
It is the classical method of historical simulation from Example 11.1
supplemented by the principle EWMA (exponentially weighted moving average).
This principle, which weighs time data by means of weights decreasing ex-
ponentially to the past, is frequently applied for financial time series (see Sects.
3.3.1 or 8.3.1). When we constructed the corresponding quantile q0.95 ¼ VaR95% of
losses in Example 11.1, we looked for such a loss among T losses ordered downward
that its order number i fulfills as the first one the inequality i/T 0.05. If using the
principle EWMA, we assign to the ith loss in their descending arrangement the
weight
1λ
λi1 , ð11:26Þ
1 λT
so that now we look for such a loss among T losses ordered downwards that its order
number i fulfills as the first one the inequality
286 11 Value at Risk
where T is the number of losses (it can be compared with the original inequality
i/T 0.05; see above). The coefficient λ must be chosen a priori controlling the
impact of time arrangement of losses: the closer to 1 this coefficient λ is, the less
important is the time allocation of losses so that losses more remote in the past may
have impact on VaR (the weights (11.26) converge to 1/T for λ going to 1, so that the
method converts to the classical calculation of VaR by means of historical simula-
tion, where the time arrangement does not play any role). For values λ usual in
practice, Table 11.8 indicates the order number of such a loss among losses ordered
downward which determines the corresponding VaR95% (e.g., for λ ¼ 0.99, the
position of the asterisk indicates the loss order i ¼ 5).
Example 11.4 (Method of historical simulation based on principle EWMA). We
shall again demonstrate this method by means of portfolio from Example 11.1. Let
us choose, e.g., λ ¼ 0.99, so that Table 11.8 indicates the value at risk VaR0.95 as the
fifth loss in the descending arrangement of losses in Table 11.5, i.e.,
This value at risk is by far the highest one in comparison with all previous results so
that the time allocation of losses has a significant impact for construction of VaR.
11.2 Calculation of VaR 287
Table 11.8 Method of historical simulation based on principle EWMA from Example 11.4: order
number of loss in descending arrangement for construction of VaR95%
Order number
of loss i λ ¼ 0.9 λ ¼ 0.95 λ ¼ 0.99 λ ¼ 0.995 λ ¼ 0.999
1 0.1000* 0.0500* 0.0109 0.0070 0.0045
2 0.1900 0.0975 0.0216 0.0139 0.0090
3 0.2710 0.1426 0.0323 0.0208 0.0134
4 0.3439 0.1855 0.0428 0.0277 0.0179
5 0.4095 0.2262 0.0532* 0.0345 0.0224
6 0.4686 0.2649 0.0636 0.0413 0.0269
7 0.5217 0.3017 0.0738 0.0481* 0.0313
8 0.5695 0.3366 0.0839 0.0548 0.0358
9 0.6126 0.3698 0.0939 0.0615 0.0402
10 0.6513 0.4013 0.1039 0.0682 0.0447
11 0.6862 0.4312 0.1137 0.0748 0.0491*
12 0.7176 0.4596 0.1234 0.0814 0.0536
13 0.7458 0.4867 0.1330 0.0880 0.0580
14 0.7712 0.5123 0.1426 0.0945 0.0624
15 0.7941 0.5367 0.1520 0.1010 0.0668
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
⋄
4. Method of Simulation Monte Carlo
This method usually combines parametric and nonparametric approaches:
At first one estimates parametrically the probability distribution of losses. There
are various alternatives how to do it: (1) to estimate separately the marginal distri-
butions of particular loss components and their correlation (or copula structure),
(2) to estimate directly the multivariate distribution of loss vector, and (3) to estimate
the loss dynamically as a multivariate time series (e.g., by means of a multivariate
GARCH model; see Sect. 13.3).
In the second step, one realizes Monte Carlo simulations based on calibrated
(estimated) model. It results in a set of mutually independent loss values referred to
the time moment of constructed VaR. These loss realizations enable us to calculate
the corresponding value at risk in the same (nonparametric) way as in the previous
methods described in this section. The simulation technique denoted as bootstrap is
preferred in this context (then the resulting VaR is sometimes called resampled value
at risk).
The advantage of this Monte Carlo simulation method consists mainly in the fact
that the volume of simulated losses can be much larger than the volume of observed
losses (e.g., in the case of credit portfolio, the volume of observed losses is relatively
limited). On the other hand, there are also drawbacks of this method, namely the
calculation complexity (particularly for portfolios with financial derivatives, which
must be newly priced for each simulation) and high demands on the quality of
simulation models.
288 11 Value at Risk
This section presents basic facts on quantitative approach to extreme values. Even
though the corresponding theory denoted explicitly as EVT (Extreme Value Theory;
see Embrechts et al. (1997), McNeil et al. (2005), and others) comprises very
complex and nontrivial results, its applications are very broad including time series
data not only in economy (particularly in finance and insurance, e.g., financial losses
or insured claims) but also in technical and environmental disciplines (e.g., river
flows in hydrology, wind forces in climatology, exhaust concentrations in environ-
mental control) and others. As the risk measures based on the value at risk principle
have some extreme properties, the theme of EVT is included in this chapter.
The EVT makes use mainly of parametric methods because extreme values are
rare (i.e., with small probabilities that can be quantified only in a parametric way). As
the extreme value methodology is concerned, the following two approaches are
preferred in practical data analysis:
• Block maxima (or minima): this approach segments particular data to blocks and
then uses maximum (or minimum) values of particular blocks.
• Threshold excesses: this approach uses data exceeding a given threshold only.
M n ¼ max ðY 1 , . . ., Y n Þ: ð11:28Þ
where ξ is a real parameter (it is so-called shape parameter) and 1 + ξx > 0. Generally,
one can add a parameter of location μ and a positive parameter of variability σ so that
one has three-parametric GEV with distribution function Hξ,μ,σ (x) ¼ Hξ ((xμ)/σ).
Here so-called Fisher–Tippett Theorem plays the role of Central Limit Theorem: If
there exist sequences of real constants cn > 0 and dn such that
M n dn
lim P x ¼ lim F n ðcn x þ d n Þ ¼ H ðxÞ ð11:31Þ
n!1 cn n!1
(λ is a positive parameter). Then for choice of cn ¼ 1/λ and dn ¼ (ln n)/λ it holds
n
M n dn 1
P x ¼ F n ðcn x þ d n Þ ¼ 1 exp ðxÞ , x ln n,
cn n
and hence
290 11 Value at Risk
M n dn
lim P x ¼ lim F n ðcn x þ d n Þ ¼ exp ðex Þ, x 2 R,
n!1 cn n!1
0,50
0,45 Fréchet distribution
0,40 (ksi = 0.5)
0,35
0,30 Gumbel distribution
0,25 (ksi = 0)
0,20
0,15 Weibull distribution
0,10 (ksi = - 0.5)
0,05
0,00
-0,05
-2,0 -1,0 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
1,0
0,9
0,8
Fréchet distribution
0,7
(ksi = 0.5)
0,6
0,5 Gumbel distribution
0,4 (ksi = 0)
0,3
0,2 Weibull distribution
0,1 (ksi = - 0.5)
0,0
-0,1
-2,0 -1,0 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
Fig. 11.9 Probability density (upper figure) and distribution function (bottom figure) of Fréchet
distribution (ξ ¼ 0.5), Gumbel distribution (ξ ¼ 0), and Weibull distribution (ξ ¼ 0.5) (see
(11.30))
Weibull distribution (see (11.30) for ξ < 0 with support (1, 1/ξ)) contains in
its maximum domain of attraction the distributions that are mostly uninteresting in
financial applications since their support is bounded from the right-hand side:
(see Fig. 11.9, where the support of Weibull distribution with ξ ¼ 0.5 is bounded
from the right-hand side by the point xF ¼ 1/ξ ¼ 2). The maximum domain of
attraction of Weibull distribution contains, e.g., beta and uniform distribution.
292 11 Value at Risk
2. Block Minima
The previous results can be easily extended to the case of block minima where
instead of (11.28) one investigates the behavior of min(Y1, . . ., Yn) ¼ max(Y1,
. . ., Yn). For instance, the limit relation (11.31) implies
min ðY 1 , . . ., Y n Þ bn
lim P x
n!1 an
max ðY 1 , . . ., Y n Þ þ bn
¼ 1 lim P x ¼ 1 H ðxÞ ð11:35Þ
n!1 an
X
r X
s
yt ¼ σ t εt , σ 2t ¼ α0 þ αi y2ti þ β j σ 2tj , ð11:36Þ
i¼1 j¼1
where {εt} are iid random variables with zero mean value and unit variance, and the
parameters of model fulfill
Then Fisher–Tippett Theorem (see above) can be reformulated to the form, in which
so-called extreme index θ (0 < θ < 1); see, e.g., Table 11.9 for the model ARCH
(1) from (8.33). Its existence is guaranteed for each process GARCH. Then instead
of (11.31) it holds
M n dn
lim P x ¼ lim F nθ ðcn x þ dn Þ ¼ H θ ðxÞ ð11:38Þ
n!1 cn n!1
(the extreme index should not be confused with the tail index 1/ξ, see above).
Therefore, instead of n dependent observations, it is possible to investigate the
11.3 Extreme Value Theory 293
Table 11.9 Extreme index θ for selected values of parameter α1 in the model ARCH(1)
α1 0.1 0.3 0.5 0.7 0.9
θ 0.999 0.939 0.835 0.721 0.612
maximum of nθ independent observations with the same distribution function (if the
number n of observations is higher): one may imagine that nθ is the number of
mutually independent clusters in the sequence of n dependent observations. From the
practical point of view, it means that for higher n one can approximate the distribu-
tion of the maximum of process GARCH by
z dn
P Mn z Hθ : ð11:39Þ
cn
The processes with extreme index θ ¼ 1 (e.g., ARMA processes with normally
distributed white noise) do not show tendency to cluster high values, and their
extremes behave as in the case of independent random variables.
4. Statistical Analysis of Block Maxima
The statistical analysis of block maxima demands a data sample of observed maxima.
Therefore, we usually apply the design where data y1, y2, . . . are divided into m blocks
of size n and maxima in particular blocks are denoted as mn1, . . ., mnm. These are, e.g.,
daily maxima of log returns of a stock index during one calendar year. If the data are
generated from the same distribution with a known distribution function F and are
mutually independent (or possibly of the type GARCH), then according to the theory
described above it suffices (for higher n) to approximate the distribution of block
maxima by the three-parametric distribution Hξ,μ,σ (x) ¼ Hξ ((xμ)/σ) (see its stan-
dardized form Hξ in (11.30)). If hξ,μ,σ denotes the corresponding probability density,
then the unknown parameters ξ, μ, and σ identifying the distribution of block maxima
can be estimated using the maximum likelihood method by maximizing over these
parameters the log likelihood function of the form
X
m
lðξ, μ, σ; mn1 , . . . , mnm Þ ¼ ln hξ,μ,σ ðmni Þ ¼
m¼1
Xm X m
1 m μ m μ 1=ξ
¼ m ln σ 1 þ ln 1 þ ξ ni ln 1 þ ξ ni
ξ i¼1
σ i¼1
σ
ð11:40Þ
under the conditions σ > 0 and 1 + ξ (mni μ)/σ > 0 for all i. These estimates have
convenient properties of maximum likelihood estimates even though the range of
their feasible values may depend on the observed data (in the case of ξ > 0.5).
Remark 11.8 There exists a conflict of interests between the number and the size of
blocks. It is convenient from the point of view of estimation if the number of blocks
294 11 Value at Risk
b
σ b
br n,k ¼ b
μþ ð ln ð1 1=k ÞÞξ 1 : ð11:42Þ
b
ξ
• Return period relates to the problem of how to find the average frequency of
occurrence of an extreme event over a given level. A more exact formulation is as
follows: if H is the distribution function of block maxima in blocks of size n, then
for a given u the corresponding return period is defined as
1
kn,u ¼ : ð11:43Þ
1 H ðuÞ
The value kn,u may be obviously interpreted in such a way that in each kn,u-tuple
of blocks of size n we can on average expect the occurrence of just one block, in
which the level u will be exceeded; e.g., k260, u is the number of years, in which
the yearly maximum of daily log returns just once exceeds the level u. After
substituting the estimated distribution function H, one obtains the estimated
return period in the form
11.3 Extreme Value Theory 295
b 1
k n,u ¼ : ð11:44Þ
1 Hb ðuÞ
ξ,b
μ,b
σ
Example 11.6 McNeil et al. (2005) apply the theory of block extremes to the time
series of daily drops (in percent) of stock index S&P 500 (this index is used globally
as a barometer of stock markets) for the period 1960 to Friday, October 16, 1987,
when during one day the given index dropped by 5.25% (as a forerunner of Black
Monday, October 19, 1987, with the catastrophic fall of this index by 20.5%).
Therefore, one analyzed the yearly and semiannual block maxima of daily drops
(recorded in absolute values), i.e., 28 and 56 observed values of block maxima,
respectively. First one constructed the maximum likelihood estimates according to
(11.40) obtaining
• For yearly block maxima: very unstable estimates b ξ ¼ 0:27, bμ ¼ 2:04 and
b
σ ¼ 0:72 with high standard deviations 0.21, 0.16, and 0.14 (the limit
Fréchet distribution shows a very heavy right tail and infinite fourth moment,
since 4 > 1/ 0.27).
• For semiannual block maxima: more stable estimates b ξ ¼ 0:36, b
μ ¼ 1:65, and
b
σ ¼ 0:54 with more reasonable standard deviations 0.15, 0.09, and 0.08.
Further one estimated the return level according to (11.42), namely
• Ten-year return level: br260,10 ¼ 4.3% with estimated 95% confidence interval
(3.4%; 7.1%).
• Twenty-year return level: br 130,20 ¼ 4.5% with estimated 95% confidence interval
(3.5%; 7.4%).
The drop by 20.5% during Black Monday, October 19, 1987 (i.e., just on the
beginning of future period from the point of view of performed analysis) missed
significantly previous confidence intervals both for the 10-year and for 20-year
return level.
⋄
This approach explores the observations exceeding a given level (or threshold). Its
main advantage consists in the fact that it does not “waste” data as the method of
block maxima from Sect. 11.3.1 which exploits only maxima of (large) blocks and
throws away remaining information. The data which we handle in this method are
extreme in the sense that they exceed a given (usually high) level so that they may be
denoted as excesses (e.g., in the framework of reinsurance of commercial insurance
companies one can confine oneself to such parts of losses that lie in the layer that has
origin in a designated level).
296 11 Value at Risk
where β and ξ are scale and shape real parameters (β > 0). One has x 0 for ξ 0,
while 0 x β/ξ for ξ < 0. The mean value of GPD is equal to β/(1 ξ) for ξ < 1,
and E(Y k) ¼ 1 for k 1/ξ, so that, e.g., the variance of GDP is infinite for ξ ¼ 0.5 as
a consequence of heavy tails. Similarly as GEV, the generalized Pareto distribution
includes three possible cases in dependence on the sign of the parameter ξ (see
Fig. 11.10):
• For ξ > 0: Pareto distribution.
• For ξ ¼ 0: exponential distribution.
• For ξ < 0: Pareto type II distribution (it has the bounded support (0, β/ξ)).
2. Distribution of Excesses
Let Y be a random variable with distribution function F. Then the excess Y u over a
given level (or threshold) u has so-called excess distribution function Fu(x) of the
form
F ð x þ uÞ F ð uÞ
F u ðxÞ ¼ PðY u xjY > uÞ ¼ ð11:46Þ
1 F ð uÞ
Example 11.7 Let us assume in addition that Y has the generalized Pareto distri-
bution (11.45). Then according to (11.46) it holds
where x 0 for ξ 0, while 0 x β/ξ u for ξ < 0. It means that the excess
distribution remains of the type GPD. Particularly for the exponential distribution
F(x) ¼ G0,β (x), it stays Fu(x) ¼ G0,β (x) ¼ F(x), which confirms the characteristic
“loss of memory” of exponential distribution (i.e., the excess distribution of
exponential distribution remains the identical exponential distribution regardless
of the size of level u).
11.3 Extreme Value Theory 297
1,00
0,90 Pareto distribution
0,80 (ksi = 0.5)
0,70
0,60 exponential distribution
0,50 (ksi = 0)
0,40
0,30 Pareto type II distribut.
0,20 (ksi = -0.5)
0,10
0,00
-0,10
0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
1,0
0,9
0,8
Pareto distribution
0,7
(ksi = 0.5)
0,6
0,5 exponential distribution
0,4 (ksi = 0)
0,3
0,2 Pareto type II distribut.
0,1 (ksi = -0.5)
0,0
-0,1
0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
Fig. 11.10 Probability density (upper figure) and distribution function (bottom figure) of Pareto
distribution (ξ ¼ 0.5), exponential distribution (ξ ¼ 0), and Pareto type II distribution (ξ ¼ 0.5)
(see (11.45) with β ¼ 1)
β þ ξu
eðuÞ ¼ , ð11:49Þ
1ξ
where u 0 for 0 ξ < 1, while 0 u β/ξ for ξ < 0. Hence the mean excess of
GPD is the linear function of u, which is a useful property in some related statistical
procedures (e.g., for the identification of distribution GDP).
⋄
298 11 Value at Risk
The generalized Pareto distribution plays an important role for modeling the
excesses not only in the sense of Example 11.7 but also as their limit distribution.
Namely, the following Balkema–de Haan Theorem holds as an analogy to Fisher–
Tippett Theorem of limiting GEV distribution for block maxima (see Sect. 11.3.1): if
there exist real constants au and bu such that Fu(au x + bu) has a continuous limiting
distribution function for u ! xF (xF is the right endpoint of support of F including the
possibility of xF ¼ 1; see (11.34)), then
lim F u ðxÞ Gξ,βðuÞ ðxÞ ¼ 0 ð11:50Þ
u!xF
for a suitable parameter ξ and a function β(u) (obviously in the situation of Example
11.7 it will be β(u) ¼ β + ξu for another parameter β). In other words, the excess
distribution can be approximated for higher levels u by GPD.
3. Statistical Analysis of Excesses
The statistical analysis is usually based on a sample of excesses x1 , x2 , . . . , xN u ,
which in the original sample y1, y2, . . ., yn of iid observations with a distribution
F exceeded a given level u. If we accept the approximation by the generalized Pareto
distribution (11.45) with probability density gξ,β, then the unknown parameters ξ and
β can be estimated by maximizing over these parameters the log likelihood function
of the form
X
Nu
lðξ, β; x1 , . . . , xN u Þ ¼ ln gξ,β x j
j¼1
Nu
1 X xj
¼ N u ln β 1 þ ln 1 þ ξ ð11:51Þ
ξ j¼1 β
under conditions β > 0 and 1 + ξ xj /β > 0 for all j. Similarly as in the case of block
maxima, it is again possible to generalize this procedure by means of extreme index
to time series of correlated observations (e.g., for the models GARCH; see
Embrechts et al. (1997)).
Moreover, the model constructed for a given level u can be transformed to models
with excesses over any higher levels v u since one can easily show that it holds
and similarly
β þ ξ ð v uÞ ξv β ξu
eð vÞ ¼ ¼ þ , ð11:53Þ
1ξ 1ξ 1ξ
where u v < 1 for 0 ξ < 1, while u v u β/ξ for ξ < 0. The linearity of the
mean excess function (11.53) of argument v (for fixed u) is helpful when looking for
11.3 Extreme Value Theory 299
such a level u that the excesses over u can be modeled by means of GDP (see
Example 11.8).
For loss observations y1, y2, . . ., yn, the following sample mean excess (11.54) can
be used as a statistical estimate of the mean excess over level u:
Pn
i¼1 ðyi uÞI ½yi > u
e n ð uÞ ¼ Pn : ð11:54Þ
i¼1 I ½yi > u
The sample excesses are often used to identify graphically the GDP of excesses in
real data. Data y1, y2, . . ., yn are ordered by their size in ascending order to the form
y(1) y(2) . . . y(n) (so-called ordered statistics) and then plotted in a plane graph
as points with coordinates (y(i), en(y(i))) for i ¼ 2, . . ., n, where en(∙) is the sample
mean excess according to (11.54). If these points lie on a line approximately starting
with some ordered statistics, then according to (11.53) the approximation of excess
distribution by the generalized Pareto distribution is proper starting again with the
level corresponding to this ordered statistics. Moreover, in such a case the slope of
identified line corresponds to the size of parameter ξ in the given GPD (see (11.53)).
Example 11.8 In the context of the excess modeling, the example of Danish fire
insurance data is well known (see, e.g., McNeil et al. (2005)): Table 11.10 and
Fig. 11.11 present time series of losses over 1 million DKK (Danish crowns) harmed
by fires in Denmark in the period 1980–1990.
Figure 11.12 shows the sample mean excess (11.54) as a function of level u (more
exactly, the graph plots points with coordinates (y(i), en(y(i))); see the discussion
above. If one ignores the points with high levels u, where the sample estimates of
mean excesses (11.54) are unreliable due to small number of data, then in the graph
starting approximately with the level of 10 million DKK one can identify an
increasing line. Therefore starting with this level it is possible to approximate the
excess distribution by the generalized Pareto distribution with a positive parameter ξ,
i.e., by the classical (“non-generalized”) Pareto distribution. The maximum likeli-
hood estimates of parameters maximizing the log likelihood (11.51) are then b β ¼ 7.0
b
with standard deviation 1.1 and ξ ¼ 0.50 with standard deviation 0.14. The GPD for
various levels u can be obtained by means of the simple transformation (11.48).
⋄
300 11 Value at Risk
Table 11.10 Losses over 1 million DKK harmed by fires in Denmark in Example 11.8
Losses over
Losses over Losses over 1 million
Date 1 million DKK Date 1 million DKK Date DKK
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
01/03/1980 1.683748 ⋮ 12/02/1984 1.256545 ⋮ 11/27/1990 1.134488
01/04/1980 2.093704 ⋮ 12/03/1984 1.103048 ⋮ 11/29/1990 3.407591
01/05/1980 1.732581 ⋮ 12/03/1984 1.204188 ⋮ 11/29/1990 1.072607
01/07/1980 1.779754 ⋮ 12/08/1984 1.151832 ⋮ 11/30/1990 1.167492
01/07/1980 4.612006 ⋮ 12/08/1984 1.884817 ⋮ 11/30/1990 1.072607
01/10/1980 8.725274 ⋮ 12/10/1984 7.539267 ⋮ 12/04/1990 1.270627
01/10/1980 7.898975 ⋮ 12/11/1984 1.099476 ⋮ 12/05/1990 1.472772
01/16/1980 2.208045 ⋮ 12/12/1984 1.570681 ⋮ 12/06/1990 1.036304
01/16/1980 1.486091 ⋮ 12/13/1984 2.670157 ⋮ 12/07/1990 1.650165
01/19/1980 2.796171 ⋮ 12/17/1984 1.151832 ⋮ 12/08/1990 1.678218
01/21/1980 7.320644 ⋮ 12/19/1984 3.874346 ⋮ 12/09/1990 2.640264
01/21/1980 3.367496 ⋮ 12/22/1984 5.026178 ⋮ 12/09/1990 1.601485
01/24/1980 1.464129 ⋮ 12/28/1984 1.780105 ⋮ 12/10/1990 17.739274
01/25/1980 1.722223 ⋮ 12/29/1984 4.764398 ⋮ 12/14/1990 4.372937
01/26/1980 11.374817 ⋮ 12/31/1984 1.151832 ⋮ 12/15/1990 1.361386
01/26/1980 2.482739 ⋮ 01/01/1985 1.500000 ⋮ 12/16/1990 1.183993
01/28/1980 26.214641 ⋮ 01/03/1985 1.251000 ⋮ 12/17/1990 2.970297
02/03/1980 2.002430 ⋮ 01/04/1985 1.030000 ⋮ 12/19/1990 1.023102
02/05/1980 4.530015 ⋮ 01/05/1985 1.050000 ⋮ 12/20/1990 1.130363
02/07/1980 1.841753 ⋮ 01/05/1985 1.900000 ⋮ 12/21/1990 3.011551
02/10/1980 3.806735 ⋮ 01/05/1985 1.100000 ⋮ 12/21/1990 1.402640
02/13/1980 14.122076 ⋮ 01/06/1985 1.881750 ⋮ 12/22/1990 2.322607
02/16/1980 5.424253 ⋮ 01/07/1985 1.007000 ⋮ 12/23/1990 1.115512
02/19/1980 11.713031 ⋮ 01/07/1985 1.630000 ⋮ 12/23/1990 1.691419
02/20/1980 1.515373 ⋮ 01/07/1985 1.025000 ⋮ 12/24/1990 1.237624
02/21/1980 2.538589 ⋮ 01/08/1985 1.007274 ⋮ 12/27/1990 1.114686
02/22/1980 2.049780 ⋮ 01/08/1985 3.500000 ⋮ 12/30/1990 1.402640
02/23/1980 12.465593 ⋮ 01/10/1985 2.900000 ⋮ 12/30/1990 4.867987
02/25/1980 1.735445 ⋮ 01/11/1985 2.463137 ⋮ 12/30/1990 1.072607
02/27/1980 1.683748 ⋮ 01/11/1985 4.625000 ⋮ 12/31/1990 4.125413
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
Source: Copenhagen Reinsurance
11.4 Exercises 301
250
200
150
100
50
0
1980 1981 1982 1983 1985 1986 1987 1988 1989 1990
Fig. 11.11 Losses over 1 million DKK harmed by fires in Denmark in Example 11.8. Source:
Copenhagen Reinsurance
70
60
50
sample mean 40
excess e(u) 30
20
10
0
0 10 20 30 40 50
level u
Fig. 11.12 Sample mean excesses as function of level u for losses over 1 million DKK harmed by
fires in Denmark in Example 11.8. Source: calculated by EViews
11.4 Exercises
Exercise 11.1 Repeat the calculation of VaR from Examples 11.1–11.4 (calculation
of VaR for daily losses in given investment portfolio in the year 2013), but only for
the last month of December 2013.
Part V
Multivariate Time Series
Chapter 12
Methods for Multivariate Time Series
Most procedures for univariate time series from previous chapters can be generalized
for multivariate time series, where instead of scalar values yt we observe m-variate
vector values yt ¼ (y1t, . . ., ymt)0 in time as realizations of a vector random process
(see Sect. 2.1). The transfer from univariate to multivariate dimension mostly means
only higher formal and numerical complexity of methods described in previous parts
of this text (decomposition methods, methods for linear and nonlinear processes, and
the like), which will be demonstrated briefly in this section by means of examples of
stationary multivariate time series. Later we shall see that such a parallel description
of several scalar processes brings to the analysis further elements that have exclu-
sively the multivariate character (examples are the routine methodology VAR for
multivariate time series, the cointegration among particular univariate components,
and others).
(Weak) stationarity of multivariate time series {yt} means again that the
corresponding process is invariant to time shifts of the first and second moments, i.e.,
For stationary multivariate time series, one can define analogously as in the
univariate case the (matrix) autocovariance function
i.e., γ ij(k) ¼ γ ji(k) and ρij(k) ¼ ρji(k). Then the estimated (matrix) autocovariance
function (estimated by means of y1, . . ., yn) is simply
1 X
n
Ck ¼ ðy yÞðytk yÞ0 , k ¼ 0, 1, . . . , n 1 ð12:7Þ
n t¼kþ1 t
Rk ¼ D b 1=2 ,
b 1=2 Ck D k ¼ 0, 1, . . . , n 1, ð12:8Þ
⋄
Example 12.1 Table 12.1 and Fig. 12.1 present the first differences of monthly
yields to maturity YTM for 3-month T-bills (so-called short-term interest rates
12.1 Generalization of Methods for Univariate Time Series 307
Table 12.1 Monthly data in Example 12.1 (the first differences of monthly yields to maturity for
3-month T-bills and corporate bonds AAA in USA in % p.a.); see also Table 12.16
Month DTB3 DAAA Obs DTB3 DAAA Obs DTB3 DAAA
1985M01 0.40 0.05 1988M05 0.35 0.23 1991M09 0.14 0.14
1985M02 0.46 0.05 1988M06 0.23 0.04 1991M10 0.22 0.06
1985M03 0.35 0.43 1988M07 0.23 0.10 1991M11 0.43 0.07
1985M04 0.57 0.33 1988M08 0.29 0.15 1991M12 0.48 0.17
1985M05 0.44 0.51 1988M09 0.21 0.29 1992M01 0.28 0.11
1985M06 0.55 0.78 1988M10 0.11 0.31 1992M02 0.00 0.09
1985M07 0.04 0.03 1988M11 0.34 0.06 1992M03 0.21 0.06
1985M08 0.13 0.08 1988M12 0.41 0.12 1992M04 0.24 0.02
1985M09 0.10 0.02 1989M01 0.20 0.05 1992M05 0.15 0.05
1985M10 0.09 0.05 1989M02 0.19 0.01 1992M06 0.04 0.06
1985M11 0.03 0.47 1989M03 0.35 0.17 1992M07 0.42 0.15
1985M12 0.13 0.39 1989M04 0.13 0.01 1992M08 0.14 0.12
1986M01 0.03 0.11 1989M05 0.30 0.22 1992M09 0.17 0.03
1986M02 0.01 0.38 1989M06 0.18 0.47 1992M10 0.13 0.07
1986M03 0.44 0.67 1989M07 0.30 0.17 1992M11 0.30 0.11
1986M04 0.53 0.21 1989M08 0.01 0.03 1992M12 0.11 0.12
1986M05 0.06 0.30 1989M09 0.19 0.05 1993M01 0.19 0.07
1986M06 0.09 0.04 1989M10 0.13 0.09 1993M02 0.11 0.20
1986M07 0.37 0.25 1989M11 0.08 0.03 1993M03 0.02 0.13
1986M08 0.27 0.16 1989M12 0.03 0.03 1993M04 0.08 0.12
1986M09 0.38 0.17 1990M01 0.00 0.13 1993M05 0.07 0.03
1986M10 0.01 0.03 1990M02 0.12 0.23 1993M06 0.14 0.10
1986M11 0.17 0.18 1990M03 0.11 0.15 1993M07 0.05 0.16
1986M12 0.14 0.19 1990M04 0.09 0.09 1993M08 0.00 0.32
1987M01 0.04 0.13 1990M05 0.00 0.01 1993M09 0.09 0.19
1987M02 0.14 0.02 1990M06 0.04 0.21 1993M10 0.08 0.01
1987M03 0.03 0.02 1990M07 0.08 0.02 1993M11 0.08 0.26
1987M04 0.20 0.49 1990M08 0.22 0.17 1993M12 0.04 0.00
1987M05 0.01 0.48 1990M09 0.06 0.15 1994M01 0.06 0.01
1987M06 0.06 0.01 1990M10 0.19 0.03 1994M02 0.19 0.16
1987M07 0.09 0.10 1990M11 0.12 0.23 1994M03 0.31 0.40
1987M08 0.22 0.25 1990M12 0.26 0.25 1994M04 0.22 0.40
1987M09 0.32 0.51 1991M01 0.51 0.01 1994M05 0.45 0.11
1987M10 0.08 0.34 1991M02 0.35 0.21 1994M06 0.01 0.02
1987M11 0.59 0.51 1991M03 0.04 0.10 1994M07 0.21 0.14
1987M12 0.01 0.10 1991M04 0.24 0.07 1994M08 0.11 0.04
1988M01 0.10 0.23 1991M05 0.16 0.00 1994M09 0.14 0.27
1988M02 0.21 0.48 1991M06 0.09 0.15 1994M10 0.32 0.23
1988M03 0.00 0.01 1991M07 0.02 0.01 1994M11 0.29 0.11
1988M04 0.23 0.28 1991M08 0.19 0.25 1994M12 0.39 0.22
Source: calculated by EViews
https://fred.stlouisfed.org/graph/?id=TB3MA, https://fred.stlouisfed.org/graph/?id=AAA
308 12 Methods for Multivariate Time Series
.6
.4
.2
.0
-.2
-.4
-.6
-.8
1985 1986 1987 1988 1989 1990 1991 1992 1993 1994
Fig. 12.1 Monthly data in Example 12.1 (the first differences of monthly yields to maturity for
3-month T-bills and corporate bonds AAA in the USA in % p.a.)
denoted as DTB3 in % p.a.) and for corporate bonds of the highest rating AAA by
S&P (denoted as DAAA in % p.a.) during 10-year period 1985–1994 in the USA.
Graphs of both time series of lengths 120 in Fig. 12.1 can be regarded as stationary
(it is just the reason why the first differences are analyzed; see non-differenced time
series in Table 12.16 and Fig. 12.10).
Evidently, there is a relatively strong positive correlation between these time
series, which is confirmed by the estimated correlation coefficient of size 0.563 and
scatterplot in Fig. 12.2. Due to the estimated (matrix) autocorrelation function in
Table 12.2 and the partial correlograms (not shown here) one could identify for
individual time series DTB3 and DAAA models AR(1) (or AR(3)) and AR(2),
.6
.4
.2
DAAA (%)
.0
-.2
-.4
-.6
-.8
-.6 -.4 -.2 .0 .2 .4 .6
DTB3 (%)
respectively. The relationship between these time series has again the form of
feedback (in both directions till the lag one; see Remark 12.1).
⋄
Example 12.2 Table 12.3 and Fig. 12.3 present the first differences of logarithms of
annual gross domestic products (i.e., log returns; see (8.1)) during the period
1951–1992 in seven countries (France, Germany, Italy, the UK, Japan, the USA,
and Canada) denoted as RGDP_FRA, RGDP_GER, RGDP_ITA, RGDP_UK,
RGDP_JAP, RGDP_US, RGDP_CAN, respectively. The graphs of these seven
time series of lengths 42 in Fig. 12.3 can be again regarded as stationary.
There are again strong positive correlations among these time series, which is
confirmed by the estimated correlation matrix in Table 12.4 and scatterplots in
Fig. 12.4. Due to the estimated correlograms and partial correlograms (not shown
here), one could identify for each individual time series models AR(1) (or white
noise). There exist unidirectional dependency relationships from Remark 12.1
between some pairs of these time series, e.g., between France and Germany (see
Table 12.5).
⋄
Besides the mutual correlation function ρij(k), one applies also partial mutual
correlation function denoted as ρij(k,k) and defined as the partial correlation coeffi-
cient between yit and yj,tk under fixed values
ytk+1, . . ., yt1. Its estimate rij(k,k) can
be obtained as the estimated parameter Φ b kk in the model
ij
where the multivariate white noise {εt} is quite analogical to the univariate white
noise, i.e., particular components of vectors εt have zero means and are mutually
uncorrelated in different times, but are simultaneously correlated with a constant
positive definite variance matrix Σ
Table 12.3 Annual data in Example 12.2 (log returns of annual gross domestic products for France, Germany, Italy, the UK, Japan, the USA, and Canada)
310
RGDP_CAN
.08
.06
.04
.02
.00
-.02
-.04
-.06
-.08
50 55 60 65 70 75 80 85 90
Fig. 12.3 Annual data in Example 12.2 (log returns of annual gross domestic products for France,
Germany, Italy, the UK, Japan, the USA, and Canada). Source: OECD (https://data.oecd.org/gdp/
gross-domestic-product-gdp.htm)
Eðεt Þ ¼ 0, E εs ε0t ¼ δst Σ: ð12:10Þ
Table 12.4 Estimated correlation matrix for seven time series from Example 12.2
RGDP_FRA RGDP_GER RGDP_ITA RGDP_UK RGDP_JPN RGDP_US RGDP_CAN
RGDP_FRA 1.000 0.610 0.591 0.489 0.748 0.409 0.345
RGDP_GER 0.610 1.000 0.510 0.445 0.553 0.400 0.177
RGDP_ITA 0.591 0.510 1.000 0.303 0.591 0.284 0.189
RGDP_UK 0.489 0.445 0.303 1.000 0.468 0.543 0.250
RGDP_JPN 0.748 0.553 0.591 0.468 1.000 0.388 0.104
RGDP_US 0.409 0.400 0.284 0.543 0.388 1.000 0.667
Generalization of Methods for Univariate Time Series
.08
RGDP_FRA
.06
.04
.02
.00
-.02
.12
RGDP_GER
.08
.04
.00
-.04
.08
RGDP_ITA
.06
.04
.02
.00
-.02
-.04
-.06
.06
RGDP_UK
.04
.02
.00
-.02
-.04
-.06
.15
RGDP_JPN
.10
.05
.00
-.05
.08
RGDP_US
.06
.04
.02
.00
-.02
-.04
.08
RGDP_CAN
.04
.00
-.04
-.08
-.02 .00 .02 .04 .06 .08 -.04 .00 .04 .08 .12 -.06 -.04 -.02 .00 .02 .04 .06 .08 -.06 -.04 -.02 .00 .02 .04 .06 -.05 .00 .05 .10 .15 -.04 -.02 .00 .02 .04 .06 .08 -.08 -.04 .00 .04 .08
In Sect. 12.2, we will present in more detail a special case of VARMA, namely
the vector autoregressive process VAR( p)
since nowadays this model is broadly applied just for dynamic economic data.
The vector autoregression (12.13) (see, e.g., Lütkepohl (2005)) is a natural exten-
sion of the univariate autoregressive process. In econometrics, it represents a useful
instrument in the context of simultaneous equation models SEM (see, e.g., Greene
(2012) or Heij et al. (2004)).
The VAR have several pros and cons in the framework of practical analysis of
economic and financial time series:
+ It is not necessary to distinguish between exogenous variables (they originate
outside the model) and endogenous variables (they originate as outputs of the
given model).
+ Models VAR has a richer structure than univariate processes AR since each
variable can depend on further variables (and not only on its lagged values with
added white noise).
+ The classical OLS estimate has usually acceptable properties in VAR models.
+ Empirical experiences show that predictions by means of VAR are sufficient for
routine situations in practice.
The application of VAR is sometimes “too technical” without deeper arguments
justifying the given model (in practice, this approach is popular in the context of
data mining).
The number of parameters which must be estimated can be large (particularly for
higher dimensions m and orders p of VAR). Moreover, one must solve the
problem of an adequate choice of p in practice).
One must stationarize the modeled data before the VAR is constructed. However,
the necessary adjustments and transformations to achieve stationarity (mainly
differencing) may imply a substantial loss of information contained originally in
the data.
At first let us consider the following model VAR(1) (the description is simpler
than for the general VAR( p) and the results derived for VAR(1) can be extended
easily to the general order; see Remark 12.3)
yt ¼ φ0 þ Φyt1 þ εt , ð12:14Þ
316 12 Methods for Multivariate Time Series
where εt is m-variate white noise (see (12.10)). In comparison with (12.13), the
relation (12.14) contains in addition an m-variate intercept φ0. For example, if m ¼ 2,
then VAR(1) is formed by two equations which can be written explicitly as
Remark 12.2 The explicit form (12.15) demonstrates how the model parameters
influence relations between series {y1t} and {y2t} in time (see Remark 12.1). If in
addition the covariance matrix Σ of white noise {εt} is diagonal (i.e., its components
are mutually uncorrelated), then it holds:
• If φ12 ¼ φ21 ¼ 0, then {y1t} and {y2t} are uncoupled.
• If φ12 ¼ 0 and φ21 6¼ 0, then there exists a unidirectional dependency relationship
of {y2t} on {y1t}.
• If φ12 6¼ 0 and φ21 6¼ 0, then there exists a feedback between {y1t} and {y2t}.
⋄
The formula (12.14) is called reduced form of the model VAR. If we consider the
ith equation in (12.14) (or more generally in (12.13)), then only the variable yi is
present in the current form (i.e., without lag). Moreover, in the reduced form, the
simultaneous correlation between {yit} and {yjt} is represented only by means of the
element σ ij of the covariance matrix of white noise Σ. However, sometimes one
needs to express the simultaneous relation between {yit} and {yjt} more explicitly. In
such a case, one can use so-called structural form of the model VAR, namely by
means of Cholesky decomposition from the matrix theory: as the matrix Σ is positive
definite, then there is a lower triangular matrix L with units on the main diagonal and
a diagonal matrix D such that
1
Σ ¼ LDL0 , i:e: L1 ΣðL0 Þ ¼ D: ð12:16Þ
where
1
φ0 ¼ L1 φ0 , Φ ¼ L1 Φ, ut ¼ L1 εt , Eðut Þ ¼ 0, varðut Þ ¼ L1 ΣðL0 Þ ¼ D:
ð12:18Þ
Particularly, {ut} is an m-variate white noise with diagonal covariance matrix (i.e.,
components of {ut} are simultaneously uncorrelated). If we denote the last row of
the inverted matrix L1 as (λm1, . . ., λm,m-1, 1), then the mth equation in (12.17) is
12.2 Vector Autoregression VAR 317
X
m1 X
m
ymt þ λmi yit ¼ φm0 þ φmi yi, t1 þ umt : ð12:19Þ
i¼1 i¼1
This structural form presents explicitly the simultaneous (i.e., at time t) linear
dependence of ymt on yit for i ¼ 1, . . ., m 1 (since umt is uncorrelated with yit
which follows from the facts that umt is uncorrelated with uit and L1 in (12.17) is a
lower triangular matrix with units on the main diagonal similarly as L). As the
components of vector yt can be rearranged in an arbitrary way, one obtains the same
conclusion as for ymt also for other components yjt ( j ¼ 1, . . ., m 1) of vector yt.
In practice, one prefers the reduced form of the model VAR since
• The estimation of the reduced form is relatively easy.
• The predictions in the structural form are complicated due to the links of predicted
variable with the simultaneous values of further variables (see above).
The conditions of (weak) stationarity (see Sect. 12.1) and the first and second
moments of VAR(1) can be found analogically as for the scalar (univariate)
autoregressive process (see Sect. 6.2.3):
A sufficient condition of stationarity of the model VAR(1) written in the form
(12.14) (this condition also allows to express VAR in the form (12.22)) usually
demands that all m eigenvalues of the matrix Φ lie inside the unit circle in complex
plane (i.e., their absolute values are lower than one). The eigenvalues of matrix Φ
are the roots of polynomial equation det (λ I Φ) ¼ 0 (or equivalently the inverted
roots of polynomial equation det (I Φz) ¼ 0). Therefore, the condition of
stationarity can be formulated also in such a way that all m roots of autoregressive
(matrix) polynomial Φ(z) ¼ I Φz lie outside the unit circle in complex plane
(or equivalently all m inverted roots of this polynomial lie inside the unit circle in
complex plane).
Under the condition of stationarity (see above), the matrix I Φ is regular so that
(12.14) can be rewritten in the form
yt μ ¼ Φðyt1 μÞ þ εt , ð12:20Þ
where
is the mean vector of the stationary process {yt} (obviously, Φ(1) ¼ I Φ).
Moreover, this process can be then also written in the form of m-variate linear
process
0
Γ0 ¼ varðyt Þ ¼ Σ þ ΦΣΦ0 þ Φ2 Σ Φ2 þ . . . ð12:23Þ
Γk ¼ Φ k Γ 0 : ð12:24Þ
Remark 12.3 All previous formulas can be extended to the model VAR( p)
(in complex plane). Then one can rewrite (12.25) in the form
yt μ ¼ Φ1 ðyt1 μÞ þ þ Φp ytp μ þ εt , ð12:27Þ
where
1
μ ¼ I Φ1 Φp φ0 ¼ ðΦð1ÞÞ1 φ0 ð12:28Þ
is the mean vector of the stationary process {yt}. Its autocovariance function (12.4)
fulfills the multivariate version of system of Yule–Walker equations (6.35)
and
Γ0 ¼ Φ1 Γ1 þ þ Φp Γp þ Σ: ð12:30Þ
⋄
The construction of VAR( p) based on observations y1, . . ., yn is entirely analog-
ical to the univariate AR( p):
1. Identification of VAR Order
The order p could be identified by generalizing partial correlograms for multivariate
case (see Sect. 6.3.1.1), but such an approach is rather elaborate. Therefore, in
practice one makes use of identification procedures based either on statistical tests
or on information criteria.
The likelihood ratio test (LR test) modified in a sequential way is used typically
for the VAR order determination (see Lütkepohl (2005) or Tables 12.7 and 12.11 by
EViews). This test makes use of critical regions with confidence level α of the form
12.2 Vector Autoregression VAR 319
Table 12.6 Wald test for the VAR lag exclusion Wald tests
identification of model VAR
Included observations: 117
in Example 12.3 (DTB3t and
DAAAt) calculated by means Chi-squared test statistics for lag exclusion
of EViews Numbers in [ ] are p-values
DTB3 DAAA Joint
Lag 1 31.18664 31.30155 54.13323
[1.69e-07] [1.60e-07] [4.94e-11]
Lag 2 4.323035 7.686472 8.446665
[0.115150] [0.021424] [0.076520]
Lag 3 6.233328 0.484196 2.614466
[0.044305] [0.784979] [0.624263]
df 2 2 4
LR ¼ n b R j ln jΣ
ln jΣ b U j > χ 2 qm2 : ð12:31Þ
1α
In more detail, one tests the null hypothesis that the last q lags of an original VAR
model with a high number of lags (regarded as an upper bound for the VAR order)
have zero parameters (i.e., zero matrices Φi for the last q lags). The symbols Σb R and
b
ΣU in (12.31) denote the estimated covariance matrix of estimated residuals in the
restricted model VAR (i.e., under the restrictions of null hypothesis) and the
unrestricted model VAR (i.e., without such restrictions), respectively.
Another test recommended in this context is Wald test (see Lütkepohl (2005) or
Table 12.6 by EViews) that is similar to the classical F-test in linear regression
models, but it is based on χ 2-distribution.
As the information criteria are concerned, one applies them in the same way as for
the order determination of univariate time series models (see Sect. 6.3.1.2). For
example, the m-variate version of AIC criterion is
bk j þ 2k
AIC ðkÞ ¼ ln j Σ , ð12:32Þ
n
b k is the estimated covariance matrix of the estimated residuals in the model
where Σ
VAR(k) and k ¼ m(km + 1) is the number of parameters, which must be estimated in
the m-variate model VAR(k) with nonzero mean vector (see Tables 12.7 and 12.11
by EViews).
2. Estimation of Model VAR
The model VAR is usually estimated by means of the ML method (i.e., by maxi-
mizing the (log) likelihood function under the assumption of normal distribution of
white noise) even though the reduced form of model VAR (see (12.25)) may be also
estimated by means of the classical OLS method. Under routine conditions, both
approaches are asymptotically equivalent, and the estimates have asymptotically the
normal distribution.
320 12 Methods for Multivariate Time Series
Table 12.7 LR test and information criteria AIC, BIC, and HQ for the identification of model VAR
in Example 12.3 (DTB3t and DAAAt) calculated by means of EViews
VAR lag order selection criteria
Included observations: 112
Lag LogL LR AIC BIC HQ
0 42.94176 NA 0.731103 0.682558 0.711407
1 72.27577 57.09657 1.183496 1.037862a 1.124408a
2 76.80968 8.663012 1.193030 0.950307 1.094550
3 82.06239 9.848829a 1.215400a 0.875588 1.077527
4 82.58925 0.969044 1.153380 0.716478 0.976115
5 83.27286 1.232937 1.094158 0.560167 0.877501
a
Lag order selected by the criterion
LR sequential modified LR test statistic (each test at 5% level), AIC Akaike information criterion,
BIC Schwarz information criterion, HQ Hannan–Quinn information criterion
X
K 0 1
Qm ¼ n2
1 b Γ
tr Γ b Γ b 1 χ 2 m2 ðK pÞ
bk Γ ð12:33Þ
k¼1
nk k 0 0 1α
• By means of tests of normality of the estimated residuals (see, e.g., the test
Jarque–Bera in Example 8.1 or Table 12.14 by EViews).
4. Predictions in Model VAR
The construction of predictions in the model VAR is quite analogical to the univar-
iate model AR. Thus, again one makes recursively use of the relation
b
ytþk ðt Þ ¼ φ0 þ Φ1b
ytþk1 ðt Þ þ þ Φpb
ytþkp ðt Þ, ð12:34Þ
where
b
ytþj ðt Þ ¼ ytþj for j 0: ð12:35Þ
Remark 12.4 As the models VMA and VARMA are concerned, the application of
the OLS method is not so straightforward, and one prefers the ML method in such
models. For example in the model VMA(1)
Y
n
1 1
Lðϑ0 , Θ1 , ΣÞ ¼ exp ε0t Σ1 εt , ð12:37Þ
t¼1 ð2π Þm=2 jΣj1=2 2
ε 1 ¼ y1 ϑ 0 , ε 2 ¼ y2 ϑ 0 Θ 1 ε 1 , ε 3 ¼ y3 ϑ 0 Θ 1 ε 2 , : . . . ð12:38Þ
⋄
Remark 12.5 Particular components of models VARMA have the form of univar-
iate models ARMA: in the case of m-variate model VARMA( p, q) are these
marginal models of the type ARMA(mp, (m1)p + q). For example, the bivariate
model VAR(1) in (12.15) with zero mean vector can be written as
1 φ11 B φ12 B y1t ε1t
¼ : ð12:39Þ
φ21 B 1 φ22 B y2t ε2t
1 φ22 B φ12 B
,
φ21 B 1 φ11 B
then we can write due to diagonality of the matrix product on the left-hand side of
(12.20)
y1t
ð1 φ11 BÞð1 φ22 BÞ φ12 φ21 B2
y2t
1 φ22 B φ12 B ε1t
¼ , ð12:40Þ
φ21 B 1 φ11 B ε2t
0.0
-0.5
-1.0
-1.5
-1 0 1
.15 .15
.10 .10
.05 .05
.00 .00
-.05 -.05
-.10 -.10
-.15 -.15
-.20 -.20
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Corr(DAAA,DTB3(-i)) Corr(DAAA,DAAA(-i))
.20 .20
.15 .15
.10 .10
.05 .05
.00 .00
-.05 -.05
-.10 -.10
-.15 -.15
-.20 -.20
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Fig. 12.6 Estimated matrix autocorrelation function of estimated residuals with plotted critical
bounds for the diagnostics of model VAR(2) in Example 12.3 (DTB3t and DAAAt) calculated by
means of EViews
Example 12.4 Analogically we shall construct the model VAR also for the data
from Example 12.2 (the seven-variate time series of length 42 with the log returns
(i.e., the first differences of logarithms) of the annual gross domestic products (GDP)
in seven countries (denoted as RGDP_FRAt, RGDP_GERt, RGDP_ITAt,
RGDP_UKt, RGDP_JAPt, RGDP_USAt, RGDP_CANt):
1. Identification:
• The application of LR test and information criteria in Table 12.11 identify the
given time series as VAR(1).
2. Estimation:
The estimation of the model VAR(1) is realized by means of EViews in
Table 12.12. Analogically as in Example 12.3 one could omit some lagged regres-
sors in the estimated model by applying estimated standard deviations or t-ratio
12.2 Vector Autoregression VAR 325
Table 12.9 Q-test applied to VAR residual Portmanteau tests for autocorrelations
estimated residuals for the
H0: no residual autocorrelations up to lag h
diagnostics of model
VAR(2) in Example 12.3 Included observations: 118
(DTB3t and DAAAt) calcu- Lags Q-Stat Prob. Adj Q-Stat Prob. df
lated by means of EViews 1 0.759100 NA 0.765588 NA NA
2 2.509240 NA 2.545903 NA NA
3 7.221494 0.1246 7.381086 0.1171 4
4 9.651343 0.2904 9.896192 0.2724 8
5 12.22388 0.4279 12.58255 0.4001 12
6 15.53721 0.4857 16.07339 0.4479 16
7 16.56259 0.6812 17.16344 0.6423 20
8 20.01529 0.6959 20.86724 0.6465 24
9 24.81018 0.6381 26.05804 0.5699 28
10 27.85802 0.6764 29.38809 0.5994 32
The test is valid only for lags larger than the VAR lag order
df is degrees of freedom for (approximate) chi-square distribution
(even the most of lagged regressors since the given time series has large dimension
of 7 but small length of 42, which causes relatively broad confidence intervals).
3. Diagnostics:
• The estimated model VAR(1) is stationary according to Fig. 12.7.
• Q-test applied to the estimated residuals in Table 12.13 confirms the
uncorrelatedness of estimated residuals.
• The test Jarque–Bera applied to the estimated residuals in Table 12.14 con-
firms the normality of these residuals.
⋄
326 12 Methods for Multivariate Time Series
Table 12.11 LR test and information criteria AIC, BIC, and HQ for the identification of model
VAR in Example 12.4 (RGDP_FRAt, . . .) calculated by means of EViews
VAR lag order selection criteria
Included observations: 39
Lag LogL LR AIC BIC HQ
0 677.8285 NA 34.40146 34.10287 34.29433
1 744.7419 106.3751 35.32010 32.93139 34.46305
2 770.1225 31.23772 34.10885 29.63003 32.50188
3 817.3099 41.13769 34.01589 27.44695 31.65901
Lag order selected by the criterion
LR sequential modified LR test statistic (each test at 5% level), FPE Final prediction error, AIC
Akaike information criterion, BIC Schwarz information criterion, HQ Hannan–Quinn information
criterion
0.0
-0.5
-1.0
-1.5
-1 0 1
Table 12.13 Q-test applied VAR residual Portmanteau tests for autocorrelations
to estimated residuals for the
H0: no residual autocorrelations up to lag h
diagnostics of model VAR(1)
in Example 12.4 Included observations: 41
(RGDP_FRAt, . . .) calculated Lags Q-Stat Prob. Adj Q-Stat Prob. df
by means of EViews 1 11.39627 NA 11.68118 NA NA
2 37.89006 0.8753 39.53363 0.8308 49
3 83.33112 0.8547 88.56214 0.7418 98
4 121.3687 0.9397 130.7119 0.8285 147
5 164.3181 0.9517 179.6265 0.7931 196
6 197.2777 0.9888 218.2363 0.8898 245
7 231.0652 0.9973 258.9800 0.9303 294
The test is valid only for lags larger than the VAR lag order
df is degrees of freedom for (approximate) chi-square distribution
Then it holds:
• If φ12 6¼ 0, then y2 causes y1.
• If φ21 6¼ 0, then y1 causes y2.
• If φ12 6¼ 0 and φ21 ¼ 0, then there exists a unidirectional relationship from y2
to y1.
• If φ12 ¼ 0 and φ21 6¼ 0, then there exists a unidirectional relationship from y1
to y2.
• If φ12 6¼ 0 and φ21 6¼ 0, then there exists a feedback between y1 and y2.
• If φ12 ¼ 0 and φ21 ¼ 0, then y1 and y2 are G-independent.
Remark 12.6 The presented approach to the problem of causality concerns not only
the causality relations between two scalar variables but also between blocks of more
variables of given model VAR. However, it has sense only in the case that the given
model VAR is unambiguously identified. If it is not the case, then various trans-
formations of such a model may exist delivering different causality results.
⋄
Example 12.5 Let us consider the model VAR(1) estimated in Example 12.4 (the
seven-variate time series of length 42 with the log returns of annual gross domestic
products in seven countries). Table 12.15 presents the causality analysis based on the
corresponding p-values delivered by EViews. One can see (applying the significance
level of 5%) that:
• There exists a unidirectional relationship from RGDP_JPN to RGDP_FRA.
• There exists a unidirectional relationship from RGDP_JPN to RGDP_GER.
• There exists a unidirectional relationship from RGDP_JPN to RGDP_ITA.
• There exists a unidirectional relationship from RGDP_ITA to RGDP_US.
• RGDP_FRA is influenced casually by all remaining six variables.
• RGDP_GER is influenced casually by all remaining six variables.
• RGDP_ITA is influenced casually by all remaining six variables.
The causality analysis in Sect. 12.3 based on F- or analogical tests does not answer
questions, which is the sign of a causality relation or how long the effect of various
one-shot changes will survive. Such information can be obtained by means of
procedures denoted as impulse response and variance decomposition.
12.4 Impulse Response and Variance Decomposition 331
Table 12.15 Causality analysis of model VAR(1) in Example 12.5 (RGDP_FRAt, . . .) calculated
by means of EViews
VAR Granger Causality/Block Exogeneity Wald tests
Included observations: 41
Excluded Chi-sq df Prob.
Dependent variable: RGDP_FRA
RGDP_GER 1.062342 1 0.3027
RGDP_ITA 2.227988 1 0.1355
RGDP_UK 7.37E-05 1 0.9932
RGDP_JPN 11.94738 1 0.0005
RGDP_US 0.643009 1 0.4226
RGDP_CAN 0.062357 1 0.8028
All 20.79029 6 0.0020
Dependent variable: RGDP_GER
RGDP_FRA 1.751010 1 0.1857
RGDP_ITA 1.042991 1 0.3071
RGDP_UK 2.290986 1 0.1301
RGDP_JPN 4.051830 1 0.0441
RGDP_US 0.969813 1 0.3247
RGDP_CAN 0.676629 1 0.4107
All 14.54094 6 0.0241
Dependent variable: RGDP_ITA
RGDP_FRA 0.021672 1 0.8830
RGDP_GER 1.540756 1 0.2145
RGDP_UK 0.886088 1 0.3465
RGDP_JPN 3.937413 1 0.0472
RGDP_US 0.006123 1 0.9376
RGDP_CAN 0.111941 1 0.7379
All 17.32903 6 0.0081
Dependent variable: RGDP_UK
RGDP_FRA 0.008270 1 0.9275
RGDP_GER 0.042991 1 0.8357
RGDP_ITA 1.493219 1 0.2217
RGDP_JPN 0.161412 1 0.6879
RGDP_US 0.108771 1 0.7415
RGDP_CAN 0.055517 1 0.8137
All 1.936258 6 0.9255
Dependent variable: RGDP_JPN
RGDP_FRA 0.009790 1 0.9212
RGDP_GER 0.154727 1 0.6941
RGDP_ITA 0.192703 1 0.6607
RGDP_UK 2.153336 1 0.1423
RGDP_US 1.537884 1 0.2149
RGDP_CAN 1.055934 1 0.3041
All 3.346580 6 0.7643
(continued)
332 12 Methods for Multivariate Time Series
1. Impulse Response
Impulse response (see also Sect. 6.3.3.1) investigates the reaction of a chosen
dependent variable of the given model VAR to an impulse (innovation shock)
generated in a chosen row of this model. Obviously, a shock generated in the ith
row of the model affects not only the variable yi, but it is also transmitted to other
variables through the dynamic lag structure of VAR. Thus, in the estimated m-
variate model VAR one can investigate in time (starting at the moment of impulse)
altogether m2 responses (namely m responses for each of m dependent variables y1t,
. . ., ymt on the left-hand sides of particular rows in time t). Under the assumption of
stationarity of this model, the impacts of impulses in all (i.e., m2) response situations
gradually dampen (even though with different intensities, which is often useful to
investigate).
For example in the bivariate model VAR(1) with zero mean vector of the form
y1t 0:6 0:2 y1,t1 ε1t
¼ þ
y2t 0 0:3 y2,t1 ε2t
y1,0 ε1,0 1 y1,1 0:6 0:2 1 0:6
¼ ¼ , ¼ ¼ ,
y2,0 ε2,0 0 y2,1 0 0:3 0 0
y1,2 0:6 0:2 0:6 0:36
¼ ¼ , ...
y2,2 0 0:3 0 0
and the following responses to a unit impulse generated at time t ¼ 0 in the second
row (i.e., the innovation shock is ε2,0 ¼ 1, while the first one remains at the zero
level):
y1,0 0 ε1,0
y1,1 0:6 0:2 0 0:2
¼ ¼ , ¼ ¼ ,
y2,0 ε2,0 1 y2,1 0 0:3 1 0:3
y1,2 0:6 0:2 0:2 0:18
¼ ¼ , ...
y2,2 0 0:3 0:3 0:09
(one can see that the both responses dampen in time and that the response of y2 to the
impulse generated in the first row is zero in all times since φ21 ¼ 0).
Remark 12.7 If the model VAR can be written as the linear process of the form
(see also (12.22)), then obviously the elements of the ith column of matrix Ψk
represent the responses of particular dependent variables to the unit innovation
shock generated in the ith row at time t k (while the other values of the multivariate
white noise remain at the zero level).
⋄
Remark 12.8 Several technical problems must be solved when the impulse
response analysis is applied (see, e.g., EViews):
• Sometimes it is reasonable to investigate the response to such an impulse that
does not occur in one shot, but repeatedly starting at a given time moment; in this
case, the response in the given stationary model VAR does not dampen, but after
some time stabilizes to a (nonzero) level.
• The impulses are usually generated randomly being set to one standard deviation
of the estimated residuals (or to multiples of these standard deviations). In
particular, such a standardization is reasonable when different variables are
measured in different scales.
• Standard deviations for particular responses can be constructed (analytically or by
means of simulations; see Fig. 12.8).
• The impulses can be also orthogonalized in advance (e.g., using Cholesky
decomposition similarly as in the transformation (12.17)) to guarantee the mutual
334 12 Methods for Multivariate Time Series
.15 .15
.10 .10
.05 .05
.00 .00
-.05 -.05
-.10 -.10
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Fig. 12.8 Impulse response analysis of model VAR(2) in Example 12.6 (DTB3t and DAAAt)
calculated by means of EViews
Variance decomposition
Percentage of variance of DTB3 by DTB3 Percentage of variance of DTB3 by DAAA
100 100
80 80
60 60
40 40
20 20
0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
70 70
60 60
50 50
40 40
30 30
20 20
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Fig. 12.9 Variance decomposition analysis of model VAR(2) in Example 12.6 (DTB3t and DAAAt)
calculated by means of EViews
2. Variance Decomposition
While the response to impulse captures the impact of an impulse generated in a
chosen row of VAR for a chosen dependent variable, the variance decomposition
provides information on such a portion of the variance of prediction error (when
predicting a chosen dependent variable) that is explained by innovations from
particular rows of VAR. In practice, the main portion of this variance for yit is
usually explained by the innovation from the ith row of the model.
Example 12.6 Figures 12.8 and 12.9 present the results of the analysis based on
impulse response and variance decomposition for the model VAR(2) estimated in
Example 12.3 (the bivariate time series of length 120 with the first differences of
monthly yields to maturity for 3-month T-bills DTB3t in the first component and for
corporate bonds DAAAt in the second component):
• The response to (random) impulses set to one standard deviation of the estimated
residuals in Fig. 12.8 is performed after the orthogonalization by Cholesky
decomposition (the standard deviations for estimated responses are also plotted;
see Remark 12.8). Obviously, each of four responses gradually dampens so that
the given model VAR is stable (the model stationarity is also confirmed in this
way). Note the nonnegligible response of DAAA to the impulse generated in the
first equation for DTB3 (see the lower graph on the left).
336 12 Methods for Multivariate Time Series
• The variance decomposition in Fig. 12.9 confirms the well-known fact (see
above) that the most prediction variance of a given variable is explained by the
innovation from the equation explaining this variable (approximately 99% for
DTB3 and 72% for DAAA). However, the portion of 28% corresponding to the
percentage of variance of DAAA explained by DTB3 is not nonnegligible and is
consistent with the result obtained by means of the impulse response analysis.
X
m
αi yit I max di ð12:45Þ
i¼1, ..., m
i¼1
for arbitrary nontrivial linear combination of considered time series. Hence, partic-
ularly, any linear combination of time series with (stochastic) linear trend yit ~ I(1)
usually includes a (stochastic) linear trend as well.
On the other hand, economic and financial time series can be sometimes com-
bined in such a way that the resulting linear combination of nonstationary time series
becomes stationary. Such a phenomenon is denoted as cointegration and can be
interpreted as relationship of a long-run equilibrium among economic variables:
particular time series are nonstationary, but their (“cointegrated”) movement in time
tends (as a consequence of various market forces) to a balanced state of equilibrium
(even though in short-run segments deviations from such a long-run balance persist
in time). Particularly, in finance there exist various examples of cointegration, e.g.:
• Among spot and futures prices of various assets (commodities, securities, and
the like).
• Among price ratios in different countries (i.e., the ratios of prices of the same
goods) and corresponding currency rates.
• Among market prices of stocks and volumes of dividends.
12.5 Cointegration and EC Models 337
In given (and other) examples, the absence of long-run equilibrium would give
birth to arbitrage opportunities. Therefore, the principle of cointegration including
so-called EC models (see below) becomes one of the main econometric topics
nowadays (see, e.g., the seminal work by Engle and Granger (1987)).
Both the theoretical and the practical analyses of cointegration become simpler in
the framework of the models VAR, which is demonstrated by the following Example
12.7.
Example 12.7 Let us consider two time series {y1t} and {y2t} that can be modeled
simultaneously as the bivariate model VAR(1) of the form
! ! ! ! ! !
y1t y1,t1 ε1t 0:5 0:25 y1,t1 ε1t
¼Φ þ ¼ þ ,
y2t y2,t1 ε2t 1 0:5 y2,t1 ε2t
ð12:46Þ
where the bivariate white noise {εt} has a general covariance matrix Σ > 0. How-
ever, this model is not stationary, since the eigenvalues of matrix Φ in (12.46), i.e.,
the roots of polynomial equation det (λ I Φ) ¼ 0), are 0 and 1, so that one of them
does not lie inside the unit circle in complex plane. Each of marginal time series {y1t}
and {y2t} is nonstationary as well: if the model (12.46) is multiplied from the left-
hand side as
! ! ! !
1 0:5B 0:25B 1 0:5B 0:25B y1t ε1t
¼ , ð12:47Þ
B 1 0:5B B 1 0:5B y2t ε2t
It explicitly demonstrates that both marginal time series {y1t} and {y2t} can be
modeled as models ARIMA(0,1,1); therefore, these univariate time series include
stochastic linear trends and are nonstationary.
Now let us transform the time series {y1t} and {y2t} to time series {z1t} and {z2t}
and the white noise {εt} to another white noise {ut} by means of transformation
! ! ! !
z1t y1t 1 0:5 y1t
¼P ¼ ,
z2t y2t 2 1 y2t
! ! ! ! ð12:49Þ
u1t ε1t 1 0:5 ε1t
¼P ¼ :
u2t ε2t 2 1 ε2t
y1t 1
y1,t1 ε1t
P ¼ PΦP P þP , ð12:50Þ
y2t y2,t1 ε2t
z1t ¼ u1t ,
ð12:52Þ
z2t ¼ z2,t1 þ u2t
(apparently, the time series {z1t} is directly a white noise and the time series {z2t} is a
random walk I(1)).
We arrived at an evident paradox: both original time series {y1t} and {y2t} are
nonstationary with nonstationarity caused according to (12.48) for each of time
series always by one unit root (i.e., as if two unit roots figured seemingly in the
system), while after the transformation only one time series {z2t} is nonstationary in
the system (it confirms the previous conclusion that the system considered as a
whole possesses only one unit root). This paradox may be explained just by the
existence of the cointegration relation
(see (12.49) and (12.52)): both univariate time series {y1t} and {y2t} are
nonstationary, while their linear combination (12.53) is stationary.
⋄
The cointegration can be defined exactly by two equivalent ways (in both cases,
we confine ourselves to a special case that appears in practice most frequently). Let
{y1t}, . . ., {ymt} be nonstationary time series with the nonstationarity caused always
just by one unit root of the corresponding autoregressive polynomial (particularly it
may be y1t ~ I(1), . . ., ymt ~ I(1)). Then the time series {y1t}, . . ., {ymt} are
cointegrated if
(1) there exists their nontrivial (i.e., nonzero) linear combination that is stationary;
(2) equivalently: the corresponding model VAR of multivariate time series (y1t, . . .,
ymt)0 has m r unit roots, where r(0 < r < m) presents the number of
cointegration relations of the type (12.53).
For instance in Example 12.7: (1) both time series {y1t} and {y2t} are of the type
I(1) and their stationary linear combination exists (see (12.53)), or equivalently
(2) there exists their bivariate model VAR with one unit root (see (12.46)), i.e.,
12.5 Cointegration and EC Models 339
m ¼ 2, r ¼ 1. Therefore, {y1t} and {y2t} are cointegrated with the single cointegration
relation (12.53) (if we ignore its scalar multiples).
Remark 12.9 More generally, one can define cointegration of order (d, b) (b > 0,
d > 0), where y1t ~ I(d), . . ., ymt ~ I(d ), and there exists a nontrivial linear
combination of given time series that is of the type I(db). Then one writes
{yt} ~ CI(d, b) (i.e., CoIntegrated).
⋄
1. EC Model (ECM)
When analyzing a univariate nonstationary time series, then the usual recommenda-
tion consists in differencing this time series at first (e.g., if {yt} ~ I(1), then one
transfers it to the first differences {Δyt}; see Sect. 6.4.2). However, when we deal
with more nonstationary variables observed in time and are interested in their mutual
link in time, then transferring to differences may be correct statistically, but the
model constructed for differenced variables may not recover relations of long-run
equilibrium among original (non-differenced) variables, which is an important
feature just in the case of cointegration.
For example, let us consider two time series {xt} and {yt}, which are both
nonstationary of the type I(1). There is a conjecture that the time series {xt}
influences {yt}. Since both time series are nonstationary, this conjecture could be
possibly investigated by means of the model
However, we are interested in the relation between variables x and y after its
balancing to a long-run equilibrium, when the accruals of variables within time
units are (nearly) zero. Therefore, the relation (12.54) has no informative value from
this point of view. The situation is different if the time series {xt} and {yt} seem to be
cointegrated in long-term horizon: then the model (12.54) can be corrected to the
form
by including a correction term that is based on level (and not on differenced) values
of given variables at previous time t 1. The model (12.55) describes not only the
short-run relation between accruals Δxt and Δyt, but simultaneously it guarantees
corrections in the case when short-run changes of both variables deviate the levels of
these variables from their long-run equilibrium state. Let us stress that the correction
of changes of variables x and y from time t 1 to time t is based on the correction
term constructed at time t 1, since its value at time t is not known yet when
corrections for this time are constructed. If the time series {xt} and {yt} are really
cointegrated and the correction term in (12.55) is chosen as the cointegration relation
providing a stationary time series, then all terms in the model (12.55) are stationary.
Thus, the situation when one uses both stationary and nonstationary terms
340 12 Methods for Multivariate Time Series
simultaneously in one model is avoided (it could cause problems when constructing
such a model).
As the terminology is concerned, the model of the type (12.55) is mostly called
EC model (error correction or equilibrium correction). Sometimes one calls it also
VEC model (vector error correction) to stress the VAR context. The terms of type
yt1 β xt1 are called error correction terms. The parameters of type β describe
long-run cointegration relations among variables and they are usually ordered to
so-called cointegration vectors of type (1, β)0 . The parameters of type γ describe
short-run cointegration relations among variables. Finally the parameters of type α
control the rate of adjustment to the equilibrium state. Moreover, there may be
intercepts or linear trends in the model (including the error correction terms), e.g.,
(obviously, the parameter γ 1 means the intercept from the point of view of differen-
tial variables, but it means the deterministic linear trend from the point of view of
level variables).
2. EC Model Formulated as VAR
The theory of EC models (but also their practical testing and constructing) is the
most elaborate in the context of vector autoregressive models VAR. For instance, let
us consider the bivariate model VAR(1)
y1t y1,t1 ε1t
¼Φ þ ð12:57Þ
y2t y2,t1 ε2t
The key role for classification of (12.58) as an EC model plays the rank r ¼ r(Π)
of matrix Π (in our case of the type 2 2), which is closely related to the form of
eigenvalues of matrix Φ or equivalently to the form of roots of autoregressive
polynomial Φ(z) ¼ I Φz:
1. r(Π) ¼ 0: In this case Π ¼ 0 so that according to (12.58) both time series {y1t}
and {y2t} are nonstationary of the type I(1), and no cointegration relation exists
between them.
2. r(Π) ¼ 2: In this case, Π has the full rank so that both eigenvalues of this matrix
are nonzero, and hence no root of polynomial Φ(z) is unit. Moreover, if we
assume that both roots of Φ(z) lie outside the unit circle in complex plane (i.e.,
both inverted roots of Φ(z), which are simultaneously the eigenvalues of matrix
Φ, lie inside the unit circle), then the model VAR model (12.57) is stationary, and
it makes no sense to transfer it by differencing of the type (12.58) to the EC
model.
12.5 Cointegration and EC Models 341
3. r(Π) ¼ 1: This case is from the point of view of cointegration and EC method-
ology the most interesting. Just one of both eigenvalues of Π is nonzero, i.e., just
one of both roots of polynomial Φ(z) is unit. If we again assume that the
remaining root of Φ(z) lies outside the unit circle, then one can show that both
univariate time series {Δy1t} and {Δy2t} are stationary (more specifically, each of
non-differenced time series {y1t} and {y2t} is of the type ARIMA(1,1,1)). More-
over, (12.58) can be rewritten as
! ! ! ! !
Δy1t 0
y1,t1 ε1t α1 ε1t
¼ αβ þ ¼ β1 y1,t1 þ β2 y2,t1 þ
Δy2t y2,t1 ε2t α2 ε2t
ð12:59Þ
(the existence of bivariate column vectors α and β follows from the unit rank
r (Π) ¼ 1). The series {Δy1t} and {Δy2t} are stationary; hence also the time series
{β1y1,t1 + β2y2,t1} must be stationary (otherwise, (12.59) would equate station-
ary and nonstationary terms) representing the cointegration relation between {y1t}
and {y2t}. The construction of EC model (12.59) serves as an example of
application of Granger’s representation theorem (see, e.g., Engle and Granger
(1987)). Generally, it holds that the rank r of matrix Π equals the number of
cointegration relations (if we ignore scalar multiples of these relations) in the
corresponding EC model, while m r is the number of unit roots in the consid-
ered m-variate model VAR.
Remark 12.10 There is a direct analogy to DF test (Dickey and Fuller (1979); see
(6.81)), where the validity of null hypothesis
means the existence of unit root in the tested time series {yt} (obviously, ψ ¼
φ1– 1 ¼ π).
⋄
Example 12.8 Let us consider two time series {y1t} and {y2t} from Example 12.7
modeled simultaneously as the bivariate model VAR(1) of the form (12.46). This
model can be easily transferred to the form (12.59), namely
! ! ! !
Δy1t 0:5 0:25 y1,t1 ε1t
¼ þ
Δy2t 1 0:5 y2,t1 ε2t
! ! ! ð12:61Þ
0:5 y1,t1 ε1t
¼ ð1 0:5Þ þ
1 y2,t1 ε2t
(here the matrix Π has the rank r ¼ 1). The corresponding cointegration vector (just
one if we ignore its scalar multiples) is
342 12 Methods for Multivariate Time Series
(see also (12.53)). The bivariate model has really just one unit root (see Example
12.7). The first and second components of vector α ¼ (0.5, 1)0 control the rate of
adjustment to the equilibrium state in the first and second equations, respectively.
The motivation for the model (12.61) can be presented also in another way: The
time series {y1t} and {y2t} considered individually ignoring their cointegration
relations possess altogether two unit roots (one for each of them; see (12.48)), i.e.,
more than one unit root of the bivariate VAR model (12.46). It implies that by
differencing each component to achieve stationarity we would overdifference the
given system (so-called overdifferencing), which has some negatitive consequences:
the invertibility may be damaged due to lagged terms of the type ε1,t1, estimation
and prediction problems may appear, and the like. On the other hand, if we correct
the model by subtracting the vector yt1 from both sides of (12.46) (as it is the case
of (12.61)), then the MA structure of this model does not change (of course, there
must be a compensation allowed for maintenance of invertibility, namely the
presence of the level variable yt1 in the model).
⋄
In general, the m-variate model VAR( p) of the form
constructed for components y1t ~ I(1), . . ., ymt ~ I(1) has the following EC re-
presentation:
where
Π ¼ Φ1 þ þ Φp I ¼ Φð1Þ,
ð12:65Þ
Γ1 ¼ Φ2 . . . Φp , . . . , Γp2 ¼ Φp1 Φp , Γp1 ¼ Φp :
If the rank r of matrix Π fulfills 0 < r < m (the boundary cases r ¼ 0 and r ¼ m are
discussed in the commentary below (12.58)), then (again according to Granger’s
theorem) there exist matrices α and β (both are rectangular m r and have the full
column rank r) so that Π ¼ αβ0 and, moreover, each component of the vector β0 yt
can be modeled as I(0). In other words, there exists an EC representation of the form
Remark 12.11 The matrices α and β in (12.66) are not constructed unambiguously:
if γ is an arbitrary regular matrix of the type r r, then the matrices αγ1 and βγ0
present further possible decomposition of the matrix Π. Therefore, various adjust-
ments are recommended in practice, e.g., normalizations of parametric matrices α
and β or application of prescribed a priori constraints.
It is also usual in practice that various exogenous variables figure on the right-
hand side of the EC model (12.66): intercepts, polynomial trends, dummy variables
(e.g., in seasonal models, the dummy variables should be centered in a suitable way;
see Johansen (1995)) or exogenous variables from other systems. As it has signif-
icant consequences for tests and construction of cointegrated models, the
corresponding software systems usually enable to classify various types of EC
models:
• The cointegration relations contain intercepts, e.g., instead of (12.59) it holds
Δy1t α1 ε1t
¼ δ0 þ β1 y1,t1 þ β2 y2,t1 þ : ð12:67Þ
Δy2t α2 ε2t
(as a matter of fact, the parameters φ10 and φ20 mean the intercepts from the point
of view of differential variables, but they mean the deterministic linear trends
from the point of view of level variables y1 and y2).
• The intercepts are included both in the cointegration relations and outside of
them: in such a case, one must distinguish strictly among “inner” and “outer”
intercepts (otherwise, there may be problems with the identification of these
models).
• The cointegration relations contain linear trends, e.g., instead of (12.59) it holds
Δy1t α1 ε1t
¼ δ0 þ δ1 t þ β1 y1,t1 þ β2 y2,t1 þ : ð12:69Þ
Δy2t α2 ε2t
⋄
344 12 Methods for Multivariate Time Series
3. Testing of Cointegration
Testing of cointegration should confirm that the tested VAR model contains just the
given number r of cointegration relations, which is important information when
constructing a suitable EC model. The cointegration is declared in the case when
r > 0 (in particular, the case of stationary model VAR with r ¼ m can be also
regarded as a special cointegrated state, in which each equation represents directly
one of m cointegration relations).
Engle and Granger (1987) suggested a simple testing of cointegration among
variables y, x2, . . ., xk. Their EG test is based on a simple idea: if the given variables
are cointegrated, then the OLS residuals bεt calculated by the least squares method in
the model
should be of the type I(0). Therefore, it is sufficient to modify DF test (Dickey and
Fuller (1979); see (6.81)) and to test the null hypothesis
The only difference from the classical DF test is due to the fact that in (12.71) one
applies the residuals estimated from a specific model so that the critical values of the
classical DF test cannot be used. The relevant critical values which are more negative
than the ones for DF test are tabulated by means of simulations in Engle and Granger
(1987) and Engle and Yoo (1987). On the other hand, this approach has some
drawbacks, namely:
• In the case of nonstationary variables, the OLS estimate of model (12.70) may not
be reliable.
• In the case of more cointegration relations, one cannot decide which of them is in
fact by means of (12.70) just estimated (what about receiving a “more intensive”
cointegration relation after reordering given variables?).
Nowadays in practice the cointegration is mostly tested by means of Johansen
tests (see Johansen (1991)). The method is based on ML estimate of so-called
canonical correlations which measure the partial correlations among m-variate
vectors Δyt and yt1 under fixed values of vectors Δyt1, . . ., Δytp+1 in the EC
model (12.64). These canonical correlations are the square roots of eigenvalues λ1,
. . ., λm of a positive definite matrix that is closely related to the matrix Π. The ML
estimates b λ1 , . . . , b
λm of these eigenvalues based on y1, . . ., yn fulfill
1b
λ1 bλ2 b
λm 0: ð12:72Þ
Johansen tests are constructed to test that the eigenvalues λ are zero. As a matter
of fact, these are LR tests (i.e., tests based on the concept of likelihood ratio), whose
critical values are not generated by means of χ 2-distribution, but by means of
simulations (see MacKinnon et al. (1999)). Moreover, two types of these tests are
used in practice:
• Johansen test with statistics
X
m
λtrace ðr Þ ¼ n ln 1 b
λi ð12:73Þ
i¼rþ1
Table 12.16 Monthly data in Example 12.9 (the monthly yields to maturity for 3-month T-bills
and corporate bonds AAA in the USA in % p.a.)
Obs TB3 AAA Obs TB3 AAA Obs TB3 AAA
1985M01 7.76 12.08 1988M05 6.27 9.90 1991M09 5.25 8.61
1985M02 8.22 12.13 1988M06 6.50 9.86 1991M10 5.03 8.55
1985M03 8.57 12.56 1988M07 6.73 9.96 1991M11 4.60 8.48
1985M04 8.00 12.23 1988M08 7.02 10.11 1991M12 4.12 8.31
1985M05 7.56 11.72 1988M09 7.23 9.82 1992M01 3.84 8.20
1985M06 7.01 10.94 1988M10 7.34 9.51 1992M02 3.84 8.29
1985M07 7.05 10.97 1988M11 7.68 9.45 1992M03 4.05 8.35
1985M08 7.18 11.05 1988M12 8.09 9.57 1992M04 3.81 8.33
1985M09 7.08 11.07 1989M01 8.29 9.62 1992M05 3.66 8.28
1985M10 7.17 11.02 1989M02 8.48 9.63 1992M06 3.70 8.22
1985M11 7.20 10.55 1989M03 8.83 9.80 1992M07 3.28 8.07
1985M12 7.07 10.16 1989M04 8.70 9.79 1992M08 3.14 7.95
1986M01 7.04 10.05 1989M05 8.40 9.57 1992M09 2.97 7.92
1986M02 7.03 9.67 1989M06 8.22 9.10 1992M10 2.84 7.99
1986M03 6.59 9.00 1989M07 7.92 8.93 1992M11 3.14 8.10
1986M04 6.06 8.79 1989M08 7.91 8.96 1992M12 3.25 7.98
1986M05 6.12 9.09 1989M09 7.72 9.01 1993M01 3.06 7.91
1986M06 6.21 9.13 1989M10 7.59 8.92 1993M02 2.95 7.71
1986M07 5.84 8.88 1989M11 7.67 8.89 1993M03 2.97 7.58
1986M08 5.57 8.72 1989M12 7.64 8.86 1993M04 2.89 7.46
1986M09 5.19 8.89 1990M01 7.64 8.99 1993M05 2.96 7.43
1986M10 5.18 8.86 1990M02 7.76 9.22 1993M06 3.10 7.33
1986M11 5.35 8.68 1990M03 7.87 9.37 1993M07 3.05 7.17
1986M12 5.49 8.49 1990M04 7.78 9.46 1993M08 3.05 6.85
1987M01 5.45 8.36 1990M05 7.78 9.47 1993M09 2.96 6.66
1987M02 5.59 8.38 1990M06 7.74 9.26 1993M10 3.04 6.67
1987M03 5.56 8.36 1990M07 7.66 9.24 1993M11 3.12 6.93
1987M04 5.76 8.85 1990M08 7.44 9.41 1993M12 3.08 6.93
1987M05 5.75 9.33 1990M09 7.38 9.56 1994M01 3.02 6.92
1987M06 5.69 9.32 1990M10 7.19 9.53 1994M02 3.21 7.08
1987M07 5.78 9.42 1990M11 7.07 9.30 1994M03 3.52 7.48
1987M08 6.00 9.67 1990M12 6.81 9.05 1994M04 3.74 7.88
1987M09 6.32 10.18 1991M01 6.30 9.04 1994M05 4.19 7.99
1987M10 6.40 10.52 1991M02 5.95 8.83 1994M06 4.18 7.97
1987M11 5.81 10.01 1991M03 5.91 8.93 1994M07 4.39 8.11
1987M12 5.80 10.11 1991M04 5.67 8.86 1994M08 4.50 8.07
1988M01 5.90 9.88 1991M05 5.51 8.86 1994M09 4.64 8.34
1988M02 5.69 9.40 1991M06 5.60 9.01 1994M10 4.96 8.57
1988M03 5.69 9.39 1991M07 5.58 9.00 1994M11 5.25 8.68
1988M04 5.92 9.67 1991M08 5.39 8.75 1994M12 5.64 8.46
Source: FRED (Federal Reserve Bank of St. Louis)
12.5 Cointegration and EC Models 347
2
85 86 87 88 89 90 91 92 93 94
Table 12.17 Johansen tests of cointegration from Example 12.9 by means of EViews (TB3 and
AAA)
Sample (adjusted): 1985M04 1994M12
Included observations: 117 after adjustments
Trend assumption: Linear deterministic trend
Series: TB3 AAA
Lags interval (in first differences): 1 to 2
Unrestricted cointegration rank test (Trace)
Hypothesized Trace 0.05
No. of CE(s) Eigenvalue Statistic Critical value Prob.**
None* 0.132054 18.61642 15.49471 0.0164
At most 1 0.017337 2.046241 3.841466 0.1526
Trace test indicates 1 cointegrating eqn(s) at the 0.05 level
4. Construction of EC Model
The construction of EC models can be described in the following steps (for simplic-
ity, we assume that the m-variate time series y1, . . ., yn is either stationary or
nonstationary of the type I(1), i.e., it can be stationarized by transferring it to the
time series of first differences):
348 12 Methods for Multivariate Time Series
1. One applies the tests of unit root (e.g., DF and ADF tests from Sect. 6.4.1) for
particular univariate time series {y1t}, . . ., {ymt}. If the null hypotheses of unit
roots are rejected, then these time series are stationary (except for possible
deterministic trends), and one constructs for y1, . . ., yn a model VAR (possibly
with deterministic trends as exogenous variables; see Sect. 12.2). Otherwise due
to unit roots, the given time series contain stochastic trends, and one proceeds to
the step 2.
2. One applies Johansen (or other) tests of cointegration (possibly including inter-
cepts, linear trends, and the like). If the cointegration is rejected (r ¼ 0), one
proceeds to the step 3. If it is not the case and the existence of r cointegration
relations is confirmed (0 < r < m), then one proceeds to the step 4 (the case of
r ¼ m is excluded due to the step 1).
3. Since the cointegration was rejected in the previous step of the algorithm, one
constructs the corresponding model VAR for the stationary time series Δy1, . . .,
Δyn.
4. Since there exist r cointegration relations (0 < r < m), the step 3 is ignored, and
one constructs the corresponding EC model (12.66) for the original time series y1,
. . ., yn. The estimation procedure can combine LM method and OLS method (see
Johansen (1995)). In particular, the maximal value of logarithmic likelihood is
np X
r
c ln 1 b
λi , ð12:75Þ
2 i¼1
λ1 , . . . , b
where c is a constant (independent of r) and b λr are positive numbers as in
(12.72). This estimation procedure can be supplemented by a priori constraints
for the parameters in matrices α and β in (12.66).
Example 12.10 In Table 12.18, the time series {TB3t} and {AAAt} from Examples
12.1 and 12.3 are estimated by means of EViews (the first differences of these time
series are denoted as {DTB3t} and {DAAAt}). Here the order of the VAR model
(12.66) is p 1 ¼ 2 (the order of the original model before differencing is p ¼ 3),
and the intercepts are explicitly estimated both in the model (12.66) and in its
cointegration relation. The estimated EC model from Table 12.18 has the explicit
form
! ! !
DTB3t 0:01 0:01
¼ þ ð23:06 þ TB3t1 3:21AAAt1 Þþ
DAAAt 0:03 0:03
! ! ! ! !
0:44 0:11 DTB3t1 0:04 0:16 DTB3t2 ε1t
þ þ þ :
0:10 0:53 DAAAt1 0:10 0:27 DAAAt2 ε2t
⋄
12.6 Exercises 349
12.6 Exercises
Exercise 12.1
Repeat Johansen tests of cointegration for time series {TB3t} and {AAAt} from
Example 12.9, but only for the 5-year period 19901994 (hint: both the test with
λtrace(r) and λmax(r) indicate with significance level of 5% no cointegration relation
(r ¼ 1)).
Chapter 13
Multivariate Volatility Modeling
The models of volatility in Chap. 8 are univariate, i.e., they model the volatility quite
independently on other time series. It may be a drawback (particularly in finance)
since
• The effect of volatility spillover among various financial markets or among
various assets within the same financial market is a typical phenomenon in
finance.
• Correlations among particular components play a key role when constructing and
managing (diversified) investment portfolios.
For instance, let us consider so-called dynamic hedging applied frequently when
reducing investment risk. The dynamic hedging is mostly realized in such a way that
the investor simultaneously enters opposite positions on markets with a mutually
inverse behavior, e.g., on spot and future markets (in this context, the position
characterized as a purchase operation is denoted as long, and similarly, the position
characterized as a sale operation is denoted as short). The pragmatic investors
suppose that potential losses in one market may be balanced by profits from another
market that behaves just inversely. Therefore, they follow so-called hedging ratio
h that presents the number of units of future contracts per one unit of (spot) assets
and should be optimal in terms of the risk reduction. For example, in the case of
so-called short hedge of an investor, which means a long position in assets and
simultaneously a short position in futures, the value of total investor’s position
(during hedging till the maturity of futures) changes by ΔS – hΔF, where ΔS and
ΔF are the corresponding change in spot and future price, respectively. One can
easily show that the optimal hedging ratio minimizing var(ΔS – hΔF) is
σ st
ht ¼ ρt ,
σ ft
where σ st is the risk of spot market (i.e., the standard deviation of spot prices), σ ft is
the risk of future market (i.e., the standard deviation of future prices), and finally ρt is
the correlation coefficient between both time series of prices (the time index
t emphasizes the fact that the described hedging is dynamic with corrections of
hedging ratio realized in particular times). Obviously, in order to calculate {ht} one
makes use of the time series {ρt} that records the dynamics of correlation between
both time series.
In general, the multivariate volatility modeling plays an important role for the risk
control (e.g., for portfolio investment, but also for internal models in the framework
of regulatory methodologies Basel III of capital adequacy in banks or Solvency II in
insurance companies including commercial products of the type RiskMetrics, and
the like).
X
1
σ ij,t ¼ ð1 λÞ
b λk yi,t1k yi y j,t1k yj
k¼0
¼ ð1 λÞ yi,t1 yi y j,t1 y j þ λb
σ ij,t1 : ð13:1Þ
Here the estimated covariance b σ ij,t presents the mutual (co)volatility prediction from
time t 1, yi and y j are mean levels of given time series {yit} and {yjt}, and λ
(0 < λ < 1) is a discount constant chosen in advance. In the case of time series with
levels close to zero (which is mostly the case of log returns) one usually applies zero
mean returns in (13.1) so that the problem of their suitable choice is removed. Some
authors (e.g., Fleming et al. (2003)) summarize multivariate EWMA relations to the
matrix form
Pt1
bt ¼ τ¼ tk ðyτ yÞðyτ yÞ0
Σ ð13:3Þ
k1
for a suitable length of sample period k. The values (13.3) are also denoted as
multivariate SMA of length k (simple moving average; see, e.g., Chiriac and Voev
(2011)).
⋄
It is a direct analogy of the univariate implied volatility from Sect. 8.3.2. The mutual
(i.e., multivariate) volatility may be implied, e.g., by means of currency option. For
instance, if we deal with implied mutual volatility between currency rates USD/EUR
and USD/CNY, then it can be calculated (in time) by means of relation
b
σ 2USD=EUR þ b
σ 2USD=CNY b
σ 2EUR=CNY
b
σ USD=EUR,USD=CNY ¼ , ð13:4Þ
2
where bσ 2 USD/EUR is the implied volatility of currency rate return USD/EUR and
similarly for bσ 2 USD/CNY and b
σ 2 EUR/CNY (these implied volatilities are constructed
by means of quoted option premiums for returns of particular currency rates; see
Sect. 8.3.2).
where, e.g., σ 11,t denotes the volatility in the first component {y1t}. Apparently, it is
a direct generalization of the volatility equation from the univariate model (8.59)
Moreover, in practice one often combines the volatility equation with the mean
equation (e.g., the mean equation in Example 13.1 is based on the VAR
methodology).
Example 13.1 Tsay (2002) constructed for monthly log returns of stocks IBM (time
series {rt1}) and index S&P 500 (time series {rt2}) during period 1926–1999 the
bivariate model of the form
This model obviously combines the bivariate model VAR(2) from Sect. 12.2
representing the mean equation and the bivariate GARCH(1,1) model (13.5)
representing the volatility equation (insignificant parameters are omitted). One
assumes the constant correlation between {et1} and {et2} estimated as 0.614.
⋄
In general, one can extend the univariate principle of conditional
heteroscedasticity in (8.16) to the m-variate case as
1=2
yt ¼ Σt εt , ð13:7Þ
where εt are iid m-variate random vectors with zero mean vector and unit covariance
matrix, i.e., {εt} is an iid multivariate white noise with
Eðεt Þ ¼ 0, varðεt Þ ¼ E εt ε0t ¼ I ð13:8Þ
and Σt1/2 is the square root matrix of the conditional covariance matrix Σt expressed
in time t as a suitable function of the information known till time t 1. The matrix Σt
(or also Ht by some authors) may be looked upon as (co)volatility matrix since its
diagonal elements are univariate volatilities of particular univariate components of
{yt} and the elements outside the main diagonal are mutual volatilities
(or covolatilities) of these components (here the logical denotation is σ 11,t ¼ σ 1t2,
and the like). Moreover, it is necessary to respect some practical aspects which are
important for construction of such models using real data:
• One must guarantee that the matrix Σt produced by the model at time t is positive
definite (and therefore also symmetric).
13.3 Multivariate GARCH Models 355
These models attempt to model the conditional covariance matrix (i.e., volatility
matrix) directly using similar model instruments as the univariate GARCH models:
1. VEC Model
Vector model GARCH denoted simply as VEC (suggested by Bollerslev et al.
(1988)) models the volatility matrix Σt in (13.7) as
X
r Xs
vechðΣt Þ ¼ a0 þ Ai vech yti y0ti þ Bj vech Σtj , ð13:9Þ
i¼1 j¼1
where a0 is a vector and Ai and Bj are square matrices of parameters. The symbol
vech() is so-called vector half operator which stacks the lower triangular part of a
m m matrix as a m(m + 1)/2 1 vector. For instance, the bivariate model GARCH
(1,1) of this type has
0 1 0 1 0 1
σ 11,t σ 21t y21,t1
B C B C B C
vechðΣt Þ ¼ @ σ 21,t A ¼ @ σ 21,t A, vech yt1 y0t1 ¼ @ y1,t1 y2,t1 A ð13:10Þ
σ 22,t σ 22t y22,t1
(the matrices A1 and B1 must be of the type 3 3 so that the corresponding
volatility equation contains 21 unknown parameters). In the general VEC model,
each element of Σt is a linear function of the lagged squared and cross-product
observations and lagged values of the elements of Σt. The total number of unknown
parameters [1 + (r + s)m(m + 1)/2]m(m + 1)/2 increases with growing dimensions to
an intolerable level so that for higher m the model VEC cannot be recommended in
routine practice.
356 13 Multivariate Volatility Modeling
2. DVEC Model
Diagonal vector model DVEC (see Bollerslev et al. (1988)) is a special case of VEC
model (13.9) with diagonal parametric matrices Ai and Bj. It reduces not only the
number of parameters to be estimated but also enables to rewrite the model (13.9) to
a more transparent form
X
r Xs
Σt ¼ Α0 þ Αi ∘ yti y0ti þ B j ∘Σtj , ð13:11Þ
i¼1 j¼1
The volatilities of both components of {yt} are modeled in the same way as in the
univariate GARCH(1,1) models of Sect. 8.3.5. The (scalar) equation of mutual
volatility has a similar structure, but now with the product of lagged values
y1,t1 ∙ y2,t1. Unfortunately, the natural property that the volatility of a component
is impacted by higher absolute values of another component in past time is not
guaranteed here.
The sufficient condition of positive definiteness of matrix Σt requires that the
matrix A0 is positive definite and the matrices Ai and Bj positive semidefinite. It can
be achieved by several alternative parametrizations:
(i) A0 ¼ A01/2(A01/2)0 , Ai ¼ Ai1/2(Ai1/2)0 , Bj ¼ Bj1/2(Bj1/2)0 , where matrices A01/2,
Ai , and Bj1/2 are lower triangular matrices of Cholesky decomposition (see
1/2
(12.16)).
(ii) A0 ¼ A01/2(A01/2)0 , Ai ¼ ai ai0 , Bj ¼ bj bj0 , where ai and bj are m-variate
parametric vectors.
(iii) A0 ¼ A01/2(A01/2)0 , Ai ¼ αi Im, Bj ¼ βj Im, where αi and βj are positive scalar
parameters and Im denotes the unit matrix of dimension m (the multivariate model
EWMA (13.1) can be looked upon as a special case of it). In particular, this case can
reduce the number of parameters substantially when the dimension m is higher.
3. BEKK Model
Model BEKK denoted by initials of its authors (Baba, Engle, Kraft, Kroner; see
Engle and Kroner (1995)) guarantees (automatically in comparison with the previous
models) the positive definiteness of volatility matrix Σt. Namely, this matrix is
modeled as
13.3 Multivariate GARCH Models 357
X
r X
s
Σt ¼ A0 þ A0i yti y0ti Ai þ B0j Σtj B j , ð13:13Þ
i¼1 j¼1
where all parametric matrices are quite general of dimension m m and only A0 is
required to be symmetric and positive definite (it can be achieved by a suitable
parametrization of A0 as in the case of model DVEC; see above). For instance, the
volatility matrix (13.13) for the bivariate model GARCH(1,1) of type BEKK can be
written by means of three scalar (co)volatility equations:
σ 11,t ¼ α11,0 þ α211,1 y21,t1 þ 2α11,1 α21,1 y1,t1 y2,t1 þ α221,1 y22,t1 þ β211,1 σ 11,t1 þ
þ2β11,1 β21,1 σ 12,t1 þ β221,1 σ 22,t1 ,
ð13:14Þ
σ 12,t ¼ α12,0 þ α11,1 α12,1 y21,t1 þ ðα11,1 α22,1 þ α12,1 α21,1 Þy1,t1 y2,t1 þ α21,1 α22,1 y22,t1 þ
þβ11,1 β12,1 σ 11,t1 þ β11,1 β22,1 þ β12,1 β21,1 σ 12,t1 þ β21,1 β22,1 σ 22,t1 ,
ð13:15Þ
σ 22,t ¼ α22,0 þ α212,1 y21,t1 þ 2α12,1 α22,1 y1,t1 y2,t1 þ α222,1 y22,t1 þ β212,1 σ 11,t1 þ
þ2β12,1 β22,1 σ 12,t1 þ β222,1 σ 22,t1 :
ð13:16Þ
The model BEKK has finally the desirable property, namely that the volatility of any
component may be impacted by higher absolute values of other component in past
time (e.g., the higher absolute value y2, t1 in (13.14) raises volatility σ 11, t). If no
interactions among volatilities occur, then it should be α21,1 ¼ β21,1 ¼ 0 in
Eq. (13.14) and α12,1 ¼ β12,1 ¼ 0 in Eq. (13.16) so that only the parameters of the
type aii,k impact the mutual volatilities, namely the parameters α11,1, α22,1, β11,1, β22,1
in Eq. (13.15) (however, it does not mean that the model BEKK is transferred to the
model DVEC in such a case).
Particular parametrizations reducing the total number of parameters are, e.g.:
(i) Diagonal BEKK models with diagonal matrices Ai and Bj.
(ii) Scalar BEKK models with matrices Ai ¼ αi Im and Bj ¼ βj Im, where αi and βj
are positive scalar parameters.
The models of this type primarily model the conditional correlation matrix, while the
volatilities of particular scalar components are modeled by means of the univariate
GARCH instruments as in Sects. 8.3.5 and 8.3.6.
358 13 Multivariate Volatility Modeling
1. CCC Model
Model CCC (constant conditional correlations; see Bollerslev (1990)) applies for
(13.7) the conditional covariance matrix Σt of the form
Σt ¼ Δt R Δt , ð13:17Þ
X
rk X
sk
σ kk,t ¼ α0k þ αki y2k,ti þ βkj σ kk,tj , k ¼ 1, . . . , m ð13:18Þ
i¼1 j¼1
(the modifications from Sect. 8.3.6 are also possible, e.g., EGARCH and others).
According to (13.7) and (13.17) the transformed process
zt ¼ Δ1
t yt ð13:19Þ
Σt ¼ Δt Rt Δt , ð13:20Þ
where Δt ¼ diag{√σ 1t, . . ., √σ mt} is similarly to (13.17) the diagonal matrix with
diagonal elements, which are the square roots of univariate volatilities so that one
13.3 Multivariate GARCH Models 359
can again utilize univariate GARCH models to estimate them. In DCC models, the
matrix Rt can vary in time and one obtains it as
αi 0, β j 0, α1 þ . . . þ αr þ β1 þ . . . þ βs < 1: ð13:23Þ
The model DCC is inspired by the univariate GARCH model (8.55) since such a
model under the assumption of stationarity with a constant (unconditional) variance
σ 2 can be rewritten in the form
!
X
r X
s X
r X
s
σ 2t ¼ 1 αi β j σ2 þ αi y2ti þ β j σ 2tj : ð13:24Þ
i¼1 j¼1 i¼1 j¼1
According to this analogy, the correlation matrix R in (13.22) can be looked upon as
such a part of the volatility equation that models the systematic correlatedness.
Apparently if all parameters αi and βj are zero, then we return to the CCC model
(13.17). To estimate DCC models, one can use special methods which are again
based on the devolatilized values estimated by means of univariate GARCH models.
In particular, the matrix R for (13.22) can be set to its empirical counterpart (e.g.,
applying the rolling window sample estimation to devolatilized values) or it can be
estimated as a parametric (positive definite) matrix in addition to remaining param-
eters of the type α and β. An alternative form of the dynamic matrix process {Qt} in
(13.22) was suggested by Tse and Tsui (2002).
Lin (1992), Vrontos et al. (2003), and others). Moreover, these factors are frequently
conditionally heteroscedastic and possess the GARCH structure.
The factor approach has an advantage that it reduces the dimensionality when the
number of factors K relative to the dimension m of given multivariate time series {yt}
is small. Engle et al. (1990) defined their factor models as follows. They assumed
that the conditional covariance matrix Σt is generated by K (K < m) factor volatilities
fkk,t corresponding to K underlying (not necessarily uncorrelated) factors, i.e.:
X
K
Σt ¼ Ω þ WFt W0 ¼ Ω þ wk w0k f kk,t , ð13:25Þ
k¼1
where ωk, αk, and βk are scalar parameters, vk are m 1 vectors of weights, and fkk,t
are factor volatilities (k ¼ 1, . . ., K ). In any case, the number of factors K is intended
to be much smaller than the number of assets m which makes the model feasible even
for a large number of assets.
In the previous model the factors are generally correlated. This may be undesir-
able when it turns out that several of the factors capture very similar characteristics of
the data. On the other hand, if the factors were uncorrelated, they would represent
really different components that drive the data. The uncorrelatedness of factors can
be achieved by means of various orthogonal transformations in O-GARCH (orthog-
onal GARCH) models (see Alexander and Chibumba (1997)) and GO-GARCH
(generalized orthogonal GARCH) models (see van der Weide (2002), Lanne and
Saikkonen (2007) and others).
Another possibility how to reduce the number of factors in a parsimonious
orthogonal way consists in application of the principal component analysis (PCA)
which is based (similarly to orthogonal GARCH models) on the eigenvalues and
eigenvectors of the conditional covariance matrix Σt of given multivariate time series
(in more details, first few principal components that explain a high percentage of
variability of the process are identified as common factors; see Example 13.2).
Example 13.2 For the same bivariate time series as in Example 13.1 (the compo-
nent {rt1} of monthly log returns of stocks IBM and {rt2} of monthly log returns of
index S&P 500 during period 1926–1999), Tsay (2002) constructed the bivariate
GARCH model applying the factor approach. At first the single common factor {xt}
as the first principal component was constructed explaining 82.5% of variability of
{rt1} and {rt2}:
13.3 Multivariate GARCH Models 361
xt ¼ 0:796r 1t þ 0:605r 2t :
Then this factor {xt} was modeled by means of the following univariate GARCH
model:
xt ¼ 1:317 þ 0:096xt1 þ et , et ¼ σ t εt ,
σ 2t ¼ 3:834 þ 0:110e2t1 þ 0:825σ 2t1 :
Finally, the bivariate model exploiting {σ t2} as the common volatility factor was
constructed:
In the literature and various software systems, one can find various approaches how
to estimate particular types of multivariate GARCH models. The most recommended
methods consist in maximum likelihood approach similarly to that in the univariate
case (see, e.g., (8.48)). This approach usually maximizes the log likelihood function
of the form (up to a constant)
1X X
n n
1=2
ln jΣt j þ ln g Σt yt , ð13:27Þ
2 t¼1 t¼1
1X 1X
n n
ln jΣt j ln y0t Σ1
t yt : ð13:28Þ
2 t¼1 2 t¼1
362 13 Multivariate Volatility Modeling
One should again remind here that the normality of innovations is often rejected in
financial applications (mainly with daily or weekly data): the kurtosis of most
financial asset returns is larger than three and the tails are often fatter than what is
implied by a conditional normal distribution. Fortunately, Bollerslev and
Wooldridge (1992) have shown that a consistent estimator of unknown parameters
(and under some assumptions even a strong consistent one; see Gourieroux (1997) or
Jeantheau (1998)) can be obtained when maximizing (13.28) even if the distribution
of generating process is not normal. Then one denotes it as (Gaussian) quasi-
maximum likelihood QML or pseudo-maximum likelihood PML estimator. Example
13.3 demonstrates the application of this estimation in financial practice.
Example 13.3 Hendrych and Cipra (2016) analyzed the mutual currency risk. By
means of the software system EViews for multivariate GARCH models, one esti-
mated the mutual volatilities (covolatilities) of six European currencies in the period
from January 5, 2007, to April 27, 2012 (i.e., 1362 observations for each currency),
namely for the Czech crown (CZK), the British pound sterling (GBP), the Hungarian
forint (HUF), the Polish zloty (PLN), the Romanian leu (RON), and the Swedish
krona (SEK). In the EU27, 17 member countries used the Euro currency; other three
states (Denmark, Latvia, and Lithuania) were members of the ERM II regime (the
European Exchange Rate Mechanism II), i.e., the national currencies were allowed
to fluctuate around their assigned value with respect to limiting bounds; the
Bulgarian lev was pegged to the euro. Therefore, only six remaining currencies
(see above) were not linked to a currency mechanism and were used in the case
study.
More precisely, one modeled the daily log returns on the bilateral exchange rates
of six currencies with euro as the denominator. Table 13.1 delivers the sample
characteristics of the data collected from the European Central Bank in 2013, e.g.,
the maximum log return of the Czech crown versus euro in the given period was
3.17%.
The analysis of corresponding six-variate process must start by modeling its
conditional mean value. For this purpose vector autoregression (VAR) appears
suitable similarly to that in Examples 13.1 and 13.2, namely VAR(3) (see
Table 13.1 Sample characteristics of daily log returns on exchange rates for selected currencies
versus euro in the period from January 5, 2007, to April 27, 2012 (1362 observations for each
currency) from Example 13.3
CZK GBP HUF PLN RON SEK
Mean 0.00007 0.00014 0.00010 0.00006 0.00019 0.00001
Median 0.00007 0.00012 0.00024 0.00014 0.00000 0.00004
Maximum 0.03165 0.03461 0.05069 0.04164 0.02740 0.02784
Minimum 0.03274 0.02657 0.03389 0.03680 0.01992 0.02260
Std. deviation 0.00478 0.00601 0.00763 0.00721 0.00462 0.00497
Skewness 0.20218 0.30655 0.42056 0.30802 0.54616 0.31526
Kurtosis 8.49754 6.49258 7.80556 8.05110 7.37830 6.05079
Source: Hendrych and Cipra (2016)
13.3 Multivariate GARCH Models 363
(12.25)). The analysis of correlation structure is then performed using only the
deviations from the conditional mean (prediction errors) eit that originate in partic-
ular components of the process applying alternatively the six-variate models
GARCH(1,1) of the type CCC, DCC, or scalar BEKK (denoted as sBEKK):
1. In the case of model CCC one must construct at first particular models for
univariate volatilities. Here the EGARCH(1,1) models (see (8.74)) seem to be
acceptable; e.g., for the deviations from the conditional mean {e1t} in the case of
log returns of the Czech crown versus euro one obtains
e1,t1 e1,t1
ffi þ 0:985 ln σ 11,t1 þ 0:004 pffiffiffiffiffiffiffiffiffiffiffiffi
ln σ 11,t ¼ 0:280 þ 0:154 pffiffiffiffiffiffiffiffiffiffiffiffi ffi:
σ 11,t1 σ 11,t1
Then it suffices to estimate the constant correlation matrix R (see Table 13.2) by
means of the devolatilization (13.19). Figure 13.1 plots the constant conditional
correlation 0.664 for the pair HUF/EUR and PLN/EUR only.
Table 13.2 CCC and DCC estimation of (constant) correlation matrix from Example 13.3 (daily
log returns on exchange rates for six selected currencies versus euro in period from January 5, 2007,
to April 27, 2012)
CZK GBP HUF PLN RON SEK
CZK 1.00000 0.05632 0.39278 0.41581 0.18822 0.16915
GBP 0.05632 1.00000 0.04448 0.02724 0.01421 0.07426
HUF 0.39278 0.04448 1.00000 0.66421 0.44590 0.32100
PLN 0.41581 0.02724 0.66421 1.00000 0.41214 0.36136
RON 0.18822 0.01421 0.44590 0.41214 1.00000 0.21656
SEK 0.16915 0.07426 0.32100 0.36136 0.21656 1.00000
Source: Hendrych and Cipra (2016)
1,00
correlation HUF/EUR vs. PLN/EUR
0,90
0,80
0,70 CCC
0,60 DCC
0,50 sBEKK
0,40
0,30
0,20
2007 2008 2009 2010 2011 2012
Fig. 13.1 Conditional correlations among daily log returns of exchange rates HUF/EUR and
PLN/EUR estimated by means of models GARCH(1,1) of type CCC, DCC, and scalar BEKK
from Example 13.3 (daily log returns on exchange rates for six selected currencies versus euro in the
period from January 5, 2007, to April 27, 2012). Source: Hendrych and Cipra (2016)
364 13 Multivariate Volatility Modeling
2. In the case of model DCC one starts similarly to that in the previous case of CCC
model with the univariate volatilities EGARCH(1,1) for the deviations from the
conditional mean {eti}. The estimation of the dynamic volatility matrix Σt is
obtained according to the formulas (13.20)–(13.22), where
with the same estimated correlation matrix R as in the previous case of the model
CCC (see Table 13.2) and devolatilized process {zt} according to (13.19).
Figure 13.1 plots the dynamic conditional correlation again for the pair
HUF/EUR and PLN/EUR only.
3. Finally in the case of scalar BEKK model (denoted as sBEKK), the estimated
model (13.13) with A1 ¼ α1 I6, B1 ¼ β1 I6 for the dynamic volatility matrix Σt has
the form
where {et} is the corresponding multivariate process of the deviations from the
conditional mean. Figure 13.1 again plots the dynamic conditional correlation for
the pair HUF/EUR and PLN/EUR only.
From the pragmatic point of view, the estimated models help to do conclusions,
e.g., on the average level of particular conditional correlations (see the estimated
correlation matrix R in Table 13.2): as the currency risk is concerned, the British
pound influences the behavior of remaining five currencies in a negligible scope;
the Hungarian forint and the Polish zloty are rather strongly correlated both
mutually (see also Fig. 13.1) and to the Czech crown, etc. The models DCC
and sBEKK inject an important dynamic aspect to the analysis.
⋄
In the multivariate case, new aspects of risk measures from Sect. 11.1 appear which
are related to the multivariate volatility modeling (and even to the multivariate
GARCH models if the data have the form of financial asset returns). The risk
measures of the type value at risk (VaR) are typical in this context.
1. CoVaR
Adrian and Brunnermeier (2008) showed empirically that the stress state of some
financial institutions (mainly big banks, but also insurance companies, mortgage
agencies, and others) can raise significantly the value at risk of the global financial
system (even by 50%). Therefore, specific risk measures were introduced:
13.4 Conditional Value at Risk 365
CoVaR jji(conditional VaR or contagion VaR) is the value at risk (11.1) of the jth
subject (e.g., a bank) with a possible loss Xj under the condition that the ith subject
(e.g., another bank) with a possible loss Xi finds oneself in a crisis situation or
emergency (i, j ¼ 1, . . ., N ), i.e.
P X j CoVaRαjji j X i ¼VaR iα ¼ α, ð13:29Þ
where VaRiα is the value at risk of the ith subject on the confidence level α (e.g.,
α ¼ 0.99).
CoVaR | i is another conditional value at risk which measures the “contagion”
spread caused by the subject with loss Xi to the global system with loss X (e.g., the
impact of a defaulting bank on the whole bank system), i.e.:
P X CoVaRαj i j X i ¼ VaR iα ¼ α: ð13:30Þ
where CoVaR | i is given by (13.30) and VaRα is the value at risk corresponding to the
loss X of the global system.
This methodology can be modified for the log returns {rit} of financial assets from
a global system with the log return {rmt} (a global financial market or a security
index; see Brownlees and Engle (2012)). As a special case, one could even consider
a simple portfolio situation
X
N
r mt ¼ wit r it , ð13:32Þ
i¼1
where wit is the relative market capitalization (i.e., the weight) of ith asset at time
t (the scheme (13.32) corresponds, e.g., to the construction of stock indices). In any
case, one can deal with suitable values at risk also in this modified situation:
mjr ¼VaRit ðαÞ
CoVaRt it is the conditional value at risk corresponding to the value at
risk of the market return under the condition that the ith asset finds oneself in a crisis
situation:
mjr ¼VaRit ðαÞ
P r mt CoVaRt it jr it ¼ VaRit ðαÞ ¼ α, ð13:33Þ
so that (13.33) is the modification of (13.30) in this (portfolio) context. Note the
inequality sign in (13.33) since the loss consists in drops of returns so that typical
values at risk are negative return (losses have the negative sign in this context).
366 13 Multivariate Volatility Modeling
CoVaRit(α) is defined as the difference between the value at risk of the global
market conditionally on the ith asset being in financial distress and the value at risk
of the global market conditionally on the asset i being in its median state:
2. MES
Marginal expected shortfall (MES) is based on the concept of the expected shortfall
ES (the ES at level α is the expected return in the worst α% of the cases; see (11.14)).
The expected shortfall is usually preferred among risk measures in today’s financial
practice (due to its coherence and other properties giving to it preferences, e.g., in
comparison with the classical value at risk approach; see Artzner et al. (1999) or
Yamai and Yoshiba (2005)):
MESit(C) is the conditional version of ES, in which the global returns exceed
a given market drop C which is chosen as a suitable threshold value (C < 0, i.e.,
measures of the type MES similarly to CoVaR are again in the context of (log)
returns typically negative):
The symbol Et1 means that one understands the symbols MESit(C) conditionally at
time as MESi,t|t1(C), i.e., computed at time t given the information available at time
t – 1; see also the commentary to (8.12) concerning the symbols of the type σ 2t|t1.
Such a concept seems to be productive in various applications, e.g., when dealing
with so-called systemic risk and systemically important financial institutions (SIFI)
whose distress or disorderly failure, because of their size, complexity, and systemic
interconnectedness, would cause significant disruption to the wider financial system
and economic activity. If the conditional ES of the system is formally defined as
X
N
ESmt ðCÞ ¼ Et1 ðr mt jr mt < CÞ ¼ wit Et1 ðr it jr mt < C Þ ð13:36Þ
i¼1
then it holds
∂ESmt ðCÞ
MESit ðC Þ ¼ : ð13:37Þ
∂wit
Hence MES measures the increase in the risk of the system (measured by the ES)
induced by a marginal increase in the weight of ith subject of the system (the higher
the subject’s MES, the higher the individual contribution of this subject to the risk of
the financial system).
13.4 Conditional Value at Risk 367
Table 13.3 Constituents of Prague Stock Exchange index (PX index) from Example 13.4
Stock name (abbrev.) Stock name Obs. from Obs. to
PX Prague Stock Exchange Index Jan 6, 2000 May 9, 2016
AAA AAA Auto Sep 25, 2007 Jul 3, 2013
VIG Vienna Insurance Group Feb 6, 2008 May 9, 2016
CEZ CEZ Jan 6, 2000 May 9, 2016
CETV Central European Media Enterprises Jun 28, 2005 May 9, 2016
ECM ECM Real Estate Investments Dec 8, 2006 Jul 20, 2011
ERSTE Erste Group Bank Oct 2, 2002 May 9, 2016
KB Komercni banka Jan 6, 2000 May 9, 2016
NWR New World Resources PLC May 7, 2008 May 9, 2016
O2 O2 CR Jan 6, 2000 May 9, 2016
ORCO Orco Property Group SA Feb 2, 2005 Sep 19, 2014
PEGAS Pegas Nonwovens SA Dec 19, 2006 May 9, 2016
PHILMOR Philip Morris CR Oct 9, 2000 May 9, 2016
UNIPETROL Unipetrol Jan 6, 2000 May 9, 2016
ZENTIVA Zentiva Jun 29, 2004 Apr 27, 2009
Source: Cipra and Hendrych (2017)
Example 13.4 The case study by Cipra and Hendrych (2017) examines the sys-
temic risk for the Prague Stock Exchange index (PX index) constituents (see
Table 13.3). In order to calculate MES for each involved firm i (i ¼ 1, . . ., N ), one
implemented GARCH modeling schemes for each bivariate process of the daily firm
and market log returns rit and rmt (see also Brownlees and Engle (2012)):
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
r it ¼ σ it εit ¼ σ it ρim,t εmt þ σ it 1 ρ2im,t ζ it , ð13:38Þ
r mt ¼ σ mt εmt ,
where the shocks (ζ it, εmt) are independent and identically distributed in time with
zero mean, unit variance, and zero covariance. A mutual independence of these
shocks is not assumed: on the contrary, there are reasons to believe that extreme
values of εmt and ζ it interact (when the market is in its tail, the firm disturbances may
be even further in the tail if there is serious risk of default). Obviously, the modeling
scheme (13.38) guarantees that the conditional variances and correlation of rit and
rmt are σ 2it ð¼ σ ii,t Þ, σ 2mt ð¼ σ mm,t Þ and ρim,t, respectively.
The specification is completed by description of conditional (co)moments. The
volatilities σ 2it and σ 2mt were modeled as the univariate GJR GARCH(1,1) models (see
(8.72))
368 13 Multivariate Volatility Modeling
with I
i,t ¼ 1 for rit < 0 and 0 otherwise, I m,t ¼ 1 for rmt < 0 and 0 otherwise (this
threshold GARCH modification covers the leverage effect, i.e., the tendency of
volatility to increase more with bad news (negative log returns) rather than with
good ones (positive log returns). The time-varying correlations are captured by using
GARCH(1,1) of the type DCC (also with an asymmetric modification; see Engle
(2009)). For example, the conditional covariance matrix (13.20) has the form
σ it 0 1 ρim,t σ it 0
Σt ¼ Δt Rt Δt ¼ : ð13:40Þ
0 σ mt ρim,t 1 0 σ mt
The previous modeling scheme was applied to 14 firms, which have been
included into the PX index basis according to their market capitalization as of the
end of June 2008. One extracted the daily log returns from January 6, 2000, to May
9, 2016 (these data are unbalanced in that sense that not all companies have been
continuously traded during the sample period; see Table 13.3). Selected sample
characteristics of the studied log returns are presented in Table 13.4. One made use
of the estimation methodology for multivariate GARCH models from Sect. 13.3.4.
Figure 13.2 displays conditional volatilities of all investigated firms jointly with
the PX index (market) conditional volatility. Apparently, all graphs are significantly
influenced by the explosion in variability during the financial crisis 2008. Further-
more, one identifies the similar trend over many charts that is in line with the market
volatility trend. On the contrary, several log return time series are dominated by
other effects, which are not common for the whole market or for other returns. For
instance, one can mention the volatility of O2 log returns, which was increased due
to the split of the company in 2015.
Figure 13.3 shows the estimated time-varying correlations ρim,t between returns
of the ith company and the PX index (market). It is evident that the financial returns
of involved firms are significantly positively correlated with the market financial
returns. However, one can identify different behavior of correlations displayed in
particular plots. Some correlations are relatively stable when comparing with others;
see, e.g., CEZ, KB, PHILMOR, or VIG; others demonstrate trends varying in time,
e.g., O2, PEGAS, or UNIPETROL.
Table 13.5 reports the examined stocks listed in ascending order regarding the
one-step ahead MES predicted for October 20, 2008, which was a very critical date
from the point of view of the financial crisis 2008. The threshold C (see (13.35)) was
set as the unconditional VaR of the PX index log returns with the confidence level
99%. Under the distress condition of the global market, the short-run prediction
produced by the model, e.g., for ERSTE indicates a deep drop over 25%.
Finally, Table 13.6 contains the estimated multi-period ahead MES predictions
(h ¼ 125, i.e., the half-year ahead) starting from May 9, 2016 (the end of the
13.4
Table 13.4 Sample characteristics of the log returns from Example 13.4 (systemic risk analysis for constituents of PX index)
Stock # obs Mean Std. dev Median Min Max Skew Kurt
PX 4101 0.00014 0.01429 0.00051 0.16185 0.12364 0.45054 12.21528
AAA 1450 0.00057 0.02885 0.00000 0.23107 0.34179 1.14937 23.07040
Conditional Value at Risk
ECM ERSTE KB
.25 .25 .25
NWR O2 ORCO
.25 .25 .25
VIG ZENTIVA PX
.25 .25 .25
Fig. 13.2 Conditional volatilities of PX index constituents and PX index itself from Example 13.4
(systemic risk analysis for constituents of PX index). Source: Cipra and Hendrych (2017)
examined data set). Here the threshold C was set as minus 5% and minus 20%,
respectively. To be more precise, an investor can identify and anticipate potential
capital shortfall under the condition that a systemic event occurs half a year after the
investment (i.e., when the market global return is less than the threshold C at that
time moment). Consequently, the stocks NWR were identified as the most risky
13.4 Conditional Value at Risk 371
ECM ERSTE KB
1.0 1.0 1.0
NWR O2 ORCO
1.0 1.0 1.0
VIG ZENTIVA
1.0 1.0
0.5 0.5
0.0 0.0
-0.5 -0.5
-1.0 -1.0
00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
Fig. 13.3 Conditional correlations among PX index constituents and PX index from Example 13.4
(systemic risk analysis for constituents of PX index). Source: Cipra and Hendrych (2017)
assets assuming C ¼ 0.05 (the multi-period ahead MES of 14.7%) and the stocks
ERSTE as the most risky assets assuming C ¼ 0.20 (the multi-period ahead MES
of 25.5%).
⋄
372 13 Multivariate Volatility Modeling
Table 13.6 Multi-period-ahead MES (h ¼ 125) starting from May 9, 2016, from Example 13.4;
the threshold C was set as minus 5% and minus 20% of PX index log returns (systemic risk analysis
for constituents of PX index)
Stock MES9/5/2016125(C ¼ 5%) MES9/5/2016125(C ¼ 20%)
AAA NA NA
VIG 0.12630 0.21913
CEZ 0.09328 0.19380
CETV 0.09321 0.16910
ECM NA NA
ERSTE 0.13347 0.25510
KB 0.09235 0.19925
NWR 0.14654 0.19410
O2 0.09833 0.16540
ORCO NA NA
PEGAS 0.02161 0.08087
PHILMOR 0.03092 0.07247
UNIPETROL 0.02253 0.09114
ZENTIVA NA NA
Source: Cipra and Hendrych (2017)
13.5 Exercises
Exercise 13.1
Apply the multivariate EWMA methodology for time series {DTB3t} and {DAAAt}
from Table 12.1 (the first differences of monthly yields to maturity for three-
month T-bills and corporate bonds AAA in the USA in % p.a.).
Chapter 14
State Space Models of Time Series
xtþ1 ¼ Ft xt þ vt , t ¼ 1, 2, . . . , ð14:1Þ
yt ¼ G t xt þ w t , t ¼ 1, 2, . . . , ð14:2Þ
where (14.1) is the (vector) state equation describing the development of state vector
in time and (14.2) is the (vector) observation equation describing the relationship
between observation vectors and state vectors. The meaning of particular symbols is
the following:
xt state vector of dimension d 1 (at time t);
yt observation vector of dimension m 1 (at time t);
1. On one hand, by means of the state space modeling one can solve the problem of
recursive estimation of parameters of this process, even by an adaptive way in real
time (i.e., the parameters can change in time). In such a case, the state vector is the
parameter vector
0
xt ¼ φ1t , . . . , φpt ð14:5Þ
and the dynamic linear model (14.1) and (14.2) will be formulated as
xtþ1 ¼ xt , ð14:6Þ
yt ¼ yt1 , . . . , ytp xt þ εt ð14:7Þ
0
xt ¼ ytpþ1 , ytpþ2 , . . . , yt ð14:8Þ
and the dynamic linear model (14.1) and (14.2) will have the form
14.1 Kalman Filter 375
0 1 0 1
0 1 0 0 0
B0 0 1 0 C B 0 C
B C B C
B C B C
xtþ1 ¼B⋮ ⋮ ⋮ ⋮ C xt þ B ⋮ C εtþ1 , ð14:9Þ
B C B C
@ 0 0 0 ... 1 A @ 0 A
φp φp1 φp2 φ1 1
yt ¼ ð0, 0, . . . , 1Þ xt ð14:10Þ
⋄
According to the previous commentaries, the state space representation enables us
to solve recursively (namely in an effective way by means of Kalman recursive
formulas) the problem of filtering, smoothing, and predicting in a given DLM. The
key role in this context plays the conditional distribution of the state vector xt
conditioned by information contained in the observations ys, ys-1, ys-2, . . . till time
s. Due to practical purposes, we confine ourselves only to the first two moments of
this distribution and denote
0
b
xtjs ¼ Es ðxt Þ, Ptjs ¼ Es xt b
xtjs xt b
xtjs , ð14:11Þ
where the index s in the symbol Es() means that the mean value is conditioned by
information till time s. Important values in this context are the following ones:
• The prediction of state vector xt from time t 1 by one-step-ahead and the
corresponding error matrix:
0
b
xtjt1 ¼ Et1 ðxt Þ, Ptjt1 ¼ Et1 xt b
xtjt1 xt b
xtjt1 : ð14:12Þ
• The estimated (filtered) value of state vector xt at time t and the corresponding
error matrix:
0
b
xtjt ¼ Et ðxt Þ, Ptjt ¼ Et xt bxtjt xt bxtjt : ð14:13Þ
These predictions and estimations are the best ones according to the criterion
MSE (i.e., in the sense of mean squared error; see (2.11)). Moreover, under the given
assumptions (i.e., in the described DML under the assumption of normality), they
even have the form of linear functions, the argument of which is always the
corresponding conditioning information (i.e., corresponding observation vectors).
Simultaneously one can also obtain the prediction of vector yt from time t 1 (i.e.,
by one-step-ahead) and the corresponding error matrix as
376 14 State Space Models of Time Series
0
b
ytjt1 ¼ Et1 ðyt Þ ¼ Gt b
xtjt1, Et1 yt b
ytjt1 yt b
ytjt1
Remark 14.1 Sometimes the matrices Ft, Gt, Vt, Wt (eventually others) in DML
contain unknown parameters which must be estimated. In practice, one applies
usually so-called EM algorithm (expectation-maximization; see, e.g., Brockwell
and Davis (1996), Dempster et al. (1977), Wu (1983)) which combines the maxi-
mum likelihood method with optimization algorithms and can be used in the
situations with incomplete information where some data are missing.
⋄
1. Filtering in State Space Model
Filtering in a given state space model consists in the (recursive) estimation of state
vector xt exploiting information contained in yt, yt-1, yt-2, . . . . The corresponding
Kalman recursive formulas, which are called Kalman (or Kalman-Bucy) filter in
such a case, have the form
1
xtjt ¼ b
b xtjt1 þ Ptjt1 G0t Gt Ptjt1 G0t þ Wt yt G t b
xtjt1 ,
1 ð14:15Þ
Ptjt ¼ Ptjt1 Ptjt1 G0t Gt Ptjt1 G0t þ Wt Gt Ptjt1 ,
where
b
xtjt1 ¼ Ft1b
xt1jt1 ,
ð14:16Þ
Ptjt1 ¼ Ft1 Pt1jt1 F0t1 þ Vt1 :
Example 14.2 Kalman filter can be used to construct the recursive OLS estimate in
the classical model of linear regression which is rewritten in the form of dynamic
linear model (14.1) and (14.2) with the state vector βt, i.e.,
βtþ1 ¼ βt , ð14:17Þ
where xt is the tth row of regression matrix X and εt ~ iid N(0, σ t2). After putting into
(14.15) and (14.16), one obtains the following recursive formulas for OLS estimate
(using a simpler denotation, namely bt instead of bt|t and Pt instead of Pt|t /σ t2 since,
e.g., bt|t 1 ¼ bt|t):
14.1 Kalman Filter 377
Pt1 x0t
bt ¼ bt1 þ ðy xt bt1 Þ,
xt Pt1 x0t þ 1 t
ð14:19Þ
Pt1 x0t xt Pt1
Pt ¼ Pt1 :
xt Pt1 x0t þ 1
⋄
Example 14.3 In this example, we will show the application of Kalman filter for
recursive estimation of linear time series models, namely autoregressive models. For
this purpose, the model AR( p) can be rewritten in the form of dynamic linear model
(14.1) and (14.2) with the state vector φt as
φtþ1 ¼ φt , ð14:20Þ
yt ¼ yt1 , yt2 , . . . , ytp φt þ εt ¼ yt φt þ εt , ð14:21Þ
where yt ¼ (yt-1, yt-2, . . ., yt-p) and εt ~ iid N(0, σ t2). Again after putting into (14.15)
and (14.16), we receive the following recursive formulas for estimating parameters
of this heteroscedastic model AR( p) (using a simpler denotation again, namely φ bt
b
instead of φtjt and Pt instead of Pt|t /σ t ):2
Pt1 y0t
bt ¼ φ
φ b t1 þ b t1 Þ ,
ðy yt φ
yt Pt1 y0t þ 1 t
Pt1 y0t yt Pt1
Pt ¼ Pt1 , ð14:22Þ
yt Pt1 y0t þ 1
!
1 ðy yt φ b t1 Þ2
σ 2t ¼
b σ 2t1 þ t
ðt p 1Þb :
tp yt Pt1 y0t þ 1
Particularly for the process AR(1), i.e., yt ¼ φ yt-1 + εt, these recursive formulas
are simplified to the form
bt ¼ φ
φ b t1 þ Pt yt1 ðyt φ
b t1 yt1 Þ ,
Pt1
Pt ¼ ,
Pt1 y2t1 þ 1 ð14:23Þ
1 ðy φbt1 yt1 Þ2
σ 2t ¼
b σ 2t1 þ t
ðt 2Þb :
t1 Pt1 y2t1 þ 1
378 14 State Space Models of Time Series
⋄
Example 14.4 Table 14.1 presents selected values obtained by means of recursive
estimation (14.23) based on a simulated trajectory of the process AR(1) modeled as
b 0 ¼ 0,
(see Kalman filter in Example 14.3). The initial values were chosen as φ
P0 ¼ 1 and b
σ 20 ¼ 0:
⋄
2. Predicting in State Space Model
Predicting in state space model (also predictor) consists in (recursive) estimation of
the state vector xt+h for particular t using information contained in yt, yt-1, yt-2, . . .
(h is fixed). One constructs recursively the predictions of the type
0
b
xtþhjt ¼ Et ðxtþh Þ, Ptþhjt ¼ Et xtþh b
xtþhjt xtþh b
xtþhjt : ð14:24Þ
b
xtþhjt ¼ Ftþh1 Ftþh2 . . . Ftþ1b
xtþ1jt ,
0 ð14:26Þ
Ptþhjt ¼ Ftþh1 Ptþh1jt Ftþh1 þ Vtþh1 :
Hence one can also calculate the prediction of yt+h and the corresponding error
matrix as
0
b xtþhjt, Et ytþh b
ytþhjt ¼ Et ytþh ¼ Gtþhb ytþhjt ytþh b
ytþhjt
Hence one can also smooth the observed time series {yt} as
0
b
ytjn ¼ Gt b
xtjn, En yt b
ytjn yt b
ytjn ¼ Gt Ptjn G0t þ Wt : ð14:30Þ
It is necessary to stress once more that the state space methodology is the
theoretical concept for construction of various recursive procedures in time series
analysis. Section 14.1.1 shows a possible application for recursive estimation of
(multivariate) GARCH models of financial time series.
1=2
rt ¼ Η t εt , ð14:31Þ
εt N ð0, IÞ ð14:32Þ
and Ηt1/2 is the square root matrix of conditional covariance matrix Ηt expressed in
time t as a suitable function of the information Ωt1 known till time t 1.
In particular, Ηt is a positive definite Ωt1-measurable matrix and Ht ¼
0
1=2 1=2
Ηt Ηt :
As the corresponding conditional moments are
Hence the conditional ML estimator of the true parameter vector θ for modeling
Ηt(θ) can be found by minimizing
T h
X i
min ln jΗt ðθÞj þ r0t Ηt ðθÞ1 rt : ð14:35Þ
θ
t¼1
h i
Rt ¼ Rt1 þ ηt e00t b
F θt1 Rt1 , ð14:37Þ
1
ηt ¼ for a forgetting factor ξt , ð14:38Þ
1 þ ξt =ηt1
where
(the approximation based on the conditional mean value in (14.40) makes simpler
the calculation of Hessian matrix). Finally, the forgetting factor {ξt} in (14.38)
substantially improves convergence and statistical properties of the given recursive
estimation. The usual choice in practice is either a constant forgetting factor ξ (e.g.,
ξ ¼ 0.95) or an increasing forgetting factor, e.g.,
ξt ¼ e
ξ ξt1 þ 1 e
ξ , ξ0 , e
ξ 2 ð0, 1Þ: ð14:41Þ
Besides the choice of forgetting factor, further technicalities must by solved before
applying the estimation in practice, e.g., the initialization of the estimation algo-
rithm. The special case of recursive estimation of univariate GARCH models is
shown in Hendrych and Cipra (2018).
Example 14.5 Let us consider the recursive estimation of the (single) parameter λ in
the multivariate EWMA (or MEWMA or scalar VEC-IGARCH(1,1)) model (13.1)
which can be rewritten as
where the discount constant λ (0 < λ < 1) is the only parameter in this very simple
multivariate GARCH model. Then after troublesome (matrix) arrangements one can
rewrite the recursive pseudo-linear regression (14.36)–(14.38) to the form
2 0 1 3
∂Ht bλt1 ∂Ht bλt1
bλt ¼ bλt1 ηt R1 4tr@H1 bλt1 A r0 H1 bλt1 H1 bλt1 rt 5,
t t
∂λ t t
∂λ t
ð14:43Þ
382 14 State Space Models of Time Series
2 0 1 3
∂Ht bλt1 ∂Ht bλt1
Rt ¼ Rt1 þ ηt 4tr@Ht b
1
λt1 Ht b
1
λt1 A Rt1 5,
∂λ ∂λ
ð14:44Þ
Htþ1 b λt rt r0t þ b
λt ¼ 1 b λt H t b
λt1 , ð14:45Þ
∂Htþ1 b
λt ∂Ht bλt1
¼ rt r0t þ Ht b
λt1 þ b
λt , ð14:46Þ
∂λ ∂λ
1
ηt ¼ for a forgetting factor fξt g ð14:47Þ
1 þ ξt =ηt1
(the symbol tr(A) denotes the trace of matrix A). Note that all calculations (including
the calculations of matrix derivatives in (14.46)) are recursive.
Figure 14.1 shows the simulation results in the bivariate case, where the process
{rt} in (14.42) was generated as a iid normal white noise with zero mean values, unit
variances, and correlation coefficient 0.8. Four alternatives with true values of the
parameter λ (namely λ ¼ 0.91, 0.94, 0.97, 0.99) were considered (one realized 1000
simulations for each of them).
This recursive estimate was applied for 647 couples of daily log returns of
40 currency rates versus EUR (i.e., (EUR, CURR1) and (EUR, CURR2)) from
January 1999 to December 2017 according to the European Central Bank. For
estimating the parameter λ, three approaches were used (see Cipra and Hendrych
(2019)):
• Fixed b
λt ¼ 0:94:
• Recursive MEWMA method with fixed forgetting factor ξt ¼ 0.995.
• Recursive MEWMA method with increasing forgetting factor ξt ¼ e
ξ ξt1 þ
1 eξ , where ξ0 ¼ 0:95, e
ξ ¼ 0:99:
For example, Fig. 14.2 presents the parameter estimators for the couple EUR/USD
and EUR/JPY. Moreover, the corresponding estimated conditional correlation and
volatilities are shown using the results of recursive MEWMA method with increas-
ing forgetting factor.
⋄
14.2 State Space Model Approach to Exponential Smoothing 383
0.95 0.95
0.90 0.90
0.85 0.85
0.80 0.80
T_250 T_500 T_750 T_1000 T_250 T_500 T_750 T_1000
0.95 0.95
0.90 0.90
0.85 0.85
0.80 0.80
T_250 T_500 T_750 T_1000 T_250 T_500 T_750 T_1000
Fig. 14.1 Recursive estimation of parameter λ in bivariate EWMA model (14.42) (boxplots are
based on 1000 simulations for four alternatives with true values λ ¼ 0.91, 0.94, 0.97, 0.99). Source:
Cipra and Hendrych (2019)
Exponential smoothing from Sects. 3.3 and 4.1.3 including Holt’s and Holt–Win-
ters’ method can be formulated as filtering and predicting based on state space
modeling (see the monograph by Hyndman et al. (2008)). One can even systemat-
ically classify particular models according to the type of trend, seasonal, and residual
(or error) components (see Sect. 2.2.2) and the type of decomposition of time series
(additive or multiplicative).
For instance, let us consider the following DLM (14.1)–(14.2) for a (univariate)
time series {yt}:
1 1 α
xt ¼ xt1 þ εt , ð14:48Þ
0 1 γ
yt ¼ ð1 1Þ xt1 þ εt ð14:49Þ
384 14 State Space Models of Time Series
0.95
0.5
0.90
0.0
0.85
–0.5
0.80
0.75 –1.0
4e–04
0.00015
2e–04
0.00000 0e–04
Fig. 14.2 MEWMA method for log returns of currency rates for the couple EUR/USD and
EUR/JPY from January 1999 to December 2017 in Example 14.5: the parameter estimators (the
smooth non-constant line plots the recursive MEWMA estimate with increasing forgetting factor)
and the corresponding model estimates of conditional correlation coefficient and volatilities (for
recursively estimated λ with increasing forgetting factor). Source: Cipra and Hendrych (2019)
with the state vector xt ¼ (Lt, Tt)0 , where the symbols Lt and Tt denote the level and
slope of the given time series (see Sect. 3.1.1), respectively, and {εt} is a white noise
(note that the residuals in state and observation equations in time t are mutually
correlated). Then one obtains gradually
0 1
0 1 α
1 1 0 0 ... 0 0 B γ C
B C B C
B0 1 0 0 ... 0 0C B C
B C B C
B C B δ C
B0 0 0 0 ... 0 1C B C
B C B C
B C B C
xt ¼ B
B0 0 1 0 ... 0 0C
C
B
xt1 þ B 0 C εt ,
C ð14:50Þ
B C B C
B0 0 0 1 ... 0 0C B 0 C
B C B C
B C B C
B⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮C B C
@ A B ⋮C
@ A
0 0 0 0 N 1 0
0
yt ¼ ð 1 1 0 0 . . . 0 1 Þ xt1 þ εt ð14:51Þ
with the state vector xt ¼ (Lt, Tt, It, It-1, . . ., It-s+1)’, where the symbols Lt, Tt, and It
denote the level, slope, and seasonal index of the given time series in time t,
respectively, and {εt} is again a white noise. Hence it follows gradually
byt ¼ Lt þ I t , ð14:55Þ
In this context, a broad class of state space models can be considered providing
various types of exponential smoothing alternatives.
386 14 State Space Models of Time Series
Table 14.3 Recursive relations of exponential smoothing for state space models ETS(A, , N)
(φτ ¼ φ + φ2 + + φτ)
Trend Recursive relations for ETS(A, , N)
None (N) Lt ¼ αyt + (1 α)Lt1
bytþτ ðt Þ ¼ Lt
Additive (A) Lt ¼ αyt + (1 α)(Lt1 + Tt1)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1
bytþτ ðt Þ ¼ Lt þ T t τ
Additive damped (Ad) Lt ¼ αyt + (1 α)(Lt1 + ϕTt1)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1 ϕ
bytþτ ðt Þ ¼ Lt þ T t ϕτ
Multiplicative (M) Lt ¼ αyt + (1 α)Lt1Tt1
Tt ¼ γ(Lt/Lt1) + (1 γ)Tt1
bytþτ ðt Þ ¼ Lt T τt
Multiplicative damped (Md) Lt ¼ αyt þ ð1 αÞT ϕt1
T t ¼ γ ðLt =Lt1 Þ þ ð1 γ ÞT ϕt1
ϕ
bytþτ ðt Þ ¼ Lt T t τ
14.2 State Space Model Approach to Exponential Smoothing 387
Table 14.4 Recursive relations of exponential smoothing for state space models ETS(A, , A)
(φτ ¼ φ + φ2 + + φτ, τs+ ¼ [(τ 1) mod s] + 1)
Trend Recursive relations for ETS(A, , A)
None (N) Lt ¼ α(yt 1Its) + (1 α)Lt1
It ¼ δ(yt Lt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt þ I tsþτþs
Additive (A) Lt ¼ α(yt Its) + (1 α)(Lt1 + Tt1)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1
It ¼ δ(yt Lt1 Tt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt þ T t τ þ I tsþτþs
Additive damped (Ad) Lt ¼ α(yt Its) + (1 α)(Lt1 + Tt1 ϕ)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1 ϕ
It ¼ δ(yt Lt1 Tt1 ϕ) + (1 δ)Its
bytþτ ðt Þ ¼ Lt þ T t ϕτ þ I tsþτþs
Multiplicative (M) Lt ¼ α(yt Its) + (1 α)Lt1Tt1
Tt ¼ γ(Lt/Lt1) + (1 γ)Tt1
It ¼ δ(yt Lt1Tt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt T τt þ I tsþτþs
Multiplicative damped (Md) Lt ¼ αðyt I ts Þ þ ð1 αÞLt1 T ϕt1
T t ¼ γðLt =Lt1 Þ þ ð1 γ ÞT ϕt1
I t ¼ δ yt Lt1 T ϕt1 þ ð1 δÞI ts
ϕ
bytþτ ðt Þ ¼ Lt T t τ þ I tsþτþs
Table 14.5 Recursive relations of exponential smoothing for state space models ETS(A, , M)
(φτ ¼ φ + φ2 + . . . + φτ, τs+ ¼ [(τ 1) mod s] + 1)
Trend Recursive relations for ETS(A, , M)
None (N) Lt ¼ α(yt/Its) + (1 α)Lt1
It ¼ δ(yt/Lt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt I tsþτþs
Additive (A) Lt ¼ α(yt/Its) + (1 α)(Lt1 + Tt1)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1
It ¼ δ(yt/(Lt1 + Tt1)) + (1 δ)Its
bytþτ ðt Þ ¼ ðLt þ T t τÞI tsþτþs
Additive damped (Ad) Lt ¼ α(yt/Its) + (1 α)(Lt1 + Tt1 ϕ)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1 ϕ
It ¼ δ(yt/(Lt1 + Tt1 ϕ) + (1 δ)Its
bytþτ ðt Þ ¼ ðLt þ T t ϕτ ÞI tsþτþs
Multiplicative (M) Lt ¼ α(yt/Its) + (1 α)Lt1Tt1
Tt ¼ γ(Lt/Lt1) + (1 γ)Tt1
It ¼ δ(yt/(Lt1Tt1)) + (1 δ)Its
bytþτ ðt Þ ¼ Lt T τt I tsþτþs
Multiplicative damped (Md) Lt ¼ αðyt =I ts Þ þ ð1 αÞLt1 T ϕt1
T t ¼ γðLt=Lt1 Þ þ ð1 γ ÞT ϕt1
I t ¼ δ yt Lt1 T ϕt1 þ ð1 δÞI ts
ϕ
bytþτ ðt Þ ¼ Lt T t τ I tsþτþs
388 14 State Space Models of Time Series
yt ¼ ð 1 1 Þ xt1 ð1 þ εt Þ ð14:58Þ
Here the recursive relations are not presented due to their complexity. To derive
them one should express at first the relative error as
yt Eðyt j xt1 Þ
εt ¼ ð14:62Þ
Eðyt j xt1 Þ
(from the observation relation yt ¼ E(yt | xt-1)(1 + εt) of the corresponding DML).
⋄
2. Construction of Exponential Smoothing Models
There is a lot of technicalities to be solved when constructing the exponential
smoothing models described in the previous text (selection, estimation, initialization,
assessing forecast accuracy; see Hyndman et al. (2008) and also Sects. 3.3 and
4.1.3). Here we deal briefly with the problem of model estimation only.
For this purpose, we apply the following general form of the
corresponding DML:
14.2 State Space Model Approach to Exponential Smoothing 389
with the state vector xt ¼ (Lt, Tt, It, It-1, . . ., It-s+1)0 serving as an argument of scalar
and vector (linear) functions. Further one assumes that {εt} is a normal white
noise with variance σ 2. The models with additive errors have r(xt1) ¼ 1
(so that yt ¼ E(yt | xt1) + εt), while the models with multiplicative errors have
r(xt1) ¼ E(yt | xt1) (so that yt ¼ E(yt | xt1)(1 + εt)).
If θ ¼ (α, γ, δ, φ)0 is the vector of unknown model parameters and x0 contains
given initial state values, then the corresponding (normal) log likelihood function
can be written as
n X n
1X 2 2
n
L θ, σ 2 jy, x0 ¼ ln 2πσ 2 ln jr ðxt1 Þj ε =σ ð14:65Þ
2 t¼1
2 t¼1 t
for observations yt from the vector y ¼ (y1, . . ., yn)0 . If taking the partial derivative
with respect to σ 2 and setting it to zero one obtains the maximum likelihood estimate
of the innovation variance σ 2 as
1X 2
n
σ2 ¼
b ε : ð14:66Þ
n t¼1 t
After putting (14.66) to (14.65) one obtains the concentrated log likelihood.
Hence, maximum likelihood estimates of parameters θ ¼ (α, γ, δ, φ)0 can be obtained
by minimizing (twice) the negative log likelihood function, i.e.,
( ! )
X
n X
n
min n ln ε2t þ2 ln jr ðxt1 Þj , ð14:67Þ
θ
t¼1 t¼1
where {xt-1} and {εt} are calculated recursively using initial state values x0 and
observations yt from the vector y ¼ (y1, . . ., yn)0 . This estimation method can be
completed by information criteria (e.g., AIC; see Sect. 6.3.1) to identify (select)
correct state space models.
Remark 14.3 One can generalize the given approach also for nonlinear state space
models, e.g., for time series with conditional heteroscedasticity using the model
ln yt ¼ Lt1 þ εt , ð14:70Þ
where εt ~ N(0, ht) (sometimes the last term in (14.69) is supplemented by further
positive parameter v3 to the form v2ln(|εt| + v3) to reduce the problem of small
residuals as arguments of logarithmic function).
⋄
Example 14.6 Hyndman et al. (2008) estimated the model (14.68)–(14.70) for
monthly closing prices of the Dow Jones Index (DJI) over the period January
1990–March 2007 as
ln yt ¼ Lt1 þ εt :
⋄
14.3 Exercises
Exercise 14.1 Derive the recursive relations of exponential smoothing for particular
state space models in Tables 14.3, 14.4, and 14.5 (hint: e.g., for ETS(A,N,N) in the
first row of Table 14.3 using model Lt ¼ Lt1 + α εt and yt ¼ Lt1 + εt one gets
Lt ¼ Lt1 + α(yt Lt1) ¼ αyt + (1 α)Lt1).
Exercise 14.2 Derive in detail the minimized expression in (14.67) when constructing
the maximum likelihood parameter estimates of state space models of exponential
smoothing.
References
Abraham, B., Ledolter, J.: Statistical Methods for Forecasting. Wiley, New York (1983)
Acerbi, C.: Spectral measures of risk: a coherent representation of subjective risk aversion. J. Bank.
Financ. 26, 1505–1518 (2002)
Adrian, T., Brunnermeier, M.K.: CoVaR. Federal Reserve Bank of New York, Staff Report
no. 348, September 2008
Ait-Sahalia, Y.: Testing continuous-time models for the spot interest rate. Rev. Financ. Stud. 9,
385–426 (1996)
Ait-Sahalia, Y., Jacod, J.: High-Frequency Financial Econometrics. Princeton University Press,
Princeton (2014)
Alexander, C.O., Chibumba, A.M.: Multivariate Orthogonal Factor GARCH. University of Sussex
Discussion Papers in Mathematics (1997)
Almon, S.: The distributed lag between capital appropriations and expenditures. Econometrica. 33,
178–196 (1965)
Al-Osh, M.A., Alzaid, A.A.: First-order integer-valued autoregressive (INAR(1)) process. J. Time
Ser. Anal. 8, 261–275 (1987)
Artzner, P., Delbaen, F., Eber, J.-M., Heath, D.: Coherent measures of risk. Math. Financ. 9,
203–228 (1999)
Bauwens, L., Laurent, S., Rombouts, J.: Multivariate GARCH models: a survey. J. Appl. Econ. 21,
79–109 (2006)
Baxter, M., Rennie, A.: Financial Calculus. An Introduction to Derivative Pricing. Cambridge
University Press, Cambridge (1996)
Bollerslev, T.: Generalized autoregressive conditional heteroscedasticity. J. Econ. 31, 307–327
(1986)
Bollerslev, T.: Modeling the coherence in short-run nominal exchange rates: a multivariate gener-
alized ARCH model. Rev. Econ. Stat. 72, 498–505 (1990)
Bollerslev, T., Wooldridge, J.M.: Quasi-maximum likelihood estimation and inference in dynamic
models with time varying covariances. Econ. Rev. 11, 143–172 (1992)
Bollerslev, T., Engle, R.F., Wooldridge, J.M.: A capital-asset pricing model with time-varying
covariances. J. Polit. Econ. 96, 116–131 (1988)
Bølviken, E.: New tests of significance in periodogram analysis. Scand. J. Statist. 10, 1–10 (1983)
Bowerman, B.L., O’Connell, R.T.: Time Series Forecasting. Duxbury Press, Boston (1987)
Box, G.E.P., Jenkins, G.M.: Time Series Analysis, Forecasting and Control. Holden-Day, San
Francisco (1970)
Box, G.E.P., Tiao, G.C.: Intervention analysis with applications to economic and environ-mental
problems. J. Am. Stat. Assoc. 70, 70–79 (1975)
Brock, W.A., Dechert, D., Scheinkman, H., LeBaron, B.: A test for independence based on the
correlation dimension. Econ. Rev. 15, 197–235 (1996)
Brockwell, P.J.: Lévy-driven continuous-time ARMA processes. In: Andersen, T.G., et al. (eds.)
Handbook of Financial Time Series, Part III: Topics in Continuous Time Processes. Springer,
Berlin (2009)
Brockwell, P.J., Davis, R.A.: Time Series: Theory and Methods. Springer, New York (1993)
Brockwell, P.J., Davis, R.A.: Introduction to Time Series and Forecasting. Springer, New York
(1996)
Brownlees, C.T., Engle, R.: Volatility, correlation and tails for systemic risk measurement. Stern
Center for Research Computing, New York University, New York (2012)
Campbell, J.Y., Lo, A.W., MacKinlay, A.C.: The Econometrics of Financial Markets. Princeton
University Press, Princeton (1997)
Chan, K.S., Tong, H.: On estimating thresholds in autoregressive models. J. Time Ser. Anal. 7,
179–190 (1986)
Chappel, D., Padmore, J., Mistry, P., Ellis, C.: A threshold model for French franc/Deutsch mark
exchange rate. J. Forecast. 15, 155–164 (1996)
Chen, R., Tsay, R.S.: Functional-coefficients autoregressive models. J. Am. Stat. Assoc. 88,
298–308 (1993)
Chiriac, R., Voev, V.: Modeling and forecasting multivariate realized volatility. J. Appl. Econ. 26,
922–947 (2011)
Cipra, T.: Financial and Insurance Formulas. Springer, New York (2010)
Cipra, T., Hendrych, R.: Systemic risk in financial risk regulation. Czech J. Econ. Financ. 67, 15–38
(2017)
Cipra, T., Hendrych, R.: Modeling of currency covolatilities. Statistika. 99(3), 259–271 (2019)
Clements, M., Harvey, D.: Forecast combination and encompassing. In: Mills, T., Patterson,
K. (eds.) The Palgrave Handbook of Econometrics, Applied Econometrics, vol. 2. Palgrave,
Oxford (2011)
Clements, A., Scott, A., Silvennoinen, A.: Forecasting Multivariate Volatility in Larger Dimen-
sions: Some Practical Issues. Working Paper #80, NCER Working Paper Series (2012)
Conley, T.G., Hansen, L.P., Luttmer, E.G.J., Scheinkman, J.A.: Short-term interest rates as
subordinate diffusions. Rev. Financ. Stud. 10, 525–577 (1997)
Dagum, E.B., Bianconcini, S.: Seasonal Adjustment Methods and Real Time Trend-Cycle Estima-
tion. Springer, New York (2016)
Davidson, J.: Econometric Theory. Blackwell, Oxford (2000)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM
algorithm. J. R. Stat. Soc. 39, 1–38 (1977)
Dickey, D.A., Fuller, W.A.: Distribution of estimators for time series regressions with a unit
root. J. Am. Stat. Assoc. 74, 427–431 (1979)
Dickey, D.A., Fuller, W.A.: Likelihood ratio statistics for autoregressive time series with a unit root.
Econometrica. 49, 1057–1072 (1981)
Ding, Z., Granger, C.W.J., Engle, R.F.: A long memory property of stock market returns and a new
model. J. Empir. Financ. 1, 83–106 (1993)
Duffie, D.: Security Markets: Stochastic Models. Academic, New York (1988)
Dupačová, J., Hurt, J., Štěpán, J.: Stochastic Modeling in Economics and Finance. Kluwer, Boston
(2002)
Durbin, J., Koopman, S.J.: Time Series Analysis by State Space Methods. Oxford University Press,
Oxford (2012)
Eilers, P.H.C., Marx, B.D.: Splines, knots and penalties. Wiley Interdiscip. Rev.: Comput. Stat. 2
(6), 637–653 (2010)
Elerian, O., Chib, S., Shephard, N.: Likelihood inference for discretely observed non-linear
diffusions. Econometrica. 69, 959–993 (2001)
Elliot, R.J., Kopp, P.E.: Mathematics of Financial Markets. Springer, New York (2004)
Embrechts, P., Kuppelberg, C., Mikosch, T.: Modelling Extremal Events. Springer, Berlin (1997)
References 393
Enders, W.: Applied Econometric Time Series. Wiley, New York (1995)
Engle, R.F.: Autoregressive conditional heteroscedasticity with the estimates of the variance of
United Kingdom inflations. Econometrica. 50, 987–1007 (1982)
Engle, R.F.: Dynamic conditional correlation—a simple class of multivariate GARCH
models. J. Bus. Econ. Stat. 20, 339–350 (2002)
Engle, R.F.: Anticipating Correlation. A New Paradigm for Risk Management. Theory and Practice.
Princeton University Press, Princeton (2009)
Engle, R.F., Granger, C.W.J.: Co-integration, and error correction: representation, estimation and
testing. Econometrica. 55, 251–276 (1987)
Engle, R.F., Kroner, K.F.: Multivariate simultaneous generalized GARCH. Econ. Theory. 11,
122–150 (1995)
Engle, R.F., Ng, V.K.: Measuring and testing the impact of news on volatility. J. Financ. 48,
1749–1778 (1993)
Engle, R.F., Russell, R.J.: Autoregressive conditional duration: a new model for irregularly spaced
transaction data. Econometrica. 66, 1127–1162 (1998)
Engle, R.F., Yoo, B.S.: Forecasting and testing in cointegrated systems. J. Econ. 35, 143–159
(1987)
Engle, R.F., Lilien, D.M., Robins, R.P.: Estimating time varying risk premia in the term structure:
the ARCH-M model. Econometrica. 55, 391–407 (1987)
Engle, R.F., Ng, V.K., Rothschild, M.: Asset pricing with a factor ARCH covariance structure:
empirical estimates for treasury bills. J. Econ. 45, 213–238 (1990)
Eraker, B.: MCMC analysis of diffusion models with applications to finance. J. Bus. Econ. Stat. 19,
177–191 (2001)
EViews 10. IHS Global Inc., Englewood (2018)
Fan, J., Yao, Q.: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer,
New York (2005)
Fleming, J., Kirby, C., Ostdiek, B.: The economic value of volatility timing using “realized”
volatility. J. Financ. Econ. 67, 473–509 (2003)
Francq, C., Zakoian, J.-M.: GARCH Models. Wiley, Chichester (2010)
Franke, J., Härdle, W., Hafner, C.M.: Statistics of Financial Markets. Springer, New York (2004)
Franses, P.H., van Dijk, D.: Non-Linear Time Series Models in Empirical Finance. Cambridge
University Press, Cambridge (2000)
Fuller, W.A.: Introduction to Statistical Time Series. Wiley, New York (1976)
Gallant, A.R., Long, J.R.: Estimating stochastic diffusion equations efficiently by minimum
chi-squared. Biometrika. 84, 125–141 (1997)
Glosten, L.R., Jagannathan, R., Runkle, D.E.: On the relation between the expected value and the
volatility of the nominal excess return on stocks. J. Financ. 48, 1779–1801 (1993)
Gómez, V.: Multivariate Time Series with Linear State Space Structure. Springer, New York (2016)
Gourieroux, C.: ARCH Models and Financial Applications. Springer, New York (1997)
Gourieroux, C., Jasiak, J.: Financial Econometrics: Problems, Models, and Methods. Princeton
University Press, Princeton (2001)
Granger, C.W.J.: Investigating causal relations by econometric models and cross-spectral methods.
Econometrica. 37, 424–438 (1969)
Granger, C.W.J.: Long memory relationships and the aggregation of dynamic models. J. Econ. 14,
227–238 (1980)
Granger, C.W.J., Andersen, A.P.: An Introduction to Bilinear Time Series Models. Vandenhoek
and Ruprecht, Gottingen (1978)
Granger, C.W.J., Newbold, P.: Forecasting Economic Time Series. Academic, San Diego (1986)
Greene, W.H.: Econometric Analysis. Prentice Hall, New York (2012)
Haggan, V., Ozaki, T.: Modelling nonlinear vibrations using an amplitude-dependent auto-regres-
sive time series models. Biometrika. 68, 189–196 (1981)
Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series and business
cycle. Econometrica. 57, 357–384 (1989)
394 References
Hamilton, J.D.: Time Series Analysis. Princeton University Press, Princeton (1994)
Härdle, W.: Applied Nonparametric Regression. Cambridge University Press, New York (1990)
Harvey, A.C.: Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge
University Press, Cambridge (1989)
Hatanaka, M.: Time-Series-Based Econometrics. Oxford University Press, Oxford (1996)
Hautsch, N.: Econometrics of Financial High-Frequency Data. Springer, Heidelberg (2012)
Heij, C., de Boer, P., Franses, P.H., Kloek, T., van Dijk, H.K.: Econometric Methods with
Applications in Business and Economics. Oxford University Press, Oxford (2004)
Hendry, D.F.: Dynamic Econometrics. Oxford University Press, Oxford (1995)
Hendrych, R., Cipra, T.: On conditional covariance modelling: an approach using state space
models. Comput. Stat. Data Anal. 100, 304–317 (2016)
Hendrych, R., Cipra, T.: Systemic risk in financial risk regulation. Czech J. Econ. Financ. 67, 15–38
(2017)
Hendrych, R., Cipra, T.: Self-weighted recursive estimation of GARCH models. Commun. Stat.
Simul. Comput. 47, 315–328 (2018)
Hull, J.: Options, Futures, and Other Derivative Securities. Prentice Hall, Englewood Cliffs (1993)
Hurst, H.: Long term storage capacity of reservoirs. Trans. Am. Soc. Civil Eng. 116, 770–799
(1951)
Hyndman, R.J., Koehler, A.B., Ord, J.K., Snyder, R.D.: Forecasting with Exponential Smoothing.
Springer, Berlin (2008)
Jacobs, P., Lewis, P.: Stationary discrete autoregressive-moving average time series generated by
mixtures. J. Time Ser. Anal. 4, 19–36 (1983)
Jeantheau, T.: Strong consistency of estimators for multivariate ARCH models. Econ. Theory. 14,
70–86 (1998)
Johansen, S.: Estimation and hypothesis testing of cointegration vectors in Gaussian vector
autoregressive models. Econometrica. 59, 1551–1580 (1991)
Johansen, S.: Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford
University Press, Oxford (1995)
Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Springer, New York (1988)
Kedem, B.: Binary Time Series. Marcel Dekker, New York (1980)
Kedem, B., Fokianos, K.: Regression Models for Time Series Analysis. Wiley, Hoboken (2002)
Kendall, M.: Time-Series. Griffin, London (1976)
Kessler, M.: Estimation of an ergodic diffusion from discrete observations. Scand. J. Stat. 24, 1–19
(1997)
Koopmans, L.H.: The Spectral Analysis of Time Series. Academic, San Diego (1995)
Kroner, K.F., Ng, V.K.: Modelling asymmetric co-movements of asset returns. Rev. Financ. Stud.
11, 817–844 (1998)
Kwaitkovski, D., Phillips, P.C.B., Schmidt, P., Shin, Y.: Testing the null hypothesis of stationarity
against the alternative of a unit root. J. Econ. 54, 159–178 (1992)
Kwok, Y.-K.: Mathematical Models of Financial Derivatives. Springer, New York (1998)
Lanne, M., Saikkonen, P.: A multivariate generalized orthogonal factor GARCH model. J. Bus.
Econ. Stat. 25, 61–75 (2007)
Lim, K.G.: Financial Valuation and Econometrics. World Scientific, Singapore (2011)
Lin, W.L.: Alternative estimators for factor GARCH models – a Monte Carlo comparison. J. Appl.
Econ. 7, 259–279 (1992)
Ljung, L.: System Identification: Theory for the User. Prentice Hall PTR, Upper Saddle River
(1999)
Ljung, L., Söderström, T.: Theory and Practice of Recursive Identification. MIT Press, Cambridge
(1983)
Lo, A.W.: Maximum likelihood estimation of generalized Ito’s processes with discretely sampled
data. Econ. Theory. 4, 231–247 (1988)
Lütkepohl, H.: New Introduction to Multiple Time Series Analysis. Springer, Berlin (2005)
References 395
MacDonald, I., Zucchini, W.: Hidden Markov and Other Models for Discrete-Valued Time Series.
Chapman and Hall, London (1997)
MacKinnon, J.G.: Numerical distribution functions for unit root and cointegration tests. J. Appl.
Econ. 11, 601–618 (1996)
MacKinnon, J.G., Haugh, A.A., Michelis, L.: Numerical distribution functions of likelihood ratio
tests for cointegration. J. Appl. Econ. 14, 563–577 (1999)
Makridakis, S.: Accuracy measures: theoretical and practical concerns. Int. J. Forecast. 9, 527–529
(1993)
Malliaris, A.G., Brock, W.A.: Stochastic Methods in Economics and Finance. North-Holland,
Amsterdam (1982)
McKenzie, E.: Some ARMA models for dependent sequences of Poisson counts. Adv. Appl.
Probab. 20, 822–835 (1988)
McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management. Princeton University Press,
Princeton (2005)
Merigó, J.M., Yager, R.R.: Generalized moving averages, distance measures and OWA operators.
Int. J. Uncertainty Fuzziness Knowledge-Based Syst. 21, 533–559 (2013)
Mills, T.C.: The Econometric Modelling of Financial Time Series. Cambridge University Press,
Cambridge (1993)
Montgomery, D.C., Johnson, L.A.: Forecasting and Time Series Analysis. McGraw-Hill,
New York (1976)
Musiela, M., Rutkowski, M.: Martingale Methods in Financial Modelling. Springer, New York
(2004)
Neftci, S.N.: Mathematics od Financial Derivatives. Academic, New York (2000)
Nelson, D.B.: Conditional heteroskedasticity in asset returns: a new approach. Econometrica. 59,
347–370 (1991)
Newey, W.K., West, K.D.: A simple, positive semi-definite, heteroskedasticity and auto-correlation
consistent covariance matrix. Econometrica. 55, 703–708 (1987)
Nicholls, D.F., Quinn, B.G.: Random Coefficient Autoregressive Models: An Introduction, Lecture
Notes in Statistics 11. Springer, New York (1982)
Parkinson, M.: The extreme value method for estimating the variance of the rate of return. J. Bus.
53, 61–65 (1980)
Phillips, P.C.B., Perron, P.: Testing for a unit root in time series regression. Biometrika. 75,
335–346 (1988)
Poon, S.-H.: A Practical Guide to Forecasting Financial Market Volatility. Wiley, Chichester (2005)
Priestley, M.B.: Non-Linear and Non-Stationary Time Series Analysis. Academic, London (1988)
Priestley, M.B.: Spectral Analysis and Time Series (Volume 1 and 2). Academic, London (2001)
Rachev, S.T., Mittnik, S., Fabozzi, F.J., Focardi, S.M., Jašić, T.: Financial Econometrics: from
Basics to Advanced Modeling Techniques. Wiley, Chichester (2007)
Ramsey, J.B.: Tests for specification errors in classical linear least-squares regression analysis. J. R.
Stat. Soc. B. 31, 350–371 (1969)
Ripley, B.D.: Statistical aspects of neural network. In: Barndorff-Nielsen, O.B., et al. (eds.)
Networks and Chaos-Statistical and Probability Aspects. Chapman and Hall, London (1993)
Risk Metrics-Technical Document. J. P. Morgan/Reuters, New York. www.riskmet-rics.com
(1996)
Ruppert, D.: Statistics and Finance. Springer, New York (2004)
Schlicht, E.: A seasonal adjustment principle and a seasonal adjustment method derived from this
principle. J. Am. Statist. Assoc. 76, 374–378 (1982)
Sentana, E.: Quadratic ARCH models. Rev. Econ. Stud. 62, 639–661 (1995)
Siegel, A.F.: Testing for periodicity in a time series. J. Am. Stat. Assoc. 75, 345–348 (1980)
Silvennoinen, A., Teräsvirta, T.: Multivariate GARCH models. In: Andersen, T.G., Davis, R.A.,
Kreiss, J.-P., Mikosch, T. (eds.) Handbook of Financial Time Series. Springer, New York
(2009)
Sims, C.A.: Money, income, and causality. Am. Econ. Rev. 62, 540–552 (1972)
396 References
Söderström, T., Stoica, P.: System Identification. Prentice Hall, New York (1989)
Steele, J.M.: Stochastic Calculus and Financial Applications. Springer, New York (2001)
Taylor, S.: Modelling Financial Time Series. Wiley, New York (1986)
Taylor, S.J.: Modelling stochastic volatility. Math. Financ. 4, 183–204 (1994)
Theil, H.: Applied Economic Forecasting. North-Holland, Amsterdam (1966)
Timmermann, A.: Forecast combinations. In: Handbook of Economic Forecasting, vol. 1, pp.
135–196. Elsevier (2006)
Tjøstheim, D.: Some doubly stochastic time series models. J. Time Ser. Anal. 7, 51–72 (1986)
Tong, H.: Threshold Models in Non-Linear Time Series Analysis. Springer, New York (1983)
Tong, H.: Non-Linear Time Series: A Dynamical Systems Approach. Oxford University Press,
Oxford (1990)
Tsay, R.S.: Analysis of Financial Time Series. Wiley, New York (2002)
Tse, Y.K., Tsui, A.K.C.: A multivariate GARCH model with time-varying correlations. J. Bus.
Econ. Stat. 20, 351–362 (2002)
van der Weide, R.: GO-GARCH: a multivariate generalized orthogonal GARCH model. J. Appl.
Econ. 17, 549–564 (2002)
Vasicek, O.: An equilibrium characterization of the term structure. J. Financ. Econ. 5, 177–188
(1977)
Vrontos, I.D., Dellaportas, P., Politis, D.N.: A full-factor multivariate GARCH model. Econ. J. 6,
311–333 (2003)
Wang, S.S.: A class of distortion operators for pricing financial and insurance risks. J. Risk Insur.
67, 15–36 (2000)
Wang, P.: Financial Econometrics: Methods and Models. Routledge, London (2003)
Wecker, W.E.: Asymmetric time series. J. Am. Stat. Assoc. 76, 16–21 (1981)
Wei, W.W.S.: Time Series Analysis: Univariate and Multivariate Methods. Addison-Wesley,
Boston (1994)
Weiss, C.H.: An Introduction to Discrete-Valued Time Series. Wiley, Chichester (2018)
Wilmott, P.: Quantitative Finance. Wiley, Chichester (2000)
Wu, C.F.J.: On the convergence of the EM algorithm. Ann. Stat. 11, 95–103 (1983)
Wu, W.B.: Nonlinear system theory: another look at dependence. Proc. Natl. Acad. Sci. 102(40),
14150–14156 (2005)
Yamai, Y., Yoshiba, T.: Value-at-risk versus expected shortfall: a practical perspective. J. Bank.
Financ. 29, 997–1015 (2005)
Zakoian, J.M.: Threshold heteroskedastic models. J. Econ. Dyn. Control. 18, 931–934 (1994)
Index
G
F GACD, 249
Factor model, 355, 359 Galton–Watson process, 29
FAR, 238 GARCH(1,1), 218
Fat-tailed distribution, 201 GARCH-M, 228
Feedback, 306, 326 GARCH(r,s), 217
Index 401
L Matrix
Lag covolatility, 352
mean, 185 conditional, 355
median, 185 eigenvalue, 340
operator, 69 rank, 340
Lagrange multiplier (LM) test, 320 transition, 31
Layer volatility, 354
hidden, 242 Maturity, 263
input, 242 Maximum domain of attraction, 289
output, 242 Maximum likelihood (ML), 215
Least squares method (OLS), 44 quasi (QML), 216, 362
weighted, 48 Mean absolute error (MAE), 24
Length percentage, 25
moving averages, 66 adjusted, 25
time series, 8 Mean absolute percentage error (MAPE), 25
Leptokurtic distribution, 201 Mean age, 78
Level, 97 Mean-corrected returns, 205
confidence, 269 Mean reverting, 151, 169, 264
return, 294 Mean squared error (MSE), 22
Leverage effect, 202 root, 24
Likelihood ratio (LR) test, 318, 345 Mean value of process
Linear process, 128 conditional, 204
multivariate, 312 estimated, 126
Linear trend, 43 nonlinear, 231
Liquidity risk, 268 Measure
Ljung–Box statistics, 146 high-low (H-L) measure, 203
Log return, 161, 199, 207 prediction accuracy, 22
Logarithmic price, 199, 255 risk, 268
Logarithmic trend, 59 coherent, 268
Logistic trend, 54 deviation, 271
Lognormal distribution, 256, 270, 273 —mean absolute, 271
Long memory process, 170, 171 —semi-deviation, 272
Long-run equilibrium, 336 —standard, 271
Longstaff–Schwartz model, 264 ——one-sided, 271
distorted, 273, 274
spectral, 275
M value at risk, 267, 268
MA(1), 130 —conditional, 264, 272
MA(2), 131 ——tail, 272
MA(q), 130 variance, 271
Marginal expected shortfall (MES), 366, 368 —coefficient, 271
Marginal model, 321 Wang, 275
Market price of risk, 265 Median test, 117
Market risk, 267 Method
Markov chain, 30 Cochrane–Orcutt, 180
homogenous, 30 Delphi, 17
Monte Carlo (MCMC), 240, 266 historical simulation, 279
Markov process, 33 modifications, 282
continuous, 35 Holt’s, 84, 384
switching, 239 Holt Winter’s, 97
Markov property, 30, 34 additive, 97, 384
Markov-switching autoregressive (MSA), 239 multiplicative, 99
Markov-switching (MSW), 239 least squares (OLS), 44
Index 403
OWA, 75 in-sample, 19
seasonal, 161 interval, 16
thinning, 37 Kalman–Bucy, 373, 378
vector half, 355 linear, 165
Option, 258 out-of-sample, 19
American, 260 percentage of correct direction change, 26
call, 260 percentage of correct sign, 25
European, 260 point, 16
premium, 258 qualitative, 17
call, 260 quantitative, 16
put, 260 static, 21
put, 260 in structural model, 19
Ornstein–Uhlenbeck process, 264 in time series model, 19
Outlier, 8, 15 VAR, 321
additive, 194 volatility, 217
innovation, 195 Predictor, 378
Overdifferencing, 160, 342 Premium
call, 260
option, 258
P put, 260
Pareto distribution, 296 Previsibility, 253
generalized, 296 Price, 199
type II, 296 exercise, 260
Partial autocorrelation function (PACF), 127 logarithmic, 199, 257
Partial mutual correlation function, 309 market price of risk, 265
Periodic component, 11 relative, 199
Periodic time series, 104 strike, 260
Periodogram, 14, 102 variation, 199
Phillips–Perron (PP) test, 154 relative, 199
Point Principal component analysis (PCA), 360
growth, 114 Probability
prediction, 16 distribution, 31
truncation, 127 transition, 30, 34
turning, 42, 117 Process
Poisson process, 33 ACD, 249
Portfolio, 276 adaptive control, 80
Portmanteau test, 145, 320 amplitude-dependent, 239
Position antipersistent, 172
long, 260, 351 AR, 131
short, 260, 351 ARFIMA, 172
Positive homogeneity, 268 ARIMA, 157
Prediction ARMA, 134
accuracy, 22 autoregressive, 131
Box–Jenkins, 164 conditional duration, 249
combination, 26 exponential, 239
equal-weighted, 27 functional coefficient, 238
median, 27 Markov switching, 239
ranking, 28 random coefficient, 238
trimmed mean, 27 threshold, 236
weighted in inverse proportional way to —self exciting, 236
MSE, 27 bilinear, 232
dynamic, 21 completely, 232
error, 22, 204 diagonal, 232, 233
406 Index
W X
WACD, 249 X-ARIMA, 90
Wald test, 319 X-12-ARIMA, 10, 90
Wang risk measure, 275 X-13-ARIMA-SEATS, 10, 90
Weibull distribution, 291
Weight, 48, 64, 242
White noise, 11
Y
estimated, 145
Yield
multivariate, 309
curve, 263
Wiener process, 36, 252
forward, 263
arithmetic, 253
spot, 263
exponential, 36, 255
risk-free, 263
generalized, 253
to maturity, 263, 306
geometric, 36
Yields to maturity (YTM), 263
with drift and volatility, 36, 253
Yule–Walker equations, 132, 318
Wiener–Khinchin theorem, 14
Window
prediction, 20
recursive, 20 Z
rolling, 20 Zero-coupon bond, 263
economies
Review
The ARDL Method in the Energy-Growth Nexus
Field; Best Implementation Strategies
Angeliki N. Menegaki
Department of Economics & Management of Tourist and Culture Units, Agricultural University of Athens,
33100 Amfissa, Greece; amenegaki@aua.gr
Received: 7 August 2019; Accepted: 14 October 2019; Published: 18 October 2019
Abstract: A vast number of the energy-growth nexus researchers, as well as other “X-variable-growth
nexus” studies, such as for example the tourism-growth nexus, the environment-growth nexus or
the food-growth nexus have used the autoregressive distributed lag model (ARDL) bounds test
approach for cointegration testing. Their research papers rarely include all the ARDL procedure
steps in a detailed way and thus they leave other researchers confused with the series of steps that
must be followed and the best implementation paradigms so that they not allow any obscure aspects.
This paper is a comprehensive review that suggests the steps that need to be taken before the ARDL
procedure takes place as well as the steps that should be taken afterward with respect to causality
investigation and robust analysis.
1. Introduction
Since the seminal work by Kraft and Kraft (1978) on the energy-growth nexus, various cointegration
and causality methods have been used in this field and the “X-variable growth nexus” framework
in general. The most common of them have been the Engle and Granger (1987) method based
on residuals, the Phillips and Hansen (1990) with a modified ordinary least square procedure,
Johansen (1988) and Johansen and Juselius (1990) maximum likelihood method.
However, some years later, it was realized that these methods may not be appropriate for small
samples (Narayan and Smyth 2005). Foremost, studies before the ARDL establishment, and this was
much the case for the energy-growth nexus, used cross sectional analysis through their panel data
configuration. This entailed that the countries included in those samples were not homogeneous
enough with respect to their economic development level (Odhiambo 2009). Unless results became
country specific, results from these studies were of little use for policy-making. This generated the need
for more sophisticated cointegration and causality methods. These econometric methods employed in
the older energy-growth nexus, have thrown light to other fields such as the tourism-growth nexus or
others, which this paper, for reasons of simplicity, terms as the “X-variable- growth nexus.”
The initiation of the autoregressive distributed lag (ARDL) method or Bounds test is due to
Pesaran and Shin (1999), while its further development is due to Pesaran et al. (2001). It is acknowledged
as one of the most flexible methods in the econometric analysis of the energy-growth nexus, particularly
when the research framework is shaped by regime shifts and shocks. The latter change the pattern of
energy consumption or the evolution of covariates in the energy-growth models. Moreover, the fact
that the ARDL method may tolerate different lags in different variables, this makes the method very
attractive, versatile, and flexible.
The ability to host sufficient lags enables best capturing of the data generating process mechanism.
This translates into that the method can be applied irrespective of whether the time series is I(0),
namely stationary at levels, I(1) namely stationary at first differences or fractionally integrated
(Pesaran et al. 2001). Nevertheless, within the ARDL framework, the series should not be I(2),
because this integration order invalidates the F-statistics and all critical values established by Pesaran.
Those have been calculated for series which are I(0) and/or I(1).
Furthermore, the ARDL method provides unbiased estimates and valid t-statistics, irrespective
of the endogeneity of some regressors (Harris and Sollis 2003; Jalil and Ma 2008). Actually,
because of the appropriate lag selection, residual correlation is eliminated and thus the endogeneity
problem is also mitigated (Ali et al. 2016). As far as the short-run adjustments are concerned, they
can be integrated with the long-run equilibrium through the error correction mechanism (ECM).
This occurs through a linear transformation without sacrificing information about the long-run horizon
(Ali et al. 2017). One other aspect is that the method allows the correction of outliers with impulse
dummies (Marques et al. 2017, 2019) and the approach distinguishes between dependent and
independent variables.
Last but not the least, the interpretation of the ARDL approach and its implementation is
quite straightforward (Rahman and Kashem 2017) and the ARDL framework requires a single form
equation (Bayer and Hanck 2013), while other procedures require a system of equations. The ARDL
approach is more reliable for small samples as compared to Johansen and Juselius’s cointegration
methodology (Haug 2002). Halicioglu (2007) also mentions two more advantages of the method, which
are: The simultaneous estimation of short- and long-run effects and the ability to test hypotheses on
the estimated coefficients in the long-run. This is not done in the Engle–Granger method.
This paper is organized as follows: After the introduction, follows the methodology as Section 2,
together with best practice guidelines. Section 3 contains other versions of the ARDL approach and
ARDL implementation strategies to follow in one’s energy-growth nexus paper, and Section 4 concludes
the paper.
2. The Methodology
For reasons of educative demonstration, we assume two series, the Yt and the Xt in this paper but
the reader can easily generalize into more variables. Nevertheless, the production function equation
in the energy-growth nexus, has more variables. In a bivariate energy-growth nexus model, the Yt
stands for economic growth and the Xt stands for energy consumption. It is also typical in the energy
growth nexus to use logarithms of the variables in order to translate variable coefficients as elasticities.
The series of steps in the ARDL procedure is the investigation of: (i) stationarity, (ii) cointegration, and
last but not least (iii) causality. There are other ways to proceed to causality analysis without the first
two steps, but this occurs within other methodological frameworks.
2.1. Stationarity
After a presentation of the descriptive statistics of the series (mean, median, minimum and maximum
values, skewness, kurtosis, as well as the standard deviation, Bera–Jacque normality test and pairwise
correlation), the first step in the ARDL analysis, is the unit root analysis. It informs about the degree of
integration of each variable. To satisfy the bounds test assumption of the ARDL models, each variable
must be I(0) or I(1). Under no circumstances, should it be I(2). De Vita et al. (2006) also noted that the
dependent variable should be I(1). However, this is not widely claimed in the current literature. Unit root
analysis is performed with a long array of tests such as for example the augmented Dickey Fuller (ADF)
and the Kwiatkowski–Phillips–Schmidt–Shin (KPSS), the Phillips–Perron (PP), the Ng–Perron test, the
cross-sectional augmented IPS-CIPS (Pesaran 2007), the LS (Lee and Strazicich 2003), and many others.
Each one is more compatible with different data characteristics, but this paper will not discuss them for
brevity reasons. However, it should be stressed that researchers should apply both the traditional and
structural break unit root tests to make sure that the variables are not I(2).
Economies 2019, 7, 105 3 of 16
2.2. Cointegration
The essence models in the ARDL bounds test framework are the following unrestricted error
correction models:
m
X n
X
∆LYt = a0 + a1 t + α2i ∆LYt−i + a3i ∆LXt−i + a4 LYt−1 + a5 LXt−1 + µ1t (1)
i=1 i=0
m
X n
X
∆LXt = β0 + β1 t + β2i ∆LXt−i + β3i ∆LΥt−i + β4 LXt−1 + β5 LΥt−1 + µ2t (2)
i=1 i=0
∆ is the first difference operator, µ is the error term that must be a white noise or put in other
words it represents the residual term which is supposed to be well behaved (serially independent,
homoskedastic and normally distributed). All α and β coefficients are non-zero with a4 and β4 also being
negative (this represents the speed of adjustment). The parameters α2i and a3i represent the short-run
dynamic coefficients, while a4 and a5 are long-run coefficients in the energy-growth nexus relationship.
The a0 and β0 are drift components, µ1t and µ2t are white noise. What type of explanatory variables
must be incorporated in the energy-growth relationship is provided in detail by Inglesi-Lotz (2018) in a
chapter written specifically on this topic. The interested reader is advised to read that. Generally, one
can decide first on the framework one is going to work, namely whether that is a production function
approach or a demand function approach or others such as the Kuznets curve hypothesis and then
decide on the variables and other components. Other deterministic components are included on a trial
and error basis and to corroborate further the stability of an estimated relationship.
Overall, we observe in Equations (1) and (2) that each variable is represented as dependent on the
past values of itself, the past values of the other variable(s), and the past values of differenced values
of itself and the past values of differenced values of the other variable(s). Models (1) and (2) can be
formulated either as intercept or trend ARDL models, or both. Equations (1) and (2) contain both.
Halicioglu (2007) claims that it is possible to end up with two models, one with trend and one without
a trend. There is a method described in Bahmani-Oskooee and Goswami (2003), according to which
one ends up with a single long-run relationship through consecutive eliminations of the rest of the
relationships. The first stage of the ARDL estimation produces a (p + 1)k number of regressions so that
the optimal lag length for each variable is obtained, with p being the maximum number of lags and k
is the number of variables in the equation. In our simplistic example, there is only one Xt variable.
In the framework described in Equations (1) and (2), the ARDL bounds cointegration test is carried out.
These equations are estimated with ordinary least squares (OLS).
H1 : a 1 , a 2 , a n , 0
The setup of the hypotheses reads as follows: there is cointegration if the null hypothesis is
rejected. The F-statistics for testing are compared with the critical values developed by Pesaran et
al. (2001). Narayan critical values are more appropriate for small samples. Pesaran et al. (2001)
provide a table enumerated as CI and entitled: “Asymptotic critical value bounds for the F-statistic.
Testing for the existence of a levels relationship” in five versions. These are (i) no intercept and no
trend, (ii) restricted intercept and no trend, (iii) unrestricted intercept and no trend, (iv) unrestricted
Economies 2019, 7, 105 4 of 16
intercept and restricted trend, (v) unrestricted intercept and unrestricted trend. They also provide a
table CII entitled “Asymptotic critical value bounds for the t-statistic. Testing for the existence of a levels
relationship” in three versions: (i) No intercept and no trend, (ii) unrestricted intercept and no trend, (iii)
unrestricted intercept and unrestricted trend. Next we reproduce a part of these tables (CI-iii and CI-v)
in order to explain how the decision for cointegration was made in Bölük and Mert (2015) based on
Pesaran tables. Note that Pesaran tables are not valid for I(2) variables (Ali et al. 2016). The interested
reader can find these tables in Pesaran et al. (2001).
Narayan and Smyth (2005) on the other hand, has estimated critical values for the bounds test for
four cases at three significance levels and up to seven independent variables up to eighty observations.
The critical values of the four cases are entitled as: (i) Case II: restricted intercept and no trend,
(ii) case III: unrestricted intercept and no trend, (iii) case IV: unrestricted intercept and restricted
trend, (iv) case V: unrestricted intercept and unrestricted trend. In Narayan tables, k stands for the
number of regressors, n is the sample size, I(0): stationary at levels, I(1): stationary at first differences.
The interested reader can find these tables in Narayan and Smyth (2005).
When no cointegration is confirmed, we can proceed with simple Granger causality (unrestricted
VAR). The VAR equation should be specified on stationary data. There are various reasons why
cointegration is not confirmed (e.g., no relationship between the examined variables or due to omitted
variables). The Toda and Yamamoto (1995) test is a solution for Granger causality testing in this
case. After all, even when a long-run relationship does not exist in the data, this does not mean that
no short-run relationship exists either. Moreover, it needs to be remembered that the cointegration
equation provides the long-run elasticities. Short-run elasticities are presented by the coefficients
of the first differenced variables. In cases where more than one coefficient for a particular variable
has been estimated for the short-run case, these are added and their joint significance is tested with
a Wald test (Fuinhas and Marques 2012). However, if cointegration is the case (which occurs very
commonly, when there is a known and established theoretical connection between some variables),
then we can proceed with the establishment of the error correction mechanism (ECM). Evidence of
cointegration implies that there is a long-run relationship between the variables and their connection is
not a short-lived situation, but a more permanent one, which can be recovered every time there is a
disturbance. Alternatively to the above described F-test, a Wald test can be applied which is used to
test the null hypothesis of no cointegration when there is more than one short-run coefficient of the
same variable (Tursoy and Faisal 2018).
1 A VAR model is a generalization of univariate AR models for multiple time series. Within a VAR framework, all variables
are represented by an equation that explains its evolution based on its own lags and the lags of the other variables in the
multivariate framework. The number of variables k are measured over a period of time t as a linear evolution of their
past values.
Economies 2019, 7, 105 5 of 16
for valuable information in their ARDL models. The impulse response function mainly shows what
happens when the model is transferred to the one side of a dummy variable. For example, if the value
of 1 represents war time and the value of 0 represents peace time, then if we take the ones or zeros only
and separately, we have an impulse response function, one for war time and one for peace time. Thus,
they are also a useful tool to test the stability of a model across structural breaks. There are various
hypotheses that underlie the models after cointegration is confirmed. After the identification of the
long-run relationship in Equations (1) and (2), we can continue with the examination of the short-run
and the long-run Granger causality. The Granger causality refers to a situation where the past can
be used to predict the future. Thus, if past values of Xt significantly contribute to forecasting future
values of Yt , the Xt is said to Granger cause the Yt . However, evidence of correlation is not necessarily
an evidence for causality.
2.5. Combined Cointegration Methods for the Robustness of the ARDL Model
In the particular case of a unique order of integration, Bayer and Hanck (2013) have developed a
test which borrows elements from a variety of previously developed cointegration tests. The combined
test borrows elements from Engle and Granger (1987); Johansen (1988); Boswijk (1994) and Banerjee et al.
(1998). The combined cointegration test uses Fisher’s formulae and the p-values of the aforementioned
individual tests.
h i
Engle and Granger − Johansen = −2 ln PEngle & Granger + ln PJohansen
2.6. Causality after the ARDL Bounds Test and the Importance of the Error Correction Term (ECT)
The investigation of causality is the third step in the energy-growth nexus analysis. The lagged
error correction term is derived from the cointegration equation. Thus the long-run information that is
missed through the differencing of the variables for stationarity purposes, is re-introduced in the system
of causality equations. This is a necessary step when variables are cointegrated. Cointegration implies
that there must be causality of some direction, however, it does not reveal to which direction that
causality goes. Therefore, additional causality analysis is required. Thus, before going to the estimation
of Equations (3) and (4) below, one needs to run another set of regressions in order to get the residuals
which will be inserted to Equations (3) and (4) as the ECT term.
There are many strategies to follow in the examination and direction of causality. One such strategy
is the VECM approach (vector error correction model), which is a restricted form of unrestricted VAR
and is suitable, once the variables are integrated at I(1). According to this model setup, the dependent
variable is dependent on its own lagged values, as well as the lagged values of the independent variables,
the error correction term, and the residual term. This is shown in the following set of equations.
l
X m
X
∆lnYt = a1 + a11 ∆LYt−i + a22 ∆Xt− j + n1 ECTt−1 + µ1i (3)
i=1 j=0
l
X m
X
∆lnXt = a1 + a21 ∆LXt−i + a22 ∆Yt− j + n2 ECTt−1 + µ2i (4)
i=1 j=0
Economies 2019, 7, 105 6 of 16
Residual terms in the above equations, are assumed to distribute normally. The coefficient of the
ECT must be negative to assure system convergence from the short run toward the long run. An ECT
equal to x% is interpreted as such that x% of economic growth is corrected by deviations in the short
run that lead eventually to the long-run equilibrium path. The significant variables on the right hand
side of each equation show short-run causality for the dependent variable.
H0 : b 1 = b 2 = . . . = b p = 0
H1 : X Granger causes Y
A similar hypothesis set up can be constructed for the second equation, but this will not be done
here for space considerations. Please note the following rationale:
After we have calculated the diagnostics of the model and we have verified that the model is well
behaved, then the next step is the bounds test. The existence of a long-run relationship can be further
corroborated with the investigation of significance of the individual terms.
Economies 2019, 7, 105 7 of 16
3.1. The Asymmetric Nonlinear or the Nonlinear Autoregressive Distributed Lag (NARDL) Approach
This version of the ARDL approach was introduced by Shin et al. (2011, 2014) and is an extension
of the method introduced by Pesaran et al. (2001). The nonlinear ARDL is used for testing whether
the positive shocks of the independent variables have the same effect as their negative shocks on the
dependent variables. In the typical ARDL, there is a symmetric relationship between the dependent
and the explanatory variables. This is not the case with the NARDL in which the ARDL relationship is
formulated as follows:
yt = a+ xt + + a− xt − + εt
The alphas are the long-run parameters, while xt is the following vector regressor:
xt = x0 + xt + + xt −
With xt + being the positive partial sum and xt − being the negative partial sum as follows:
t
X t
X
xt + = ∆xi + = max(∆xi , 0)
i=1 i=1
t
X t
X
xt − = ∆xi − = max(∆xi , 0)
i=1 i=1
This means that the corresponding error correction model can be written as:
j−1
X p
X
∆yt = ρyt−1 + θ+ xt−1 + + θ− xt−1 − + ϕi ∆yt−i + πi + ∆xt−i + + πi − ∆xt−i − + εt
i=1 i=0
Using the F-statistic developed by Pesaran et al. (2001), one can test the hypothesis that θ+ θ− = θ = 0.
The rejection of the null hypothesis indicates the presence of cointegration. The hypothesis of θ = 0
versus the alternative that θ < 0 is examined through a t-test (Banerjee et al. 1998).
Economies 2019, 7, 105 8 of 16
Overall, the procedure steps are exactly as the conventional ARDL approach that has been already
presented in this paper. In addition to that, the method provides the cumulative dynamic multiplier
effects of x+ and x− on yt as follows:
k k
X ∂y t+i
X ∂y t+i
mk + = and mk − =
∂xt + ∂xt −
i=0 i=0
When k increases to infinity, the multipliers converge to the alphas. This method has been
applied by Shahbaz (2018) in a case study for the energy-growth nexus in Al-hajj et al. (2018) for the
investigation of the oil price and stock returns nexus in Malaysia. The NARDL method is applicable if
all variables are integrated at I(1) or they have a flexible order of integration. The approach solves
multicollinearity through the choice of the appropriate lag length of variables (Shin et al. 2014). Thus,
the bounds test proposed by Shin et al. (2014) examines the presence of cointegration while at the
same time hosting asymmetries. As far as causality is concerned, a complete account of asymmetric
causality is presented in Apergis (2018) who provides a detailed account also on linear versus the
nonlinear causality.
3.2. The Pool Mean Group (PMG) Estimator for Panel Data
The PMG allows for heterogeneity only in the short-run compared to the mean group which
allows for heterogeneity both in the short and the long-run. The pool mean group estimates are
superior to the fixed effects estimates, because they are robust to endogeneity and to the presence of unit
roots. Overall the PMG is an estimator that allows pooling and averaging. Besides the short-run and
long-run effects that are captured among the variables of a model, the PMG additionally investigates
the dynamic effects of the independent variables on the dependent variable.
The general form of the PMG can be seen in the following Equation:
p
X q
X
Yit = λij yi, t− j + δij Xi,t− j + µt + εit
j=1 j=0
The error correction equation can be derived from the previous equation as:
p−1
X q−1
X
∆Yit = ϕi yi,t− j − θi Xi,t− j λij ∆yi,t− j + δij ∆i,t− j + µt + εit
j=1 j=0
With ϕi indicating the speed of adjustment which needs to be negative and significant in order
to have convergence in the long-run horizon. If the speed of adjustment is zero, then no long-run
relationship would be present. This equation provides the short-run dynamics that correspond to the
long-run ones described in the cointegration equation. Besides the sign of the adjustment coefficient,
the researcher must pay attention to the rest of the signs both I the cointegration and the error correction
equation and decide whether they are consistent with economic theory and established research in the
energy-growth nexus field. After short-run and long-run causal findings have been corroborated, it is
useful to document them with policy reasons, namely find out why a causal direction is happening,
whether it is due to some energy or environmental policy or whether it is due to the lack of some relevant
Economies 2019, 7, 105 9 of 16
policy. Comparison with the findings of other studies is also essential at this point. The estimates from
the PMG estimator are consistent and asymptotically normal for both stationary and non-stationary
regressors. As with conventional ARDL, the appropriate lag length in the PMG can be determined by
the AIC and SBC criteria. Foremost, the more homogeneous the panels are, the more efficient the PMG
estimator is.
Additional attention is advised for researchers with panel data who are advised to perform
both the Pesaran (2004) CD test and the Pesaran and Yamagata’s slope homogeneity tests.
The Pesaran (2004) CD test was formulated as an answer to the shortcomings faced in the scaled
LM test (Pesaran 2004; Breusch and Pagan 1980). Large panel data sets could not be handled with
the Breusch and Pagan test. Thus, Pesaran (2004) suggested the standardized version of that LM
test. Again, however, this solution had its own restrictions with large panels where cross sections
were large but the time span was not long enough. The CD test was proposed as a final solution that
could accommodate both smaller cross sections and shorter data spans. In many studies we make the
comfortable assumption that the slope coefficients are homogeneous. While, when the time span is
long and the cross section dimension short, this can be tested with seemingly unrelated regressions
(SURE), but these dimensions are not always the case. Pesaran and Yamagata (2005) have proposed a
modified Swamy’s test of slope homogeneity. Swamy (1970) bases his test of the slope homogeneity on
the dispersion of individual slope estimates from a suitable pooled estimator. For more on these tests,
the interested reader should read the suggested bibliography.
3.3. What Are the ARDL Best Implementation Strategies to Follow in One’s Energy-Growth Nexus Paper?
This paper deals with the general outline of the research in the ARDL analysis and not the specific
direction that various studies may end up with, because of specific handlings dictated by data, theory,
and research demands. For example Liu (2009) ends his/her ARDL analysis with a factor decomposition
model (FDM) analysis, which shows the yearly causal contribution of each variable onto the dependent
variable. This is not how most ARDL energy-growth nexus studies end with. The typical outline
of most of these studies is an investigation of the integrational properties of the variables, followed
by an ARDL cointegration analysis that ends with a causality analysis. In the following two tables
(Tables 1 and 2), the ARDL implementation strategies are provided with guidelines for every step and
variant. Table 1 contains guidelines for the time series data, while Table 2 contains guidelines for the
panel data version of the ARDL implementation. For more detailed discussions on time series and
panel data causality tests dependent on cointegration and integration results, the reader is advised to
consult the studies by Tugcu (2018) and Apergis (2018) in the book by Menegaki (2018) entitled as
“The Economics and the Econometrics of the energy-growth nexus” and by Marques et al. (2019) in the
book by Fuinhas and Marques (2019) entitled as “The extended energy-growth nexus.”
Table 1. Autoregressive distributed lag model (ARDL) implementation for time series data in the
energy-growth nexus.
Stages in Time-Series
ARDL Implementation
First: Stationarity, Unit roots, and order
of integration
ADF: Augmented Dickey Fuller,
PP: Philips–Perron
(Note: They have low power properties, but
since literature is still using them, it is good
to use them as reference)
Economies 2019, 7, 105 10 of 16
Table 1. Cont.
KPSS: Kwiatowksi–Phillips–Schmidt–Shin
ADF-WS: Augmented Dickey
Fuller-Weighted Symmetric (Note: Good
size and power properties)
LS: Lee and Strazicish for breaks
and various other tests depending on the
assumptions made about the data or the
knowledge of them . . .
When contradictory results are reached,
observing the correlogram is a good idea.
Are the series I(0) or I(1)? If yes, proceed
with ARDL cointegration
Yes: No:
Stationarity Stationarity
Second stage: Cointegration
Maximum lag value is decided on AIC and
BIC basis and HQC. The F value for the
cointegration test should be applied for all
criteria (BIC, AIC, HQC).
If cointegration evidence is inconclusive,
Yes: Cointegration then the decision about the long-run No: Cointegration
relationship is based on the ECT.
Are long-run coefficients significant?
Do they have the correct sign?
If we find no evidence of
We need to augment the cointegration, then the
Granger-type causality test model specification will be a vector
with one period lagged ECT autoregression (VAR) in 1st
difference form (Liu 2009)
Even if the ECT is incorporated in
all equations of the Granger
causality model, only in the
equations where the null
hypothesis of no cointegration is
rejected, will be estimated with an
ECT (Narayan and Smyth 2006).
Is the cointegration equation robust?
Answer: Use the FMOLS, DOLS to check.
Third stage: Causality
Granger causality is ideal both for small
and large samples (Geweke et al. 1983 )
The ECT model allows the inclusion of the
lagged ECT derived from the cointegration
equation. Thus the long-run information
lost through differencing is reintroduced.
Does the ECM have a negative sign?
Are the estimated coefficients stable?
Work with diagnostics to prove robustness
of your model
Source: Author’s compilation. Note: BIC: Bayesian (Schwarz) information criterion, AIC: Akaike information
criterion, HQC: Hannan–Quinn criterion, ECT: error correction model, FMOLS: fully modified OLS, DOLS:
dynamic OLS.
Economies 2019, 7, 105 11 of 16
Experienced researchers will have so far realized that the panel data are many shorter time series
data, pooled together. The data generation process may be, or may not be, the same across panels
Economies 2019, 7, 105 12 of 16
(sub-groups of data). Therefore, several time series tests and procedures have been adapted from time
series into panel data through a kind of averaging across panels (groups of data). Panel data are a
convenient way in energy economics to overcome problems such as collinearity. Furthermore, that
data provide more degrees of freedom and a more informed speed of adjustment. On top of that, with
this approach one can control for heterogeneity and efficiency in the identification and measurement of
economic issues (Tugcu 2018).
Panel data suffer from limitations such as the cross-sectional dependence, which is attributed
to globalization and unification of policies across panel units (e.g., countries). This makes energy
consumption patterns follow similar movements among the various countries in a panel, particularly
if countries are signatories to the same environmental and emissions cutting agreement. The other
limitation comes from the fact that panel data are in essence two entry level data and thus the error
term in modeling contains both unit-specific (e.g., country) information and time-specific information.
This may contribute to the endogeneity problem if the aforementioned error components are correlated
to explanatory variables. However, these drawbacks do not discourage researchers from using panel
data, which are the main type of data to expect in the energy-growth nexus research field.
Before closing this paper, it is useful to recommend the sites for the implementation of ARDL and
NARDL coding in EVIEWS and STATA softwares:
ARDL and NARDL coding and implementation in EVIEWS available from: http://www.eviews.
com/help/helpintro.html#page/content/ardl-Estimating_ARDL_Models_in_EViews.html.
ARDL and NARDL coding and implementation in STATA available from: https:
//www.statalist.org/forums/forum/general-stata-discussion/general/1434232-ardl-updated-stata-
command-for-the-estimation-of-autoregressive-distributed-lag-and-error-correction-models.
Note: As far as NARDL coding and implementation in EVIEWS and STATA are concerned, since
it is an ARDL model, it is just an estimation with lags of variables. One can specify that as a non-linear
estimation with the least squares estimator.
4. Conclusions
The energy-growth nexus economics is a field that attracts major research attention, because of
the significant information it provides to policy-makers who consider energy conservation measures.
The ARDL method has been mostly favored and used in the past decade owing to its merits (flexibility,
interpretability, eloquence, and statistical properties that are explained in the introduction of this
paper). The paper meets the needs of two groups of researchers: one group is the new researchers who
have recently started using the ARDL method. As a result of that, some points of its implementation
are not fully clarified to them yet, because those are fragmented in various research papers and lecture
notes on the internet. This fragmentation causes delays in research and paper writing and always
leaves room for journal reviewers to reject a paper or advise major reviews. The other group is the more
experienced researchers who have used the method a lot of times, but there is always an aspect in the
method that will be benefited from throwing additional light into. Besides, the method is continuously
enriched it its applied dimension and the reading of this paper by experienced researchers will grant
them the opportunity to stay up-to-date with the method’s evolution.
The paper is referencing applied work and knowledge throughout. Sometimes, it happens
that even experienced researchers are using a test of a statistical concept, whose exact meaning
needs brushing-up since the days they learned that during their undergraduate years at university.
Furthermore, the paper guides the ARDL energy-growth researcher about the steps that need to be
taken and the exact way that results should be presented and written in a paper in order to create the
readers a feeling of transparency when they read a research paper. Moreover, this point will offer
comparability among papers and will enable apt meta-analysis which is so valuable for the progress of
science and the evolution of society.
The paper can also serve as a review and reference paper for post-graduate students writing their
MA/MSc (not lest PhD) dissertation and need to employ this method. The quintessence of the paper
Economies 2019, 7, 105 13 of 16
lies in the last two tables of the fifth section, which separate the ARDL steps between the time-series
and panel-data frameworks. Degree of integration, cointegration, and causality steps are explained
and presented in a vertebrate and well-tied nature and relieves students from the stress of selecting the
correct test in every step of the implementation.
Last but not the least, the content of this paper is useful not only for the researchers of the
energy-growth nexus, but also for the researchers of other fields such as the tourism-growth nexus or
the broader environment-growth nexus and the Kuznets curve studies.
References
Kraft, John, and Arthur Kraft. 1978. On the Relationship between Energy and GNP. Journal of Energy Development
3: 401–3.
Engle, Robert F., and Clive W. J. Granger. 1987. Co-Integration and Error Correction: Representation, Estimation,
and Testing. Econometrica 55: 251–76. [CrossRef]
Ali, Hamisu Sadi, Siong Hook Law, and Talha Ibrahim Zannah. 2016. Dynamic impact of urbanization, economic
growth, energy consumption, and trade openness on CO2 emissions in Nigeria. Environmental Science and
Pollution Research 23: 12435–43. [CrossRef] [PubMed]
Rahman, Mohammad Mafizur, and Mohammad Abul Kashem. 2017. Carbon emissions, energy consumption
and industrial growth in Bangladesh: Empirical evidence from ARDL cointegration and Granger causality
analysis. Energy Policy 110: 600–8. [CrossRef]
Bölük, Gülden, and Mehmet Mert. 2015. The renewable energy, growth and environmental Kuznets curve in
Turkey: An ARDL approach. Renewable and Sustainable Energy Reviews 52: 587–95. [CrossRef]
Boswijk, H. Peter. 1994. Testing for an unstable root in conditional and structural error correction models.
Journal of Econometrics 63: 37–60. [CrossRef]
Stock, James H., and Mark W. Watson. 1993. A Simple Estimator of Cointegrating Vectors in Higher Order
Integrated System. Economometrica 61: 783–820. [CrossRef]
Swamy, Paravastu A. V. B. 1970. Efficient inference in a random coefficient regression model. Econometrica
38: 311–23. [CrossRef]
Fuinhas, Jose Alberto, and António Cardoso Marques, eds. 2019. The Extended Energy–Growth Nexus. Cambridge:
Academic Press.
Dumitrescu, Elena-Ivona, and Christophe Hurlin. 2012. Testing for Granger non-causality in heterogeneous
panels. Economic Modelling 29: 1450–60. [CrossRef]
Al-hajj, Ekhlas, Usama Al-Mulali, and Sakiru Adebola Solarin. 2018. Oil price shocks and stock returns nexus for
Malaysia: Fresh evidence from nonlinear ARDL test. Energy Reports 4: 624–37. [CrossRef]
Ali, Wajahat, Azrai Abdullah, and Muhammad Azam. 2017. Re-visiting the environmental Kuznets curve
hypothesis for Malaysia: Fresh evidence from ARDL bounds testing approach. Renewable and Sustainable
Energy Reviews 77: 990–1000. [CrossRef]
Pedroni, Peter. 2007. Social Capital, Barriers to Production and Capital Shares: Implications for the Importance of
Parameter Heterogeneity from a Nonstationary Panel Approach. Journal of Applied Econometrics 22: 429–51.
[CrossRef]
Apergis, Nicholas. 2018. Testing for Causality: A Survey of the Current Literature. In The Economics and
Econometrics of the Energy-Growth Nexus. Edited by Angeliki N. Menegaki. Cambridge: Academic Press,
pp. 273–305.
Asafu-Adjaye, John. 2000. The relationship between energy consumption, energy prices and economic growth:
Time series evidence from Asian developing countries. Energy Economics 22: 615–25. [CrossRef]
Bahmani-Oskooee, M. Mohsen, and Gour G. Goswami. 2003. A disaggregated approach to test the J-Curve
phenomenon: Japan versus her major trading partners. Journal of Economics and Finance 27: 102–13. [CrossRef]
Bai, Jushan, and Serena Ng. 2004. A panic attack on unit roots and cointegration. Econometrica 72: 1127–77.
[CrossRef]
Economies 2019, 7, 105 14 of 16
Baltagi, Badi H., Qu Feng, and Chihwa Kao. 2012. A Lagrange Multiplier test for cross-sectional dependence in a
fixed effects panel data model. Journal of Econometrics 170: 164–77. [CrossRef]
Banerjee, Anindya, Juan Dolado, and Ricardo Mestre. 1998. Error-correction mechanism tests for cointegration in
a single-equation framework. Journal of Time Series Analysis 19: 615–25. [CrossRef]
Bayer, Christian, and Christoph Hanck. 2013. Combining non-cointegration tests. Journal of Time Series Analysis
34: 83–95. [CrossRef]
Breitung, Jörg. 2000. The local power of some unit root tests for panel data. In Nonstationary Panels,
Panel Cointegration and Dynamic Panels. Edited by Badi H. Baltagi, Thomas B. Fomby and R. Carter Hill.
Bingley: Emerald Group Publishing Limited, vol. 15, pp. 161–78.
Breusch, Trevor S., and Adrian R. Pagan. 1980. The Lagrange multiplier test and its applications to model
specification in econometrics. The Review of Economic Studies 47: 239–53. [CrossRef]
Chang, Yoosoon. 2002. Nonlinear IV unit root tests in panels with cross-sectional dependency. Journal of Econometrics
110: 261–92. [CrossRef]
Choi, In. 2001. Unit root tests for panel data. Journal of International Money and Finance 20: 249–72. [CrossRef]
De Vita, Glauco, Klaus Endresen, and Lester C. Hunt. 2006. An empirical analysis of energy demand in Namibia.
Energy Policy 34: 3447–63. [CrossRef]
Driscoll, John C., and Aart C. Kraay. 1998. Consistent covariance matrix estimation with spatially dependent
panel data. Review of Economics and Statistics 80: 549–59. [CrossRef]
Fuinhas, José Alberto, and António Cardoso Marques. 2012. Energy consumption and economic growth nexus in
Portugal, Italy, Greece, Spain and Turkey: An ARDL bounds test approach (1965–2009). Energy Economics
34: 511–17. [CrossRef]
Geweke, John, Richard Meese, and Warren Dent. 1983. Comparing alternative tests of causality in temporal
systems. Analytic results and experimental evidence. Journal of Econometrics 21: 161–94. [CrossRef]
Groen, Jan J. J., and Frank Kleibergen. 2003. Likelihood-based cointegration analysis in panels of vector
error-correction models. Journal of Business and Economic Statistics 21: 295–318. [CrossRef]
Gutierrez, Luciano. 2003. On the power of panel cointegration tests: A Monte Carlo comparison. Economics Letters
80: 105–11. [CrossRef]
Hadri, Kaddour. 2000. Testing for stationarity in heterogeneous panel data. Econometrics Journal 3: 148–61.
[CrossRef]
Halicioglu, Ferda. 2007. Residential electricity demand dynamics in Turkey. Energy Economics 29: 199–210.
[CrossRef]
Harris, Richard, and Robert Sollis. 2003. Applied Time Series Modelling and Forecasting. West Sussex: Wiley.
Haug, Alfred A. 2002. Temporal aggregation and the power of cointegration tests: A Monte Carlo study.
Oxford Bulletin of Economics and Statistics 64: 399–412. [CrossRef]
Im, Kyung So, M. Hashem Pesaran, and Yongcheol Shin. 2003. Testing for unit roots in heterogeneous panels.
Journal of Econometrics 115: 53–74. [CrossRef]
Inglesi-Lotz, Roula. 2018. The role of potential factors/actors and regime switching modelling. In The Economics and
the Econometrics of the Energy-Growth Nexus. Edited by Angeliki N. Menegaki. Cambridge: Academic Press,
p. 387.
Jalil, Abdul, and Ying Ma. 2008. Financial development and economic growth: Time series evidence from Pakistan
and China. Journal of Economic Cooperation among Islamic Countries 29: 29–68.
Johansen, Søren. 1988. Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control
122: 231–54. [CrossRef]
Johansen, Søren, and Katarina Juselius. 1990. Maximum likelihood estimation and inference on cointegration—with
applications to the demand for money. Oxford Bulletin of Economics and Statistics 52: 169–210. [CrossRef]
Kao, Chihwa. 1999. Spurious regression and residual–based tests for cointegration in panel data.
Journal of Econometrics 90: 1–44. [CrossRef]
Kao, Chihwa, and Min-Hsien Chiang. 2000. On the estimation and inference of cointegrated regression in panel
data. In Nonstationary Panels, Panel Cointegration, and Dynamic Panels (Advances in Econometrics). Edited by
Badi H. Baltagi, Thomas B. Fomby and R. Carter Hill. Bingley: Emerald Group Publishing Limited, vol. 15,
pp. 179–222.
Economies 2019, 7, 105 15 of 16
Larsson, Rolf, Johan Lyhagen, and Mickael Löthgren. 2001. Likelihood-based cointegration tests in heterogeneous
panels. Econometrics Journal 108: 1–24. [CrossRef]
Lee, Chien-Chiang, and Chun-Ping Chang. 2008. Energy consumption and economic growth in Asian economies:
A more comprehensive analysis using panel data. Resource and Energy Economics 30: 50–65. [CrossRef]
Lee, Junsoo, and Mark C. Strazicich. 2003. Minimum Lagrange Multiplier Unit Root Test with Two Structural
Breaks. Review of Economics and Statistics 85: 1082–89. [CrossRef]
Levin, Andrew, Chien-Fu Lin, and Chia-Shang James Chu. 2002. Unit root tests in panel data: Asymptotic and
finite-sample properties. Journal of Econometrics 108: 1–24. [CrossRef]
Liu, Yaobin. 2009. Exploring the relationship between urbanization and energy consumption in China using
ARDL autoregressive distributed lag and FDM factor decomposition model. Energy 34: 1846–54. [CrossRef]
Maddala, G. S., W. U. Shaowen, and Peter C. Liu. 1999. Do panel data rescue purchasing power parity (PPP)
theory? In Panel Data Econometrics: Future Directions. Edited by Jaya Krishnakkumar and Elvezio Ronchetti.
New York: Elsevier.
Marques, Luís Miguel, José Alberto Fuinhas, and António Cardoso Marques. 2017. Augmented energy-growth
nexus: Economic, political and social globalization impacts. Energy Procedia 136: 97–101. [CrossRef]
Marques, Luís Miguel, José Alberto Fuinhas, and António Cardoso Marques. 2019. Chapter Four—The impacts
of China’s effect and globalization on the augmented energy–nexus: Evidence in four aggregated regions.
In The Extended Energy-Growth Nexus. Edited by Jose Alberto Fuinhas and António Cardoso Marques.
Cambridge: Academic Press, pp. 97–139.
Masih, Abul MM, and Rumi Masih. 1996. Energy consumption, real income and temporal causality: Results from
a multi-country study based on cointegration and error-correction modelling techniques. Energy Economics
18: 165–83. [CrossRef]
McCoskey, Suzanne, and Chihwa Kao. 1998. A residual-based test of the null of cointegration in panel data.
Econometric Reviews 17: 57–84. [CrossRef]
Menegaki, Angeliki N., ed. 2018. The Economics and Econometrics of the Energy-Growth Nexus. Cambridge: Academic Press.
Menegaki, Angeliki N., and Can Tansel Tugcu. 2016. Rethinking the energy-growth nexus: Proposing an index of
sustainable economic welfare for Sub-Saharan Africa. Energy Research and Social Science 17: 147–59. [CrossRef]
Menegaki, Angeliki N., and Can Tansel Tugcu. 2018. Two versions of the Index of Sustainable Economic Welfare
(ISEW in the energy-growth nexus for selected Asian countries. Sustainable Production and Consumption
14: 21–35. [CrossRef]
Moon, Hyungsik Roger, and Benoit Perron. 2004. Testing for a unit root in panels with dynamic factors.
Journal of Econometrics 122: 81–126. [CrossRef]
Narayan, Paresh Kumar, and Russell Smyth. 2005. Electricity consumption, employment and real income in
Australia evidence from multivariate Granger causality tests. Energy Policy 33: 1109–16. [CrossRef]
Narayan, Paresh Kumar, and Russell Smyth. 2006. Higher education, real income and real investment in China:
Evidence from granger causality tests. Education Economics 14: 107–25. [CrossRef]
Odhiambo, Nicholas M. 2009. Energy consumption and economic growth nexus in Tanzania: An ARDL bounds
testing approach. Energy Policy 37: 617–22. [CrossRef]
Pedroni, Peter. 2004. Panel cointegration: Asymptotic and finite sample properties of pooled time series tests with
an application to the PPP hypothesis. Econometric Theory 20: 597–625. [CrossRef]
Pesaran, M. Hashem, Yongcheol Shin, and Richard J. Smith. 2001. Bounds testing approaches to the analysis of
level relationships. Journal of Applied Econometrics 16: 289–326. [CrossRef]
Pesaran, M. Hashem, and Yongcheol Shin. 1999. An Autoregressive Distributed Lag Modelling Approach to
Cointegration Analysis. In Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial
Symposium. Edited by Steinar Strøm. Cambridge: Cambridge University Press.
Pesaran, M. Hashem. 2004. General Diagnostic Tests for Cross Sectional Dependence in Panels. Cambridge Working
Papers in Econometrics, No: 0435. Cambridge: Faculty of Economics, University of Cambridge.
Pesaran, M. Hashem. 2007. A simple panel unit root test in the presence of cross-section dependence. Journal of
Applied Econometrics 22: 265–312. [CrossRef]
Pesaran, M. Hashem, and Takashi Yamagata. 2005. Testing Slope Homogeneity in Large Panels (March 2005). IEPR
Working Paper No. 05.14; CESifo Working Paper No. 1438. Available online: https://ssrn.com/abstract=671050
(accessed on 6 June 2019).
Economies 2019, 7, 105 16 of 16
Phillips, Peter CB, and Bruce E. Hansen. 1990. Statistical inference in instrumental variables regression with I(1)
processes. Review of Economic Studies 57: 99–125. [CrossRef]
Shahbaz, Muhammad. 2018. Current Issues in Time-Series Analysis for the Energy-Growth Nexus (EGN);
Asymmetries and Nonlinearities, Case Study: Pakistan. In The Economics and Econometrics of the Energy-Growth
Nexus. Edited by Angeliki N. Menegaki. Cambridge: Academic Press, p. 387.
Shin, Yongcheol, Byungchul Yu, and Matthew Greenwood-Nimmo. 2011. Modelling Asymmetric Cointegration
and Dynamic Multiplier in a Nonlinear ARDL Framework. In Festschrift in Honor of Peter Schmidt.
Rochester: SSRN.
Shin, Yongcheol, Byungchul Yu, and Matthew Greenwood-Nimmo. 2014. Modelling Asymmetric Cointegration
and Dynamic Multipliers in a Nonlinear ARDL Framework. In Festschrift in Honor of Peter Schmidt. New York:
Springer, pp. 281–314.
Toda, Hiro Y., and Taku Yamamoto. 1995. Statistical inference in vector autoregressions with possibly integrated
processes. Journal of Econometrics 66: 225–50. [CrossRef]
Tugcu, Can Tansel. 2018. Panel Data Analysis in the Energy-Growth Nexus (EGN). In The Economics and
Econometrics of the Energy-Growth Nexus. Cambridge: Academic Press, pp. 255–71.
Tursoy, Turgut, and Faisal Faisal. 2018. The impact of gold and crude oil prices on stock market in Turkey:
Empirical evidences from ARDL bounds test and combined cointegration. Resources Policy 55: 49–54.
[CrossRef]
Westerlund, Joakim. 2007. Testing for error correction in panel data. Oxford Bulletin of Economics and Statistics
69: 709–48. [CrossRef]
© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Modelling Asymmetric Cointegration and Dynamic
Multipliers in a Nonlinear ARDL Framework
Yongcheol Shin
University of York
Byungchul Yu
Department of International Trade, Dong-A University
Matthew Greenwood-Nimmo
Leeds University Business School
November 9, 2011
Abstract
short- and long-run nonlinearities are introduced via positive and negative partial
is estimable by OLS and that reliable long-run inference can be achieved by bounds-
asymmetric dynamic multipliers that graphically depict the traverse between the
short- and the long-run. The salient features of the model are illustrated using the
ers, Nonlinear ARDL (NARDL) ECM-based Estimation and Tests, Nonlinear Unemployment-
The nonlinearity of many macroeconomic variables and processes has long been recog-
nised. In a famous remark, Keynes (1936, p. 314) noted that “the substitution of a down-
ward for an upward tendency often takes place suddenly and violently, whereas there
is, as a rule, no such sharp turning point when an upward is substituted for a downward
tendency”. More recently, the joint fields of behavioural finance and economics associated
most notably with Daniel Kahneman, Amos Tversky and Robert Shiller (e.g. Kahneman
and Tversky, 1979; Shiller, 1993, 2005) have provided a considerable impetus to the mod-
elling of asymmetry, stressing that nonlinearity is endemic within the social sciences and
Since the mid-nineties, a substantial literature has considered the joint issues of nonsta-
tionarity and nonlinearity. This field has been dominated by three regime-switching mod-
els: the threshold ECM associated with Balke and Fomby (1997), the Markov-switching
ECM of Psaradakis et al. (2004), and the smooth transition regression ECM developed
by Kapetanios et al. (2006). The development of this literature reflects the belief that
the information revealed by linear models may be insufficiently rich to permit strong in-
ference or to yield reliable forecasts. More generally, it suggests a general concern that
the assumption of linear adjustment may be excessively restrictive in a wide range of eco-
The majority of these studies, however, maintain the assumption that the long-run re-
tic regressors. With the notable exceptions of Park and Phillips (2001), Saikkonen and
has been devoted to the analysis of nonlinear cointegration. Schorderet (2001, 2003) has
where output is decomposed into partial sum processes of positive and negative changes.
On the basis of this piecewise linear specification, he finds that the impact of recessions
an hysteretic relationship. Granger and Yoon (2002) further develop the notion that the
cointegrating relationship may be defined between the positive and negative components
Partial sum decompositions have been applied with some success to the analysis of dy-
namic asymmetry. Examples include Webber’s (2000) analysis of the relationship between
the exchange rate and import prices, Lee (2000) and Virén’s (2001) work on asymmetries
in Okun’s Law and the research of Borenstein et al. (1997) and Bachmeier and Grif-
fin (2003) focusing on the asymmetric response of gasoline prices to fluctuations in the
oil price. However, most papers modelling short-run asymmetry employ the two step
Engle-Granger technique which is inherently less efficient than single-step ECM estima-
tion. Moreover, papers coherently modelling long- and short-run asymmetries jointly are
scarce.
Our purpose in this paper is to develop a simple and flexible nonlinear dynamic frame-
work capable of simultaneously and coherently modelling asymmetries both in the under-
lying long-run relationship and in the patterns of dynamic adjustment. We make four
principle contributions. Firstly, we derive the dynamic error correction representation as-
sociated with the asymmetric long-run cointegrating regression, resulting in the nonlinear
3
ARDL (NARDL) model. Secondly, following Pesaran and Shin (1998) and Pesaran et
al. (2001), we employ a pragmatic bounds-testing procedure for the existence of a stable
long-run relationship which is valid irrespective of whether the underlying regressors are
multipliers that allow us to trace out the asymmetric adjustment patterns following pos-
itive and negative shocks to the explanatory variables. This has substantial theoretical
rium following a perturbation to the system. Such is the flexibility of our framework
that it can readily accommodate the four general combinations of long- and short-run
asymmetry. Finally, we conduct a range of Monte Carlo experiments which largely val-
idate our estimation and inferential framework, revealing little bias in estimation and
considerable power of the key test statistics. Moreover, we compute empirical p-values
for the cointegration tests and confidence intervals for our dynamic multipliers by means
our proposed methodology: it is easily estimable by OLS and simple inferential methods
provide a straightforward and reliable means of discriminating between the various forms
We demonstrate the usefulness of the NARDL framework through two empirical ap-
and Japan over the period 1982m2–2003m11. We find strong evidence of long-run asym-
metry consistent with the growing consensus that unemployment is more sensitive to busts
that firms are quick to fire and slow to hire. Finally, the dynamic multipliers reveal a
4
pattern that is often obscured in discussions of persistence – although the half-life of an
the real impact in terms of jobs created/lost is larger in the recessionary case. It fol-
lows, therefore, that focusing on the half-life of a shock is insufficient when the long-run
relationship is asymmetric as this fails to convey relevant information about the relative
Our second application investigates the asymmetric responses of Korean retail gasoline
prices to fluctuations in the crude oil spot price and the Korean Won/US Dollar exchange
rate over the period 1991q1–2007q2. Our results indicate that the long-run relationship is
linear in both variables, indicating that retailers pass cost changes through to consumers
symmetrically in the long-run. However, the speed of upward adjustment exceeds that of
the ‘rockets and feathers’ hypothesis associated with Bacon (1991). Moreover, our results
support the findings of Asplund et al. (2000) that the short-run response of gasoline prices
to the exchange rate is more pronounced than that associated with fluctuations in the
Finally, the flexibility and utility of the NARDL technique is reflected in the growing
literature that has adopted our technique for the analysis of a range of economic issues1 .
Van Treeck (2008) has employed the NARDL model in his analysis of asymmetric wealth
effects on US consumption, and has found that liquidity constraints and loss-aversion can
be reconciled inter-temporally, with the former dominating in the short-run and the latter
in the long-run. More recently, Delatte and López-Villavicencio (2010, 2011) have applied
the NARDL technique in their analysis of long-run asymmetries in the pass-through from
5
exchange rates to consumer prices in developed economies. Nguyen and Shin (2010) have
estimated NARDL models on high frequency exchange rate data, revealing interesting
Shin and Van Treeck (2011) have estimated NARDL models of the interest rate pass-
through relationship in the USA and Germany, finding strong evidence of time-varying
asymmetry. An important and relatively common finding in this literature is that the
direction of asymmetry may switch between the short-run and the long-run. For example,
a positive shock may have a larger absolute effect in the short-run while a negative shock
has a larger absolute effect in the long-run (or vice-versa). The simplicity and flexibility
of NARDL renders it an ideal framework with which to model such complex phenomena.
regression model and derives the associated asymptotic theory. On this basis, the NARDL
model is derived including expressions for the asymmetric cumulative dynamic multipliers,
and the associated testing procedures are developed. Section 3 employs a range of Monte
Carlo simulations to investigate the finite sample properties of the proposed estimators
and the test statistics. Section 4 presents the results of our two empirical illustrations.
Lastly, Section 5 offers some concluding remarks, while mathematical proofs are collected
in the Appendix.
6
2 Modelling Asymmetries in a Nonlinear ARDL Frame-
work
run relationships has led to the proliferation of regime-switching models. Among existing
studies, nonlinearity is typically confined to the error correction mechanism and estimation
proceeds on the basis of either the threshold ECM associated with Balke and Fomby
(1997), the Markov-Switching ECM of Psaradakis et al. (2004) or the smooth transition
regression ECM developed by Kapetanios et al. (2006). However, the common assumption
The three regime-switching type functional forms mentioned above are equally applicable
to the case of long-run asymmetry (Saikkonen and Choi, 2004; Escribano et al., 2006).
in the long-run relationship and the error correction mechanism coherently. In practice,
however, selection of the regime-switching variables and the transition functional forms
developing a nonlinear modelling framework based on the ARDL approach which provides
a simple and flexible vehicle for the analysis of joint long- and short-run asymmetries.
7
2.1 Nonlinear Asymmetric Cointegration
Before developing the full representation of the NARDL model, we introduce the following
− −
y t = β + x+
t + β xt + u t , (2.1)
∆xt = vt , (2.2)
−
where yt and xt are scalar I(1) variables, and xt is decomposed as xt = x0 + x+
t + xt where
−
x+
t and xt are partial sum processes of positive and negative changes in xt :
t
X t
X t
X t
X
x+
t = ∆x+
j = max (∆xj , 0) , x−
t = ∆x−
j = min (∆xj , 0) . (2.3)
j=1 j=1 j=1 j=1
This simple approach to modelling asymmetric cointegration based on partial sum decom-
positions has been applied by Schorderet (2001) in the context of the nonlinear relationship
Granger and Yoon (2002) advance the concept of ‘hidden cointegration’, where coin-
tegrating relationships may be defined between the positive and negative components of
the underlying variables. They demonstrate the relevance of this conceptual framework in
the context of the linkage between US short- and long-term interest rates and the output-
unemployment relationship, both of which are notable for the lack of robust evidence of
linear cointegration. Schorderet (2003) generalises this concept and defines the following
that standard linear (symmetric) cointegration is a special case of (2.4), obtained only
8
if β0+ = β0− and β1+ = β1− . Schorderet modifies (2.4) to analyse hidden cointegration,
where only one component of each series appears in (2.4), developing a model of the
Lardic and Mignon (2008) analyse hidden cointegration between the price of oil and GDP,
although they fail to provide any economically meaningful interpretation of the estimated
asymmetric coefficients.
Given the difficulty in interpreting the results of hidden cointegration analysis, we will
focus on (2.1), imposing the restriction β0+ = β0− = β0 in (2.4) such that β + = −β1+ /β0 and
β − = −β1− /β0 . To achieve the greatest possible clarity of exposition, we initially begin
with the case of a single regressor decomposed into the relevant partial sum processes.
Assumption 1 The disturbances ut and vt in (2.1) and (2.2) follow iid processes with
zero means and finite variances, and they are independently distributed.
Theorem 1 Consider the asymmetric cointegrating regression, (2.1) and (2.2). Under
Assumption 1, the OLS estimators of β + and β − have the following asymptotic distribu-
tions:
1R R R
µ− σ u Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr
T (β̂ + − β + ) ⇒ − 3
2 ,
σs 1
R R
W (r) 2 dr − rW (r)dr
3 s̃ s̃
1R R R
µ+ σ u Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr
− − 3
T (β̂ − β ) ⇒ 2 ,
σs 1
R R
W (r) 2 dr − rW (r)dr
3 s̃ s̃
V ar (ut ), σs2 := V ar (st ), and Ws̃ (·) and Wũ (·) are two independent standard Brownian
motions defined on r ∈ [0, 1], and obtained as the weak limit of partial sum processes,
9
PT (·) PT (·)
T −1/2 j=1 s̃t and T −1/2 j=1 ũt , with ũt := ut /σu and s̃t := st /σs . Furthermore,
2 2
Remark 1 In the special case when vt follows a symmetric distribution with µ+ = µ−
1
R R R
− − + + 3
Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr
T (β̂ − β ), T (β̂ − β ) ⇒ 2 ,
1
R R
3
Ws̃ (r)2 dr − rWs̃ (r)dr
n o
− − + +
such that T (β̂ − β ) + (β̂ − β ) = op (1).
0
Remark 2 Let β = (β + , β − ) , then
a
T β̂ − β ∼ M N (0, V ) , (2.5)
−1
where V = plimT →∞ T 2 (X 0 X) σu2 . Even though x+ −
t and xt are dominated by the de-
terministic trends by construction, these leading terms cancel off in the derivation of
−1 −1
(X 0 X) such that plimT →∞ T 2 (X 0 X) is well-defined and standard inference on β re-
Remark 3 In a similar manner, when an intercept term is included, we can obtain the
R R
1
W̃s̃ (r)dWũ (r) − (r − 12 )W̃s̃ (r)dr (r − 21 )dWũ (r)
R
− u
µ σ 12
T (β̂ + − β + ) ⇒ − 2 ;
σs
R
1 1
R
2
W̃s̃ (r) dr − (r − 2 W̃s̃ (r)
12
R R
1
W̃s̃ (r)dWũ (r) − (r − 21 )W̃s̃ (r)dr (r − 12 )dWũ (r)
R
+ u
µ σ 12
T (β̂ − − β − ) ⇒ 2 ;
σs
R
1 1
R
W̃ (r)2 dr − (r − W̃ (r)
12 s̃ 2 s̃
10
and T {µ+ (β̂ + − β + ) + µ− (β̂ − − β − )} = oP (1), where W̃s̃ (r) := Ws̃ (r) −
R
Ws̃ (r)dr for
r ∈ [0, 1].
The simple case presented above is useful for exposition and will certainly cover some
empirical applications. However, it is too restrictive since it does not allow for weak en-
dogeneity of the regressors and/or serially correlated errors, factors that will significantly
affect both the asymptotic and the small sample properties of the estimators. In their
presence, the OLS estimator in (2.1) may remain super-consistent but the asymptotic dis-
tribution is non-Gaussian. Hence, hypothesis testing cannot be carried out in the usual
manner without removing both the serial correlation and the endogeneity of the regres-
sors. In particular, the resulting OLS estimator of the cointegrating parameter will be
In the linear cointegration literature, several solutions to these twin problems have
been proposed in the context of the static regression model (Phillips and Hansen, 1990;
Saikkonen, 1991) and the dynamic regression model (Pesaran and Shin, 1998). Given
that our interest is in developing a fully dynamic model, we naturally choose to extend
the ARDL approach popularised by Pesaran and Shin (1998) and Pesaran et al. (2001),
thereby developing a flexible dynamic parametric framework with which to model rela-
p q
X X
−0 −
θ +0 +
yt = φj yt−j + j xt−j + θ j xt−j + εt , (2.6)
j=1 j=0
11
−
where xt is a k × 1 vector of multiple regressors defined such that xt = x0 + x+
t + xt , φj is
−
the autoregressive parameter, θ +
j and θ j are the asymmetric distributed-lag parameters,
and εt is an iid process with zero mean and constant variance, σε2 . Throughout this
−
paper we will focus on the case in which xt is decomposed into x+
t and xt around
a threshold of zero, thereby distinguishing between positive and negative changes in the
rate of growth of xt . The resulting partial sum processes maintain an intuitively appealing
correction form as
p−1 q−1
X X
−0 −0 −
+0
x+ x− ϕ+0 +
∆yt = ρyt−1 + θ t−1 +θ t−1 + γj ∆yt−j + j ∆xt−j + ϕj ∆xt−j + εt
j=1 j=0
p−1 q−1
X X
−0 −
ϕ+0 +
= ρξt−1 + γj ∆yt−j + j ∆xt−j + ϕj ∆xt−j + εt (2.7)
j=1 j=0
Pp Pp Pq −
where ρ = j=1 φj − 1, γj = − i=j+1 φi for j = 1, ..., p − 1, θ + = j=0 θ+
j , θ =
Pq − + + +
Pq + − − −
Pq −
j=0 θ j , ϕ0 = θ 0 , ϕj = − i=j+1 θj for j = 1, ..., q − 1, ϕ0 = θ 0 , ϕj = − i=j+1 θj
−0 −
for j = 1, ..., q − 1, and ξt = yt − β +0 x+
t − β xt is the nonlinear error correction term
the regressors and the residuals in (2.7) we now consider the following reduced form data
where v t ∼ iid (0, Σv ), with Σv being a k × k positive definite covariance matrix. Given
12
q−1
!
X
εt = ω 0 v t + et = ω 0 ∆xt − Λj ∆xt−j + et (2.9)
j=1
where et is uncorrelated with v t by construction. Substituting (2.9) into (2.7) and rear-
It is clear that (2.10) corrects perfectly for the weak endogeneity of any nonstationary
explanatory variables and that the choice of an appropriate lag structure will render the
model free from residual serial correlation. Our model combines many of the desirable
attributes of the fully-modified and the ARDL-based dynamic corrections associated re-
spectively with Phillips and Hansen (1991) and Pesaran and Shin (1998) in a dynamic
parametric framework capable of modelling both long- and short-run asymmetries. More-
Following the conditions used in the derivations above, we now summarise the following
(2.8); (iii) et is uncorrelated with v t through the conditional modelling, (2.9); (iv) the
Following Theorems 3.1 and 3.2 in Pesaran and Shin (1998), it is straightforward to
show under Assumption 2 that: (i) the OLS estimators of all the short-run dynamic
√
parameters in (2.10) are T -consistent and have the asymptotic normal distribution,
13
+ +
and (ii) the OLS estimators of the long-run parameters computed as β̂ = −θ̂ /ρ̂ and
− −
β̂ = −θ̂ /ρ̂, are T -consistent and follow the mixture normal distribution as defined in
or symmetric short-run coefficients can be tested using the Wald statistic following an
asymptotic χ2 distribution. In order to assess the extent to which these theoretical pre-
dictions are validated in both large and small samples, we will conduct a series of Monte
We develop two operational testing procedures for the existence of an asymmetric (cointe-
grating) long-run relationship based on the NARDL ECM, (2.10). If ρ = 0, (2.10) reduces
to the regression involving only first differences, implying that there is no long-run rela-
−
tionship between the levels of yt , x+
t and xt . We first follow Banerjee et al. (1998) and
propose the t-statistic testing ρ = 0 against ρ < 0 in (2.10). Next, we follow Pesaran,
Shin and Smith (2001) and propose an F-test of the joint null, ρ = θ + = θ − = 0 in (2.10).
The asymptotic distributions of these test statistics are non-standard under their re-
spective null hypotheses and their exact asymptotic distributions are generally compli-
−
cated to derive due to the complex dependence structure between x+
t and xt , especially
when the means of ∆yt and ∆xt are non-zero.9 In light of these difficulties, we propose the
use of the pragmatic ‘bounds-testing’ approach advanced by Pesaran et al. (2001). Two
−
extreme cases can be identified, one in which the level regressors x+
t and xt in (2.10) are
all I(1), and the other in which they are all I(0). It follows that critical values tabulated
14
for these two scenarios provide critical value bounds for all classifications, irrespective of
whether the regressors are I(0), I(1) or mutually cointegrated. This approach is partic-
ularly useful in the current context due to the various dependence structures (including
−
cointegration) that may exist between x+
t and xt . Following Pesaran et al. (2001), we
differentiate between five cases of (2.10) for the FP SS statistic: (i) without intercept or
linear trend; (ii) with restricted intercept only; (iii) with unrestricted intercept only; (iv)
with intercept and restricted linear trend; and (v) with intercept and unrestricted linear
trend. Similarly, for the tBDM statistic we differentiate between cases (i), (iii) and (v).
Pesaran et al. (2001) tabulate the critical value bounds for both the FP SS and tBDM
statistics under each of these cases for a range of values of k, the number of regressors
In the context of the NARDL model, due to the dependence structure that exists
−
between the partial sum decompositions x+
t and xt , the exact value of k is not clear.
−
In the simplest case where the long-run relationship is defined between yt , x+
t and xt , it
follows that the true value of k lies between 1 and 2.10 In general, we expect that the test
the k = 1 critical values results in a more conservative test (a higher critical value) so, at
a pragmatic level, rejecting the null of no long-run relationship using these critical values
the test can be readily resolved by bootstrapping, although in practice we find that the
pragmatic approach typically leads to the same conclusion. This observation is reinforced
below by a series of Monte Carlo simulation experiments designed to evaluate the finite
sample properties of the PSS test and the associated bootstrapping routine.
15
2.4 Asymmetric Dynamic Multipliers
−
in x+
t and xt , respectively, on yt . Consider the ARDL-in-levels representation of (2.10):
− −
φ (L) yt = θ + (L) x+
t + θ (L) xt + et , (2.11)
− − −1
yt = λ+ (L) x+t + λ (L) xt−i + [φ (L)] et , (2.12)
P P
where λ+ (L) = ∞ λ
j=0 j
+
= φ (L)−1 +
θ (L) and λ −
(L) = ∞
j=0 j λ −
= φ (L)−1 θ − (L).12
−
The cumulative dynamic multiplier effects of x+
t and xt on yt can be evaluated as follows:
h h h h
X ∂yt+j X X ∂yt+j X
m+ = = λ+
j , m− = = λ−
j , h = 0, 1, 2... (2.13)
h
j=0
∂x+
t j=0
h
j=0
∂x−
t j=0
+ − − + +
Notice that, by construction, as h → ∞, m+
h → β and mh → β , where β = −θ /ρ
and β − = −θ − /ρ are the asymmetric long-run coefficients. There is little reason to be-
−
lieve that the dynamic adjustment patterns summarised by m+
h and mh should generally
be symmetric. Therefore, even though we do not directly model asymmetric error cor-
rection (i.e. we do not allow for regime-dependency of ρ in (2.10)) we may still observe
important feature of the NARDL model. In the interest of clarity, when discussing asym-
metry we tend to distinguish only between long- and short-run asymmetries. However,
the NARDL model in fact admits three general forms of asymmetry: (i) long-run or re-
action asymmetry, associated with β + 6= β − ; (ii) impact asymmetry, associated with the
−
inequality of the coefficients on the contemporaneous first differences ∆x+
t and ∆xt ; (iii)
16
adjustment asymmetry, captured by the patterns of adjustment from initial equilibrium
to the new equilibrium following an economic perturbation (i.e. the dynamic multipliers).
Adjustment asymmetry derives from the interaction of impact and reaction asymmetries
In practice, the patterns of dynamic adjustment will depend on the model specification.
Four distinct cases can be identified: the unrestricted specification, (2.10), accommodating
asymmetries in both the short- and long-run and three restricted specifications obtained by
imposing short- and long-run symmetry restrictions in (2.10), either separately or jointly.
the response of retail gasoline prices to fluctuations in the price of crude oil by implicitly
imposing the long-run symmetry restrictions θ + = θ − = θ such that (2.10) simplifies to13
p−1 q−1
X X
− −
π+ +
∆yt = ρyt−1 + θxt−1 + γi ∆yt−i + i ∆xt−i + π i ∆xt−i + et . (2.14)
i=1 i=0
Models of this form have also been employed by Shirvani and Wilbratte (2000) and Apergis
and Miller (2006) in their analysis of short-run asymmetric wealth effects on consumption
−
Short-run symmetry restrictions can take either of two forms: (i.) π +
i = π i for all
Pq−1 Pq−1
i = 0, ..., q − 1 or (ii.) i=0 π+
i = i=0 π−
i . When imposing such restrictions in the
Finally, the most restrictive specification is obtained when assuming linearity of the long-
17
p−1 q−1
X X
∆yt = ρyt−1 + θxt−1 + γi ∆yt−i + π i ∆xt−i + et . (2.16)
i=1 i=0
It is clear that (2.14), (2.15) and (2.16) are special cases of the unrestricted specifica-
tion described by (2.10) and that the long- and short-run symmetry restrictions can be
easily tested in the usual manner following our proposed methodology. Our early experi-
mentation with the model, as well as the results adduced in Van Treeck (2008), Nguyen
and Shin (2010) and Greenwood-Nimmo, Shin and Van Treeck (2011), suggest that the
dynamic multipliers obtained from the various cases are generally significantly different
from one-another. Moreover, it is generally the case that the results of linear estimation
are profoundly misleading when the underlying relationship is, in fact, asymmetric. This
will become apparent during the discussion of our empirical illustrations in Section 4.
A simple and useful addition to the general typology developed above is the extension
to the case where a subset of regressors enters the long-run relationship symmetrically:15
−0 − 0
yt = β +0 x+
t + β x t + γ w t + ut , (2.17)
−
where xt = x0 + x+
t + xt is a k × 1 vector of regressors entering the model asymmetri-
of partial asymmetry to both the long- and short-run within our NARDL model, we
obtain:
− −
∆yt = ρyt−1 + θ + x+
t−1 + θ xt−1 + θ w w t−1
p−1 q−1
X X
− −
π+ +
+ γi ∆yt−i + i ∆xt−i + π i ∆xt−i + π w,i ∆w t−i + et . (2.18)
i=1 i=0
In light of the bounds-testing approach employed above, it follows that estimation and
inference proceed exactly as before, irrespective of whether xt and wt are I(0), I(1) or
18
mutually cointegrated. Furthermore, it is once again clear that this partially asymmetric
In order to investigate the finite sample properties of the estimators we conduct a range of
Monte Carlo experiments based on the following simple data generating process (DGP):
− − − −
∆yt = a + ρ yt−1 − β + x+ + +
t−1 − β xt−1 + ϕ ∆xt + ϕ ∆xt + ut , (3.19)
where ∆xt = εt , and (ut , εt ) are serially uncorrelated and are generated according to the
ut 1 ω
∼ N 0, Ω = . (3.20)
εt
ω 1
− − − −
∆yt = a + ρyt−1 + θ+ x+ +
t−1 + θ xt−1 + π ∆xt + π ∆xt + et , (3.21)
ically, under the assumptions that a = 0, β + = 0.5 and ϕ+ = 0.5, and denoting
ing parameters: ρ ∈ (−0.05, −0.1, −0.2), δβ ∈ (0.1, 0.2, 0.25, 0.5), δϕ ∈ (0.1, 0.2, 0.25, 0.5),
ω ∈ (−0.5, 0, 0.5), and T ∈ (100, 200, 400). Due to space constraints, we are unable to
19
report the results of all of these simulations herein16 . Rather, we summarise the key
findings that arise across these parameterisations and report in detail the results from a
baseline case in which we use ρ = −0.2, δβ = 0.5 and δϕ = 0.5, and where ω and T vary
In Table 1 we report a range of summary statistics for the parameter estimates based
on our simulations using 3,000 replications of our baseline case. We note that the bias
and error in the estimation of each of the parameters is largely negligible (this also holds
under the other parameterisations of the DGP that we consider). The only exception to
this generalisation is the error correction parameter, which shows a modest downward
bias especially when T ≤ 100. However, this observation is not unexpected given the
We also investigate the finite sample size and power of the Wald statistics for the null
− −
∆yt = a + ρ (yt−1 − βxt−1 ) + ϕ+ ∆x+
t + ϕ ∆xt + ut , (3.22)
− −
∆yt = a + ρ yt−1 − β + x+
t−1 − β xt−1 + ϕ∆xt + ut , (3.23)
examine the finite sample size and power of the PSS bounds test of the null hypothesis
20
− −
∆yt = a + ϕ+ ∆x+
t + ϕ ∆xt + ut . (3.24)
and, as before, the alternative model is given by (3.19). As noted in Section 2.3, the
relevant critical value bounds for the PSS test depend on the number of regressors entering
−
the long-run relationship, k. However, given the dependence between x+
t and xt , the
appropriate value of k is unclear. Thus, we propose a pragmatic solution using two sets
of critical values, one for which k is defined by counting the partial sums as separate I(1)
regressors (here, k = 2) and another by counting each set of partial sums collectively
as a single I(1) regressor (here, k = 1). It follows that the latter approach is the more
conservative.
Table 2 summarises the simulation results from our baseline case at a nominal size of
5%. For T = 100, the long-run Wald test has very high power and the short-run Wald and
PSS tests have moderate power, although this rapidly improves as T increases. Indeed,
when T = 400 all of the tests achieve close to 100% power. The short-run Wald test is
well-sized regardless of the value of T while WLR is slightly oversized in small samples,
find that the power of the test is satisfactory even under the conservative case (k = 1).
Table 2 also reports the power of the bootstrapped PSS test. For each replication of
the simulation routine, using data generated under the alternative hypothesis, we generate
500 bootstrap samples non-parametrically using the resampled residuals from estimation
of (3.21) in conjunction with the estimated coefficients from (3.24) under the assumption
21
that the initial values and the x’s are known. It is then a simple matter to compute
the empirical p-value of the PSS test by estimating (3.21) on the bootstrap samples
and calculating the probability that the bootstrapped test statistic exceeds its original
value. On this basis, we note that the bootstrapping procedure achieves the desired size
One important finding that arises from the other parameterisations of the DGP is
that the power of the long- and short-run Wald tests is positively associated with the
distance between their respective null and alternative hypotheses. Moreover, we find that
the long-run Wald test becomes somewhat over-sized especially when the distance of the
alternative from the null is small, the error correction parameter is close to zero, and
T ≤ 100. These findings reflect the well known limitations of asymptotic inference under
adverse conditions. To overcome these issues, one could adopt the common practice within
the literature and compute empirical p-values for the short- and long-run Wald statistics
approach. By computing 95% bootstrap confidence intervals for the difference between
the asymmetric cumulative dynamic multipliers defined for positive and negative shocks,
respectively, we are able to convey relevant information about the statistical significance
Furthermore, in light of our simulations, and given the absence of precise asymptotic
critical values for the FP SS and tBDM test statistics, we choose to provide bootstrapped
22
4 Empirical Applications
To demonstrate both the simplicity and the flexibility of the NARDL approach, we will
present two empirical applications. Firstly, we will examine nonlinearities in the bivariate
relationship between output and unemployment in the US, Canada and Japan. Secondly,
we will apply our technique to the trivariate case of gasoline pricing in Korea.
The negative relationship between changes in the rate of unemployment and the rate
of output growth (Okun’s Law) remains one of the most commonly cited stylized facts
mission, representing the link between unemployment and output which underpins the
However, despite its importance, empirical assessments of Okun’s law over the last
three decades have been rather disappointing. The majority of this voluminous litera-
ture adheres to a linear paradigm, reflecting the assumption that cyclical upturns and
to believe that the labour market should behave in this simplistic fashion. If employers
dismiss a given quantity of labour after a negative growth shock, then they may not hire
exactly the same amount after a positive shock of equal magnitude (Lang and de Peretti,
2009). This may be discussed in terms of labour market hysteresis, the idea that cyclical
shocks may permanently affect structural unemployment. In this vein, Blanchard and
Summers (1987) explain the persistently high European unemployment of the 1980s us-
ing an insider-outsider wage setting model. They argue that adverse shocks that reduce
23
the proportion of insiders (union members) will increase outsider unemployment perma-
nently. There is, therefore, no tendency for the labour market to return to its initial state
even after economic growth has recovered (see also Hammermesh and Pfann, 1998, on the
Okun’s Law, the Phillips curve and the preferences of the central bank which has helped
to drive research in the field. Neftci (1984) laid the foundations for this literature with
his early study of business cycle effects on the patterns of correlation between major US
time series, which revealed that the output-unemployment relationship displays marked
asymmetry. Altissimo and Violante (2001) find evidence of nonlinearity between output
and unemployment using a nonlinear multivariate VAR model. Their results, which they
note are consistent with the majority of existing univariate threshold models, indicate
that shocks in the recessionary regime are considerably less persistent than those in the
specification of Okun’s law and finds that the contemporaneous effect of output growth
and that shocks to unemployment tend to be more persistent in the expansionary regime.
Attfield and Silverstone (1998) argue that if output and unemployment are coin-
tegrated and potential output and unemployment are defined by the stochastic trend
Okun’s coefficient can be interpreted as the cointegrating coefficient. However, the cointe-
gration test results are ambiguous: the single equation residual based ADF test is unable
24
to reject the null of no cointegration while it is rejected by the Johansen test. Using a
static asymmetric regression of the form of (2.1), Schorderet (2001) finds that nonlin-
earity hinders efforts to detect the stationary relationship between unemployment and
output.18 The contention that the appropriate modelling of nonlinearity strongly affects
In this section, we apply the NARDL technique to the simultaneous analysis of both
long- and short-run nonlinearities in the relationship between output and unemployment
in the US, Canada and Japan.19 This application demonstrates one of the key strengths
of our model: its flexibility and the ease with which it can be applied to each of the four
Firstly, to establish a reference point, we estimate the static linear regression of un-
employment on a constant, a time trend and output (Table 3(a)) and a static asymmetric
model of the form of (2.1), the results of which are reported in Table 3(b).
In keeping with the findings of Attfield and Silverstone (1998), Schorderet (2001) and
Granger and Yoon (2002), the EG test finds no evidence of linear cointegration. Moreover,
the EG test is unable to reject the null of no cointegration in the static asymmetric case,
a pronounced negative association between output and unemployment, with the results of
asymmetric analysis indicating strong non-linearity (the Wald tests reject the null in all
cases). However, the validity of these results is questionable given the evidence of severe
model mis-specifications.
Table 4 reports estimation results for the restricted symmetric ARDL regression of
the form of (2.16). Table 5 presents the results of the unrestricted NARDL case allowing
25
for both long- and short-run asymmetry. Notice that the cointegration tests are unable to
reject the null hypothesis in the restricted case but that both the tBDM and FP SS statistics
resoundingly reject the null when long-run asymmetry is modelled appropriately. This
result underscores the importance of correctly specifying the long-run relationship under
scrutiny. Moreover, the finding that the ECM-based tests are able to detect the asymmet-
ric long-run relationship while the EG residual-based approach cannot is generally consis-
tent with the works of Kremers, Ericsson and Dolado (1992), Hansen (1995), Banerjee et
al. (1998) and Pesaran et al. (2001). This reflects the well-established power-dominance
of the ECM-based tests resulting from their inclusion of potentially valuable information
relating to the correlation between the regressors and the underlying disturbances.
In the restricted symmetric models (Table 4), the estimated long-run coefficients for the
US, Canada and Japan are -1.66, -5.68 and 5.57, respectively, although none is statistically
significant due to the failure to accurately model the long-run relationship. Indeed, the
counterintuitive finding of a positive long-run coefficient in the case of Japan reflects the
fact that the model misspecification is so severe in this case that the estimated error
more general unrestricted model of the form (2.10), the FP SS and tBDM tests both reject
their respective null hypotheses in all cases, even using the conservative critical values for
the PSS test (see Table 5). Furthermore, the Wald tests are also able to firmly reject the
null hypothesis of long-run symmetry in all cases. In this case, the estimated long-run
coefficients on y + and y − are -9.76 and -28.88 for the US, -17.26 and -28.48 for Canada and
-7.28 and -11.26 for Japan, respectively. Therefore, we may conclude that an economic
26
downturn of just 3.5% achieves the opposite. The associated values for Canada are 5.8%
and 3.5% while in the case of Japan the figures translate to an economic upturn of 13.7%
and a downturn of 8.9%. The relatively muted response of the labour market to output
fluctuations in Japan reflects its restrictive employment policies and unusually long job
tenure (Tanaka, 2001), and is comparable to the linear estimation results achieved by
Turning to the analysis of short-run dynamic asymmetry, we find that the Wald test
cannot reject the null of (weak-form) summative symmetric adjustment in the USA or
Japan but that it is rejected at the 10% level in Canada. Consulting the bootstrap con-
fidence intervals for the difference between the asymmetric dynamic multipliers reported
in Figures 1–3 supports this finding. However, as noted earlier, the pattern of dynamic
coefficient and the model dynamics. Therefore, although we find little evidence of additive
For the benefit of the reader, Figure 1 presents the dynamic multipliers for the US
under each of the four combinations of long- and short-run asymmetry. Notice that
the dynamic multipliers, resulting in marked overshooting where none was previously
observed. In conjunction with the results of a battery of diagnostic tests, we conclude that
model. This underscores the importance of correctly accounting for inherent nonlinearities
in the long-run relationship and cautions that failure to do so jeopardises the identification
27
of the long-run relationship and compromises the estimation of the model dynamics. In
light of the overwhelming rejection of the long-run symmetric models, the associated
For the US, the results of both long-run asymmetric models (Figures 1(a) and (c)) are
remarkably similar, indicating that the labour market responds rapidly and strongly to
cyclical downturns in the very short-run (correcting one quarter of disequilibrium within
one period) but that full adjustment to the new equilibrium is a relatively prolonged
process. By contrast, the labour market responds only mildly to the boom phase but full
adjustment is achieved within six months. This reflects the flexibility of the US labour
market, whereby firms are quick to fire in the short-run in order to cut costs but are also
quick to hire in the knowledge that they can easily and quickly release the additional
Figure 2 reveals that the pattern of dynamic adjustment is considerably richer in the
fully asymmetric case in Canada. We again find very rapid labour market adjustment
in the immediate wake of a recessionary shock, with more than 50% of the traverse to
equilibrium achieved within six months. Again, we find that the remaining disequilibrium
error is corrected relatively slowly. By contrast, the labour market response to the cyclical
upswing is more gradual, taking one year to achieve 50% of the adjustment toward equi-
librium. Furthermore, in panel (b), with the imposition of short-run symmetry, after the
initial rapid adjustment to the recessionary shock the gradient of the cumulative dynamic
the upward slope of the difference curve. In sum, our results suggest that Canadian firms
are quick to fire and slow to hire, reflecting conservatism on the part of their management.
28
Finally, we find little evidence of short-run asymmetry in Japan. Figure 3 reveals
that the Japanese labour market exhibits very muted responses to both booms and busts
when compared to the US and Canada, a finding that reflects the prevalence of restrictive
labour market institutions. Focusing on Figure 3(b), we note that 50% of the equilibrium
correction occurs within 10-12 months of either a positive or a negative shock, and that
after this initial phase, convergence upon long-run equilibrium occurs very slowly.
and 3. In general, the labour markets in all countries exhibit relatively rapid adjustment
in the first year with the absolute effect of an economic contraction being significantly
larger than that of an expansion. Following this initial period, the speed of adjustment
slows markedly, and subject to the imposition of short-run symmetry restrictions, we find
that the labour market response to output shocks remains somewhat more rapid in the
recessionary case than in the expansionary environment in both Canada and Japan. The
US can be viewed as a special case due to the widely discussed flexibility of its labour
market which permits very rapid adjustment to the expansionary shock as firms are eager
to hire in the knowledge that subsequent dismissals are neither difficult nor unduly costly.
The subtle patterns revealed by the dynamic multipliers suggest that the focus of the
literature on the persistence of shocks (Altissimo and Violante, 2001; Crespo Cuaresma,
2003) fails to convey important information regarding the magnitude of the implied ad-
justments to the labour market. Simply put, the impact of a recession in terms of jobs
lost is greater in both the short- and the long-run than the job creation associated with an
economic expansion of equal magnitude even though the discussion of the half-life of the
shocks in the US may indicate the opposite (i.e. 50% of the long-run effect of a recession-
29
ary shock is greater than 100% of the long-run impact of an expansionary shock of equal
der study when the long-run relationship is asymmetric. This serves to highlight one of
the primary attributes of the asymmetric cumulative dynamic multipliers; they help to
shed light on the traverse between the short-run and the long-run, a property whose use-
fulness and theoretical appeal is difficult to overstate. In a traditional ECM, the speed of
A large literature has developed around the observation that retail gasoline prices tend to
react asymmetrically to changes in the price of crude oil (an exhaustive survey is provided
by Grasso and Manera, 2007). This phenomenon has come to be referred to as the ‘rockets
and feathers’ hypothesis following the early contribution of Bacon (1991). Employing an
et al. (1997, BCG) derive strong support for asymmetry from a hybrid error correction
model where changes in gasoline and oil prices are decomposed into positive and negative
changes.20
Various theoretical explanations for asymmetric price adjustment have been adduced
in the literature, the dominant three being oligopolistic pricing behaviour (Radchenko,
2005), inventory capacity and costs (Borenstein and Shepard, 2002) and nonlinear con-
30
sumer search-effort (Johnson, 2002). While the literature on short-run dynamic asymme-
try is expansive, relatively little work has been done on potential long-run asymmetry.21
Reilly and Witt (1998) were among the first authors to investigate asymmetric pass-
through of the exchange rate to the retail price of gasoline, reflecting the convention of
quoting oil prices in US$ per barrel. Their results, derived from a simple ECM specifica-
ship between the exchange rate and retail gasoline prices for the UK. The authors report
that a Sterling depreciation is rapidly passed through to higher prices at the pump but
that a strengthening of the Pound is not met by a commensurate reduction in retail prices.
Similarly, Asplund, Eriksson and Friberg (2000) find that the impact of a depreciation is
more marked than that of an appreciation in Sweden, with retail gasoline prices reacting
more swiftly to the exchange rate than to crude oil price movements. More recently, Ga-
leotti, Lanza and Manera (2003) find compelling evidence that the speed of adjustment
to long-run equilibrium is asymmetric both with respect to oil price shocks and exchange
rate shocks. However, these papers consider only short-run dynamic asymmetries and
The majority of papers surveyed by Grasso and Manera (2007) rely on the two-step
ship is imposed in the first step. This methodology is only appropriate in the analysis of
short-run asymmetry where the long-run relationship is believed to be linear.22 Should the
underlying long-run relationship prove nonlinear, the imposition of linearity in the first
step is likely to provide misleading and spurious results as noted in the case of Okun’s
Law above. We contribute to this literature by applying our modelling strategy to the
31
case of asymmetric pass-through of crude oil price changes and exchange rate fluctuations
to the retail price of gasoline in Korea over the period 1991q1-2007q2.23 The choice of
Korean data is motivated by the need to find an industrial country which is entirely reliant
on imported oil, thereby circumventing any issues of endogeneity of regressors that may
arise in countries with significant oil extraction and refining activity. Given the extensive
literature surveyed above, we do not report static estimation results and merely note that
Table 6 presents the results of the benchmark symmetric ARDL model, the fully asym-
metric NARDL model and our preferred specification which combines long-run symmetry
with short-run asymmetry. Taken together, the FP SS and tBDM tests indicate cointegra-
The Wald tests fail to reject long-run symmetry with respect to either the oil price or
the exchange rate, indicating that the pass-through from input prices to the retail price
of gasoline is linear in the long-run. This may suggest that the retail gasoline industry
intervention in the energy industry in the early years of the sample. Turning to the
short-run, the Wald tests decisively reject the null of additive short-run symmetry with
respect to both the oil price and the exchange rate. This pattern of asymmetry determines
the shape of the dynamic multipliers presented in Figure 4. Focusing first on the retail
price response to the crude oil spot price, we observe a strong and rapid reaction to
positive changes but a more gradual response to falling crude prices in both panels (a)
and (b). The principle difference between these two figures derives from the considerable
uncertainty surrounding the long-run coefficient estimates in the fully asymmetric model.
32
This inflates the bootstrap confidence intervals in panel (a) but, interestingly, also seems to
exaggerate the observed short-run asymmetry. In conjunction with the weight of evidence
supporting the rockets and feathers hypothesis, we therefore regard the combination of
Turning to the case of exchange rate fluctuations, we again note that the long-run
symmetry restrictions cannot be rejected but that the additive short-run restrictions are
firmly rejected. Figures 4(c) and (d) reveal that gasoline prices increase rapidly and
contrast, the response to an appreciation is rather muted. Moreover, our results suggest
that exchange rate fluctuations have a more pronounced impact on retail gasoline prices
than movements in the price of crude oil quoted in US$. This effect is apparent in both
the long- and the short-run and is consistent with the findings of Asplund et al. (2000).
Overall, our results are largely consistent with the existing literature on dynamic
asymmetries, confirming that Korean gasoline prices respond more rapidly to the price
increases of crude oil than to decreases and that they are more sensitive to exchange
asymmetries exists against which to judge our results. However, at a pragmatic level,
one can argue that the presence of long-run asymmetries in the gasoline-pricing equation
may give rise to a logical inconsistency, and so our finding of long-run symmetry may
prices to increase more following a depreciation than they would decrease following an
appreciation of equal magnitude, for example, then there would be a ratchet mechanism
at work whereby prices would gradually increase through time under the assumption that
33
positive and negative shocks are of approximately equal magnitude and probability. This
outcome seems rather implausible and suggests that long-run linearity is the more natural
case.
5 Concluding Remarks
a prominent role in econometric research. This reflects the realisation that asymmetry
is pervasive within the social sciences and may be inherent in modern economies. In-
deed, the behavioural finance literature can be viewed as an attempt at formalising this
cointegration with a dynamically flexible ARDL model and have derived the associated
error correction framework. The desirable features of the NARDL model are threefold.
Firstly, the estimation of the ECM in one step is likely to improve the performance of
the model in small samples, particularly in terms of the power of the cointegration tests.
Secondly, the ability to simultaneously estimate both long- and short-run asymmetries in
a computationally simple and tractable manner reflects the flexibility of our modelling
approach. Moreover, our technique provides a straightforward means of testing both long-
and short-run symmetry restrictions. Finally, the use of asymmetric dynamic multipliers
between the short- and long-run, a result with significant theoretical appeal. While the
dynamic adjustment in most ECMs is discussed in terms of the percentage of the disequi-
librium error that is corrected in each period, our approach sheds light on the nature of
this dynamic adjustment, mapping the gradual movement of the process under scrutiny
34
from initial equilibrium through the shock and toward the new equilibrium.
These key strengths of the NARDL framework have been demonstrated in the case
of the long- and short-run asymmetry of the unemployment-output relationship and the
short-run asymmetry characterising retail gasoline price adjustments. The results suggest
that the imposition of long-run symmetry where the underlying relationship is nonlinear
will confound efforts to test for the existence of a stable long-run relationship and will
result in spurious dynamic responses. Similarly, our results stress the importance of
and long-run asymmetries yet developed. At this point, it seems appropriate to mention
three obvious extensions which present themselves. Firstly, the model can be related to
the threshold literature by generalising to the case of one or more unknown non-zero
thresholds for use in the construction of the partial sum processes. This is the subject
employ Hansen’s (2000) approach to estimation and inference in models with unknown
threshold parameters. One could further extend research in this vein by allowing for the
model capable of dealing with multiple long-run relationships would permit the analysis
model to the dynamic heterogeneous panel context may broaden its appeal further still.
The obvious starting point for such developments is the pooled mean group framework
35
advanced by Pesaran, Shin and Smith (1999), which is readily estimable by FIML under
Acknowledgments
This is a substantially revised version of the working paper by Shin and Yu (2004). Earlier
versions circulated under the titles “An ARDL Approach to an Analysis of Asymmetric
Cho, Ana-Maria Fuertes, Liang Hu, John Hunter, Minjoo Kim, Soyoung Kim, Gary
Koop, Kevin Lee, Camilla Mastromarco, Amy Mise, Viet Ngyuen, Neville Norman, Kevin
Reilly, Hashem Pesaran, Laura Serlenga, Ron Smith, Till van Treeck and participants at
the ESEM conference (Vienna, 2006), the ICAETE conference (Hyderabad, 2009), and
research seminars at the IMK, the Bank of Korea, and the Universities of Bari, Lecce,
Leeds, Leicester, Korea and Yonsei for their helpful comments. This paper has been
widely circulated and the methodology adopted by a number of authors – we are pleased
partial financial support from the ESRC (Grant No. RES-000-22-3161). Yu is grateful for
the hospitality of Leeds University Business School during his visit. The usual disclaimer
applies.
36
Notes
1
The present version of the paper is a substantially revised version of Shin and Yu
(2004), which has benefited greatly from a sequence of incremental improvements and
additions arising from the constructive comments of conference and seminar participants
and from editorial feedback. Earlier versions of the paper circulated under the titles “An
work”. By virtue of its wide circulation and prolonged availability as a working paper, our
research has informed the development of a subsequent literature that we now discuss. In
all cases, however, the development of the NARDL model is properly credited.
2
The presence of long-run asymmetry will induce a ratchet mechanism if the respec-
tive positive and negative regime probabilities are approximately equal and the shocks
under each regime are of comparable magnitude. In the more general case in which these
3
Consider the threshold ECM as an example, in which case the choice of the transition
distribution of the test statistic for the null of linearity or symmetry is not only non-
4
The concept of asymmetric cointegration is easily conceptualised by use of a simple
regression, one models yt and xt subject to a common stochastic trend. As this relationship
37
is assumed to hold in the long-run, it represents the equilibrium to which the system
returns after a perturbation (i.e. it acts as a global attractor). However, in our framework,
the long-run relationship between yt and xt is modelled as piecewise linear subject to the
decomposition of xt . Suppose that |β + | < |β − | in (2.1). This suggests that the long-run
effect of a unit negative change in output will increase unemployment by a greater amount
than a unit positive change would reduce it. Thus, our model includes a regime-switching
cointegrating relationship in which regime transitions are governed by the sign of ∆xt .
The economic implication of this line of reasoning is that equilibrium need not be unique
in a globally linear sense. The link to the path dependency literature is apparent.
5
In the special case where vt is normally distributed with zero mean and constant
variance σv2 , it is well-established that the censored normal variates, vt+ = max [0, vt ] and
vt− = min [0, vt ], will have E vt+ = √σv , E vt− = − √σ2π , and V ar vt+ = V ar vt− =
v
2π
σv2 π−1
2 π
. We are grateful to Jinseo Cho for pointing this issue out and encouraging us to
6
Notice that the analysis of short-run dynamic asymmetries is not straightforward in
the context of the static regression model employing the semiparametric approach.
7
In some cases, most notably where the growth rates of the series in xt are predomi-
nantly positive (negative), the use of a zero threshold may result in one regime containing
8
For convenience we employ the same lag order, q. One may also allow for feedback
38
9
While the associated critical values can be tabulated easily using stochastic simula-
tion, it is impractical to provide a meaningful set of critical values covering all possible
10
It is straightforward to extend similar reasoning to the more general case with multiple
11
The level parameters are obtained as follows:
−
12
The dynamic multipliers, λ+
j and λj for j = 0, 1, ..., can be evaluated using the
following recursive relationships in which λ`0 = θ `0 , φj = 0 for j < 1 and λ`j = 0 for j < 0:
13
The final specification in Borenstein et al. (1997) differs slightly from (2.14) as the
lagged ∆yt ’s on the right hand side are also decomposed into positive and negative changes.
14
Short-run symmetry restrictions (especially the pair-wise restrictions) may be exces-
sively restrictive in many applications although they may be useful in providing more
ship in small samples. The additive symmetry restrictions are somewhat weaker and have
been discussed in the literature in terms of assessing the validity of the liquidity constraint
39
Pq−1 Pq−1
where i=0 π+
i < i=0 π−
i (e.g. Van Treeck, 2008).
15
Webber (2000) utilises a similar approach in his analysis of the asymmetric pass-
through from exchange rates, decomposed as the partial sum processes of appreciations
16
Full results are available on request.
17
We employ a non-parametric bootstrapping routine and use 50,000 replications after
rejecting those for which ρ > −1 × 10−4 . Full details are available on request.
18
Further examples of the use of positive/negative decompositions in the modelling of
asymmetry in the unemployment-output relationship include Lee (2000) and Virén (2001).
19
Seasonally-adjusted monthly data for unemployment and industrial production cov-
ering the range 1982m2-2003m11 were collected from the OECD’s Main Economic Indi-
cators. Although not presented here, ADF testing lends overwhelming support to the
20
Bachmeier and Griffin (2003) criticise BCG for their use of ‘nonstandard estimation
methodology’ and low-frequency data, arguing that the two-step EG method finds no
evidence of asymmetry and, moreover, that the BCG method finds no evidence of asym-
metry when applied to their daily dataset. While there is some debate over the optimal
data frequency for the study of price shocks, the criticism of the one-step BCG estimation
process is unwarranted (c.f. Pesaran and Shin, 1998). Indeed, as noted above, estimating
the ECM in a single step yields superior performance in small samples, particularly in
40
21
An early and notable paper combining both short- and long-run asymmetries in the
analysis of the nonlinearities characterising the relationship between upstream and down-
stream prices in the oil industry is Balke, Brown, and Yücel (1998). The authors extend
and large” asymmetries in all cases apart from their levels specification (p. 10).
22
As discussed above, in the presence of weakly endogenous regressors and/or serially
correlated errors, the OLS estimator in the first step remains consistent but is inefficient.
Furthermore, if the AR coefficients are significantly different from zero, the OLS estimator
becomes inconsistent and is thus poorly determined in finite samples (see Pesaran and
23
The Dubai spot price (US$/bbl), pot , was retrieved from the Korean Energy Eco-
KRW/USD exchange, xt , were retrieved from the Economic Statistics System of the Bank
24
The Engle-Granger residual-based test associated with the static linear regression
of price on a constant, time trend, oil price and exchange rate (all in logs) returns a
maximum value of -3.91 compared to a 5% critical value of -4.32. Similarly, for the static
asymmetric regression of price on po+ , po− , x+ and x− , as well as a constant and a trend,
41
A Appendix: Proof of Theorem 1
−1
+ 2
PT PT + −
PT +
t=1 xt t=1 xt xt t=1 xt yt
β̂ =
P
,
T + −
PT
− 2
P
T
t=1 xt xt t=1 xt t=1 x−
t yt
so that
− 2
PT PT + −
PT +
1 t=1 xt − t=1 xt xt t=1 xt ut AT
= 1
β̂ − β = ,
DT PT + −
PT 2 P
T
DT
− t=1 xt xt t=1 x+
t t=1 x−
t ut BT
PT 2 PT 2 P 2 2 PT
T PT
where DT := t=1 x+
t t=1 x−
t − t=1 x+ −
t xt , AT := t=1 x−
t t=1 x+
t ut −
PT −
PT PT PT PT 2 PT
t=1 x+
t xt t=1 x−
t ut , and BT := − t=1 x+ −
t xt t=1 x+
t ut + t=1 x+
t t=1 x−
t ut . We
now let
t
X t
X
x+ +
t ≡ tµ + wj+ , x− −
t ≡ tµ + wj−
j=1 j=1
Hence, we obtain:
( T ) T t
!2 t
!2 t
! t
!
X X X X X X
2 +2 − −2 + + − − +
DT = t µ wj +µ wj − 2µ µ wj wj
t=1
t=1
j=1 j=1 j=1 j=1
!2 !2 ! T !
T
X X t T
X X t T
X X t X X t
+2 − −2 + + − − +
− µ t wj +µ t wj − 2µ µ t wj t wj
t=1 j=1 t=1 j=1 t=1 j=1 t=1 j=1
+ oP (T 5 ).
42
Here, oP (T 6 ) terms are canceled off, and the remaining next-order terms are stated as
t
!2 t
!2 t
! t
! t
!2
X X X X X
µ+2 wj− + µ−2 wj+ − 2µ+ µ− wj− wj+ = sj
j=1 j=1 j=1 j=1 j=1
where sj ≡ µ+ wj− − µ− wj− by the definitions of wj− and wj+ . Hence, by Donsker’s FCLT.
T (·)
X
−1/2
T st /σs ⇒ Ws̃ (·),
j=1
where σs2 := V ar (st ), ⇒ indicates weak convergence, and Ws̃ (r) is the standard Brownian
T t
!2
X X Z 1
−2
T sj ⇒ σs2 Ws̃ (r)2 dr
t=1 j=1 0
by the CMT (e.g. Eq. (17.3.22) of Hamilton (1994), p. 486). Also notice that
T t
!2 T t
!2 T t
! T t
!
X X X X X X X X
µ+2 t wj− + µ−2 t wj+ − 2µ+ µ− t wj− t wj+
t=1 j=1 t=1 j=1 t=1 j=1 t=1 j=1
T t
!2 T t
!2
X X X X
µ+ wj− − µ− wj+
= t = t sj ,
t=1 j=1 t=1 j=1
43
by the CMT. Collecting all these results we obtain:
" Z 2 #
1 Z 1
1
T −5 DT ⇒ σs2 Ws̃ (r)2 dr − rWs̃ (r)dr . (A.1)
3 0 0
Next, we consider the asymptotic weak limit of the numerator of β̂ + −β + . For this, we
note that OP (T 9/2 ) terms cancel off, so that the remaining next-order terms are Op (T 4 ),
so that
T T T T
X 2 X X X
AT := x−
t x+
t ut − x+ −
t xt x−
t ut
t=1 t=1 t=1 t=1
( T T t T T X
t
)
X X X X X
= µ−2 t2 ut wj+ + 2µ− µ+ tut wj−
t=1 t=1 j=1 t=1 t=1 j=1
( T T t T t T
)
X X X X X X
− µ+ µ− t2 ut wj+ + t (µ+ wj− + µ− wj+ )µ− tut + oP (T 4 )
t=1 t=1 j=1 t=1 j=1 t=1
( T
! T t
! T t
! T
!)
X X X X X X
=µ− − t2 ut sj + t sj tut + oP (T 4 ) (A.2)
t=1 t=1 j=1 t=1 j=1 t=1
where we also employ the definition of sj := µ+ wj− − µ− wj+ . Then, by the CMT (e.g. Eqs.
T
X t
X Z 1
−1
T ut sj ⇒ σs σu Ws̃ (r)dWũ (r) (A.3)
t=1 j=1 0
T Z 1
− 23
X
T tut ⇒ σu Wũ (1) − Wũ (r)dr (A.4)
t=1 0
where Wũ (·) is the standard Brownian motion independent of Ws̃ (·). Collecting all these
44
results and (A.4) and plugging them into AT , we obtain by the CMT:
Z 1 Z 1 Z 1
−4 − 1
T AT ⇒ µ σs σu − Ws̃ (r)dWũ (r) + rWs̃ (r)dr Wũ (1) − Wũ (r)dr
3 0 0 0
(A.5)
( T
! T t
! T t
! T
!)
X X X X X X
BT := µ+ σs σu t2 ut sj − t sj tut + oP (T 4 ), (A.6)
t=1 t=1 j=1 t=1 j=1 t=1
and
Z 1 Z 1 Z 1
−4 + 1
T BT ⇒ µ σs σu Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr
3 0 0 0
(A.7)
Combining (A.5) and (A.7) respectively with (A.1) we obtain the main results.
µ+ AT + µ− BT = oP (T 4 ),
45
References
Apergis, N. and Miller, S. (2006). “Consumption Asymmetry and the Stock Market:
Altissimo, F. and Violante, G. (2001). “The Nonlinear Dynamics of Output and Unem-
Bachmeier, L.J. and Griffin, J.M. (2003). “New Evidence on Asymmetric Gasoline Price
Bacon, R.W. (1991). “Rockets and Feathers: The Asymmetric Speed of Adjustment of
Bae, Y. and de Jong, R.M. (2007). “Money Demand Function Estimation by Nonlinear
Balke, N.S. and Fomby, T.B. (1997). “Threshold Cointegration.” International Economic
Balke, N.S., Brown, S.P. and Yücel, M.K. (1998). “Crude Oil and Gasoline Prices: An
Banerjee, A., Dolado, J. and Mestre, R. (1998). “Error-correction Mechanism Tests for
(3), 267-283.
Blanchard, O.J. and Summers, L.H. (1987). “Hysteresis and the European Unemployment
46
Borenstein, S., Cameron, C. and Gilbert, R. (1997). “Do Gasoline Prices Respond Asym-
metrically to Crude Oil Price Changes?” The Quarterly Journal of Economics, 112
(1), 305-339.
Borenstein, S. and Shepard, A. (2002). “Sticky Prices, Inventories, and Market Power in
Crespo Cuaresma, J. (2003). “Okun’s Law Revisited.” Oxford Bulletin of Economics and
Engle, R.F. and Granger, C.W.J. (1987). “Co-integration and Error Correction: Repre-
Escribano, A., Sipols, A.E. and Aparicio, F.M. (2006). “Nonlinear Cointegration and
Galeotti, M., Lanza, A. and Manera, M. (2003). “Rockets and Feathers Revisited: An
175-190.
47
Grasso, M. and Manera, M. (2007). “Asymmetric Error Correction Models for the Oil-
Greenwood-Nimmo, M.J., Shin, Y. and Van Treeck, T. (2011). “The Great Modera-
tion and the Decoupling of Monetary Policy from Long-Term Rates in the U.S. and
Greenwood-Nimmo, M.J., Shin, Y. and Van Treeck, T. (2011). “The Asymmetric ARDL
Hamilton, J.D. (1994). Time Series Analysis. Princeton (NJ): Princeton University Press.
Hansen, B.E. (1995). “Rethinking the Univariate Approach to Unit Root Tests: How to
(3), 575-603.
Johnson, R.N. (2002). “Search Costs, Lags and Prices at the Pump.” Review of Industrial
Kapetanios, G., Shin, Y. and Snell, A. (2006). “Testing for Cointegration in Nonlinear
Keynes, J.M. (1936). The General Theory of Employment, Interest and Money. London:
48
Macmillan.
Kremers, J.J.M., Ericsson, K.R. and Dolado, J.J. (1992). “The Power of Cointegration
Lang, D. and de Peretti, C. (2009). “A Strong Hysteretic Model for Okun’s Law: Theory
445-462.
Lardic, S. and Mignon, V. (2008). “Oil Prices and Economic Activity: An Asymmetric
Lee, J. (2000). “The Robustness of Okun’s Law: Evidence from OECD countries.” Jour-
Neftci, S.N. (1984). “Are Economic Time Series Asymmetric over the Business Cycle?”
Nguyen, V.H. and Shin, Y. (2010). “Asymmetric Price Impacts of Order Flow on Ex-
Park, J.Y. and Phillips, P.C.B. (2001). “Nonlinear Regressions with Integrated Time
Pesaran M.H. and Shin, Y. (1998). “An Autoregressive Distributed Lag Modelling Ap-
Pesaran, M.H., Shin, Y. and Smith, R.J. (1999). “Pooled Mean Group Estimation of
(446) , 621-634.
49
Pesaran M.H., Shin, Y. and Smith, R.J. (2001). “Bounds Testing Approaches to the
Psaradakis, Z., Sola, M. and Spagnolo, F. (2004). “On Markov Error-Correction Models
19 (1), 69-88.
Radchenko, S. (2005). “Oil Price Volatility and the Asymmetric Response of Gasoline
Prices to Oil Price Increases and Decreases.” Energy Economics, 27 (5), 708-730.
Reilly, B. and Witt, R. (1998). “Petrol Price Asymmetries Revisited.” Energy Economics,
20 (3), 297-308.
of Geneva.
Shin, Y. and Yu, B. (2004). “An ARDL Approach to an Analysis of Asymmetric Long-run
50
Shiller, R.J. (1993). Macro Markets: Creating Institutions for Managing Society’s Largest
Shiller, R.J. (2005). Irrational Exuberance (2nd ed.). Princeton (NJ): Princeton Univer-
sity Press.
41-49.
Stock, J.H. and Watson, M.W. (1993). “A Simple Estimator of Cointegrating Vectors in
Tanaka, Y. (2001). “Employment Tenure, Job Expectancy and Earnings Profile in Japan.”
Van Treeck, T. (2008). “Asymmetric Income and Wealth Effects in a Non-linear Error
Virén, M. (2001). “The Okun Curve is Non-linear.” Economics Letters, 70 (2), 253-57.
Webber, A.G. (2000). “Newton’s Gravity Law and Import Prices in the Asia Pacific.”
51
Table 1: Monte Carlo Simulation Results: Bias, Standard Error and RMSE of the OLS Estimator
T = 100 T = 200 T = 400
Coef Bias STDE RMSE Coef Bias STDE RMSE Coef Bias STDE RMSE
α 0.001 0.308 6.932 α 0.001 0.194 3.082 α 0.001 0.130 1.451
ρ -0.063 0.070 2.121 ρ -0.029 0.043 0.825 ρ -0.014 0.028 0.354
θ+ 0.019 0.054 1.283 θ+ 0.011 0.028 0.475 θ+ 0.006 0.016 0.195
ω = −0.5 θ− 0.051 0.073 2.001 θ− 0.026 0.044 0.806 θ− 0.013 0.029 0.351
ϕ+ -0.001 0.179 4.019 ϕ+ 0.000 0.122 1.930 ϕ+ -0.001 0.085 0.954
− −
ϕ -0.002 0.178 4.011 ϕ -0.001 0.122 1.937 ϕ− 0.001 0.085 0.954
β + -0.031 0.205 4.664 β + -0.010 0.102 1.628 β+ -0.003 0.051 0.567
− −
β -0.031 0.205 4.661 β -0.010 0.102 1.631 β− -0.003 0.051 0.567
Coef Bias STDE RMSE Coef Bias STDE RMSE Coef Bias STDE RMSE
α 0.002 0.366 8.230 α 0.001 0.229 3.633 α 0.000 0.150 1.681
ρ -0.075 0.077 2.427 ρ -0.037 0.049 0.970 ρ -0.018 0.033 0.417
+ +
θ 0.037 0.072 1.811 θ 0.018 0.036 0.647 θ+ 0.009 0.020 0.251
ω=0 θ− 0.075 0.098 2.773 θ− 0.037 0.056 1.062 θ− 0.018 0.035 0.441
ϕ+ 0.002 0.206 4.631 ϕ+ 0.000 0.141 2.228 ϕ+ 0.001 0.098 1.102
52
− −
ϕ -0.001 0.205 4.613 ϕ 0.000 0.140 2.221 ϕ− 0.000 0.099 1.104
β + -0.002 0.227 5.104 β+ 0.000 0.114 1.812 β+ 0.000 0.057 0.637
β − -0.001 0.227 5.097 β− 0.000 0.114 1.813 β− 0.000 0.057 0.638
Coef Bias STDE RMSE Coef Bias STDE RMSE Coef Bias STDE RMSE
α 0.005 0.311 7.001 α 0.001 0.195 3.096 α 0.001 0.129 1.441
ρ -0.063 0.070 2.121 ρ -0.029 0.043 0.826 ρ -0.014 0.028 0.354
+ +
θ 0.044 0.074 1.929 θ 0.018 0.036 0.634 θ+ 0.008 0.019 0.234
− −
ω = 0.5 θ 0.075 0.102 2.854 θ 0.032 0.054 1.001 θ− 0.015 0.032 0.396
ϕ+ 0.002 0.178 4.001 ϕ+ 0.001 0.123 1.948 ϕ+ 0.000 0.085 0.952
− −
ϕ 0.002 0.178 4.002 ϕ 0.000 0.122 1.933 ϕ− 0.000 0.085 0.957
β+ 0.031 0.207 4.714 β+ 0.010 0.102 1.625 β+ 0.003 0.050 0.565
β− 0.032 0.207 4.714 β− 0.010 0.102 1.621 β− 0.003 0.050 0.566
Note: Bias = θ̂R − θ0 , where θ0 is the true value of the coefficient θ and θ̂R is the mean of the estimates of θ across
PR
replications, i.e., θ̂R = i=1 θ̂i /R, where R is the number of replications (we set R = 3, 000 in all cases). STDE θ denotes
the
q standard error of the estimator, θ̂i , across replications. RMSE denotes the root mean squared error of θ̂i , defined as
−1 2
PR
R i=1 (θ̂i − θ0 ) .
Table 2: Monte Carlo Simulation Results: Size and Power of Wald and PSS Tests
T = 100 T = 200 T = 400
Test Power Size Test Power Size Test Power Size
WLR 0.981 0.089 WLR 1.000 0.067 WLR 1.000 0.059
ω = −0.5 WSR 0.425 0.075 WSR 0.675 0.050 WSR 0.935 0.055
FPk=1
SS 0.610 0.040 FPk=1 SS 0.995 0.045 FPk=1
SS 1.000 0.030
k=2 k=2
FPk=2
SS 0.765 0.090 F P SS 1.000 0.100 FP SS 1.000 0.070
(b) (b) (b)
FP SS 0.720 0.050 FP SS 1.000 0.050 FP SS 1.000 0.050
Test Power Size Test Power Size Test Power Size
WLR 0.974 0.100 WLR 1.000 0.075 WLR 1.000 0.062
ω=0 WSR 0.308 0.051 WSR 0.548 0.053 WSR 0.870 0.035
FPk=1
SS 0.329 0.036 FPk=1 SS 0.947 0.030 FPk=1
SS 1.000 0.025
k=2 k=2
0.988
53
FP SS 0.527 0.080 FPk=2 SS 0.072 FP SS 1.000 0.070
(b) (b) (b)
FP SS 0.422 0.050 FP SS 0.976 0.050 FP SS 1.000 0.050
Test Power Size Test Power Size Test Power Size
WLR 0.982 0.098 WLR 1.000 0.075 WLR 1.000 0.061
ω = 0.5 WSR 0.385 0.075 WSR 0.675 0.025 WSR 0.925 0.055
FPk=1
SS 0.540 0.035 FPk=1 SS 0.985 0.025 FPk=1
SS 1.000 0.025
k=2
FP SS 0.735 0.080 FPk=2 SS 1.000 0.080 FPk=2
SS 1.000 0.075
(b) (b) (b)
FP SS 0.655 0.050 FP SS 0.995 0.050 FP SS 1.000 0.050
Note: WLR denotes the Wald test of the null hypothesis of long-run symmetry defined as
θ+ = θ− . WSR is the Wald test of the short-run symmetry restrictions ϕ+ = ϕ− . FPk=n SS
denotes the PSS F-test of the null hypothesis ρ = θ+ = θ− = 0 using the k = n critical
values where n = (1, 2) for the case where all regressors follow nonstationary I(1) processes.
(b)
FP SS refers to the bootstrapped PSS test.
Table 3: Static Estimation of the Unemployment-Output Relationship
Note: yt denotes the natural logarithm of industrial production and yt+ and yt− the associ-
ated positive and negative partial sum processes. Note also that in order to accommodate
the strong trending behavior of yt , we include a deterministic time trend in the symmet-
ric case. χ2SC , χ2H , χ2F F and χ2N denote LM tests for serial correlation, heteroscedasticity,
functional form (Ramsey’s RESET test) and normality, respectively. Figures in square
parentheses are the associated p-values. Wy+ =y− denotes the Wald test of the equality
of the coefficients associated with yt+ and yt− . EGM AX denotes the largest value of the
Engle-Granger residual-based ADF test. The 95% critical values of the EG test are -3.42
(panel (a)) and -3.77 (panel (b)).
54
Table 4: Dynamic Linear Estimation of the Unemployment-Output Relationship
US Canada Japan
Var. Coeff. S.E. Var. Coeff. S.E. Var. Coeff. S.E.
ut−1 -0.03 0.01 ut−1 -0.02 0.01 ut−1 0.00 0.01
yt−1 -0.04 0.07 yt−1 -0.09 0.10 yt−1 -0.02 0.06
∆ut−1 -0.17 0.06 ∆ut−2 -0.12 0.06 ∆ut−1 -0.26 0.06
∆ut−11 0.13 0.05 ∆yt -4.40 1.19 ∆ut−2 -0.22 0.06
∆yt -8.17 1.61 ∆yt−2 -2.83 1.21 ∆ut−10 0.16 0.06
∆yt−2 -4.73 1.58 ∆yt−6 -3.01 1.16 ∆ut−12 -0.18 0.06
∆yt−4 -4.04 1.50 Const. 0.57 0.55 ∆yt−1 -1.37 0.42
Const. 0.38 0.35 ∆yt−2 -1.27 0.45
∆yt−3 -1.30 0.43
∆yt−9 -1.16 0.39
Const. 0.09 0.27
Ly -1.66 2.03 Ly -5.68 3.89 Ly 5.57 20.88
R2 0.29 R2 0.13 R 2
0.23
R̄2 0.27 R̄2 0.11 R̄ 2 0.20
χ2SC 10.75[.550] χ2SC 9.35[.673] χ2SC 11.95[.450]
χ2F F 1.94[.163] χ2F F 0.26[.609] χ2F F 0.03[.867]
χ2N OR 3.72[.156] χ2N OR 12.35[.002] 2
χN OR 0.92[.632]
χ2HET 15.19[.000] χ2HET 0.09[.770] χ2HET 0.41[.521]
tBDM -2.34[.136] tBDM -1.27[.820] tBDM 0.57[1.000]
FP SS 4.69[.081] FP SS 0.81[.927] FP SS 0.18[.890]
Note: ut denotes the rate of unemployment, measured in percentage points. Here we fol-
low the general-to-specific approach to select the final ARDL specification. The preferred
specification is chosen by starting with max p = max q = 12 and dropping all insignificant
stationary regressors. tBDM is the BDM t-statistic while FP SS denotes the PSS F-statistic
testing the null hypothesis ρ = θ = 0. The long-run coefficient Ly is defined by β̂ = −θ̂/ρ̂.
Pesaran, Shin and Smith (2001) tabulate the 5% critical values for k = 1 as follows:
tcrit = −3.22; Fcrit = 5.73. Empirical p-values are quoted for the BDM t-statistic and the
PSS F-statistic.
55
Table 5: Dynamic Asymmetric Estimation of the Unemployment-Output Relationship
US Canada Japan
Var. Coeff. S.E. Var. Coeff. S.E. Var. Coeff. S.E.
ut−1 -0.06 0.01 ut−1 -0.07 0.02 ut−1 -0.05 0.01
+ + +
yt−1 -0.55 0.17 yt−1 -1.27 0.28 yt−1 -0.34 0.10
− − −
yt−1 -1.62 0.50 yt−1 -2.09 0.46 yt−1 -0.53 0.14
∆ut−1 -0.19 0.06 ∆ut−2 -0.13 0.06 ∆ut−1 -0.23 0.06
∆ut−11 0.11 0.05 ∆ut−12 -0.12 0.06 ∆ut−2 -0.19 0.06
∆yt+ -8.42 2.23 ∆yt+ -5.24 1.86 ∆ut−10 0.13 0.06
+ +
∆yt−2 -4.82 1.99 ∆yt−3 3.69 1.86 ∆ut−12 -0.22 0.06
∆yt− -8.24 4.28 ∆yt− -5.15 2.60 +
∆yt−1 -1.61 0.65
− − +
∆yt−4 -9.74 3.77 ∆yt−3 -5.89 2.64 ∆yt−9 -1.71 0.66
−
Const. 0.38 0.11 Const. 0.72 0.19 ∆yt -1.80 0.71
Const. 0.16 0.04
Ly + -9.76 1.74 Ly+ -17.26 2.15 Ly+ -7.28 1.64
Ly − -28.88 6.33 Ly− -28.48 4.04 Ly − -11.26 1.97
R2 0.32 R2 0.20 R 2
0.24
R̄2 0.30 R̄2 0.17 R̄2 0.21
χ2SC 9.23[.683] χ2SC 8.11[.777] 2
χSC 11.85[.458]
χ2F F 0.53[.466] χ2F F 9.74[.002] 2
χF F 0.11[.744]
χ2N OR 1.79[.409] χ2N OR 12.62[.002] χ2N OR 0.30[.861]
χ2HET 12.81[.000] χ2HET 0.38[.537] 2
χHET 2.77[.096]
tBDM -3.97[.007] tBDM -4.12[.006] tBDM -3.34[.033]
FP SS 6.98[.010] FP SS 7.13[.005] FP SS 5.38[.038]
WLR 16.33[.000] WLR 32.49[.000] WLR 76.69[.000]
WSR 0.46[.498] WSR 3.65[.056] WSR 2.35[.125]
Note: Ly+ and Ly− denote the long-run coefficients associated with positive and negative
changes of output, respectively. WLR refers to the Wald test of long-run symmetry (i.e.
Ly+ = Ly− ) while WSR denotes the Wald test of the additive short-run symmetry con-
dition. Pesaran, Shin and Smith (2001) tabulate the 5% critical values of tBDM as -3.53
and -3.22 for k = 2 and k = 1, respectively, while the equivalent values for FP SS are 4.85
and 5.73. Empirical p-values are reported for both tests.
56
Table 6: Dynamic Asymmetric Estimation of Gasoline Price Adjustments
Note: pt denotes the natural logarithm of the gasoline price index (2000Y=ln(100)), pot
denotes the natural logarithm of the price of crude oil (US$/bbl) while xt denotes the
natural logarithm of the KRW/USD exchange rate. The superscripts ‘+’ and ‘-’ denote
positive and negative partial sums, respectively. Lpo+ , Lpo− , Lx+ and Lx− denote the long-
run coefficients associated with positive and negative changes in the price of crude oil and
positive and negative changes in the KRW/USD exchange rate, respectively. WLR, po
refers to the Wald test of the restriction Lpo+ = Lpo− while WLR, x refers to the Wald
test of Lx+ = Lx− . WSR, po and WSR, x refer to the Wald tests of the short-run additive
symmetry restrictions. The relevant 5% critical values of the tBDM test are -3.99 for k = 4
and -3.53 for k = 2. Similarly, the critical values of the FP SS test are 4.01 with k = 4 and
4.85 with k = 2. Empirical p-values are reported for both tests.
57
30 20
20
10
10
0
0
-10
-10
-20 -20
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
y- Diff y+ y- Diff y+
30 20
20
10
10
0
0
-10
-10
-20 -20
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
y- Diff y+ y- Diff y+
58
30 30
20 20
10 10
0 0
-10 -10
-20 -20
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
y- Diff y+ y- Diff y+
59
12 10
8
5
4
0
0
-5
-4
-8 -10
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
y- Diff y+ y- Diff y+
60
.6 .8
.4
.4
.2
.0
.0
-.4
-.2
-.4 -.8
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
p- Diff p+ p- Diff p+
1.5 1.2
0.8
1.0
0.4
0.5 0.0
-0.4
0.0
-0.8
-0.5 -1.2
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
Figure 4: Dynamic Multipliers w.r.t. Oil Price and Exchange Rate Shocks
61
Bounds Testing Approaches to the Analysis of Level Relationships
Author(s): M. Hashem Pesaran, Yongcheol Shin and Richard J. Smith
Source: Journal of Applied Econometrics, Vol. 16, No. 3, Special Issue in Memory of John
Denis Sargan, 1924-1996: Studies in Empirical Macroeconometrics (May - Jun., 2001), pp.
289-326
Published by: Wiley
Stable URL: http://www.jstor.org/stable/2678547
Accessed: 08-08-2016 17:39 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.
Wiley is collaborating with JSTOR to digitize, preserve and extend access to Journal of Applied
Econometrics
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
JOURNAL OF APPLIED ECONOMETRICS
J. Appl. Econ. 16: 289-326 (2001)
DOI: 10.1002/jae.616
SUMMARY
This paper develops a new approach to the problem of testing the existence of a level relatio
a dependent variable and a set of regressors, when it is not known with certainty whether
regressors are trend- or first-difference stationary. The proposed tests are based on standard F-
used to test the significance of the lagged levels of the variables in a univariate equilibri
mechanism. The asymptotic distributions of these statistics are non-standard under the null
there exists no level relationship, irrespective of whether the regressors are I(0) or I(1). Two sets
critical values are provided: one when all regressors are purely I(1) and the other if they
1(0). These two sets of critical values provide a band covering all possible classifications of
into purely I(O), purely I(1) or mutually cointegrated. Accordingly, various bounds testing p
proposed. It is shown that the proposed tests are consistent, and their asymptotic distribution
and suitably defined local alternatives are derived. The empirical relevance of the bounds
demonstrated by a re-examination of the earnings equation included in the UK Treasury ma
model. Copyright © 2001 John Wiley & Sons, Ltd.
1. INTRODUCTION
Over the past decade considerable attention has been paid in empirical econ
the existence of relationships in levels between variables. In the main, thi
based on the use of cointegration techniques. Two principal approaches hav
two-step residual-based procedure for testing the null of no-cointegration (see
1987; Phillips and Ouliaris, 1990) and the system-based reduced rank regres
Johansen (1991, 1995). In addition, other procedures such as the variable additio
(1990), the residual-based procedure for testing the null of cointegration by Sh
stochastic common trends (system) approach of Stock and Watson (1988) ha
All of these methods concentrate on cases in which the underlying variables are
one. This inevitably involves a certain degree of pre-testing, thus introducing
uncertainty into the analysis of levels relationships. (See, for example, Cavanag
1995.)
This paper proposes a new approach to testing for the existence of a relationship between
variables in levels which is applicable irrespective of whether the underlying regressors are purely
* Correspondence to: M. H. Pesaran, Faculty of Economics and Politics, University of Cambridge, Sidgwick Avenue,
Cambridge CB3 9DD. E-mail: hashem.pesaran@econ.cam.ac.uk
Contract/grant sponsor: ESRC; Contract/grant numbers: R000233608; R000237334.
Contract/grant sponsor: Isaac Newton Trust of Trinity College, Cambridge.
Copyright © 2001 John Wiley & Sons, Ltd. Received 16 February 1999
Revised 13 February 2001
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
290 M. H. PESARAN, Y. SHIN AND R. J. SMITH
I(O), purely I(1) or mutually cointegrated. The statistic underlying our procedure is the
Wald or F-statistic in a generalized Dicky-Fuller type regression used to test the s
of lagged levels of the variables under consideration in a conditional unrestricted
correction model (ECM). It is shown that the asymptotic distributions of both st
non-standard under the null hypothesis that there exists no relationship in levels b
included variables, irrespective of whether the regressors are purely I(0), purely I(1) or
cointegrated. We establish that the proposed test is consistent and derive its asymptotic d
under the null and suitably defined local alternatives, again for a set of regressors
mixture of 1(0)/I(1) variables.
Two sets of asymptotic critical values are provided for the two polar cases which assu
the regressors are, on the one hand, purely I(1) and, on the other, purely I(0). Since the
of critical values provide critical value bounds for all classifications of the regressors in
I(1), purely I(0) or mutually cointegrated, we propose a bounds testing procedure. If the
Wald or F-statistic falls outside the critical value bounds, a conclusive inference can be drawn
without needing to know the integration/cointegration status of the underlying regressors. However,
if the Wald or F-statistic falls inside these bounds, inference is inconclusive and knowledge of the
order of the integration of the underlying variables is required before conclusive inferences can be
made. A bounds procedure is also provided for the related cointegration test proposed by Banerjee
et al. (1998) which is based on earlier contributions by Banerjee et al. (1986) and Kremers et al.
(1992). Their test is based on the t-statistic associated with the coefficient of the lagged dependent
variable in an unrestricted conditional ECM. The asymptotic distribution of this statistic is obtained
for cases in which all regressors are purely I(1), which is the primary context considered by these
authors, as well as when the regressors are purely I(0) or mutually cointegrated. The relevant
critical value bounds for this t-statistic are also detailed.
The empirical relevance of the proposed bounds procedure is demonstrated in a re-examination
of the earnings equation included in the UK Treasury macroeconometric model. This is a
particularly relevant application because there is considerable doubt concerning the order of
integration of variables such as the degree of unionization of the workforce, the replacement
ratio (unemployment benefit-wage ratio) and the wedge between the 'real product wage' and the
'real consumption wage' that typically enter the earnings equation. There is another consideration
in the choice of this application. Under the influence of the seminal contributions of Phillips (1958
and Sargan (1964), econometric analysis of wages and earnings has played an important role in
the development of time series econometrics in the UK. Sargan's work is particularly noteworthy
as it is some of the first to articulate and apply an ECM to wage rate determination. Sargan,
however, did not consider the problem of testing for the existence of a levels relationship between
real wages and its determinants.
The relationship in levels underlying the UK Treasury's earning equation relates real average
earnings of the private sector to labour productivity, the unemployment rate, an index of union
density, a wage variable (comprising a tax wedge and an import price wedge) and the replacement
ratio (defined as the ratio of the unemployment benefit to the wage rate). These are the variables
predicted by the bargaining theory of wage determination reviewed, for example, in Layard
et al. (1991). In order to identify our model as corresponding to the bargaining theory of wag
determination, we require that the level of the unemployment rate enters the wage equation, but not
vice versa; see Manning (1993). This assumption, of course, does not preclude the rate of chang
of earnings from entering the unemployment equation, or there being other level relationships
between the remaining four variables. Our approach accommodates both of these possibilities.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 291
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
292 M. H. PESARAN, Y. SHIN AND R. J. SMITH
-(Ik+l - =iP-1 (i), and the short-run response matrix lag polynomial r(L) Ik+l - Ei1
ri = - jil j, i = 1, ..., p- 1. Hence, the VAR(p) model (1) may be rewritten in vec
ECM form as
p-1
a. _ ( )yy wyX )
(Wxy )xx
we may express Eyt conditionally in terms of ext as
yt = WyxQl-xt + Ut (4)
r = (7tyy 7yx c
V rxy nrx x
2 See also Nielsen and Rahbek (1998) for an analysis of similarity issues in cointegrated systems.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 293
Thus, we may regard the process {xt}°l as long-run forcing for {Yt}li as there
from the level of Yt in (7); see Granger and Lin (1995).3 Assumption 3 restricts con
cases in which there exists at most one conditional level relationship between Yt and
of the level of integration of the process {xt}ll; see (10) below.4
Under Assumption 3, the conditional ECM (5) now becomes
P-1
t = 1, 2..., where
Under Assumption 4, from (7), we may express IIxl as lxx = axxPx, where axx and ,xx are both
(k, r) matrices of full column rank; see, for example, Engle and Granger (1987) and Johansen
(1991). If the maximal order of integration of the system (8) and (7) is unity, under Assumptions
1, 3 and 4, the process {xt}tl^ is mutually cointegrated of order r, 0 < r < k. However, in
contradistinction to, for example, Banerjee, Dolado and Mestre (1998), BDM henceforth, who
concentrate on the case r = 0, we do not wish to impose an a priori specification of r.6 When
7y, = 0 and .,x = O, then xt is weakly exogenous for tyy and 7ryx.x = 7Ty in (8); see, for example,
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
294 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Johansen (1995, Theorem 8.1, p. 122). In the more general case where rIl. is non-zero, as yTYY and
ryx.x = jt,y - w'ilx are variation-free from the parameters in (7), xt is also weakly exogenous for
the parameters of (8).
Note that under Assumption 4 the maximal cointegrating rank of the long-run multiplier
matrix n for the system (8) and (7) is r + 1 and the minimal cointegrating rank of HI is r. The
next assumptions provide the conditions for the maximal order of integration of the system (8)
and (7) to be unity. First, we consider the requisite conditions for the case in which rank(J) = r.
In this case, under Assumptions 1, 3 and 4, 7ryy = 0 and ryx - 'Inx, = 0' for some k-vector 4.
Note that Tyx.x = 0' implies the latter condition. Thus, under Assumptions 1, 3 and 4, rI has rank
r and is given by
11 - n
\Q(x° x
n,Jx
Hence, we may express II = at' where a = (a x, a')' and fB = (0, B'x)' are (k + 1, r) matrices
full column rank; cf. HJNR, p. 390. Let the columns of the (k + 1, k - r + 1) matrices (al,
and (fry , pf), where ay , fy1 and a , fB are respectively (k+l 1)-vectors and (k+ , k - r)
matrices, denote bases for the orthogonal complements of respectively a and fB; in particul
(al, a')'a = 0 and (I, fBl)'f- = 0.
Assumption 5a. If rank(r ) = r, the matrix (al, al)'r(BIyz, f6) isfull rankk - r + 1, 0 < r <
Assumption 5b. If rank(r) = r + 1, the matrix al'r"T is full rank k - r, 0 < r < k.
Assumptions 1, 3, 4 and 5a and 5b permit the two polar cases for {x,} l. First, if {xt} 1 i
purely I(O) vector process, then n^,, and, hence, aXx and ,XX, are nonsingular. Second, if {x
is purely I(1), then lxx = 0, and, hence, axx and fxx are also null matrices.
Using (A.1) in Appendix A, it is easily seen that 7ty.x(zt - A - yt) = ry.xC*(L)Et, where
{C*(L)Et} is a mean zero stationary process. Therefore, under Assumptions 1, 3, 4 and 5b, that is,
7Tyy A O0, it immediately follows that there exists a conditional level relationship between Yt a
xt defined by
Yt = O0 + lt + xt +vt, t = 1,2,... (10)
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 295
BDM test for the exclusion of yt- in (11) when r = 0, that is, fx, = 0 in (11) or I
(7) and, thus, {xt} is purely 1(1); cf. HJNR and PSS.9 Therefore, BDM consider the
oyy = 0 (or ryy = 0).10 More generally, when 0 < r < k, BDM require the imposit
untested subsidiary hypothesis ay - w'axx = O'; that is, the limiting distribution of the
is obtained under the joint hypothesis yTyy = 0 and 7ryx.x = 0 in (8).
In the following sections of the paper, we focus on (8) and differentiate between five
interest delineated according to how the deterministic components are specified:
7 This joint hypothesis may be justified by the application of Roy's union-intersection princ
in (8) given ,ryx.. Let W,,,, (yx)) be the Wald statistic for testing 7r,y = 0 for a given
maxr',,, W7y (lryx.x) is identical to the Wald test of 7r,y = 0 and ryx.x. = 0 in (8).
8 A related approach to that of this paper is Hansen's (1995) test for a unit root in a univariate t
context, would require the imposition of the subsidiary hypothesis 7ry.x = 0'.
9 The BDM test is based on earlier contributions of Kremers et al. (1992), Banerjee et al. (199
10Partitioning rxi = (Yxy,i, rxx,i), i = 1, - 1 , , conformably with zt = (yt x, x BDM
1,..., p - 1, which implies y,y = 0, where rx = (Yxy, rxx); that is, AYt does not Granger cau
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
296 M. H. PESARAN, Y. SHIN AND R. J. SMITH
p-1
ip-1
p-1
It should be emphasized that the DGPs for Cases II and III are treated as identical as are th
for Cases IV and V. However, as in the test for a unit root proposed by Dickey and Fuller (1979)
compared with that of Dickey and Fuller (1981) for univariate models, estimation and hypothesi
testing in Cases III and V proceed ignoring the constraints linking respectively the intercept an
trend coefficient, co and cl, to the parameter vector (7ryy, 7rty.) whereas Cases II and IV fully
incorporate the restrictions in (9).
In the following exposition, we concentrate on Case IV, that is, (15), which may be specialized
to yield the remainder.
In this section we develop bounds procedures for testing for the existence of
between Yt and xt using (12)-(16); see (10). The main approach taken
Granger (1987) and BDM, is to test for the absence of any level relations
Xt via the exclusion of the lagged level variables yt-l and xt-_ in (12)-(16
define the constituent null hypotheses Ho' :ryy = 0, H " ': T,.x = 0', and al
Hi!': ryy , 0, HI"' : Jtr, 0'. Hence, the joint null hypothesis of inte
given by:
Ho= HZo qnH,o (17)
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-3
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 297
For ease of exposition, we consider Case IV and rewrite (15) in matrix notati
, Ik+l_ (- ) ( 7[ly )
=^, ~ , - ~ W
W = r PZ*_P_z Z* _i x/ ,9I, F k2 (21)
where ci,,,, (T - m)- T=1 Uz , m - (k + 1)(p + 1) + 1 is the number of estimated co
and ut, t = 1, 2, ..., T, are the least squares (LS) residuals from (19).
The next theorem presents the asymptotic null distribution of the Wald statistic;
behaviour of the F-statistic is a simple corollary and is not presented here or sub
Let Wk-,-+l(a) = (W, (a), Wk-r (a)')' denote a (k - r + 1)-dimensional standard Browni
partitioned into the scalar and (k - r)-dimensional sub-vector independent standard
motions W,,(a) and Wk_, (a), a e [0, 1]. We will also require the corresponding de-m
r+ 1)-vector standard Brownian motion Wk-,+l (a) Wk-,.+ (a) - f0 Wk-r+1 (a)da
meaned and de-trended (k - r + l)-vector standard Brownian motion Wk-,+l (a) = Wk
12 (a - 1) fo (a- 2) W,k_,.+l(a)da, and their respective partitioned counterparts Wk
(W ,,(a), Wk- r(a)')', and Wk-, ++i(a) = (W,l(a), Wk-. (a)')', a E [0, 1].
1 \I i-1o
W X Zz.z + dW
where z,. - N(O, I,.) i
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
298 M. H. PESARAN, Y. SHIN AND R. J. SMITH
The asymptotic distribution of the Wald statistic W of (21) depends on the dimensio
cointegration rank of the forcing variables {xt}, k and r respectively. In Case IV, refer
(11), the first component in (22), z.zr - X2(r), corresponds to testing for the exclusion of th
dimensional stationary vector 'xx t_l, that is, the hypothesis ayx - w'ax = 0', whereas the se
term in (22), which is a non-standard Dickey-Fuller unit-root distribution, corresponds to te
for the exclusion of the (k - r + 1)-dimensional I(1) vector (gfB, Pl)'zt_l and, in Case
IV, the intercept and time-trend respectively or, equivalently, ayy = 0.
We specialize Theorem 3.1 to the two polar cases in which, first, the process for the f
variables {xt} is purely integrated of order zero, that is, r = k and rlx is of full rank, and, s
the {xt} process is not mutually cointegrated, r = 0, and, hence, the {xt} process is purely inte
of order one.
Corollary 3.1 (Limiting distribution of W if {xt} - I(0)). If Assumptions 1-4 and 5a hold
and r = k, that is, {xt} - I(0), then under Ho : 0tyy = 0 and Yfyx.x = O' of (17), as T -> oo, the
asymptotic distribution of the Wald statistic W of (21) has the representation
where Fk+l(a) is defined in Theorem 3.1 for Cases I-V, a e [0, 1].
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 299
{xt} process is 0 < r < k.11 Hence, these two sets of critical values prov
bounds covering all possible classifications of {xt} into I(0), I(1) and mutual
processes. Asymptotic critical value bounds for the F-statistics covering Case
Tables CI(i)-CI(v) for sizes 0.100, 0.050, 0.025 and 0.010; the lower bound valu
the forcing variables {xt} are purely I(0), and the upper bound values assume that
1(1).12
Hence, we suggest a bounds procedure to test Ho : ryy = 0 and ryx.x = 0' of (17) within the
conditional ECMs (12)-(16). If the computed Wald or F-statistics fall outside the critical value
bounds, a conclusive decision results without needing to know the cointegration rank r of the
{xt} process. If, however, the Wald or F-statistic fall within these bounds, inference would be
inconclusive. In such circumstances, knowledge of the cointegration rank r of the forcing variables
{xt} is required to proceed further.
The conditional ECMs (12)-(16), derived from the underlying VAR(p) model (2), may also be
interpreted as an autoregressive distributed lag model of orders (p, p, ..., p) (ARDL(p, ..., p)).
However, one could also allow for differential lag lengths on the lagged variables Yt-i and
xt-i in (2) to arrive at, for example, an ARDL(p, 1, ..., Pk) model without affecting the
asymptotic results derived in this section. Hence, our approach is quite general in the sense that
one can use a flexible choice for the dynamic lag structure in (12)-(16) as well as allowing
for short-run feedbacks from the lagged dependent variables, Ayt-i, i = 1, ..., p, to Axt in
(7). Moreover, within the single-equation context, the above analysis is more general than the
cointegration analysis of partial systems carried out by Boswijk (1992, 1995), HJNR, Johansen
(1992, 1995), PSS, and Urbain (1992), where it is assumed in addition that _xn = 0 or xt is purely
I(1) in (7).
To conclude this section, we reconsider the approach of BDM. There are three scenarios for
the deterministics given by (12), (14) and (16). Note that the restrictions on the deterministics'
coefficients (9) are ignored in Cases II of (13) and IV of (15) and, thus, Cases II and IV are now
subsumed by Cases III of (14) and V of (16) respectively. As noted below (11), BDM impose
but do not test the implicit hypothesis at - w'ax = 0'; that is, the limiting distributional results
given below are also obtained under the joint hypothesis Ho : tyy = 0 and Vyx.x = 0' of (17). BDM
test y,, = 0 (or H 'Ttyy = 0) via the exclusion of Yt-1 in Cases I, III and V. For example, in
Case V, they consider the t-statistic
-1P - t - A y
A
t 1/2 / - ^ 1/2
Oiiu (Y^-lP I _I (24)
where ctuU, is defined in the line after (21), Ay =P-1 _ P1,y_i y-i
(Yo, .., YT-1), X-1 -Pt X_1, X-1 (Xp, v,XT-)i,) AZ_ APATZ-_, PT,TT - P,,
PTtT( TQTPTTT) tP,tT PI - =PPZ -P z- X1(X P-Z X_ 1) X Pz and PZ-
IT - AZ-(AZ AZ- )-'AZZ
1 The critical values of the Wald and F-statistics in the general case (not reported here) may be computed via stochastic
simulations with different combinations of values for k and 0 < r < k.
12 The critical values for the Wald version of the bounds test are given by k + 1 times the critical values of the F-test
Cases I, III and V, and k + 2 times in Cases II and IV.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
300 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Table CI. Asymptotic critical value bounds for the F-statistic. Testing for the existence of a levels
relationshipa
k 1(0) 1(1) I(o) I(1) I(o) I(1) i(o) ( (1) ( (o) ( (1) ( (o) () )
0 3.00 3.00 4.20 4.20 5.47 5.47 7.17 7.17 1.16 1.16 2.32 2.32
1 2.44 3.28 3.15 4.11 3.88 4.92 4.81 6.02 1.08 1.54 1.08 1.73
2 2.17 3.19 2.72 3.83 3.22 4.50 3.88 5.30 1.05 1.69 0.70 1.27
3 2.01 3.10 2.45 3.63 2.87 4.16 3.42 4.84 1.04 1.77 0.52 0.99
4 1.90 3.01 2.26 3.48 2.62 3.90 3.07 4.44 1.03 1.81 0.41 0.80
5 1.81 2.93 2.14 3.34 2.44 3.71 2.82 4.21 1.02 1.84 0.34 0.67
6 1.75 2.87 2.04 3.24 2.32 3.59 2.66 4.05 1.02 1.86 0.29 0.58
7 1.70 2.83 1.97 3.18 2.22 3.49 2.54 3.91 1.02 1.88 0.26 0.51
8 1.66 2.79 1.91 3.11 2.15 3.40 2.45 3.79 1.02 1.89 0.23 0.46
9 1.63 2.75 1.86 3.05 2.08 3.33 2.34 3.68 1.02 1.90 0.20 0.41
10 1.60 2.72 1.82 2.99 2.02 3.27 2.26 3.60 1.02 1.91 0.19 0.37
k I(o) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1)
0 3.80 3.80 4.60 4.60 5.39 5.39 6.44 6.44 2.03 2.03 1.77 1.77
1 3.02 3.51 3.62 4.16 4.18 4.79 4.94 5.58 1.69 2.02 1.01 1.25
2 2.63 3.35 3.10 3.87 3.55 4.38 4.13 5.00 1.52 2.02 0.69 0.96
3 2.37 3.20 2.79 3.67 3.15 4.08 3.65 4.66 1.41 2.02 0.52 0.78
4 2.20 3.09 2.56 3.49 2.88 3.87 3.29 4.37 1.34 2.01 0.42 0.65
5 2.08 3.00 2.39 3.38 2.70 3.73 3.06 4.15 1.29 2.00 0.35 0.56
6 1.99 2.94 2.27 3.28 2.55 3.61 2.88 3.99 1.26 2.00 0.30 0.49
7 1.92 2.89 2.17 3.21 2.43 3.51 2.73 3.90 1.23 2.01 0.26 0.44
8 1.85 2.85 2.11 3.15 2.33 3.42 2.62 3.77 1.21 2.01 0.23 0.40
9 1.80 2.80 2.04 3.08 2.24 3.35 2.50 3.68 1.19 2.01 0.21 0.36
10 1.76 2.77 1.98 3.04 2.18 3.28 2.41 3.61 1.17 2.00 0.19 0.33
k I(O) I(1) i(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1)
0 6.58 6.58 8.21 8.21 9.80 9.80 11.79 11.79 3.05 3.05 7.07 7.07
1 4.04 4.78 4.94 5.73 5.77 6.68 6.84 7.84 2.03 2.52 2.28 2.89
2 3.17 4.14 3.79 4.85 4.41 5.52 5.15 6.36 1.69 2.35 1.23 1.77
3 2.72 3.77 3.23 4.35 3.69 4.89 4.29 5.61 1.51 2.26 0.82 1.27
4 2.45 3.52 2.86 4.01 3.25 4.49 3.74 5.06 1.41 2.21 0.60 0.98
5 2.26 3.35 2.62 3.79 2.96 4.18 3.41 4.68 1.34 2.17 0.48 0.79
6 2.12 3.23 2.45 3.61 2.75 3.99 3.15 4.43 1.29 2.14 0.39 0.66
7 2.03 3.13 2.32 3.50 2.60 3.84 2.96 4.26 1.26 2.13 0.33 0.58
8 1.95 3.06 2.22 3.39 2.48 3.70 2.79 4.10 1.23 2.12 0.29 0.51
9 1.88 2.99 2.14 3.30 2.37 3.60 2.65 3.97 1.21 2.10 0.25 0.45
10 1.83 2.94 2.06 3.24 2.28 3.50 2.54 3.86 1.19 2.09 0.23 0.41
(Continued ov
J. Ltd.
Copyright © 2001 John Wiley & Sons, Appl. Econ. 16:
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 301
k I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1)
0 5.37 5.37 6.29 6.29 7.14 7.14 8.26 8.26 3.17 3.17 2.68 2.68
1 4.05 4.49 4.68 5.15 5.30 5.83 6.10 6.73 2.45 2.77 1.41 1.65
2 3.38 4.02 3.88 4.61 4.37 5.16 4.99 5.85 2.09 2.57 0.92 1.20
3 2.97 3.74 3.38 4.23 3.80 4.68 4.30 5.23 1.87 2.45 0.67 0.93
4 2.68 3.53 3.05 3.97 3.40 4.36 3.81 4.92 1.72 2.37 0.51 0.76
5 2.49 3.38 2.81 3.76 3.11 4.13 3.50 4.63 1.62 2.31 0.42 0.64
6 2.33 3.25 2.63 3.62 2.90 3.94 3.27 4.39 1.54 2.27 0.35 0.55
7 2.22 3.17 2.50 3.50 2.76 3.81 3.07 4.23 1.48 2.24 0.31 0.49
8 2.13 3.09 2.38 3.41 2.62 3.70 2.93 4.06 1.44 2.22 0.27 0.44
9 2.05 3.02 2.30 3.33 2.52 3.60 2.79 3.93 1.40 2.20 0.24 0.40
10 1.98 2.97 2.21 3.25 2.42 3.52 2.68 3.84 1.36 2.18 0.22 0.36
k I(O) I(1) I(O) I(1) I(O) I(1) I(O) (1 ) i(O ) (1 ) i(O) i(1)
0 9.81 9.81 11.64 11.64 13.36 13.36 15.73 15.73 5.33 5.33 11.35 11.35
1 5.59 6.26 6.56 7.30 7.46 8.27 8.74 9.63 3.17 3.64 3.33 3.91
2 4.19 5.06 4.87 5.85 5.49 6.59 6.34 7.52 2.44 3.09 1.70 2.23
3 3.47 4.45 4.01 5.07 4.52 5.62 5.17 6.36 2.08 2.81 1.08 1.51
4 3.03 4.06 3.47 4.57 3.89 5.07 4.40 5.72 1.86 2.64 0.77 1.14
5 2.75 3.79 3.12 4.25 3.47 4.67 3.93 5.23 1.72 2.53 0.59 0.91
6 2.53 3.59 2.87 4.00 3.19 4.38 3.60 4.90 1.62 2.45 0.48 0.75
7 2.38 3.45 2.69 3.83 2.98 4.16 3.34 4.63 1.54 2.39 0.40 0.64
8 2.26 3.34 2.55 3.68 2.82 4.02 3.15 4.43 1.48 2.35 0.34 0.56
9 2.16 3.24 2.43 3.56 2.67 3.87 2.97 4.24 1.43 2.31 0.30 0.49
10 2.07 3.16 2.33 3.46 2.56 3.76 2.84 4.10 1.40 2.28 0.26 0.44
The variables yt and xt are generated from Yt = Yt-l + Eit and xt = Pxt-I + 82t, t = 1 ..., T, where yo = 0
Et = (lt, e2t)' is drawn as (k + 1) independent standard normal variables. If xt is purely I(1), P = Ik whereas P
is purely I(0). The critical values for k = 0 correspond to the squares of the critical values of Dickey and Ful
unit root t-statistics for Cases I, III and V, while they match those for Dickey and Fuller's (1981) unit root F
for Cases II and IV. The columns headed 'I(0)' refer to the lower critical values bound obtained when xt is
while the columns headed 'I(1)' refer to the upper bound obtained when xt is purely I(1).
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
302 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Theorem 3.2 (Limiting distribution of t,,,). If Assumptions 1-4 and 5a hold and Yxy = 0, where
Tx = (Yxy, rr), then under Ho: 7ryy = 0 and tyx.x = O' of (17), as T -> oo, the asymptotic
distribution of the t-statistic t,y, of (24) has the representation
where
od dWu(a)F(a)
\Jo( ) F(a2
where
where W, (a) Case
F(a) = Wu(a) Case III
I W,(a) Case V )
and Cases I, III and V are defined in (12), (14) and (16), a e [0, 1].
r1 / /ll -1/2
j dW,(a)Fk(a) ( Fk(a)2
where Fk(a) is defined in Theorem 3.2 f
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 303
lower and upper bounds respectively for those corresponding to the general
Theorem 3.2. Hence, a bounds procedure for testing H': y = 0 based on thes
may be implemented as described above based on the t-statistic t,,, for the exclus
the conditional ECMs (12), (14) and (16) without prior knowledge of the cointe
These asymptotic critical value bounds are given in Tables CII(i), CII(iii) and C
III and V for sizes 0.100, 0.050, 0.025 and 0.010.
As is emphasized in the Proof of Theorem 3.2 given in Appendix A, if the asym
for the t-statistic t,Y of (24) is conducted under HoY : yy = 0 only, the resultant
for t,: depends on the nuisance parameter w - 0 in addition to the cointegrating r
under Assumption 5a, ayx - 0'acx = 0'. Moreover, if Ayt is allowed to Granger-cau
Yxy,i a 0 for somee i = p - , then the limit distribution also is dependent o
parameter yAy/(yyy - 'Yxy); see Appendix A. Consequently, in general, where w Z
Table CII. Asymptotic critical value bounds of the t-statistic. Testing for the existence of a l
k I(0) I(1) I(0) I(1) I() I(1) I(0) I(1) I(0) I(1) I(0) I(1)
0 -1.62 -1.62 -1.95 -1.95 -2.24 -2.24 -2.58 -2.58 -0.42 -0.42 0.98 0.98
1 -1.62 -2.28 -1.95 -2.60 -2.24 -2.90 -2.58 -3.22 -0.42 -0.98 0.98 1.12
2 -1.62 -2.68 -1.95 -3.02 -2.24 -3.31 -2.58 -3.66 -0.42 -1.39 0.98 1.12
3 -1.62 -3.00 -1.95 -3.33 -2.24 -3.64 -2.58 -3.97 -0.42 -1.71 0.98 1.09
4 -1.62 -3.26 -1.95 -3.60 -2.24 -3.89 -2.58 -4.23 -0.42 -1.98 0.98 1.07
5 -1.62 -3.49 -1.95 -3.83 -2.24 -4.12 -2.58 -4.44 -0.42 -2.22 0.98 1.05
6 -1.62 -3.70 -1.95 -4.04 -2.24 -4.34 -2.58 -4.67 -0.42 -2.43 0.98 1.04
7 -1.62 -3.90 -1.95 -4.23 -2.24 -4.54 -2.58 -4.88 -0.42 -2.63 0.98 1.04
8 -1.62 -4.09 -1.95 -4.43 -2.24 -4.72 -2.58 -5.07 -0.42 -2.81 0.98 1.04
9 -1.62 -4.26 -1.95 -4.61 -2.24 -4.89 -2.58 -5.25 -0.42 -2.98 0.98 1.04
10 -1.62 -4.42 -1.95 -4.76 -2.24 -5.06 -2.58 -5.44 -0.42 -3.15 0.98 1.03
k I(0) I(1) I(0) I(1) I(0) I(1) I(0 ) () (0) I(1) I(O) I(1)
0 -2.57 -2.57 -2.86 -2.86 -3.13 -3.13 -3.43 -3.43 -1.53 -1.53 0.72 0.71
1 -2.57 -2.91 -2.86 -3.22 -3.13 -3.50 -3.43 -3.82 -1.53 -1.80 0.72 0.81
2 -2.57 -3.21 -2.86 -3.53 -3.13 -3.80 -3.43 -4.10 -1.53 -2.04 0.72 0.86
3 -2.57 -3.46 -2.86 -3.78 -3.13 -4.05 -3.43 -4.37 -1.53 -2.26 0.72 0.89
4 -2.57 -3.66 -2.86 -3.99 -3.13 -4.26 -3.43 -4.60 -1.53 -2.47 0.72 0.91
5 -2.57 -3.86 -2.86 -4.19 -3.13 -4.46 -3.43 -4.79 -1.53 -2.65 0.72 0.92
6 -2.57 -4.04 -2.86 -4.38 -3.13 -4.66 -3.43 -4.99 -1.53 -2.83 0.72 0.93
7 -2.57 -4.23 -2.86 -4.57 -3.13 -4.85 -3.43 -5.19 -1.53 -3.00 0.72 0.94
8 -2.57 -4.40 -2.86 -4.72 -3.13 -5.02 -3.43 -5.37 -1.53 -3.16 0.72 0.96
9 -2.57 -4.56 -2.86 -4.88 -3.13 -5.18 -3.42 -5.54 -1.53 -3.31 0.72 0.96
10 -2.57 -4.69 -2.86 -5.03 -3.13 -5.34 -3.43 -5.68 -1.53 -3.46 0.72 0.96
(Continued overl
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
304 M. H. PESARAN, Y. SHIN AND R. J. SMITH
k i(o) I(1) I(O) I(1) I(0) I(1) I(0) I(1) I(O) I(1) I(O) I(1)
0 -3.13 -3.13 -3.41 -3.41 -3.65 -3.66 -3.96 -3.97 -2.18 -2.18 0.57 0.57
1 -3.13 -3.40 -3.41 -3.69 -3.65 -3.96 -3.96 -4.26 -2.18 -2.37 0.57 0.67
2 -3.13 -3.63 -3.41 -3.95 -3.65 -4.20 -3.96 -4.53 -2.18 -2.55 0.57 0.74
3 -3.13 -3.84 -3.41 -4.16 -3.65 -4.42 -3.96 -4.73 -2.18 -2.72 0.57 0.79
4 -3.13 -4.04 -3.41 -4.36 -3.65 -4.62 -3.96 -4.96 -2.18 -2.89 0.57 0.82
5 -3.13 -4.21 -3.41 -4.52 -3.65 -4.79 -3.96 -5.13 -2.18 -3.04 0.57 0.85
6 -3.13 -4.37 -3.41 -4.69 -3.65 -4.96 -3.96 -5.31 -2.18 -3.20 0.57 0.87
7 -3.13 -4.53 -3.41 -4.85 -3.65 -5.14 -3.96 -5.49 -2.18 -3.34 0.57 0.88
8 -3.13 -4.68 -3.41 -5.01 -3.65 -5.30 -3.96 -5.65 -2.18 -3.49 0.57 0.90
9 -3.13 -4.82 -3.41 -5.15 -3.65 -5.44 -3.96 -5.79 -2.18 -3.62 0.57 0.91
10 -3.13 -4.96 -3.41 -5.29 -3.65 -5.59 -3.96 -5.94 -2.18 -3.75 0.57 0.92
wt = 0 Case I
wt = 1 Case III
wt =(1, t)' Case V
The variables Yt and xt are generated from Yt = Yt-_ + 8lt and xt = Pxt_l + s2t, t = 1 ..., T, where yo = 0,
and st = (Elt, s2t)' is drawn as (k + 1) independent standard normal variables. If xt is purely I(1), P = Ik whereas P
if xt is purely I(0). The critical values for k = 0 correspond to those of Dickey and Fuller's (1979) unit root t-stat
The columns headed 'I(0)' refer to the lower clitical values bound obtained when xt is purely I(0), while the co
headed 'I(1)' refer to the upper bound obtained when xt is purely I(1).
although the t-statistic t, has a well-defined limiting distribution under H ' = 0, the above
IT - ~yy - 0, the above
bounds testing procedure for Hor : r,,,,
Consequently, in the light of the co
Section 4, see Theorems 4.1, 4.2 and 4.
the existence of a level relationship betw
based on the Wald or F-statistic of (21
proceed no further; (b) if Ho is rejecte
the t-statistic t,, of (24) from Coroll
t,!! should result, at least asymptotically
Yt and xt, which, however, may be deg
This section first demonstrates that the proposed bounds testing procedu
statistic of (21) described in Section 3 is consistent. Second, it derives the asym
14 In principle, the asymptotic distribution of t,V,, under H"!' : Ty,, = 0 may be simulated from t
given in the Proof of Theorem 3.2 of Appendix A after substitution of consistent estimators for 0
Ho' y = 0, where yY,x Y /y - /Xy. Although such estimators may be obtained straightfo
they necessitate the use of parameter estimators from the marginal ECM (7) for {xt}t°l
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (200
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 30
Theorem 4.1 (Consistency of the Wald statistic bounds testprocedure under H ' ")
1-4 and 5b hold, then under H '!Y: t,, Zy 0 of (18) the Wald statistic W (21) is con
H1: trryy 0 in Cases I-V defined in (12)-(16).
Theorem 4.2 (Consistency of the Wald statistic bounds test procedure und
Assumptions 1-4 and 5a hold, then under H '": 7r x. 0 of(18) and H"
Wald statistic W (21) is consistent against H> ' : nr,..x = 0/ in Cases I-V defi
Hence, combining Theorems 4.1 and 4.2, the bounds procedure of Section 3 base
statistic W (21) defines a consistent test of Ho = Ho"' n, H"H' of (17) against
of (18). This result holds irrespective of whether the forcing variables {xt} are p
I(1) or mutually cointegrated.
We now turn to consider the asymptotic distribution of the Wald statistic (21)
specified sequence of local alternatives. Recall that under Assumption 5b, t7rV,v[
(ayytyy, ayalfiy + (aoty -W - w/a)/5x). Consequently, we define the sequence of
Copyright © J. App
2001 J
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
306 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Theorem 4.3 (Limiting distribution of W under H ir). If Assumptions 1 -4 and 5a hold, then unde
H1T : ry.x = T y-lyyfy + T-1/2(8y - w'8xx)' of (26), as T -- oo, the asymptotic distribution
the Wald statistic W of (21) has the representation
where z, N (Q1/
xx)', is distributed in
Jkr.+i(a) Case I
Case IIII
(Jk-r+l (a)', 1)' Case
Fk-r+l(a)= < Jk+i(a) Case III
(J_.l (a)', a- 1/2)' Case IV
Jkr+l(a) Case V
Theorem 4.4 (Consistency of the t-statistic bounds test procedure under H 1' ). If Assumptio
1-4 and 5b hold, then under H>' . yy :7 0 of (18) the t-statistic t,,~, (24) is consistent again
H1 ': tyy 0 in Cases I, III and V defined in (12), (14) and (16).
As noted at the end of Section 3, Theorem 4.4 suggests the possibility of using ty,, to
discriminate between HO!: yy = 0 and H 7y: Ty 1 0, although, if H': = O' is fal
the bounds procedure given via Corollaries 3.3 and 3.4 is not asymptotically similar.
Following the modelling approach described earlier, this section provides a re-examination of the
earnings equation included in the UK Treasury macroeconometric model described in Chan, Savage
and Whittaker (1995), CSW hereafter. The theoretical basis of the Treasury's earnings equation
is the bargaining model advanced in Nickell and Andrews (1983) and reviewed, for example, in
Layard et al. (1991, Chapter 2). Its theoretical derivation is based on a Nash bargaining framework
where firms and unions set wages to maximize a weighted average of firms' profits and unions'
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 307
utility. Following Darby and Wren-Lewis (1993), the theoretical real wage
the Treasury's earnings equation is given by
Prodt
off' dummy variables.17 Let zt = (wt, Prodt, URt, Wedget, Uniont)' = (wt, x')'. Then, using
analysis of Section 2, the conditional ECM of interest can be written as
p-l
15 The wedge effect is further decomposed into a tax wedge and an import price wedge in the Treasury model, but this
decomposition is not pursued here.
16 It is important, however, that, at a future date, a fresh investigation of the possible effects of the replacement ratio on
real wages should be undertaken.
17 However, both the asymptotic theory and associated critical values must be modified if the fraction of periods in which
the dummy variables are non-zero does not tend to zero with the sample size T. In the present application, both dummy
variables included in the earning equation are zero after 1979, and the fractions of observations where D7475t and D7579t
are non-zero are only 7.6% and 19.2% respectively.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
308 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Under the assumption that lagged real wages, wt1_, do not enter the sub-VAR model for x
the above real wage equation is identified and can be estimated consistently by LS.18 Notic
however, that this assumption does not rule out the inclusion of lagged changes in real wages in
the unemployment or productivity equations, for example. The exclusion of the level of real wag
from these equations is an identification requirement for the bargaining theory of wages which
permits it to be distinguished from other alternatives, such as the efficiency wage theory which
postulates that labour productivity is partly determined by the level of real wages.19 It is cl
that, in our framework, the bargaining theory and the efficiency wage theory cannot be entertain
simultaneously, at least not in the long run.
The above specification is also based on the assumption that the disturbances ut are seriall
uncorrelated. It is therefore important that the lag order p of the underlying VAR is select
appropriately. There is a delicate balance between choosing p sufficiently large to mitigate t
residual serial correlation problem and, at the same time, sufficiently small so that the condition
ECM (30) is not unduly over-parameterized, particularly in view of the limited time series d
which are available.
Finally, a decision must be made concerning the time trend in (30) and whether its coefficien
should be restricted.20 This issue can only be settled in light of the particular sample period un
consideration. The time series data used are quarterly, cover the period 1970ql-1997q4, and
seasonally adjusted (where relevant).21 To ensure comparability of results for different choices
p, all estimations use the same sample period, 1972ql-1997q4 (T = 104), with the first ei
observations reserved for the construction of lagged variables.
The fiveve variables in the earnings equation were constructed from primary sources in the
lowing manner: wt = ln(ERPRt/PYNONGt), Wedget = ln(l + TEt) + ln(l - TDt) - ln(RPIXt
PYNONGt), URt = ln(100 x ILOUt/(ILOUt + WFEMPt)), Prodt = ln((YPROMt + 278.29 x
YMFt)/(EMFt + ENMFt)), and Uniont = ln(UDENt), where ERPRt is average private sector
earnings per employee (£), PYNONGt is the non-oil non-government GDP deflator, YPROM
is output in the private, non-oil, non-manufacturing, and public traded sectors at constant fac-
tor cost (f million, 1990), YMFt is the manufacturing output index adjusted for stock changes
(1990 = 100), EMFt and ENMFt are respectively employment in UK manufacturing and non-
manufacturing sectors (thousands), ILOUt is the International Labour Office (ILO) measure
of unemployment (thousands), WFEMPt is total employment (thousands), TEt is the average
employers' National Insurance contribution rate, TDt is the average direct tax rate on employ-
ment incomes, RPIXt is the Retail Price Index excluding mortgage payments, and UDENt is
union density (used to proxy 'union power') measured by union membership as a percentage of
employment.22 The time series plots of the five variables included in the VAR model are given in
Figures 1-3.
18 See Assumption 3 and the following discussion. By construction, the contemporaneous effects Axt are uncorrelated
with the disturbance term ut and instrumental variable estimation which has been particularly popular in the empirical
wage equation literature is not necessary. Indeed, given the unrestricted nature of the lag distribution of the conditional
ECM (30), it is difficult to find suitable instruments: namely, variables that are not already included in the model, which
are uncorrelated with Ut and also have a reasonable degree of correlation with the included variables in (30).
19 For a discussion of the issues that surround the identification of wage equations, see Manning (1993).
20 See, for example, PSS and the discussion in Section 2.
21 We are grateful to Andrew Gurney and Rod Whittaker for providing us with the data. For further details about the
sources and the descriptions of the variables, see CSW, pp. 46-51 and p. 11 of the Annex.
22 The data series for UDEN assumes a constant rate of unionization from 1980q4 onwards.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
_ _A_
(a) 4.0-
3.5-
_ __^-----~~~~~~~~~~~- ~ ~ ~~~/ Real Wages
3.0.~.~~-~
-a)
co 2.5-
2.0-
1 .0 I I I I I I I I I I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
(b) 0.04-
0.03-
0.00
-0.01
-0.02
I I0.03-0.04~ / t, Productivity
-0.03
-0.0 4 I I I I I I I I I I I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 199701
Quarters
Figure 1. (a) Real wages and labour productivity. (b) Rate of change of real wages and labour productivity
It is clear from Figure 1 that real wages (average earnings) and productivity show steadily risin
trends with real wages growing at a faster rate than productivity.23 This suggests, at least initially,
that a linear trend should be included in the real wage equation (30). Also the application of unit
root tests to the five variables, perhaps not surprisingly, yields mixed results with strong eviden
in favour of the unit root hypothesis only in the cases of real wages and productivity. This does
not necessarily preclude the other three variables (UR, Wedge, and Union) having levels impac
on real wages. Following the methodology developed in this paper, it is possible to test for th
existence of a real wage equation involving the levels of these five variables irrespective of wheth
they are purely I(O), purely I(1), or mutually cointegrated.
23 Over the period 1972ql-97q4, real wages grew by 2.14% per annum as compared to labour productivity that increase
by an annual average rate of 1.54% over the same period.
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
310 M. H. PESARAN, Y. SHIN AND R. J. SMITH
-0.2
-0.3 -
/ UNION
-0.4 -
-0.5 -
-0.6-
·I·L·L_
WEDGE
-0.7-
.(
_
I I I I I I I I I I
O Q I I I I I I I I I I I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
3.0-
2.5-
2.0-
/ UR
1.5-
Q)
0
o
0
,_1
1.0-
0.5-
n n
*~
I I I~I I
I I lI lI l
I I l I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
To determine the appropriate lag length p and whether a deterministic linear trend is required
in addition to the productivity variable, we estimated the conditional model (30) by LS, with
and without a linear time trend, for p = 1, 2,..., 7. As pointed out earlier, all regressions were
computed over the same period 1972ql-1997q4. We found that lagged changes of the productivity
variable, AProdt-l, AProdt2, ..., were insignificant (either singly or jointly) in all regressions.
Therefore, for the sake of parsimony and to avoid unnecessary over-parameterization, we decided
to re-estimate the regressions without these lagged variables, but including lagged changes of
all other variables. Table I gives Akaike's and Schwarz's Bayesian Information Criteria, denoted
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 311
respectively by AIC and SBC, and Lagrange multiplier (LM) statistics for testin
of no residual serial correlation against orders 1 and 4 denoted by XS2c(1) and X 2
As might be expected, the lag order selected by AIC, 7paic = 6, irrespecti
deterministic trend term is included or not, is much larger than that selected by
criterion gives estimates Psbc = 1 if a trend is included and psbc = 4 if not. The X
suggest using a relatively high lag order: 4 or more. In view of the importance of
of serially uncorrelated errors for the validity of the bounds tests, it seems prud
be either 5 or 6.24 Nevertheless, for completeness, in what follows we report test
and 5, as well as for our preferred choice, namely p = 6. The results in Tab
that there is little to choose between the conditional ECM with or without a linear deterministic
trend.
Table II gives the values of the F- and t-statistics for testing the existence of a level earnings
equation under three different scenarios for the deterministics, Cases III, IV and V of (14), (15)
and (16) respectively; see Sections 2 and 3 for detailed discussions.
The various statistics in Table II should be compared with the critical value bounds provided
in Tables CI and CII. First, consider the bounds F-statistic. As argued in PSS, the statistic Fly
which sets the trend coefficient to zero under the null hypothesis of no level relationship, Case
IV of (15), is more appropriate than Fv, Case V of (16), which ignores this constraint. Note that,
if the trend coefficient cl is not subject to this restriction, (30) implies a quadratic trend in the
level of real wages under the null hypothesis of nr,, = 0 and r,,x.x = 0', which is empirically
implausible. The critical value bounds for the statistics Flv and Fv are given in Tables CI(iv) and
CI(v). Since k = 4, the 0.05 critical value bounds are (3.05, 3.97) and (3.47, 4.57) for Fly and
Fv, respectively.25 The test outcome depends on the choice of the lag order p. For p = 4, the
Table I. Statistics for selecting the lag order of the earnings equation
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
312 M. H. PESARAN, Y. SHIN AND R. J. SMITH
With Without
deterministic trends deterministic trends
p F F t F tv Fi tll
hypothesis that there exists no level earnings equation is not rejected at the 0.05 level, irrespective
of whether the regressors are purely I(O), purely I(1) or mutually cointegrated. For p = 5, the
bounds test is inconclusive. For p = 6 (selected by AIC), the statistic Fv is still inconclusive, but
Flv = 4.78 lies outside the 0.05 critical value bounds and rejects the null hypothesis that there
exists no level earnings equation, irrespective of whether the regressors are purely I(0), purely
I(1) or mutually cointegrated.26 This finding is even more conclusive when the bounds F-test is
applied to the earnings equations without a linear trend. The relevant test statistic is F111 and the
associated 0.05 critical value bounds are (2.86, 4.01).27 For p = 4, F111 = 3.63, and the test result
is inconclusive. However, for p = 5 and 6, the values of F111 are 5.23 and 5.42 respectively and
the hypothesis of no levels earnings equation is conclusively rejected.
The results from the application of the bounds t-test to the earnings equations are less clear-cut
and do not allow the imposition of the trend restrictions discussed above. The 0.05 critical value
bounds for t/ll and tv, when k = 4, are (-2.86, -3.99) and (-3.41, -4.36).28 Therefore, if a
linear trend is included, the bounds t-test does not reject the null even if p = 5 or 6. However,
when the trend term is excluded, the null is rejected for p = 5. Overall, these test results support
the existence of a levels earnings equation when a sufficiently high lag order is selected and
when the statistically insignificant deterministic trend term is excluded from the conditional ECM
(30). Such a specification is in accord with the evidence on the performance of the alternative
conditional ECMs set out in Table I.
In testing the null hypothesis that there are no level effects in (30), namely (7,,, = 0, 7r,^
it is important that the coefficients of lagged changes remain unrestricted, otherwise these
could be subject to a pre-testing problem. However, for the subsequent estimation of levels e
and short-run dynamics of real wage adjustments, the use of a more parsimonious specificat
seems advisable. To this end we adopt the ARDL approach to the estimation of the level relat
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 313
discussed in Pesaran and Shin (1999).29 First, the (estimated) orders of an ARDL(p
model in the five variables (wt, Prodt, URt, Wedget, Uniont) were selected b
the 75 = 16, 807 ARDL models, spanned by p = 0, 1,..., 6, and pi = 0, 1
using the AIC criterion.30 This resulted in the choice of an ARDL(6, 0, 5, 4, 5) sp
estimates of the levels relationship given by
wt = 1.063 Prodt -0.105 URt -0.943 Wedget +1.481 Uniont +2.701 + vt (31)
(0.050) (0.034) (0.265) (0.311) (0.242)
29 Note that the ARDL approach advanced in Pesaran and Shin (1999) is applicable irrespective of whether the r
are purely I(0), purely 1(1) or mutually cointegrated.
30 For further details, see Section 18.19 and Lesson 16.5 in Pesaran and Pesaran (1997).
31 CSW do not report standard errors for the levels estimates of the Treasury earnings equation.
32 We are grateful to a referee for drawing our attention to this point.
33 Clearly, it is possible to simplify the model further, but this would go beyond the remit of this section which
test for the existence of a level relationship using an unrestricted ARDL specification and, second, if we are sat
such a levels relationship exists, to select a parsimonious specification.
34 The standard errors of the estimates reported in Table III allow for the uncertainty associated with the estimati
levels coefficients. This is important in the present application where it is not known with certainty whether the
are purely 1(0), purely 1(1) or mutually cointegrated. It is only in the case when it is known for certain that al
are 1(1) that it would be reasonable in large samples to treat these estimates as known because of their super-c
35 The equilibrium correction coefficient in the Treasury's earnings equation is estimated to be -0.1848 (0.052
is smaller than our estimate; see p. 11 in Annex of CSW. This seems to be because of the shorter lag lengths
Treasury's specification rather than the shorter time period 1971ql-1994q3. Note also that the t-ratio reporte
coefficient does not have the standard t-distribution; see Theorem 3.2.
36 The complex roots are 0.34293 + 0.67703i and -0.17307 + 0.61386i, where i = 1-T.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
314 M. H. PESARAN, Y. SHIN AND R. J. SMITH
the 0.05 level which may be linked to the presence of some non-linear effects or asymmetries in
the adjustment of the real wage process that our linear specification is incapable of taking into
account.37 Recursive estimation of the conditional ECM and the associated cumulative sum and
cumulative sum of squares plots also suggest that the regression coefficients are generally stable
over the sample period. However, these tests are known to have low power and, thus, may have
missed important breaks. Overall, the conditional ECM earnings equation presented in Table III
has a number of desirable features and provides a sound basis for further research.
J. Appl.
Copyright © 2001 John Wiley & Sons, Ltd. Econ. 16
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 315
6. CONCLUSIONS
We confine the main proof of Theorem 3.1 to that for Case IV and briefly detail t
necessary for the other cases. Under Assumptions 1-4 and 5a, the process {zt}l
moving-average representation,
38 For an excellent review of this early literature, see Hendry et al. (1984).
39 Of course, the system approach developed by Johansen (1991, 1995) can also be applied to a set of variables containin
possibly a mixture of 1(0) and 1(1) regressors.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
316 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where (gB, fBI) is a (k + 1, k- r + 1) matrix whose columns are a basis for the orthogonal
complement of f. Hence, (f, , B) is a basis for Zk+l. Let 4 be the (k + 2)-unit vector (1, 0')'
Then, (P,,, , 8) is a basis for Rk+2. It therefore follows that
where zt* = (t, z),) Bk+l (a) is a (k + 1)-vector Brownian motion with variance matrix Q and [T
denotes the integer part of Ta, a e [0, 1]; see Phillips and Solo (1992, Theorem 3.15, p. 983). Also,
T-l'zt* = T-lt = a. Similarly, noting that B'C = 0, we have that ft*,zt = Pt'/ + ItC*(L)Et =
Op(l). Hence, from Phillips and Solo (1992, Theorem 3.16, p. 983), defining V Z- P,Z* and
AZ_ - P,AZ_, it follows that
T-l ' = Z1
Op(l),fT-lZ*AZ
, Z_ = Op(l), T-1'Z_AZ = Op(l)
Cf. Johansen (1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3
The next result follows from Phillips and Solo (1992, Theorem 3.15, p
(1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146) and P
(1986).
Lemma A.1 Let BT (8, T-1/25) and define G(a) = (G (a)', G2(a))', whe
CBk+ (a), Bk+1(a)[= (B1(a)', Bk(a)')'] = Bk+1(a) - f0 Bk+1(a)da, and G2
Then
where B*(a) - 1
Proof of Theore
(tuW ~-uW
= - -1 up _ -1 1P_ PT_fi = U
= i'P
Z Z1A
A T (AZ'
TZ_P-1
Z*TA)
T A Z*'1P^Z
- _
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 317
A' (T-1/2fl/
P fi u+'*'pl--i (A5)
- \ T lBZ* + op() (A5)
Finally, the estimatorfor the error variance cou, (defined in the line afte
~/ -_ i'P
wuu = (T - m)-1 [u'f -- *Z1AT(AZP^
A-* --Z* -1ZA/,~.*/ - ]
-1iA) -AZP^
= (T - m)-ii + op(1) = ( o+ + op(1) (A6)
W = T-1i'P- Z*-1fi (T 1Z 1P z*
+ T P~-_ _T
-2UZi'B BT
[rT
-1 T--I'Br
BZ.Z.
T ~ --z B ' B(A7)
BZ*UiB _i/)oIu +
We consider each of the te
to state
(T-l/z2*p
T - 1-PT z*
-_ fi1)
- 1/*-1/2
T - _T /2 ii/o1/
-PT-N ,_ 2 = : Zr N(O, I,)
Hence, the first term in (A7) converges in distribution to z.z,., a chi-square random variable with
r degrees offreedom; that is,
dB*
a-U( )da)(a) 1 1
2O1a a 1 a-
\Jo a 2/
X,/ (( 1ay B al
a I
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
318 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where B*(a) = B1(a) - w'Bk(a) is independent of Bk(a) and Bk+1(a) = (B1(a), Bk(a)')' is
titioned according to z, = (Yt, x)', a e [0, 1]. Hence, the second term in (A7) has the follo
asymptotic representation:
dW,(a 1 (a I 1 da
jd lW
aJ-a- (a) (Wk
- a- - ± (
x /l ( + (a-2 ) dl,(a) (A9)
Note that dWu(a) in (A9) may be replaced by dW, (a), a e [0, 1].
the result of Theorem 3.1.
For the remaining cases, we need only make minor modificati
In Case I, 8 = (fyir, 'il) with (P, fyi, ) a basis for Rk+1 and
-1 = (tr, Z 1)', we have
Ik+
Proof of Corollary 3.2 Follows immediately from Theorem 3.1 by setting r = O.E
Proof of Theorem 3.2 We provide a prooffor Case V which may be simply adapted for
and III. To emphasize the potential dependence of the limit distribution on nuisance paramete
the proof is initially conducted under Assumptions 1-4 together with Assumption 5a which i
Ho'' tyy = 0 but not necessarily HO ' =rV.xX 0'; in particular, note that we may write a
(1, -o')' for some k-vector 0. The t-statistic for Ho'"! ' y = 0 may be expressed as the
root of
Tr-lP_iZiBr
T--UPlB( 2BT T (T ZI ZiBr)1
--B'' T -TZ
21B B T) T-1BT ^P i/r- (All)
' A0111
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 319
where, for convenience, but without loss of generality, we have set y = (p1
(0, y), ) y = Yxy /Yy.x, Yy. Yyy - Yx y y. Yyx - 0 x and (a) B (a)
B,(a) = B1(a) - 0/Bk(a), a e [0, 1]. Hence, (All) weakly converges to
,{ B
- /
x axx u \UIUk xx
x al
Under
012W1 (a) and ax'B(a)[= a Bk (a)] (a), a e [0, 1].
Proof of Corollary 3.4 Follows immediately from Theorem 3.2 by setting r = O.E
Proof of Theorem 4.1 Again, we consider Case IV; the remaining Cases I-III and V may be
dealt with similarly. Under H ' : .7r,y / 0, Assumption Sb holds and, thus, n = ay py + ap' where
ay = (a, 0')' and fy = (Pyy, Px)'; see above Assumption 5b. Under Assumptions 1-4 and 5b,
the process {zt]lz has the infinite moving-average representation, zt = /i + yt + Cst + C*(L)st,
where now C - '[a "'p1]-la '. We redefine P* and 8 as the (k + 2, r + 1) and (k + 2, k - r
matrices,
and ,
8 (Y -f )1
1jk+1/
where is a (k + 1, k - r) matrix whose columns are a basis for the orthogonal complement of
(fly, j). Hence, (y, , (y, , ) is a basis for Z k+1 and, thus, (I*, , 8) a basis for Zk+2, where again
4 is the (k + 2)-unit vector (1, 0')'. It therefore follows that
Also, as above, T-10'z* = T-1t = a and P*z7 = (fy, P)Itt + (fy, 8)'C*(L)st = Op(l).
The Wald statistic (21) multiplied by eui may be written as
(B1)
(B 1)
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
320 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where , (, a)'(1, -w')', AT - T-1/2(, T-1/2B) and Br - (, T-1/2k). Note that (A6)
continues to hold under H!'' 71 T y k O. A similar argument to that in the Proof of Theorem 3.1
demonstrates that the first term in (Bl) divided by w,,c has the limiting representation
where z,.+ 1 N(O, I,+i), Fk_,(a) = (Wk_,(a)', a - ) and Wk_ (a) (a 'Qa )- 1/
is a (k - r)-vector of de-meaned independent standard Brownian motions independen
standard Brownian motion W,(a), a e [0, 1]; cf (22). Now, fo Fk_,(a)dW,(a) is mixed
with conditional variance matrix fo Fk_,-(a)Fk_,(a)'da. Therefore, the second term i
unconditionally distributed as a X2(k - r) random variable and is independent of the first
(A4). Hence, the first term in (Bl) divided by Iw,i has a limiting X2(k + 1) distribution.
The second term in (Bl) may be written as
Proof of Theorem 4.2 A similar decomposition to (Bl) for the Wald statistic (21) holds under
HIV n Ho-'- except that f, and 8 are now as defined in the Proof of Theorem 3.1. Although
H 7ryy = 0 holds, we have Hl' ' Jryx, O'. Therefore, as in Theorem 3.2, note that we may
write al = (1, -')' for some k-vector 0 = w. Consequently, the first term divided by w,, may be
written as
+T-2UiZ*
-1 1BT
T [T-2B/rTZ*
T -1 lZ- iB B'Z- i/,,1 + o
cf (A7). As in the Pro
where z,. ~ N(0, I,.); cf
[I 7(/% Bk(a) \
x ( cx^Bk(a)
Coyrgh dB(a)/w,,,
©201 on ily Sns td J Api = Op(1)
Eon 6:28-36 201
JO \ a--
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 321
where B (a) Bl(a) - 'Bk(a), a e [0, 1]; cf. Proof of Theorem 3.2. The seco
becomes
Proof of Theorem 4.3 We concentrate on Case IV; the remaining Cases I-III and V are
proved by a similar argument. Let {ztr}lI denote the process under HIT of (26). Hence,
41(L)(ztT - L -yt) = StT, where tT - (fT - n)[z(t-_)T - - y(t - 1)] + Et and nT - n is
given in (27). Therefore, A(ztT - / - yt) = C tT+ C*(L)AtT, C(z) = C + (1 - z)C*(z) and
C = (fi-, B')[(a, a')'F(fl, al')] -(a, a1)', and thus,
s-1
x[Cs(t-i)T + C*(L)AU(t-i)T]
Note that ArT = (tri - n)A[z(t-_ ) - t - y(t - 1)] + Ast. It thereforefollows tha
( (g,, B )'CJk+l(a), where 8 is defined above Lemma A.1 and zt* = (t, z'T', Jk+
{ayfi,C(a - r)dBk+l (r) is an Ornstein-Uhlenbeck process and Bk+l (a) is a (k + 1)
nian motion with variance matrix Q, a E [0, 1]; cf Johansen (1995, Theorem 14.1
Similarly to (A4),
- r-2a'yP"
AP-IIZ
~ B-
TT,
-1B'Z-1
^,Br]
1PB_
rZ!P _ Ay + op(1) (B
J.Ltd.
Copyright © 2001 John Wiley & Sons, Appl. Econ. 16:
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
322 M. H. PESARAN, Y. SHIN AND R. J. SMITH
rfiP--1 ~ T ,- ~ z -1 ,
+ 2T'i'Pz Z2* f(T l-' /Z* P^ Z-1 ) p Z* P Z*<
+wT
yT -1rT, 1z(laP^z
-1 yT_ 2* ( (T
where T T- 1auTyyfi' + T1/2( - w
T 2pl/2 /*1P'
rT-/2* P*'-z2*i<_-T1/2Z/
Z*- 1T, = T 2*'/2
p Z* 22 Z* (, _lPay
lP--
= T-lf*Z*lPAzZ* + op(l) (B9)
B/ZlT-2B'P ZL-Y*ay
=_Z+ T-32BT*
T7-1 ' , P_cif,P~_Z*_,y
Z- = '* ,T-B + o1 +
) op(l))
1 ( fy, fi )CJk+l (a) JT+fl ( La) ayyda
Therefore,
where Ju(a) = J1(a) - W'Jk(a) is independentf k(a) and o k+l(a) (Ji(a), Jk(a)')', a e [0, 1].
Now , Jk r+l(a) satisfies the stochastic integral and differential equations, Jk*-,+l (a) = Wk-r+
(a) + ab' Jo Jk- + (r) dr and dJk.+ 1 (a) = dWk-,.+1 (a) + ab'Jl .+1 (a) da, where a = [(a, a)'
Ž(af, al)]-1/2(f, a)'a and b = [(aL, a)'Q)(al, a')]1/2 x [(f, Bly)'r(al, a)]-( Y )
fy; cf. Johansen (1995, Theorem 14.4, p. 207). Note that the first element of Jk*-_+l (a) satisfies
J(a) = Wu(a) + wouu-/2ayb' f i Jk- ( (r) dr and dJ( = dW(a) = d +(a)C + 1/2 ' 1 (a) da.
Copyright © 2001 J. Appl.
John WileE
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 323
Therefore,
T-1BT-*'IP-_AY
B lB Z*P j ((IY,
:=o CJk+l
d1*( (a) ) )1/2 d(a)
T -J I \ a I ^
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
324 M. H. PESARAN, Y. SHIN AND R. J. SMITH
noting T-1/2v u = Op(l). Similarly, as (1 - )(, a) 7 0', T-/Z'' 1AZ_ = Op(1), T-'1Z'2
X- iPxx_ = Op(l) and T-1''J IX_ i-l = Op(l). Therefore,
ACKNOWLEDGEMENTS
We are grateful to the Editor (David Hendry) and three anonymous refere
comments on an earlier version of this paper. Our thanks are also owed to Mic
Burridge, Clive Granger, Brian Henry, Joon-Yong Park, Ron Smith, Rod Whit
participants at the University of Birmingham. Partial financial support from
R000233608 and R000237334) and the Isaac Newton Trust of Trinity Coll
gratefully acknowledged. Previous versions of this paper appeared as DAE Wor
Nos. 9622 and 9907, University of Cambridge.
REFERENCES
Banerjee A, Dolado J, Galbraith JW, Hendry DF. 1993. Co-Integration, Error Correction,
metric Analysis of Non-Stationary Data. Oxford University Press: Oxford.
Banerjee A, Dolado J, Mestre R. 1998. Error-correction mechanism tests for cointegration in
framework. Journal of Time Series Analysis 19: 267-283.
Banerjee A, Galbraith JW, Hendry DF, Smith GW. 1986. Exploring equilibrium relationships
rics through static models: some Monte Carlo Evidence. Oxford Bulletin of Economics and
253-277.
Blanchard OJ, Summers L. 1986. Hysteresis and the European Unemployment Problem. In N
conomics Annual 15-78.
Boswijk P. 1992. Cointegration, Identification and Exogeneity: Inference in Structural Error C
Models. Tinbergen Institute Research Series.
Boswijk HP. 1994. Testing for an unstable root in conditional and structural error correction mode
of Econometrics 63: 37-70.
Boswijk HP. 1995. Efficient inference on cointegration parameters in structural error correctio
Journal of Econometrics 69: 133-158.
Cavanagh CL, Elliott G, Stock JH. 1995. Inference in models with nearly integrated regressors. Ec
Theory 11: 1131-1147.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 325
Chan A, Savage D, Whittaker R. 1995. The new treasury model. Government Econo
Paper No. 128, (Treasury Working Paper No. 70).
Darby J, Wren-Lewis S. 1993. Is there a cointegrating vector for UK wages? Journal
20: 87-115.
Dickey DA, Fuller WA. 1979. Distribution of the estimators for autoregressive time series wi
Journal of the American Statistical Association 74: 427-431.
Dickey DA, Fuller WA. 1981. Likelihood ratio statistics for autoregressive time series wit
Econometrica 49: 1057-1072.
Engle RF, Granger CWJ. 1987. Cointegration and error correction representation: estimation a
Econometrica 55: 251-276.
Granger CWJ, Lin J-L. 1995. Causality in the long run. Econometric Theory 11: 530-536.
Hansen BE. 1995. Rethinking the univariate approach to unit root testing: using covariates to incre
Econometric Theory 11: 1148-1171.
Harbo I, Johansen S, Nielsen B, Rahbek A. 1998. Asymptotic inference on cointegrating rank
systems. Journal of Business Economics and Statistics 16: 388-399.
Hendry DF, Pagan AR, Sargan JD. 1984. Dynamic specification. In Handbook of Econometri
Griliches Z, Intriligator MD (des). Elsevier: Amsterdam.
Johansen S. 1991. Estimation and hypothesis testing of cointegrating vectors in Gaussian vector au
sive models. Econometrica 59: 1551-1580.
Johansen S. 1992. Cointegration in partial systems and the efficiency of single-equation analysis.
Econometrics 52: 389-402.
Johansen S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxf
versity Press: Oxford.
Kremers JJM, Ericsson NR, Dolado JJ. 1992. The power of cointegration tests. Oxford Bulletin of E
and Statistics 54: 325-348.
Layard R, Nickell S, Jackman R. 1991. Unemployment. Macroeconomic Performance and the
Market. Oxford University Press: Oxford.
Lindbeck A, Snower D. 1989. The Insider Outsider Theory of Employment and Unemployment, MIT
Cambridge, MA.
Manning A. 1993. Wage bargaining and the Phillips curve: the identification and specification of ag
wage equations. Economic Journal 103: 98-118.
Nickell S, Andrews M. 1983. Real wages and employment in Britain. Oxford Economic Papers 35:
Nielsen B, Rahbek A. 1998. Similarity issues in cointegration analysis. Preprint No. 7, Departm
Theoretical Statistics, University of Copenhagen.
Park JY. 1990. Testing for unit roots by variable addition. In Advances in Econometrics: Cointe
Spurious Regressions and Unit Roots, Fomby TB, Rhodes RF (eds). JAI Press: Greenwich, CT.
Pesaran MH, Pesaran B. 1997. Working with Microfit 4.0: Interactive Econometric Analysis, Oxford
sity Press: Oxford.
Pesaran MH, Shin Y. 1999. An autoregressive distributed lag modelling approach to cointegration an
Chapter 11 in Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Cen
Symposium, Strom S (ed.). Cambridge University Press: Cambridge.
Pesaran MH, Shin Y, Smith RJ. 2000. Structural analysis of vector error correction models with ex
I(1) variables. Journal of Econometrics 97: 293-343.
Phillips AW. 1958. The relationship between unemployment and the rate of change of money wage
the United Kingdom, 1861-1957. Economica 25: 283-299.
Phillips PCB, Durlauf S. 1986. Multiple time series with integrated variables. Review of Economic
53: 473-496.
Phillips PCB, Ouliaris S. 1990. Asymptotic properties of residual based tests for cointegration
58: 165-193.
Phillips PCB, Solo V. 1992. Asymptotics for linear processes. Annals of Statistics 20: 971-1
Rahbek A, Mosconi R. 1999. Cointegration rank inference with stationary regressors in VA
Econometrics Journal 2: 76-91.
Sargan JD. 1964. Real wages and prices in the U.K. Econometric Analysis of National Economic Pla
Hart PE Mills G, Whittaker JK (eds). Macmillan: New York. Reprinted in Hendry DF, Wallis KF
Econometrics and Quantitative Economics. Basil Blackwell: Oxford; 275-314.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
326 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Shin Y. 1994. A residual-based test of the null of cointegration against the alternative of no cointegration.
Econometric Theory 10: 91-115.
Stock J, Watson MW. 1988. Testing for common trends. Journal of the American Statistical Association 83:
1097-1107.
Urbain JP. 1992. On weak exogeneity in error correction models. Oxford Bulletin of Economics
52: 187-202.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
JOURNAL OF APPLIED ECONOMETRICS
J. Appl. Econ. 16: 289– 326 (2001)
DOI: 10.1002/jae.616
SUMMARY
This paper develops a new approach to the problem of testing the existence of a level relationship between
a dependent variable and a set of regressors, when it is not known with certainty whether the underlying
regressors are trend- or first-difference stationary. The proposed tests are based on standard F- and t-statistics
used to test the significance of the lagged levels of the variables in a univariate equilibrium correction
mechanism. The asymptotic distributions of these statistics are non-standard under the null hypothesis that
there exists no level relationship, irrespective of whether the regressors are I0 or I1. Two sets of asymptotic
critical values are provided: one when all regressors are purely I1 and the other if they are all purely
I0. These two sets of critical values provide a band covering all possible classifications of the regressors
into purely I0, purely I1 or mutually cointegrated. Accordingly, various bounds testing procedures are
proposed. It is shown that the proposed tests are consistent, and their asymptotic distribution under the null
and suitably defined local alternatives are derived. The empirical relevance of the bounds procedures is
demonstrated by a re-examination of the earnings equation included in the UK Treasury macroeconometric
model. Copyright 2001 John Wiley & Sons, Ltd.
1. INTRODUCTION
Over the past decade considerable attention has been paid in empirical economics to testing for
the existence of relationships in levels between variables. In the main, this analysis has been
based on the use of cointegration techniques. Two principal approaches have been adopted: the
two-step residual-based procedure for testing the null of no-cointegration (see Engle and Granger,
1987; Phillips and Ouliaris, 1990) and the system-based reduced rank regression approach due to
Johansen (1991, 1995). In addition, other procedures such as the variable addition approach of Park
(1990), the residual-based procedure for testing the null of cointegration by Shin (1994), and the
stochastic common trends (system) approach of Stock and Watson (1988) have been considered.
All of these methods concentrate on cases in which the underlying variables are integrated of order
one. This inevitably involves a certain degree of pre-testing, thus introducing a further degree of
uncertainty into the analysis of levels relationships. (See, for example, Cavanagh, Elliott and Stock,
1995.)
This paper proposes a new approach to testing for the existence of a relationship between
variables in levels which is applicable irrespective of whether the underlying regressors are purely
Ł Correspondence to: M. H. Pesaran, Faculty of Economics and Politics, University of Cambridge, Sidgwick Avenue,
Cambridge CB3 9DD. E-mail: hashem.pesaran@econ.cam.ac.uk
Contract/grant sponsor: ESRC; Contract/grant numbers: R000233608; R000237334.
Contract/grant sponsor: Isaac Newton Trust of Trinity College, Cambridge.
Copyright 2001 John Wiley & Sons, Ltd. Received 16 February 1999
Revised 13 February 2001
290 M. H. PESARAN, Y. SHIN AND R. J. SMITH
I(0), purely I(1) or mutually cointegrated. The statistic underlying our procedure is the familiar
Wald or F-statistic in a generalized Dicky–Fuller type regression used to test the significance
of lagged levels of the variables under consideration in a conditional unrestricted equilibrium
correction model (ECM). It is shown that the asymptotic distributions of both statistics are
non-standard under the null hypothesis that there exists no relationship in levels between the
included variables, irrespective of whether the regressors are purely I(0), purely I(1) or mutually
cointegrated. We establish that the proposed test is consistent and derive its asymptotic distribution
under the null and suitably defined local alternatives, again for a set of regressors which are a
mixture of I0/I1 variables.
Two sets of asymptotic critical values are provided for the two polar cases which assume that all
the regressors are, on the one hand, purely I(1) and, on the other, purely I(0). Since these two sets
of critical values provide critical value bounds for all classifications of the regressors into purely
I(1), purely I(0) or mutually cointegrated, we propose a bounds testing procedure. If the computed
Wald or F-statistic falls outside the critical value bounds, a conclusive inference can be drawn
without needing to know the integration/cointegration status of the underlying regressors. However,
if the Wald or F-statistic falls inside these bounds, inference is inconclusive and knowledge of the
order of the integration of the underlying variables is required before conclusive inferences can be
made. A bounds procedure is also provided for the related cointegration test proposed by Banerjee
et al. (1998) which is based on earlier contributions by Banerjee et al. (1986) and Kremers et al.
(1992). Their test is based on the t-statistic associated with the coefficient of the lagged dependent
variable in an unrestricted conditional ECM. The asymptotic distribution of this statistic is obtained
for cases in which all regressors are purely I(1), which is the primary context considered by these
authors, as well as when the regressors are purely I(0) or mutually cointegrated. The relevant
critical value bounds for this t-statistic are also detailed.
The empirical relevance of the proposed bounds procedure is demonstrated in a re-examination
of the earnings equation included in the UK Treasury macroeconometric model. This is a
particularly relevant application because there is considerable doubt concerning the order of
integration of variables such as the degree of unionization of the workforce, the replacement
ratio (unemployment benefit–wage ratio) and the wedge between the ‘real product wage’ and the
‘real consumption wage’ that typically enter the earnings equation. There is another consideration
in the choice of this application. Under the influence of the seminal contributions of Phillips (1958)
and Sargan (1964), econometric analysis of wages and earnings has played an important role in
the development of time series econometrics in the UK. Sargan’s work is particularly noteworthy
as it is some of the first to articulate and apply an ECM to wage rate determination. Sargan,
however, did not consider the problem of testing for the existence of a levels relationship between
real wages and its determinants.
The relationship in levels underlying the UK Treasury’s earning equation relates real average
earnings of the private sector to labour productivity, the unemployment rate, an index of union
density, a wage variable (comprising a tax wedge and an import price wedge) and the replacement
ratio (defined as the ratio of the unemployment benefit to the wage rate). These are the variables
predicted by the bargaining theory of wage determination reviewed, for example, in Layard
et al. (1991). In order to identify our model as corresponding to the bargaining theory of wage
determination, we require that the level of the unemployment rate enters the wage equation, but not
vice versa; see Manning (1993). This assumption, of course, does not preclude the rate of change
of earnings from entering the unemployment equation, or there being other level relationships
between the remaining four variables. Our approach accommodates both of these possibilities.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 291
A number of conditional ECMs in these five variables were estimated and we found that, if a
sufficiently high order is selected for the lag lengths of the included variables, the hypothesis that
there exists no relationship in levels between these variables is rejected, irrespective of whether
they are purely I(0), purely I(1) or mutually cointegrated. Given a level relationship between these
variables, the autoregressive distributed lag (ARDL) modelling approach (Pesaran and Shin, 1999)
is used to estimate our preferred ECM of average earnings.
The plan of the paper is as follows. The vector autoregressive (VAR) model which underpins
the analysis of this and later sections is set out in Section 2. This section also addresses the
issues involved in testing for the existence of relationships in levels between variables. Section 3
considers the Wald statistic (or the F-statistic) for testing the hypothesis that there exists no
level relationship between the variables under consideration and derives the associated asymptotic
theory together with that for the t-statistic of Banerjee et al. (1998). Section 4 discusses the power
properties of these tests. Section 5 describes the empirical application. Section 6 provides some
concluding remarks. The Appendices detail proofs of results given in Sections 3 and 4.
The following notation is used. The symbol ) signifies ‘weak convergence in probability
measure’, Im ‘an identity matrix of order m’, Id ‘integrated of order d’, OP K ‘of the same
order as K in probability’ and oP K ‘of smaller order than K in probability’.
Assumption 1 permits the elements of zt to be purely I(1), purely I(0) or cointegrated but excludes
the possibility of seasonal unit roots and explosive roots.1 Assumption 2 may be relaxed somewhat
to permit fet g1
tD1 to be a conditionally mean zero and homoscedastic process; see, for example,
PSS, Assumption 4.1.
We may re-express the lag polynomial 8L in vector equilibrium correction model (ECM)
form; i.e. 8L 5L C 0L1 L in which the long-run multiplier matrix is defined by 5
1 Assumptions 5a and 5b below further restrict the maximal order of integration of fzt g1
tD1 to unity.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
292 M. H. PESARAN, Y. SHIN AND R. J. SMITH
p p1 i
p iD1 8i , and the short-run response matrix lag polynomial 0L IkC1 iD1 0i L ,
IkC1
0i D jDiC1 j , i D 1, . . . , p 1. Hence, the VAR(p) model (1) may be rewritten in vector
ECM form as
p1
zt D a0 C a1 t C 5zt1 C 0i zti C et t D 1, 2, . . . 2
iD1
where ut ¾ IN0, ωuu , ωuu ωyy wyx Z1 xx wxy and ut is independent of ext . Substitution of (4)
into (2) together with a similar partitioning of a0 D ay0 , a0x0 0 , a1 D ay1 , a0x1 0 , 5 D p0y , 50x 0 ,
0 D g0y , 00x 0 , 0i D g0yi , 00xi 0 , i D 1, . . . , p 1, provides a conditional model for yt in terms of
zt1 , xt , zt1 , . . .; i.e. the conditional ECM
p1
yt D c0 C c1 t C py.x zt1 C y0i zti C w0 xt C ut t D 1, 2, . . . 5
iD1
where w 1 0 0 0 0
xx wxy , c0 ay0 w ax0 , c1 ay1 w ax1 , yi gyi w 0xi , i D 1, . . . , p 1, and
0
py.x py w x . The deterministic relations (3) are modified to
where gy.x gy w0 0x .
We now partition the long-run multiplier matrix 5 conformably with zt D yt , x0t 0 as
!yy pyx
D
pxy 5xx
2 See also Nielsen and Rahbek (1998) for an analysis of similarity issues in cointegrated systems.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 293
t D 1, 2, . . ., where
Under Assumption 4, from (7), we may express 5xx as 5xx D axx b0xx , where axx and bxx are both
k, r matrices of full column rank; see, for example, Engle and Granger (1987) and Johansen
(1991). If the maximal order of integration of the system (8) and (7) is unity, under Assumptions
1, 3 and 4, the process fxt g1tD1 is mutually cointegrated of order r, 0 r k. However, in
contradistinction to, for example, Banerjee, Dolado and Mestre (1998), BDM henceforth, who
concentrate on the case r D 0, we do not wish to impose an a priori specification of r.6 When
pxy D 0 and 5xx D 0, then xt is weakly exogenous for !yy and pyx.x D pyx in (8); see, for example,
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
294 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Johansen (1995, Theorem 8.1, p. 122). In the more general case where 5xx is non-zero, as !yy and
pyx.x D pyx w0 5xx are variation-free from the parameters in (7), xt is also weakly exogenous for
the parameters of (8).
Note that under Assumption 4 the maximal cointegrating rank of the long-run multiplier
matrix 5 for the system (8) and (7) is r C 1 and the minimal cointegrating rank of 5 is r. The
next assumptions provide the conditions for the maximal order of integration of the system (8)
and (7) to be unity. First, we consider the requisite conditions for the case in which rank5 D r.
In this case, under Assumptions 1, 3 and 4, !yy D 0 and pyx f0 5xx D 00 for some k-vector f.
Note that pyx.x D 00 implies the latter condition. Thus, under Assumptions 1, 3 and 4, 5 has rank
r and is given by
0 pyx
D
0 5xx
Hence, we may express 5 D ab0 where a D a0yx , a0xx 0 and b D 0, b0xx 0 are k C 1, r matrices of
full column rank; cf. HJNR, p. 390. Let the columns of the k C 1, k r C 1 matrices a? ?
y ,a
? ? ? ? ? ?
and by , b , where ay , by and a , b are respectively k C 1-vectors and k C 1, k r
matrices, denote bases for the orthogonal complements of respectively a and b; in particular,
a? ? 0 ? ? 0
y , a a D 0 and by , b b D 0.
Assumptions 1, 3, 4 and 5a and 5b permit the two polar cases for fxt g1 1
tD1 . First, if fxt gtD1 is a
purely I0 vector process, then 5xx , and, hence, axx and bxx , are nonsingular. Second, if fxt g1 tD1
is purely I1, then 5xx D 0, and, hence, axx and bxx are also null matrices.
Using (A.1) in Appendix A, it is easily seen that py.x zt m gt D py.x CŁ Let , where
fCŁ Let g is a mean zero stationary process. Therefore, under Assumptions 1, 3, 4 and 5b, that is,
!yy 6D 0, it immediately follows that there exists a conditional level relationship between yt and
xt defined by
yt D (0 C (1 t C qxt C vt , t D 1, 2, . . . 10
where (0 py.x m/!yy , (1 py.x g/!yy , q pyx.x /!yy and vt D py.x CŁ Lεt /!yy , also a zero mean
stationary process. If pyx.x D ˛yy b0yx C ayx w axx b0xx 6D 00 , the level relationship between yt
and xt is non-degenerate. Hence, from (10), yt ¾ I0 if rankbyx , bxx D r and yt ¾ I1 if
rankbyx , bxx D r C 1. In the former case, q is the vector of conditional long-run multipliers and,
in this sense, (10) may be interpreted as a conditional long-run level relationship between yt and
xt , whereas, in the latter, because the processes fyt g1 1
tD1 and fxt gtD1 are cointegrated, (10) represents
the conditional long-run level relationship between yt and xt . Two degenerate cases arise. First,
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 295
if !yy 6D 0 and pyx.x D 00 , clearly, from (10), yt is (trend) stationary or yt ¾ I0 whatever the
value of r. Consequently, the differenced variable yt depends only on its own lagged level yt1
in the conditional ECM (8) and not on the lagged levels xt1 of the forcing variables. Second, if
!yy D 0, that is, Assumption 5a holds, and pyx.x D ayx w0 axx b0xx 6D 00 , as rank5 D r, pyx.x D
f w0 axx b0xx which, from the above, yields pyx.x xt mx gx t D py.x CŁ Let , t D 1, 2, . . .,
where m D )y , m0x 0 and g D *y , g0x 0 are partitioned conformably with zt D yt , x0t 0 . Thus, in
(8), yt depends only on the lagged level xt1 through the linear combination f w0 axx of the
lagged mutually cointegrating relations b0xx xt1 for the process fxt g1 tD1 . Consequently, yt ¾ I1
whatever the value of r. Finally, if both !yy D 0 and pyx.x D 00 , there are no level effects in the
conditional ECM (8) with no possibility of any level relationship between yt and xt , degenerate
or otherwise, and, again, yt ¾ I1 whatever the value of r.
Therefore, in order to test for the absence of level effects in the conditional ECM (8) and, more
crucially, the absence of a level relationship between yt and xt , the emphasis in this paper is a
test of the joint hypothesis !yy D 0 and pyx.x D 00 in (8).7,8 In contradistinction, the approach of
BDM may be described in terms of (8) using Assumption 5b:
yt D c0 C c1 t C ˛yy ˇyy yt1 C b0yx xt1 C ayx w0 axx b0xx xt1
p1
C y0i zti C w0 xt C ut 11
iD1
BDM test for the exclusion of yt1 in (11) when r D 0, that is, bxx D 0 in (11) or 5xx D 0 in
(7) and, thus, fxt g is purely I1; cf. HJNR and PSS.9 Therefore, BDM consider the hypothesis
˛yy D 0 (or !yy D 0).10 More generally, when 0 < r k, BDM require the imposition of the
untested subsidiary hypothesis ayx w0 axx D 00 ; that is, the limiting distribution of the BDM test
is obtained under the joint hypothesis !yy D 0 and pyx.x D 0 in (8).
In the following sections of the paper, we focus on (8) and differentiate between five cases of
interest delineated according to how the deterministic components are specified:
ž Case I (no intercepts; no trends) c0 D 0 and c1 D 0. That is, m D 0 and g D 0. Hence, the
ECM (8) becomes
p1
yt D !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 12
iD1
ž Case II (restricted intercepts; no trends) c0 D !yy , pyx.x m and c1 D 0. Here, g D 0. The
ECM is
p1
yt D !yy yt1 )y C pyx.x xt1 mx C y0i zti C w0 xt C ut 13
iD1
7 This joint hypothesis may be justified by the application of Roy’s union-intersection principle to tests of ! D 0
yy
in (8) given pyx.x . Let W!yy pyx.x be the Wald statistic for testing !yy D 0 for a given value of pyx.x . The test
max!yx.x W!yy pyx.x is identical to the Wald test of !yy D 0 and pyx.x D 0 in (8).
8 A related approach to that of this paper is Hansen’s (1995) test for a unit root in a univariate time series which, in our
context, would require the imposition of the subsidiary hypothesis pyx.x D 00 .
9 The BDM test is based on earlier contributions of Kremers et al. (1992), Banerjee et al. (1993), and Boswijk (1994).
10 Partitioning 0 D g 0 0
xi xy,i , 0xx,i , i D 1, . . . , p 1, conformably with zt D yt , xt , BDM also set gxy,i D 0, i D
1, . . . , p 1, which implies gxy D 0, where 0x D gxy , 0xx ; that is, yt does not Granger cause xt .
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
296 M. H. PESARAN, Y. SHIN AND R. J. SMITH
p1
yt D c0 C !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 14
iD1
p1
yt D c0 C !yy yt1 *y t C pyx.x xt1 gx t C y0i zti C w0 xt C ut 15
iD1
p1
yt D c0 C c1 t C !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 16
iD1
It should be emphasized that the DGPs for Cases II and III are treated as identical as are those
for Cases IV and V. However, as in the test for a unit root proposed by Dickey and Fuller (1979)
compared with that of Dickey and Fuller (1981) for univariate models, estimation and hypothesis
testing in Cases III and V proceed ignoring the constraints linking respectively the intercept and
trend coefficient, c0 and c1 , to the parameter vector !yy , pyx.x whereas Cases II and IV fully
incorporate the restrictions in (9).
In the following exposition, we concentrate on Case IV, that is, (15), which may be specialized
to yield the remainder.
However, as indicated in Section 2, not only does the alternative hypothesis H1 of (17) cover the
case of interest in which !yy 6D 0 and pyx.x 6D 00 but also permits !yy 6D 0, pyx.x D 00 and !yy D 0
and pyx.x 6D 00 ; cf. (8). That is, the possibility of degenerate level relationships between yt and xt
is admitted under H1 of (18). We comment further on these alternatives at the end of this section.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 297
For ease of exposition, we consider Case IV and rewrite (15) in matrix notation as
y D iT c0 C ZŁ1 pŁy.x C Z y C u 19
where iT is a T-vector of ones, y y1 , . . . , yT 0 , X x1 , . . . , xT 0 , Zi
z1i , . . . , zTi 0 , i D 1, . . . , p 1, y w0 , y01 , . . . , y0p1 0 , Z X, Z1 , . . . ,
Z1p , ZŁ1 tT , Z1 , tT 1, . . . , T0 , Z1 z0 , . . . , zT1 0 , u u1 , . . . , uT 0 and
g0 !yy
pŁy.x D
IkC1 p0yx.x
The least squares (LS) estimator of pŁy.x is given by:
0
p̂Ły.x Z̃Ł1 P Ł 1 Ł0
Z Z̃1 Z̃1 P
Z y
20
P. Z , y
where Z̃Ł1 P. ZŁ1 , Z P. y, P. IT iT i0 iT 1 i0 and P
T T Z IT
0 0
1
Z Z Z Z . The Wald and the F-statistics for testing the null hypothesis H0 of
(17) against the alternative hypothesis H1 of (18) are respectively:
0 0 W
W p̂Ły.x Z̃Ł1 P Ł Ł
Z Z̃1 p̂y.x /ωO uu , F 21
kC2
where ωO uu T m1 TtD1 uQ t2 , m k C 1p C 1 C 1 is the number of estimated coefficients
and uQ t , t D 1, 2, . . . , T, are the least squares (LS) residuals from (19).
The next theorem presents the asymptotic null distribution of the Wald statistic; the limit
behaviour of the F-statistic is a simple corollary and is not presented here or subsequently.
Let WkrC1 a Wu a, Wkr a0 0 denote a k r C 1-dimensional standard Brownian motion
partitioned into the scalar and k r-dimensional sub-vector independent standard Brownian
motions Wu a and Wkr a, a 2 [0, 1]. We will also require the corresponding 1 de-meaned k
r C 1-vector standard Brownian motion W̃krC1 a WkrC1 a 0 WkrC1 ada, and de-
meaned
and
de-trended k r C 1-vector standard Brownian motion ŴkrC1 a W̃krC1 a
1
12 a 12 0 a 12 W̃krC1 ada, and their respective partitioned counterparts W̃krC1 a D
WQ u a, W̃kr a0 0 , and ŴkrC1 a D W O u a, Ŵkr a0 0 , a 2 [0, 1].
Theorem 3.1 (Limiting distribution of W) If Assumptions 1–4 and 5a hold, then under H0 :
!yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic distribution of the Wald statistic W of
(21) has the representation
1
1 1
1
W ) z0r zr C dWu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 adWu a 22
0 0 0
The asymptotic distribution of the Wald statistic W of (21) depends on the dimension and
cointegration rank of the forcing variables fxt g, k and r respectively. In Case IV, referring to
(11), the first component in (22), z0r zr ¾ / 2 r, corresponds to testing for the exclusion of the r-
dimensional stationary vector b0xx xt1 , that is, the hypothesis ayx w0 axx D 00 , whereas the second
term in (22), which is a non-standard Dickey–Fuller unit-root distribution, corresponds to testing
for the exclusion of the k r C 1-dimensional I1 vector b? ? 0
y , b zt1 and, in Cases II and
IV, the intercept and time-trend respectively or, equivalently, ˛yy D 0.
We specialize Theorem 3.1 to the two polar cases in which, first, the process for the forcing
variables fxt g is purely integrated of order zero, that is, r D k and 5xx is of full rank, and, second,
the fxt g process is not mutually cointegrated, r D 0, and, hence, the fxt g process is purely integrated
of order one.
Corollary 3.1 (Limiting distribution of W if fxt g ¾ I0). If Assumptions 1–4 and 5a hold
and r D k, that is, fxt g ¾ I0, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the
asymptotic distribution of the Wald statistic W of (21) has the representation
1
FadWu a2
W ) z0k zk C 0 1 23
0 Fa2 da
Corollary 3.2 (Limiting distribution of W if fxt g ¾ I1). If Assumptions 1–4 and 5a hold
and r D 0, that is, fxt g ¾ I1, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the
asymptotic distribution of the Wald statistic W of (21) has the representation
1
1 1
1
0 0
W) dWu aFkC1 a FkC1 aFkC1 a da FkC1 adWu a
0 0 0
where FkC1 a is defined in Theorem 3.1 for Cases I–V, a 2 [0, 1].
In practice, however, it is unlikely that one would possess a priori knowledge of the rank r
of 5xx ; that is, the cointegration rank of the forcing variables fxt g or, more particularly, whether
fxt g ¾ I0 or fxt g ¾ I1. Long-run analysis of (12)–(16) predicated on a prior determination
of the cointegration rank r in (7) is prone to the possibility of a pre-test specification error;
see, for example, Cavanagh et al. (1995). However, it may be shown by simulation that the
asymptotic critical values obtained from Corollaries 3.1 (r D k and fxt g ¾ I0) and 3.2 (r D 0
and fxt g ¾ I1) provide lower and upper bounds respectively for those corresponding to the
general case considered in Theorem 3.1 when the cointegration rank of the forcing variables
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 299
fxt g process is 0 r k.11 Hence, these two sets of critical values provide critical value
bounds covering all possible classifications of fxt g into I0, I1 and mutually cointegrated
processes. Asymptotic critical value bounds for the F-statistics covering Cases I–V are set out in
Tables CI(i)–CI(v) for sizes 0.100, 0.050, 0.025 and 0.010; the lower bound values assume that
the forcing variables fxt g are purely I0, and the upper bound values assume that fxt g are purely
I1.12
Hence, we suggest a bounds procedure to test H0 : !yy D 0 and pyx.x D 00 of (17) within the
conditional ECMs (12)–(16). If the computed Wald or F-statistics fall outside the critical value
bounds, a conclusive decision results without needing to know the cointegration rank r of the
fxt g process. If, however, the Wald or F-statistic fall within these bounds, inference would be
inconclusive. In such circumstances, knowledge of the cointegration rank r of the forcing variables
fxt g is required to proceed further.
The conditional ECMs (12)–(16), derived from the underlying VAR(p) model (2), may also be
interpreted as an autoregressive distributed lag model of orders (p, p, . . . , p) (ARDL(p, . . . , p)).
However, one could also allow for differential lag lengths on the lagged variables yti and
xti in (2) to arrive at, for example, an ARDL(p, p1 , . . . , pk ) model without affecting the
asymptotic results derived in this section. Hence, our approach is quite general in the sense that
one can use a flexible choice for the dynamic lag structure in (12)–(16) as well as allowing
for short-run feedbacks from the lagged dependent variables, yti , i D 1, . . . , p, to xt in
(7). Moreover, within the single-equation context, the above analysis is more general than the
cointegration analysis of partial systems carried out by Boswijk (1992, 1995), HJNR, Johansen
(1992, 1995), PSS, and Urbain (1992), where it is assumed in addition that 5xx D 0 or xt is purely
I1 in (7).
To conclude this section, we reconsider the approach of BDM. There are three scenarios for
the deterministics given by (12), (14) and (16). Note that the restrictions on the deterministics’
coefficients (9) are ignored in Cases II of (13) and IV of (15) and, thus, Cases II and IV are now
subsumed by Cases III of (14) and V of (16) respectively. As noted below (11), BDM impose
but do not test the implicit hypothesis ayx w0 axx D 00 ; that is, the limiting distributional results
given below are also obtained under the joint hypothesis H0 : !yy D 0 and pyx.x D 00 of (17). BDM
!
test ˛yy D 0 (or H0 yy : !yy D 0) via the exclusion of yt1 in Cases I, III and V. For example, in
Case V, they consider the t-statistic
ŷ01 P
y
Z
,X̂1
t!yy D 1/2
24
ωO uu ŷ01 P
Z ŷ1 1/2
,X̂1
where ωO uu is defined in the line after (21), y P. ,0 y, ŷ1 P. ,0 y1 , y1
T T T T
P. ,0 Z , P. ,0 P.
y0 , . . . , yT1 , X̂1 P.T ,0T X1 , X1 x0 , . . . , xT1 0 , Z
0
T T T T T
0 1 0
P.T tT t0T P.T tT 1 t0T P.T , P
Z ,X̂1 D P Z X̂1 X̂1 P
Z P Z X̂1 X̂1 P
Z and P Z
IT Z Z 0 Z 1 Z 0 .
11 The critical values of the Wald and F-statistics in the general case (not reported here) may be computed via stochastic
simulations with different combinations of values for k and 0 r k.
12 The critical values for the Wald version of the bounds test are given by k C 1 times the critical values of the F-test in
Cases I, III and V, and k C 2 times in Cases II and IV.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
300 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Table CI. Asymptotic critical value bounds for the F-statistic. Testing for the existence of a levels
relationshipa
Table CI(i) Case I: No intercept and no trend
0 3.00 3.00 4.20 4.20 5.47 5.47 7.17 7.17 1.16 1.16 2.32 2.32
1 2.44 3.28 3.15 4.11 3.88 4.92 4.81 6.02 1.08 1.54 1.08 1.73
2 2.17 3.19 2.72 3.83 3.22 4.50 3.88 5.30 1.05 1.69 0.70 1.27
3 2.01 3.10 2.45 3.63 2.87 4.16 3.42 4.84 1.04 1.77 0.52 0.99
4 1.90 3.01 2.26 3.48 2.62 3.90 3.07 4.44 1.03 1.81 0.41 0.80
5 1.81 2.93 2.14 3.34 2.44 3.71 2.82 4.21 1.02 1.84 0.34 0.67
6 1.75 2.87 2.04 3.24 2.32 3.59 2.66 4.05 1.02 1.86 0.29 0.58
7 1.70 2.83 1.97 3.18 2.22 3.49 2.54 3.91 1.02 1.88 0.26 0.51
8 1.66 2.79 1.91 3.11 2.15 3.40 2.45 3.79 1.02 1.89 0.23 0.46
9 1.63 2.75 1.86 3.05 2.08 3.33 2.34 3.68 1.02 1.90 0.20 0.41
10 1.60 2.72 1.82 2.99 2.02 3.27 2.26 3.60 1.02 1.91 0.19 0.37
0 3.80 3.80 4.60 4.60 5.39 5.39 6.44 6.44 2.03 2.03 1.77 1.77
1 3.02 3.51 3.62 4.16 4.18 4.79 4.94 5.58 1.69 2.02 1.01 1.25
2 2.63 3.35 3.10 3.87 3.55 4.38 4.13 5.00 1.52 2.02 0.69 0.96
3 2.37 3.20 2.79 3.67 3.15 4.08 3.65 4.66 1.41 2.02 0.52 0.78
4 2.20 3.09 2.56 3.49 2.88 3.87 3.29 4.37 1.34 2.01 0.42 0.65
5 2.08 3.00 2.39 3.38 2.70 3.73 3.06 4.15 1.29 2.00 0.35 0.56
6 1.99 2.94 2.27 3.28 2.55 3.61 2.88 3.99 1.26 2.00 0.30 0.49
7 1.92 2.89 2.17 3.21 2.43 3.51 2.73 3.90 1.23 2.01 0.26 0.44
8 1.85 2.85 2.11 3.15 2.33 3.42 2.62 3.77 1.21 2.01 0.23 0.40
9 1.80 2.80 2.04 3.08 2.24 3.35 2.50 3.68 1.19 2.01 0.21 0.36
10 1.76 2.77 1.98 3.04 2.18 3.28 2.41 3.61 1.17 2.00 0.19 0.33
0 6.58 6.58 8.21 8.21 9.80 9.80 11.79 11.79 3.05 3.05 7.07 7.07
1 4.04 4.78 4.94 5.73 5.77 6.68 6.84 7.84 2.03 2.52 2.28 2.89
2 3.17 4.14 3.79 4.85 4.41 5.52 5.15 6.36 1.69 2.35 1.23 1.77
3 2.72 3.77 3.23 4.35 3.69 4.89 4.29 5.61 1.51 2.26 0.82 1.27
4 2.45 3.52 2.86 4.01 3.25 4.49 3.74 5.06 1.41 2.21 0.60 0.98
5 2.26 3.35 2.62 3.79 2.96 4.18 3.41 4.68 1.34 2.17 0.48 0.79
6 2.12 3.23 2.45 3.61 2.75 3.99 3.15 4.43 1.29 2.14 0.39 0.66
7 2.03 3.13 2.32 3.50 2.60 3.84 2.96 4.26 1.26 2.13 0.33 0.58
8 1.95 3.06 2.22 3.39 2.48 3.70 2.79 4.10 1.23 2.12 0.29 0.51
9 1.88 2.99 2.14 3.30 2.37 3.60 2.65 3.97 1.21 2.10 0.25 0.45
10 1.83 2.94 2.06 3.24 2.28 3.50 2.54 3.86 1.19 2.09 0.23 0.41
(Continued overleaf )
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 301
0 5.37 5.37 6.29 6.29 7.14 7.14 8.26 8.26 3.17 3.17 2.68 2.68
1 4.05 4.49 4.68 5.15 5.30 5.83 6.10 6.73 2.45 2.77 1.41 1.65
2 3.38 4.02 3.88 4.61 4.37 5.16 4.99 5.85 2.09 2.57 0.92 1.20
3 2.97 3.74 3.38 4.23 3.80 4.68 4.30 5.23 1.87 2.45 0.67 0.93
4 2.68 3.53 3.05 3.97 3.40 4.36 3.81 4.92 1.72 2.37 0.51 0.76
5 2.49 3.38 2.81 3.76 3.11 4.13 3.50 4.63 1.62 2.31 0.42 0.64
6 2.33 3.25 2.63 3.62 2.90 3.94 3.27 4.39 1.54 2.27 0.35 0.55
7 2.22 3.17 2.50 3.50 2.76 3.81 3.07 4.23 1.48 2.24 0.31 0.49
8 2.13 3.09 2.38 3.41 2.62 3.70 2.93 4.06 1.44 2.22 0.27 0.44
9 2.05 3.02 2.30 3.33 2.52 3.60 2.79 3.93 1.40 2.20 0.24 0.40
10 1.98 2.97 2.21 3.25 2.42 3.52 2.68 3.84 1.36 2.18 0.22 0.36
0 9.81 9.81 11.64 11.64 13.36 13.36 15.73 15.73 5.33 5.33 11.35 11.35
1 5.59 6.26 6.56 7.30 7.46 8.27 8.74 9.63 3.17 3.64 3.33 3.91
2 4.19 5.06 4.87 5.85 5.49 6.59 6.34 7.52 2.44 3.09 1.70 2.23
3 3.47 4.45 4.01 5.07 4.52 5.62 5.17 6.36 2.08 2.81 1.08 1.51
4 3.03 4.06 3.47 4.57 3.89 5.07 4.40 5.72 1.86 2.64 0.77 1.14
5 2.75 3.79 3.12 4.25 3.47 4.67 3.93 5.23 1.72 2.53 0.59 0.91
6 2.53 3.59 2.87 4.00 3.19 4.38 3.60 4.90 1.62 2.45 0.48 0.75
7 2.38 3.45 2.69 3.83 2.98 4.16 3.34 4.63 1.54 2.39 0.40 0.64
8 2.26 3.34 2.55 3.68 2.82 4.02 3.15 4.43 1.48 2.35 0.34 0.56
9 2.16 3.24 2.43 3.56 2.67 3.87 2.97 4.24 1.43 2.31 0.30 0.49
10 2.07 3.16 2.33 3.46 2.56 3.76 2.84 4.10 1.40 2.28 0.26 0.44
a The critical values are computed via stochastic simulations using T D 1000 and 40,000 replications for the F-statistic
for testing f D 0 in the regression: yt D f zt1 C a wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and
zt1 D yt1 , x0t1 0 , wt D 0 Case I
z 0 0
t1 D yt1 , xt1 , 1 , wt D 0 Case II
zt1 D yt1 , x0t1 0 , wt D 1 Case III
z D yt1 , x0t1 , t0 , wt D 1 Case IV
t1
zt1 D yt1 , x0t1 0 , wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0 and
et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0 if xt
is purely I0. The critical values for k D 0 correspond to the squares of the critical values of Dickey and Fuller’s (1979)
unit root t-statistics for Cases I, III and V, while they match those for Dickey and Fuller’s (1981) unit root F-statistics
for Cases II and IV. The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0,
while the columns headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
302 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Theorem 3.2 (Limiting distribution of t!yy ). If Assumptions 1-4 and 5a hold and gxy D 0, where
0x D gxy , 0xx , then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic
distribution of the t-statistic t!yy of (24) has the representation
1
1 1/2
dWu aFkr a Fkr a2 da 25
0 0
where
1 1
1
Wu a 0 Wu aWkr a0 da 0 Wkr aWkr a0 da Wkr a Case I
1
1 1
Fkr a D WQ u a W Q u aW̃kr a da
0
W̃ aW̃ a0
da W̃ a Case III
0
0 kr kr
1
kr
O 1 1
Wu a 0 WO u aŴkr a da
0
Ŵkr aŴkr a0
da Ŵkr a Case V
0
r D 0, . . . , k, and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
The form of the asymptotic representation (25) is similar to that of a Dickey–Fuller test for
a unit root except that the standard Brownian motion Wu a is replaced by the residual from
an asymptotic regression of Wu a on the independent (k r)-vector standard Brownian motion
Wkr a (or their de-meaned and de-meaned and de-trended counterparts).
Similarly to the analysis following Theorem 3.1, we detail the limiting distribution of the t-
statistic t!yy in the two polar cases in which the forcing variables fxt g are purely integrated of
order zero and one respectively.
Corollary 3.3 (Limiting distribution of t!yy if fxt g ¾ I0). If Assumptions 1-4 and 5a hold
and r D k, that is, fxt g ¾ I0, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the
asymptotic distribution of the t-statistic t!yy of (24) has the representation
1
1 1/2
2
dWu aFa Fa da
0 0
Wu a Case I
where
Fa D Q u a Case III
W
O u a Case V
W
and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
Corollary 3.4 (Limiting distribution of t!yy if fxt g ¾ I1). If Assumptions 1-4 and 5a hold,
!
gxy D 0, where 0x D gxy , 0xx , and r D 0, that is, fxt g ¾ I1, then under H0 yy : !yy D 0, as
T ! 1, the asymptotic distribution of the t-statistic t!yy of (24) has the representation
1
1 1/2
2
dWu aFk a Fk a da
0 0
where Fk a is defined in Theorem 3.2 for Cases I, III and V, a 2 [0, 1].
As above, it may be shown by simulation that the asymptotic critical values obtained from
Corollaries 3.3 (r D k and fxt g is purely I0) and 3.4 (r D 0 and fxt g is purely I1) provide
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 303
lower and upper bounds respectively for those corresponding to the general case considered in
!
Theorem 3.2. Hence, a bounds procedure for testing H0 yy : !yy D 0 based on these two polar cases
may be implemented as described above based on the t-statistic t!yy for the exclusion of yt1 in
the conditional ECMs (12), (14) and (16) without prior knowledge of the cointegrating rank r.13
These asymptotic critical value bounds are given in Tables CII(i), CII(iii) and CII(v) for Cases I,
III and V for sizes 0.100, 0.050, 0.025 and 0.010.
As is emphasized in the Proof of Theorem 3.2 given in Appendix A, if the asymptotic analysis
!
for the t-statistic t!yy of (24) is conducted under H0 yy : !yy D 0 only, the resultant limit distribution
for t!yy depends on the nuisance parameter w f in addition to the cointegrating rank r, where,
under Assumption 5a, ayx f0 axx D 00 . Moreover, if yt is allowed to Granger-cause xt , that is,
gxy,i 6D 0 for some i D 1, . . . , p 1, then the limit distribution also is dependent on the nuisance
parameter gxy /*yy f0 gxy ; see Appendix A. Consequently, in general, where w 6D f or gxy 6D 0,
Table CII. Asymptotic critical value bounds of the t-statistic. Testing for the existence of a levels relationshipa
Table CII(i): Case I: No intercept and no trend
0 1.62 1.62 1.95 1.95 2.24 2.24 2.58 2.58 0.42 0.42 0.98 0.98
1 1.62 2.28 1.95 2.60 2.24 2.90 2.58 3.22 0.42 0.98 0.98 1.12
2 1.62 2.68 1.95 3.02 2.24 3.31 2.58 3.66 0.42 1.39 0.98 1.12
3 1.62 3.00 1.95 3.33 2.24 3.64 2.58 3.97 0.42 1.71 0.98 1.09
4 1.62 3.26 1.95 3.60 2.24 3.89 2.58 4.23 0.42 1.98 0.98 1.07
5 1.62 3.49 1.95 3.83 2.24 4.12 2.58 4.44 0.42 2.22 0.98 1.05
6 1.62 3.70 1.95 4.04 2.24 4.34 2.58 4.67 0.42 2.43 0.98 1.04
7 1.62 3.90 1.95 4.23 2.24 4.54 2.58 4.88 0.42 2.63 0.98 1.04
8 1.62 4.09 1.95 4.43 2.24 4.72 2.58 5.07 0.42 2.81 0.98 1.04
9 1.62 4.26 1.95 4.61 2.24 4.89 2.58 5.25 0.42 2.98 0.98 1.04
10 1.62 4.42 1.95 4.76 2.24 5.06 2.58 5.44 0.42 3.15 0.98 1.03
0 2.57 2.57 2.86 2.86 3.13 3.13 3.43 3.43 1.53 1.53 0.72 0.71
1 2.57 2.91 2.86 3.22 3.13 3.50 3.43 3.82 1.53 1.80 0.72 0.81
2 2.57 3.21 2.86 3.53 3.13 3.80 3.43 4.10 1.53 2.04 0.72 0.86
3 2.57 3.46 2.86 3.78 3.13 4.05 3.43 4.37 1.53 2.26 0.72 0.89
4 2.57 3.66 2.86 3.99 3.13 4.26 3.43 4.60 1.53 2.47 0.72 0.91
5 2.57 3.86 2.86 4.19 3.13 4.46 3.43 4.79 1.53 2.65 0.72 0.92
6 2.57 4.04 2.86 4.38 3.13 4.66 3.43 4.99 1.53 2.83 0.72 0.93
7 2.57 4.23 2.86 4.57 3.13 4.85 3.43 5.19 1.53 3.00 0.72 0.94
8 2.57 4.40 2.86 4.72 3.13 5.02 3.43 5.37 1.53 3.16 0.72 0.96
9 2.57 4.56 2.86 4.88 3.13 5.18 3.42 5.54 1.53 3.31 0.72 0.96
10 2.57 4.69 2.86 5.03 3.13 5.34 3.43 5.68 1.53 3.46 0.72 0.96
(Continued overleaf )
!
13 Although Corollary 3.3 does not require gxy D 0 and H0 yx.x : pyx.x D 00 is automatically satisfied under the conditions
!
of Corollary 3.4, the simulation critical value bounds result requires gxy D 0 and H0 yx.x : pyx.x D 00 for 0 < r < k.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
304 M. H. PESARAN, Y. SHIN AND R. J. SMITH
0 3.13 3.13 3.41 3.41 3.65 3.66 3.96 3.97 2.18 2.18 0.57 0.57
1 3.13 3.40 3.41 3.69 3.65 3.96 3.96 4.26 2.18 2.37 0.57 0.67
2 3.13 3.63 3.41 3.95 3.65 4.20 3.96 4.53 2.18 2.55 0.57 0.74
3 3.13 3.84 3.41 4.16 3.65 4.42 3.96 4.73 2.18 2.72 0.57 0.79
4 3.13 4.04 3.41 4.36 3.65 4.62 3.96 4.96 2.18 2.89 0.57 0.82
5 3.13 4.21 3.41 4.52 3.65 4.79 3.96 5.13 2.18 3.04 0.57 0.85
6 3.13 4.37 3.41 4.69 3.65 4.96 3.96 5.31 2.18 3.20 0.57 0.87
7 3.13 4.53 3.41 4.85 3.65 5.14 3.96 5.49 2.18 3.34 0.57 0.88
8 3.13 4.68 3.41 5.01 3.65 5.30 3.96 5.65 2.18 3.49 0.57 0.90
9 3.13 4.82 3.41 5.15 3.65 5.44 3.96 5.79 2.18 3.62 0.57 0.91
10 3.13 4.96 3.41 5.29 3.65 5.59 3.96 5.94 2.18 3.75 0.57 0.92
a The critical values are computed via stochastic simulations using T D 1000 and 40 000 replications for the t-statistic for
testing 2 D 0 in the regression: yt D 2yt1 C d0 xt1 C a0 wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and
wt D 0 Case I
wt D 1 Case III
wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0
and et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0
if xt is purely I0. The critical values for k D 0 correspond to those of Dickey and Fuller’s (1979) unit root t-statistics.
The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0, while the columns
headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
!
although the t-statistic t!yy has a well-defined limiting distribution under H0 yy : !yy D 0, the above
!
bounds testing procedure for H0 yy : !yy D 0 based on t!yy is not asymptotically similar.14
Consequently, in the light of the consistency results for the above statistics discussed in
Section 4, see Theorems 4.1, 4.2 and 4.4, we suggest the following procedure for ascertaining
the existence of a level relationship between yt and xt : test H0 of (17) using the bounds procedure
based on the Wald or F-statistic of (21) from Corollaries 3.1 and 3.2: (a) if H0 is not rejected,
!
proceed no further; (b) if H0 is rejected, test H0 yy : !yy D 0 using the bounds procedure based on
!
the t-statistic t!yy of (24) from Corollaries 3.3 and 3.4. If H0 yy : !yy D 0 is false, a large value of
t!yy should result, at least asymptotically, confirming the existence of a level relationship between
yt and xt , which, however, may be degenerate (if pyx.x D 00 ).
!
14 In principle, the asymptotic distribution of t!yy under H0 yy : !yy D 0 may be simulated from the limiting representation
2 2
given in the Proof of Theorem 3.2 of Appendix A after substitution of consistent estimators for f and lxy gxy /*yy.x under
!yy 2 0
H0 : !yy D 0, where *yy.x *yy f *xy . Although such estimators may be obtained straightforwardly, unfortunately,
they necessitate the use of parameter estimators from the marginal ECM (7) for fxt g1 tD1 .
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 305
of the Wald statistic of (21) under a sequence of local alternatives. Finally, we show that the
bounds procedure based on the t-statistic of (24) is consistent.
In the discussion of the consistency of the bounds test procedure based on the Wald statistic
of (21), because the rank of the long-run multiplier matrix 5 may be either r or r C 1 under the
! ! ! !
alternative hypothesis H1 D H1 yy [ H1 yx.x of (18) where H1 yy : !yy 6D 0 and H1 yx.x : pyx.x 6D 00 , it is
!yy
necessary to deal with these two possibilities. First, under H1 : !yy 6D 0, the rank of 5 is r C 1 so
!
Assumption 5b applies; in particular, ˛yy 6D 0. Second, under H0 yy : !yy D 0, the rank of 5 is r so
!yx.x
Assumption 5a applies; in this case, H1 : pyx.x 6D 00 holds and, in particular, ayx w0 axx 6D 00 .
!
Theorem 4.1 (Consistency of the Wald statistic bounds test procedure under H1 yy ). If Assumptions
!
1-4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the Wald statistic W (21) is consistent against
!yy
H1 : !yy 6D 0 in Cases I–V defined in (12)–(16).
! !
Theorem 4.2 (Consistency of the Wald statistic bounds test procedure under H1 yx.x \ H0 yy ). If
! !
Assumptions 1–4 and 5a hold, then under H1 yx.x : pyx.x 6D 00 of (18) and H0 yy : !yy D 0 of (17) the
!yx.x 0
Wald statistic W (21) is consistent against H1 : pyx.x 6D 0 in Cases I–V defined in (12)–(16).
Hence, combining Theorems 4.1 and 4.2, the bounds procedure of Section 3 based on the Wald
! ! ! !
statistic W (21) defines a consistent test of H0 D H0 yy \ H0 yx.x of (17) against H1 D H1 yy [ H1 yx.x
of (18). This result holds irrespective of whether the forcing variables fxt g are purely I0, purely
I1 or mutually cointegrated.
We now turn to consider the asymptotic distribution of the Wald statistic (21) under a suitably
specified sequence of local alternatives. Recall that under Assumption 5b, py.x [D !yy , pyx.x ] D
˛yy ˇyy , ˛yy b0xy C ayx w0 axx b0xx . Consequently, we define the sequence of local alternatives
H1T : py.xT [D !yyT , pyx.xT ] D T1 ˛yy ˇyy , T1 ˛yy b0xy C T1/2 dyx w0 dxx b0xx 26
In order to detail the limit distribution of the Wald statistic under the sequence of local alterna-
tives H1T of (26), it is necessary to define the (k r C 1)-dimensional Ornstein–Uhlenbeck pro-
cess JŁkrC1 a D JŁu a, JŁkr a 0 0
which obeys the stochastic integral and differential equations,
0 a Ł
JkrC1 a D WkrC1 a C ab 0 JkrC1 r dr and dJŁkrC1 a D dWkrC1 a C ab0 JŁkrC1 a da,
Ł
Theorem 4.3 (Limiting distribution of W under H1T ). If Assumptions 1–4 and 5a hold, then under
H1T : !y.x D T1 ˛yy b0y C T1/2 dyx w0 dxx b0 of (26), as T ! 1, the asymptotic distribution of
the Wald statistic W of (21) has the representation
1
1 1
1
W ) z0r zr C dJŁu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 a dJŁu a 28
0 0 0
0
where zr ¾ NQ1/2 h, Ir , Q[D Q1/20 Q1/2 ] D p limT!1 T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ , h dyx w
0
The first component of (28) z0r zr is non-central chi-square distributed with r degrees of
!
freedom and non-centrality parameter h0 Qh and corresponds to the local alternative H1Tyx.x :
!
pyx.xT D T1/2 dyx w0 dxx b0xx under H0 : !yy D 0. The second term in (28) is a non-standard
yy
!
Dickey–Fuller unit-root distribution under the local alternative H1Tyy : !yyT D T1 ˛yy ˇyy and
dyx w0 dxx D 00 . Note that under H0 of (17), that is, ˛yy D 0 and dyx w0 dxx D 00 , the limiting
representation (28) reduces to (22) as should be expected.
The proof for the consistency of the bounds test procedure based on the t-statistic of (24)
requires that the rank of the long-run multiplier matrix 5 is r C 1 under the alternative hypothesis
!
H1 yy : !yy 6D 0. Hence, Assumption 5b applies; in particular, ˛yy 6D 0.
!
Theorem 4.4 (Consistency of the t-statistic bounds test procedure under H1 yy ). If Assumptions
!
1–4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the t-statistic t!yy (24) is consistent against
!yy
H1 : !yy 6D 0 in Cases I, III and V defined in (12), (14) and (16).
As noted at the end of Section 3, Theorem 4.4 suggests the possibility of using t!yy to
! ! !
discriminate between H0 yy : !yy D 0 and H1 yy : !yy 6D 0, although, if H0 yx.x : pyx.x D 00 is false,
the bounds procedure given via Corollaries 3.3 and 3.4 is not asymptotically similar.
utility. Following Darby and Wren-Lewis (1993), the theoretical real wage equation underlying
the Treasury’s earnings equation is given by
Prodt
wt D 29
1 C fURt 1 RRt /Uniont
where wt is the real wage, Prodt is labour productivity, RRt is the replacement ratio defined as
the ratio of unemployment benefit to the wage rate, Uniont is a measure of ‘union power’, and
fURt is the probability of a union member becoming unemployed, which is assumed to be an
increasing function of the unemployment rate URt . The econometric specification is based on a
log-linearized version of (29) after allowing for a wedge effect that takes account of the difference
between the ‘real product wage’ which is the focus of the firms’ decision, and the ‘real consumption
wage’ which concerns the union.15 The theoretical arguments for a possible long-run wedge effect
on real wages is mixed and, as emphasized by CSW, whether such long-run effects are present
is an empirical matter. The change in the unemployment rate (URt ) is also included in the
Treasury’s wage equation. CSW cite two different theoretical rationales for the inclusion of URt
in the wage equation: the differential moderating effects of long- and short-term unemployed
on real wages, and the ‘insider–outsider’ theories which argue that only rising unemployment
will be effective in significantly moderating wage demands. See Blanchard and Summers (1986)
and Lindbeck and Snower (1989). The ARDL model and its associated unrestricted equilibrium
correction formulation used here automatically allow for such lagged effects.
We begin our empirical analysis from the maintained assumption that the time series properties
of the key variables in the Treasury’s earnings equation can be well approximated by a log-linear
VARp model, augmented with appropriate deterministics such as intercepts and time trends.
To ensure comparability of our results with those of the Treasury, the replacement ratio is not
included in the analysis. CSW, p. 50, report that ‘... it has not proved possible to identify a
significant effect from the replacement ratio, and this had to be omitted from our specification’.16
Also, as in CSW, we include two dummy variables to account for the effects of incomes policies
on average earnings. These dummy variables are defined by
D7475t D 1, over the period 1974q1 1975q4, 0 elsewhere
D7579t D 1, over the period 1975q1 1979q4, 0 elsewhere
The asymptotic theory developed in the paper is not affected by the inclusion of such ‘one-
off’ dummy variables.17 Let zt D wt , Prodt , URt , Wedget , Uniont 0 D wt , x0t 0 . Then, using the
analysis of Section 2, the conditional ECM of interest can be written as
p1
wt D c0 C c1 t C c2 D7475t C c3 D7579t C !ww wt1 C pwx.x xt1 C y0i zti C d0 xt C ut
iD1
30
15 The wedge effect is further decomposed into a tax wedge and an import price wedge in the Treasury model, but this
decomposition is not pursued here.
16 It is important, however, that, at a future date, a fresh investigation of the possible effects of the replacement ratio on
real wages should be undertaken.
17 However, both the asymptotic theory and associated critical values must be modified if the fraction of periods in which
the dummy variables are non-zero does not tend to zero with the sample size T. In the present application, both dummy
variables included in the earning equation are zero after 1979, and the fractions of observations where D7475t and D7579t
are non-zero are only 7.6% and 19.2% respectively.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
308 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Under the assumption that lagged real wages, wt1 , do not enter the sub-VAR model for xt ,
the above real wage equation is identified and can be estimated consistently by LS.18 Notice,
however, that this assumption does not rule out the inclusion of lagged changes in real wages in
the unemployment or productivity equations, for example. The exclusion of the level of real wages
from these equations is an identification requirement for the bargaining theory of wages which
permits it to be distinguished from other alternatives, such as the efficiency wage theory which
postulates that labour productivity is partly determined by the level of real wages.19 It is clear
that, in our framework, the bargaining theory and the efficiency wage theory cannot be entertained
simultaneously, at least not in the long run.
The above specification is also based on the assumption that the disturbances ut are serially
uncorrelated. It is therefore important that the lag order p of the underlying VAR is selected
appropriately. There is a delicate balance between choosing p sufficiently large to mitigate the
residual serial correlation problem and, at the same time, sufficiently small so that the conditional
ECM (30) is not unduly over-parameterized, particularly in view of the limited time series data
which are available.
Finally, a decision must be made concerning the time trend in (30) and whether its coefficient
should be restricted.20 This issue can only be settled in light of the particular sample period under
consideration. The time series data used are quarterly, cover the period 1970q1-1997q4, and are
seasonally adjusted (where relevant).21 To ensure comparability of results for different choices of
p, all estimations use the same sample period, 1972q1–1997q4 (T D 104), with the first eight
observations reserved for the construction of lagged variables.
The five variables in the earnings equation were constructed from primary sources in the fol-
lowing manner: wt D lnERPRt /PYNONGt , Wedget D ln1 C TEt C ln1 TDt lnRPIXt /
PYNONGt , URt D ln100 ð ILOUt /ILOUt C WFEMPt , Prodt D lnYPROMt C 278.29 ð
YMFt /EMFt C ENMFt , and Uniont D lnUDENt , where ERPRt is average private sector
earnings per employee (£), PYNONGt is the non-oil non-government GDP deflator, YPROMt
is output in the private, non-oil, non-manufacturing, and public traded sectors at constant fac-
tor cost (£ million, 1990), YMFt is the manufacturing output index adjusted for stock changes
(1990 D 100), EMFt and ENMFt are respectively employment in UK manufacturing and non-
manufacturing sectors (thousands), ILOUt is the International Labour Office (ILO) measure
of unemployment (thousands), WFEMPt is total employment (thousands), TEt is the average
employers’ National Insurance contribution rate, TDt is the average direct tax rate on employ-
ment incomes, RPIXt is the Retail Price Index excluding mortgage payments, and UDENt is
union density (used to proxy ‘union power’) measured by union membership as a percentage of
employment.22 The time series plots of the five variables included in the VAR model are given in
Figures 1–3.
18 See Assumption 3 and the following discussion. By construction, the contemporaneous effects x are uncorrelated
t
with the disturbance term ut and instrumental variable estimation which has been particularly popular in the empirical
wage equation literature is not necessary. Indeed, given the unrestricted nature of the lag distribution of the conditional
ECM (30), it is difficult to find suitable instruments: namely, variables that are not already included in the model, which
are uncorrelated with ut and also have a reasonable degree of correlation with the included variables in (30).
19 For a discussion of the issues that surround the identification of wage equations, see Manning (1993).
20 See, for example, PSS and the discussion in Section 2.
21 We are grateful to Andrew Gurney and Rod Whittaker for providing us with the data. For further details about the
sources and the descriptions of the variables, see CSW, pp. 46–51 and p. 11 of the Annex.
22 The data series for UDEN assumes a constant rate of unionization from 1980q4 onwards.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 309
(a)
4.0
3.5
Real Wages
3.0
Log Scale
2.5
2.0
1.5 Productivity
1.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
(b)
0.04
0.03
Real Wage
0.02
0.01
0.00
−0.01
−0.02
Productivity
−0.03
−0.04
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
Figure 1. (a) Real wages and labour productivity. (b) Rate of change of real wages and labour productivity
It is clear from Figure 1 that real wages (average earnings) and productivity show steadily rising
trends with real wages growing at a faster rate than productivity.23 This suggests, at least initially,
that a linear trend should be included in the real wage equation (30). Also the application of unit
root tests to the five variables, perhaps not surprisingly, yields mixed results with strong evidence
in favour of the unit root hypothesis only in the cases of real wages and productivity. This does
not necessarily preclude the other three variables (UR, Wedge, and Union) having levels impact
on real wages. Following the methodology developed in this paper, it is possible to test for the
existence of a real wage equation involving the levels of these five variables irrespective of whether
they are purely I0, purely I1, or mutually cointegrated.
23 Over the period 1972q1– 97q4, real wages grew by 2.14% per annum as compared to labour productivity that increased
by an annual average rate of 1.54% over the same period.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
310 M. H. PESARAN, Y. SHIN AND R. J. SMITH
−0.2
−0.3
UNION
−0.4
−0.5
−0.6
WEDGE
−0.7
−0.8
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
3.0
2.5
2.0
Log Scale
UR
1.5
1.0
0.5
0.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
To determine the appropriate lag length p and whether a deterministic linear trend is required
in addition to the productivity variable, we estimated the conditional model (30) by LS, with
and without a linear time trend, for p D 1, 2, . . . , 7. As pointed out earlier, all regressions were
computed over the same period 1972q1–1997q4. We found that lagged changes of the productivity
variable, Prodt1 , Prodt2 , . . . , were insignificant (either singly or jointly) in all regressions.
Therefore, for the sake of parsimony and to avoid unnecessary over-parameterization, we decided
to re-estimate the regressions without these lagged variables, but including lagged changes of
all other variables. Table I gives Akaike’s and Schwarz’s Bayesian Information Criteria, denoted
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 311
respectively by AIC and SBC, and Lagrange multiplier (LM) statistics for testing the hypothesis
2 2
of no residual serial correlation against orders 1 and 4 denoted by /SC 1 and /SC 4 respectively.
As might be expected, the lag order selected by AIC, p aic D 6, irrespective of whether a
deterministic trend term is included or not, is much larger than that selected by SBC. This latter
criterion gives estimates p sbc D 1 if a trend is included and p
sbc D 4 if not. The /SC
2
statistics also
suggest using a relatively high lag order: 4 or more. In view of the importance of the assumption
of serially uncorrelated errors for the validity of the bounds tests, it seems prudent to select p to
be either 5 or 6.24 Nevertheless, for completeness, in what follows we report test results for p D 4
and 5, as well as for our preferred choice, namely p D 6. The results in Table I also indicate
that there is little to choose between the conditional ECM with or without a linear deterministic
trend.
Table II gives the values of the F- and t-statistics for testing the existence of a level earnings
equation under three different scenarios for the deterministics, Cases III, IV and V of (14), (15)
and (16) respectively; see Sections 2 and 3 for detailed discussions.
The various statistics in Table II should be compared with the critical value bounds provided
in Tables CI and CII. First, consider the bounds F-statistic. As argued in PSS, the statistic FIV
which sets the trend coefficient to zero under the null hypothesis of no level relationship, Case
IV of (15), is more appropriate than FV , Case V of (16), which ignores this constraint. Note that,
if the trend coefficient c1 is not subject to this restriction, (30) implies a quadratic trend in the
level of real wages under the null hypothesis of !ww D 0 and pwx.x D 00 , which is empirically
implausible. The critical value bounds for the statistics FIV and FV are given in Tables CI(iv) and
CI(v). Since k D 4, the 0.05 critical value bounds are (3.05, 3.97) and (3.47, 4.57) for FIV and
FV , respectively.25 The test outcome depends on the choice of the lag order p. For p D 4, the
Table I. Statistics for selecting the lag order of the earnings equation
Notes: p is the lag order of the underlying VAR model for the conditional ECM (30), with zero restrictions on the
coefficients of lagged changes in the productivity variable. AICp LLp sp and SBCp LLp sp /2 ln T denote
Akaike’s and Schwarz’s Bayesian Information Criteria for a given lag order p, where LLp is the maximized log-likelihood
value of the model, sp is the number of freely estimated coefficients and T is the sample size. /SC 2 1 and / 2 4 are LM
SC
statistics for testing no residual serial correlation against orders 1 and 4. The symbols Ł , ŁŁ , and ŁŁŁ denote significance
at 0.01, 0.05 and 0.10 levels, respectively.
24 In the Treasury model, different lag orders are chosen for different variables. The highest lag order selected is 4 applied
to the log of the price deflator and the wedge variable. The estimation period of the earnings equation in the Treasury
model is 1971q1– 1994q3.
25 Following a suggestion from one of the referees we also computed critical value bounds for our sample size, namely
T D 104. For k D 4, the 5% critical value bounds associated with FIV and FV statistics turned out to be (3.19,4.16) and
(3.61,4.76), respectively, which are only marginally different from the asymptotic critical value bounds.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
312 M. H. PESARAN, Y. SHIN AND R. J. SMITH
With Without
deterministic trends deterministic trends
p FIV FV tV FIII tIII
Notes: See the notes to Table I. FIV is the F-statistic for testing
0
!ww D 0, pwx.x D 0 and c1 D 0 in (30). FV is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30). FIII is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30) with c1 set equal to 0. tV
and tIII are the t-ratios for testing !ww D 0 in (30) with and without
a deterministic linear trend. a indicates that the statistic lies below
the 0.05 lower bound, b that it falls within the 0.05 bounds, and c
that it lies above the 0.05 upper bound.
hypothesis that there exists no level earnings equation is not rejected at the 0.05 level, irrespective
of whether the regressors are purely I0, purely I1 or mutually cointegrated. For p D 5, the
bounds test is inconclusive. For p D 6 (selected by AIC), the statistic FV is still inconclusive, but
FIV D 4.78 lies outside the 0.05 critical value bounds and rejects the null hypothesis that there
exists no level earnings equation, irrespective of whether the regressors are purely I0, purely
I1 or mutually cointegrated.26 This finding is even more conclusive when the bounds F-test is
applied to the earnings equations without a linear trend. The relevant test statistic is FIII and the
associated 0.05 critical value bounds are (2.86, 4.01).27 For p D 4, FIII D 3.63, and the test result
is inconclusive. However, for p D 5 and 6, the values of FIII are 5.23 and 5.42 respectively and
the hypothesis of no levels earnings equation is conclusively rejected.
The results from the application of the bounds t-test to the earnings equations are less clear-cut
and do not allow the imposition of the trend restrictions discussed above. The 0.05 critical value
bounds for tIII and tV , when k D 4, are (2.86, 3.99) and (3.41, 4.36).28 Therefore, if a
linear trend is included, the bounds t-test does not reject the null even if p D 5 or 6. However,
when the trend term is excluded, the null is rejected for p D 5. Overall, these test results support
the existence of a levels earnings equation when a sufficiently high lag order is selected and
when the statistically insignificant deterministic trend term is excluded from the conditional ECM
(30). Such a specification is in accord with the evidence on the performance of the alternative
conditional ECMs set out in Table I.
In testing the null hypothesis that there are no level effects in (30), namely (!ww D 0, pwx.x D 0)
it is important that the coefficients of lagged changes remain unrestricted, otherwise these tests
could be subject to a pre-testing problem. However, for the subsequent estimation of levels effects
and short-run dynamics of real wage adjustments, the use of a more parsimonious specification
seems advisable. To this end we adopt the ARDL approach to the estimation of the level relations
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 313
discussed in Pesaran and Shin (1999).29 First, the (estimated) orders of an ARDLp, p1 , p2 , p3 , p4
model in the five variables wt , Prodt , URt , Wedget , Uniont were selected by searching across
the 75 D 16, 807 ARDL models, spanned by p D 0, 1, . . . , 6, and pi D 0, 1, . . . , 6, i D 1, . . . , 4,
using the AIC criterion.30 This resulted in the choice of an ARDL6, 0, 5, 4, 5 specification with
estimates of the levels relationship given by
wt D 1.063 Prodt 0.105 URt 0.943 Wedget C1.481 Uniont C2.701 C vO t 31
0.050 0.034 0.265 0.311 0.242
where vO t is the equilibrium correction term, and the standard errors are given in parenthesis.
All levels estimates are highly significant and have the expected signs. The coefficients of the
productivity and the wedge variables are insignificantly different from unity. In the Treasury’s
earnings equation, the levels coefficient of the productivity variable is imposed as unity and the
above estimates can be viewed as providing empirical support for this a priori restriction. Our
levels estimates of the effects of the unemployment rate and the union variable on real wages,
namely 0.105 and 1.481, are also in line with the Treasury estimates of 0.09 and 1.31.31
The main difference between the two sets of estimates concerns the levels coefficient of the
wedge variable. We obtain a much larger estimate, almost twice that obtained by the Treasury.
Setting the levels coefficients of the Prodt and Wedget variables to unity provides the alternative
interpretation that the share of wages (net of taxes and computed using RPIX rather than the
implicit GDP deflator) has varied negatively with the rate of unemployment and positively with
union strength.32
The conditional ECM regression associated with the above level relationship is given in
Table III.33 These estimates provide further direct evidence on the complicated dynamics that seem
to exist between real wage movements and their main determinants.34 All five lagged changes in
real wages are statistically significant, further justifying the choice of p D 6. The equilibrium
correction coefficient is estimated as 0.229 (0.0586) which is reasonably large and highly
significant.35 The auxiliary equation of the autoregressive part of the estimated conditional ECM
has real roots 0.9231 and 0.9095 and two pairs of complex roots with moduli 0.7589 and 0.6381,
which suggests an initially cyclical real wage process that slowly converges towards the equilibrium
described by (31).36 The regression fits reasonably well and passes the diagnostic tests against non-
normal errors and heteroscedasticity. However, it fails the functional form misspecification test at
29 Note that the ARDL approach advanced in Pesaran and Shin (1999) is applicable irrespective of whether the regressors
are purely I0, purely I1 or mutually cointegrated.
30 For further details, see Section 18.19 and Lesson 16.5 in Pesaran and Pesaran (1997).
31 CSW do not report standard errors for the levels estimates of the Treasury earnings equation.
32 We are grateful to a referee for drawing our attention to this point.
33 Clearly, it is possible to simplify the model further, but this would go beyond the remit of this section which is first to
test for the existence of a level relationship using an unrestricted ARDL specification and, second, if we are satisfied that
such a levels relationship exists, to select a parsimonious specification.
34 The standard errors of the estimates reported in Table III allow for the uncertainty associated with the estimation of the
levels coefficients. This is important in the present application where it is not known with certainty whether the regressors
are purely I0, purely I1 or mutually cointegrated. It is only in the case when it is known for certain that all regressors
are I1 that it would be reasonable in large samples to treat these estimates as known because of their super-consistency.
35 The equilibrium correction coefficient in the Treasury’s earnings equation is estimated to be 0.1848 (0.0528), which
is smaller than our estimate; see p. 11 in Annex of CSW. This seems to be because of the shorter lag lengths used in the
Treasury’s specification rather than the shorter time period 1971q1– 1994q3. Note also that the t-ratio reported for this
coefficient does not have the standard t-distribution; see Theorem 3.2. p
36 The complex roots are 0.34293 š 0.67703i and 0.17307 š 0.61386i, where i D 1.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
314 M. H. PESARAN, Y. SHIN AND R. J. SMITH
the 0.05 level which may be linked to the presence of some non-linear effects or asymmetries in
the adjustment of the real wage process that our linear specification is incapable of taking into
account.37 Recursive estimation of the conditional ECM and the associated cumulative sum and
cumulative sum of squares plots also suggest that the regression coefficients are generally stable
over the sample period. However, these tests are known to have low power and, thus, may have
missed important breaks. Overall, the conditional ECM earnings equation presented in Table III
has a number of desirable features and provides a sound basis for further research.
2
R D 0.5589, GO D 0.0083, AIC D 339.57, SBC D 302.55,
2 4 D 8.74[0.068], / 2 1 D 4.86[0.027]
/SC FF
2 2 D 0.01[0.993], / 2 1 D 0.66[0.415].
/N H
37 The conditional ECM regression in Table III also passes the test against residual serial correlation but, as the model
was specified to deal with this problem, it should not therefore be given any extra credit!
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 315
6. CONCLUSIONS
Empirical analysis of level relationships has been an integral part of time series econometrics
and pre-dates the recent literature on unit roots and cointegration.38 However, the emphasis of this
earlier literature was on the estimation of level relationships rather than testing for their presence (or
otherwise). Cointegration analysis attempts to fill this vacuum, but, typically, under the relatively
restrictive assumption that the regressors, xt , entering the determination of the dependent variable of
interest, yt , are all integrated of order 1 or more. This paper demonstrates that the problem of testing
for the existence of a level relationship between yt and xt is non-standard even if all the regressors
under consideration are I0 because, under the null hypothesis of no level relationship between yt
and xt , the process describing the yt process is I1, irrespective of whether the regressors xt are
purely I0, purely I1 or mutually cointegrated. The asymptotic theory developed in this paper
provides a simple univariate framework for testing the existence of a single level relationship
between yt and xt when it is not known with certainty whether the regressors are purely I0,
purely I1 or mutually cointegrated.39 Moreover, it is unnecessary that the order of integration
of the underlying regressors be ascertained prior to testing the existence of a level relationship
between yt and xt . Therefore, unlike typical applications of cointegration analysis, this method is
not subject to this particular kind of pre-testing problem. The application of the proposed bounds
testing procedure to the UK earnings equation highlights this point, where one need not take an a
priori position as to whether, for example, the rate of unemployment or the union density variable
are I1 or I0.
The analysis of this paper is based on a single-equation approach. Consequently, it is inappropri-
ate in situations where there may be more than one level relationship involving yt . An extension of
this paper and those of HJNR and PSS to deal with such cases is part of our current research, but
the consequent theoretical developments will require the computation of further tables of critical
values.
38 For an excellent review of this early literature, see Hendry et al. (1984).
39 Of course, the system approach developed by Johansen (1991, 1995) can also be applied to a set of variables containing
possibly a mixture of I0 and I1 regressors.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
316 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where b? ?
y , b is a k C 1, k r C 1 matrix whose columns are a basis for the orthogonal
y , b is a basis for R
kC1
complement of b. Hence, b, b? ?
. Let x be the k C 2-unit vector 1, 00 0 .
Then, bŁ , x, d is a basis for R . It therefore follows that
kC2
where zŁt D t, z0t 0 , BkC1 a is a k C 1-vector Brownian motion with variance matrix Z and [Ta]
denotes the integer part of Ta, a 2 [0, 1]; see Phillips and Solo (1992, Theorem 3.15, p. 983). Also,
T1 x0 zŁt D T1 t ) a. Similarly, noting that b0 C D 0, we have that bŁ0 zŁt D b0 m C b0 CŁ Let D
OP 1. Hence, from Phillips and Solo (1992, Theorem 3.16, p. 983), defining Z̃Ł1 Pi ZŁ1 and
Pi Z , it follows that
Z
0 0
T1 b0Ł Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 b0Ł Z̃Ł1 Z 0 Z
D OP 1, T1 Z D OP 1
0 0
D OP 1
T1 B0T Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 B0T Z̃Ł1 Z A2
where BT d, T1/2 x . Similarly, defining ũ Pi u,
0
0
ũ D OP 1
T1/2 b0Ł Z̃Ł1 ũ D OP 1, T1/2 Z A3
Cf. Johansen (1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146).
The next result follows from Phillips and Solo (1992, Theorem 3.15, p. 983); cf. Johansen
(1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146) and Phillips and Durlauf
(1986).
Lemma A.1 Let BT d, T1/2 x and define Ga D G1 a0 , G2 a0 , where G1 a b? ? 0
y ,b
1 1
CB̃kC1 a, B̃kC1 a[D BQ 1 a , B̃k a ] D BkC1 a 0 BkC1 ada, and G2 a a 2 , a 2 [0,1].
0 0 0
Then
1
1
0 0
T2 B0T Z̃Ł1 Z̃Ł1 BT ) GaGa0 da, T1 B0T Z̃Ł1 ũ ) GadBQ uŁ a
0 0
where BQ uŁ a BQ 1 a w0 B̃k a and B̃k a D BQ 1 a, B̃k a0 0 , a 2 [0, 1]
Proof of Theorem 3.1 Under H0 of (17), the Wald statistic W of (21) can be written as
0 1 0
ωO uu W D ũ0 P Z Z̃ Ł
1 Z̃ Ł
1 P Z
Z̃ Ł
1 Z̃Ł1 P Z ũ
1
0 Ł0 0
D ũ0 P Ł
Z Z̃1 AT AT Z̃1 P
Ł
Z Z̃1 AT A0T Z̃Ł1 P
Z ũ
0
where AT T1/2 bŁ , T1/2 BT . Consider the matrix A0T Z̃Ł1 P Ł
Z Z̃1 AT . It follows from (A2)
and Lemma A.1 that
1 0 Ł0 Ł
0 T bŁ Z̃1 P Z Z̃1 bŁ 00
A0T Z̃Ł1 P Z̃
Z 1 T
Ł
A D 0 C oP 1 A4
0 T2 B0T Z̃Ł1 Z̃Ł1 BT
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 317
0
Next, consider A0T Z̃Ł1 P
Z ũ. From (A3) and Lemma A.1,
0
0 T1/2 b0Ł Z̃Ł1 P
Z ũ
A0T Z̃Ł1 P
Z ũ D 0
C oP 1 A5
T1 B0T Z̃Ł1 ũ
Finally, the estimator for the error variance ωuu (defined in the line after (21)),
0 Ł0 1 0 Ł0
ωO uu D T m1 ũ0 ũ ũ0 P Ł
Z Z̃1 AT AT Z̃1 P
Ł
Z Z̃1 AT AT Z̃1 P
Z ũ
D T m1 ũ0 ũ C oP 1 D ωuu C oP 1 A6
We consider each of the terms in the representation (A7) in turn. A central limit theorem allows us
to state 1/2
0 0
1/2
T1 b0Ł Z̃Ł1 P Z̃
Z 1
Ł
b Ł T1/2 b0Ł Z̃Ł1 P
Z ũ/ωuu ) zr ¾ N0, Ir
Hence, the first term in (A7) converges in distribution to z0r zr , a chi-square random variable with
r degrees of freedom; that is,
1
1 0 Ł0 0
2
T1 ũ0 P Ł
Z Z̃1 bŁ T bŁ Z̃1 P
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 P 0
Z ũ/ωuu ) zr zr ¾ / r A8
which, as C D b? ? ? ? 0 ? ? 1 ? ? 0
y , b [ay , a 0ˇy , b )] ay , a , may be expressed as
0
? ? 0 1
1
a? ? 0
y , a B̃kC1 a a?1 ? 0
y , a B̃kC1 a ay , a B̃kC1 a 0
dBQ uŁ a da
0 a 12 0 a 12 a 12
1 ? ? 0
ay , a B̃kC1 a
ð dBQ uŁ a/ωuu
0 a 12
Now, noting that under H0 of (17) we may express a? 0 0 ? ?0 0
y D 1, w and a D 0, axx where
a? 0
xx axxD 0, we define the k r C 1-vector of independent de-meaned standard Brownian
motions,
Q u a, W̃kr a0 0 ] [a?
W̃krC1 a[ D W ? 0 ? ? 1/2 ?
y , a Zay , a ] ay , a? 0 B̃kC1 a
1/2 Q
ωuu Bu a
D
a? 0 ? 1/2 ? 0
xx Zxx axx axx B̃k a
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
318 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where BQ uŁ a D BQ 1 a w0 B̃k a is independent of B̃k a and B̃kC1 a BQ 1 a, B̃k a0 0 is par-
titioned according to zt D yt , x0t 0 , a 2 [0, 1]. Hence, the second term in (A7) has the following
asymptotic representation:
1 0
1 0 1
dW Q u a W̃krC11a W̃krC1 a W̃krC1 a
da
0 a 2 0 a 12 a 12
1
W̃krC1 a Q u a
ð dW A9
0 a 12
Note that dW Q u a in (A9) may be replaced by dWu a, a 2 [0, 1]. Combining (A8) and (A9) gives
the result of Theorem 3.1.
For the remaining cases, we need only make minor modifications to the proof for Case IV.
In Case I, d D b? ? ?
y , b with b, by , b
?
a basis for RkC1 and BT D d. For Case II, where
Ł 0 0
Z1 D iT , Z1 , we have
m0
bŁ D b
IkC1
and, consequently, we define x as in Case IV,
m0
dD b? ?
y , b and BT D d, x.
IkC1
Case III is similar to Case I as is Case V.
Proof of Corollary 3.1 Follows immediately from Theorem 3.1 by setting r D k.
Proof of Corollary 3.2 Follows immediately from Theorem 3.1 by setting r D 0.
Proof of Theorem 3.2 We provide a proof for Case V which may be simply adapted for Cases I
and III. To emphasize the potential dependence of the limit distribution on nuisance parameters,
the proof is initially conducted under Assumptions 1-4 together with Assumption 5a which implies
! p
H0 yy : !yy D 0 but not necessarily H0 yx.x : pyx.x D 00 ; in particular, note that we may write a? y D
!
1, f0 0 for some k-vector f. The t-statistic for H0 yy : !yy D 0 may be expressed as the square
root of 1
0P
y 0 0
A0T Ẑ01 P
Z ,X̂1
Ẑ 1 A T A Ẑ P
T 1 Z Ẑ 1 A T Z ,X̂1 y/ωO uu A10
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 319
Under the conditions of the theorem, f D w and l2xy D 0 and, therefore, BO u2 a[D BO uŁ a] D
0 2
1/2 O
ωuu Wu a and a? ?0 ?0 ? 1/2
xx B̂k a[D axx B̂k a] D axx Zxx axx Ŵkr a, a 2 [0, 1].
Proof of Corollary 3.3 Follows immediately from Theorem 3.2 by setting r D k.
Proof of Corollary 3.4 Follows immediately from Theorem 3.2 by setting r D 0.
Proof of Theorem 4.1 Again, we consider Case IV; the remaining Cases I–III and V may be
!
dealt with similarly. Under H1 yy : !yy 6D 0, Assumption 5b holds and, thus, D ay b0y C ab0 where
ay D ˛yy , 00 0 and by D ˇyy , b0yx 0 ; see above Assumption 5b. Under Assumptions 1–4 and 5b,
the process fzt g1 Ł
tD1 has the infinite moving-average representation, zt D m C gt C Cst C C Let ,
? ?0 ? 1 ?0
where now C b [a 0b ] a . We redefine bŁ and d as the k C 2, r C 1 and k C 2, k r
matrices,
g0
bŁ by , b
IkC1
and
g0
d b? ,
IkC1
where b? is a k C 1, k r matrix whose columns are a basis for the orthogonal complement of
by , b. Hence, by , b, b? is a basis for RkC1 and, thus, bŁ , x, d a basis for RkC2 , where again
x is the k C 2-unit vector 1, 00 0 . It therefore follows that
T1/2 d0 zŁ[Ta] D T1/2 b?0 m C T1/2 b?0 Cs[Ta] C b?0 T1/2 CŁ Le[Ta] ) b?0 CBkC1 a
Also, as above, T1 x0 zŁt D T1 t ) a and b0Ł zŁt D by , b0 m C by , b0 CŁ Let D OP 1.
The Wald statistic (21) multiplied by ωO uu may be written as
1
ũ P 0 Ł0 0
Ł Ł 0 Ł0 0 Ł0
Z Z̃1 AT AT Z̃1 P
Z Z̃1 AT A0T Z̃Ł1 P
Z ũ C 2lŁ Z̃1 P
Ł
Z Z̃1 lŁ ,
Z ũ C lŁ Z̃1 P
B1
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
320 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where lŁ bŁ ay , a0 1, w0 0 , AT T1/2 bŁ , T1/2 BT and BT d, T1/2 x. Note that (A6)
!
continues to hold under H1 yy : !yy 6D 0. A similar argument to that in the Proof of Theorem 3.1
demonstrates that the first term in (B1) divided by ωuu has the limiting representation
1
1 1
1
z0rC1 zrC1 C dWu aFkr a0 Fkr aFkr a0 da Fkr adWu a B2
0 0 0
where zrC1 ¾ N0, IrC1 , Fkr a D W̃kr a0 , a 12 0 and W̃kr a a? 0 ? 1/2 ? 0
xx Zxx axx axx B̃k a
is a k r-vector of de-meaned independent standard Brownian 1 motions independent of the
standard Brownian motion Wu a, a 2 [0, 1]; cf. (22). Now, 0 Fkr adWu a is mixed normal
1
with conditional variance matrix 0 Fkr aFkr a0 da. Therefore, the second term in (B2) is
unconditionally distributed as a / 2 k r random variable and is independent of the first term; cf.
(A4). Hence, the first term in (B1) divided by ωuu has a limiting / 2 k C 1 distribution.
The second term in (B1) may be written as
0
1/2 1/2 0 Ł0
21, w0 ay , ab0Ł Z̃Ł1 P
Z ũ D 2T 1, w0
ay , a T b Z̃ P
Ł 1 Z ũ D OP T1/2 , B3
0
as T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ converges in probability to a positive definite matrix. Moreover, as
!
1, w0 ay , a 6D 00 under H1 yy : !yy 6D 0, the Theorem is proved.
Proof of Theorem 4.2 A similar decomposition to (B1) for the Wald statistic (21) holds under
! !
H1 yx.x \ H0 yy except that bŁ and d are now as defined in the Proof of Theorem 3.1. Although
!yy !
H0 : !yy D 0 holds, we have H1 yx.x : pyx.x 6D 00 . Therefore, as in Theorem 3.2, note that we may
write a? 0 0
y D 1, f for some k-vector f 6D w. Consequently, the first term divided by ωuu may be
written as
1
1 0 Ł0 0
T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃
Ł 1 P Z̃
Z 1
Ł
b Ł b0Ł Z̃Ł1 P
Z ũ/ωuu
0
1 0
C T2 ũ0 Z̃Ł1 BT T2 B0T Z̃Ł1 Z̃Ł1 BT B0T Z̃Ł1 ũ/ωuu C oP 1 B5
cf. (A7). As in the Proof of Theorem 3.1, the first term of (B5) has the limiting representation z0r zr
where zr ¾ N0, Ir ; cf. (22). The second term of (B5) has the limiting representation
Q2 1
1 Bu a 0
1 BQ u2 a Q2
Bu a 0
dBQ uŁ a a? 0
xx B̃k a
a? 0
xx B̃k a a? 0
xx B̃k a da
1 1
0 a 2 0 a 2 a 12
1 BQ u2 a
ð a? 0
xx B̃k a dBQ uŁ a/ωuu D OP 1
0 1
a 2
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 321
where BQ uf a BQ 1 a f0 B̃k a, a 2 [0, 1]; cf. Proof of Theorem 3.2. The second term of (B1)
becomes
0
1/2 1/2 0 Ł0
21, w0 ab0Ł Z̃Ł1 PZ ũ D 2T 1, w0
a T b Z̃ P
Ł 1 Z ũ D OP T1/2
0
ð T1 b0Ł Z̃Ł1 PZ 1Z̃ Ł
b Ł a0 1, w0 0 D OP T
! p
The Theorem follows as 1, w0 a 6D 00 under H0 yy : !yy D 0 and H1 yx.x : pyx.x 6D 00 .
Proof of Theorem 4.3 We concentrate on Case IV; the remaining Cases I–III and V are
proved by a similar argument. Let fztT gTtD1 denote the process under H1T of (26). Hence,
8LztT m gt D xtT , where xtT 5T 5[zt1T m gt 1] C et and 5T 5 is
given in (27). Therefore, ztT ) gt D CxtT C CŁ LxtT , Cz D C C 1 zCŁ z and
?
C D b? ? ? ? 0 ? 1 ? ? 0
y , b [ay , a 0(by , b )] ay , a , and thus,
[IkC1 IkC1 C T1 Cay b0y L]ztT m gt D CetT C CŁ LxtT B6
where
dyx
etT T1/2 b0 [zt1T m gt 1] C et , t D 1, . . . , T, T D 1, 2, . . .
dxx
Note that xtT D 5T 5[zt1T m gt 1] C et . It therefore follows that T1/2 d0 zŁ[Ta]T
a
) b? ? 0 Ł 0 0
y , b CJkC1 a, where d is defined above Lemma A.1 and ztT D t, ztT , JkC1 a 0 exp
0
fay by Ca rgdBkC1 r is an Ornstein-Uhlenbeck process and BkC1 a is a k C 1-vector Brow-
nian motion with variance matrix Z, a 2 [0, 1]; cf. Johansen (1995, Theorem 14.1, p. 202).
Similarly to (A4),
1 0 Ł0 Ł
T bŁ Z̃1 PZ Z̃1 bŁ 00
A0T Z̃01 P Z̃ A
Z 1 T D 0 C oP 1
0 T2 B0T Z̃Ł1 Z̃Ł1 BT
Therefore, expression (B1) for the Wald statistic (21) multiplied by ωO uu is revised to
1
% 0 yP
ωO uu W D T1 Ł
T 1 0 Ł0 Ł 0
b0Ł Z̃Ł1 P
Z
Z̃ 1 b Ł b Ł Z̃ 1 P Z
Z̃ 1 b Ł Z y
1
C T2 % 0 yP Ł
T 2 0 Ł0 Ł 0
B0T Z̃Ł1 P
Z̃
Z 1 T
B B Z̃ Z̃
T 1 1 T B Z y C oP 1 B7
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
322 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where pŁyT T1 ˛yy b0yŁ C T1/2 dyx w0 dxx b0Ł . Defining h dyx w0 dxx 0 , consider
0 0 0
T1/2 b0Ł Z̃Ł1 P Ł Ł
Z Z̃1 pyT D T
1/2 0 Ł Ł 1
Z Z̃1 byŁ ˛yy T C bŁ hT
bŁ Z̃1 P 1/2
0
D T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ h C oP 1 B9
where we have made use of T1/2 b0yŁ zŁ[Ta]T ) b0y CJkC1 a. Therefore, (B8) divided by ωuu may be
re-expressed as
0
0
1/2 0 Ł0
T1/2 b0Ł Z̃Ł1 P
Z ũ C Qh Q 1
T b Z̃
Ł 1 P Z ũ C Qh /ωuu C oP 1 D z0r zr C oP 1
B9
1 0 Ł0 Ł 1/2
where Q p limT!1 T bŁ Z̃1 P Z Z̃1 bŁ and zr ¾ NQ h, Ir .
As Ł Ł0 0
T1 B0T Z̃Ł1 P 1 0 Ł0 Ł Ł0
P Z y D P Z Z̃1 pyT C ũ, Z0 y D T B0 T Z̃1 PZ Z̃1 pyT C ũ.
Consider the second term in (B7), in particular, T1 B0T Z̃Ł1 P Ł Ł
Z Z̃1 pyT which after substitution
Ł
for pyT becomes
0 0 0
T2 B0T Z̃Ł1 P Ł
Z Z̃1 byŁ ˛yy C T
3/2 0 Ł
BT Z̃1 P Ł 2 0 Ł
Z Z̃1 bŁ h D T BT Z̃1 P
Ł
Z Z̃1 byŁ ˛yy C oP 1
1 ? ? 0
by , b CJ̃kC1 a
) 1 J̃kC1 a0 C0 by ˛yy da
0 a 2
Therefore,
1
0
b? ? 0
y , b CJ̃kC1 a 1/2 Q
T1 B0T Z̃Ł1 P
Z y ) ωuu dWu a C J̃kC1 a0 C0 by ˛yy da
0 a 12
Consider
by ; cf. Johansen (1995, Theorem 14.4, p. 207). Note that the first element of J̃ŁkrC1 a satisfies
QJŁu a D WQ u a C ωuu 0 a Ł
1/2
˛yy b 0 J̃krC1 r dr and dJQ Łu a D dWQ u a C ωuu
1/2
˛yy b0 JQ ŁkrC1 a da.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 323
Therefore,
1
1 0
b? ? 0
y , b CJ̃kC1 a 1/2 Q Ł
T B0T Z̃Ł1 P
Z Y ) ωuu dJu a
0 a 12
Hence, the second term in (B7) weakly converges to
1
1 1
1
ωuu dJQ Łu aFkrC1 a0 FkrC1 aFkrC1 a da 0
FkrC1 a dJQ Łu a B10
0 0 0
Proof of Theorem 4.4 We consider Case V; the remaining Cases I and III may be dealt with
!
similarly. Under H1 yy : !yy 6D 0, from (10), ŷ1 D X̂1 q C v̂1 , where v̂1 P Z ,X̂1 v1 and
0 0
v1 D 0, v1 , . . . , vT1 . Therefore, ŷ1 P 0 0
Z ,X̂1 y D v̂1 P
Z ,X̂1 Y and ŷ1 P
Z ,X̂1 ŷ1 D
0
Z ,X̂1 v̂1 .
v̂1 P
As in Appendix A,
T1/2 b? 0
xx x[Ta] D T
1/2 ? 0
bxx mx C T1/2 b? 0
xx gx t C T
1/2 ? 0 ?
bxx bxx a?0 0b? 1 a?0 s[Ta]
C 0, b? 0
xx T
1/2 Ł
C Le[Ta]
T1 ŷ01 P
Z ŷ1 D T1 v̂01 P
Z v̂1 T1 v̂01 P
Z ? v̂1 C oP 1
,X̂1 ,X̂1 bxx ,X̂1 bxx
D T1 v̂01 P
Z v̂1 C oP 1
,X̂1 bxx
0 0 1 0 0
where P
Z ,X̂1 bxx P
Z PZ X̂1 bxx bxx X̂1 P
Z X̂1 bxx bxx X̂1 P
Z and P
Z ,X̂1 b?xx
? ?0 0 ? 1 ? 0 0 1 0
Z X̂1 bxx bxx X̂1 P
P
Z X̂1 bxx bxx X̂1 P
Z . Therefore, as T v̂1 v̂1 D OP 1,
T1 ŷ01 P
Z ŷ1 D OP 1 B11
,X̂1
T1/2 v̂01 P
Z û D T1/2 v̂01 P
Z û T1/2 v̂01 P
Z ? û C oP 1
,X̂1 ,X̂1 bxx ,X̂1 bxx
D T1/2 v̂01 P
Z û C oP 1 D OP 1
,X̂1 bxx
T1 v̂01 P
Z Ẑ1 l D T1 v̂01 P
Z Ẑ1 l T1 v̂01 P
Z ? Ẑ1 l C oP 1
,X̂1 ,X̂1 bxx ,X̂1 bxx
D T1 v̂01 P
Z Ẑ1 l C oP 1 D OP 1
,X̂1 bxx
T1/2 v̂01 P
Z Ẑ1 l D OP T1/2 . B12
,X̂1
Because ωO uu ωuu D oP 1, combining (B11) and (B12) yields the desired result.
ACKNOWLEDGEMENTS
We are grateful to the Editor (David Hendry) and three anonymous referees for their helpful
comments on an earlier version of this paper. Our thanks are also owed to Michael Binder, Peter
Burridge, Clive Granger, Brian Henry, Joon-Yong Park, Ron Smith, Rod Whittaker and seminar
participants at the University of Birmingham. Partial financial support from the ESRC (grant Nos
R000233608 and R000237334) and the Isaac Newton Trust of Trinity College, Cambridge, is
gratefully acknowledged. Previous versions of this paper appeared as DAE Working Paper Series,
Nos. 9622 and 9907, University of Cambridge.
REFERENCES
Banerjee A, Dolado J, Galbraith JW, Hendry DF. 1993. Co-Integration, Error Correction, and the Econo-
metric Analysis of Non-Stationary Data. Oxford University Press: Oxford.
Banerjee A, Dolado J, Mestre R. 1998. Error-correction mechanism tests for cointegration in single-equation
framework. Journal of Time Series Analysis 19: 267–283.
Banerjee A, Galbraith JW, Hendry DF, Smith GW. 1986. Exploring equilibrium relationships in economet-
rics through static models: some Monte Carlo Evidence. Oxford Bulletin of Economics and Statistics 48:
253–277.
Blanchard OJ, Summers L. 1986. Hysteresis and the European Unemployment Problem. In NBER Macroe-
conomics Annual 15–78.
Boswijk P. 1992. Cointegration, Identification and Exogeneity: Inference in Structural Error Correction
Models. Tinbergen Institute Research Series.
Boswijk HP. 1994. Testing for an unstable root in conditional and structural error correction models. Journal
of Econometrics 63: 37–70.
Boswijk HP. 1995. Efficient inference on cointegration parameters in structural error correction models.
Journal of Econometrics 69: 133–158.
Cavanagh CL, Elliott G, Stock JH. 1995. Inference in models with nearly integrated regressors. Econometric
Theory 11: 1131–1147.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 325
Chan A, Savage D, Whittaker R. 1995. The new treasury model. Government Economic Series Working
Paper No. 128, (Treasury Working Paper No. 70).
Darby J, Wren-Lewis S. 1993. Is there a cointegrating vector for UK wages? Journal of Economic Studies
20: 87–115.
Dickey DA, Fuller WA. 1979. Distribution of the estimators for autoregressive time series with a unit root.
Journal of the American Statistical Association 74: 427–431.
Dickey DA, Fuller WA. 1981. Likelihood ratio statistics for autoregressive time series with a unit root.
Econometrica 49: 1057–1072.
Engle RF, Granger CWJ. 1987. Cointegration and error correction representation: estimation and testing.
Econometrica 55: 251–276.
Granger CWJ, Lin J-L. 1995. Causality in the long run. Econometric Theory 11: 530–536.
Hansen BE. 1995. Rethinking the univariate approach to unit root testing: using covariates to increase power.
Econometric Theory 11: 1148–1171.
Harbo I, Johansen S, Nielsen B, Rahbek A. 1998. Asymptotic inference on cointegrating rank in partial
systems. Journal of Business Economics and Statistics 16: 388–399.
Hendry DF, Pagan AR, Sargan JD. 1984. Dynamic specification. In Handbook of Econometrics (Vol. II)
Griliches Z, Intriligator MD (des). Elsevier: Amsterdam.
Johansen S. 1991. Estimation and hypothesis testing of cointegrating vectors in Gaussian vector autoregres-
sive models. Econometrica 59: 1551–1580.
Johansen S. 1992. Cointegration in partial systems and the efficiency of single-equation analysis. Journal of
Econometrics 52: 389–402.
Johansen S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford Uni-
versity Press: Oxford.
Kremers JJM, Ericsson NR, Dolado JJ. 1992. The power of cointegration tests. Oxford Bulletin of Economics
and Statistics 54: 325–348.
Layard R, Nickell S, Jackman R. 1991. Unemployment: Macroeconomic Performance and the Labour
Market. Oxford University Press: Oxford.
Lindbeck A, Snower D. 1989. The Insider Outsider Theory of Employment and Unemployment, MIT Press:
Cambridge, MA.
Manning A. 1993. Wage bargaining and the Phillips curve: the identification and specification of aggregate
wage equations. Economic Journal 103: 98–118.
Nickell S, Andrews M. 1983. Real wages and employment in Britain. Oxford Economic Papers 35: 183–206.
Nielsen B, Rahbek A. 1998. Similarity issues in cointegration analysis. Preprint No. 7, Department of
Theoretical Statistics, University of Copenhagen.
Park JY. 1990. Testing for unit roots by variable addition. In Advances in Econometrics: Cointegration,
Spurious Regressions and Unit Roots, Fomby TB, Rhodes RF (eds). JAI Press: Greenwich, CT.
Pesaran MH, Pesaran B. 1997. Working with Microfit 4.0: Interactive Econometric Analysis, Oxford Univer-
sity Press: Oxford.
Pesaran MH, Shin Y. 1999. An autoregressive distributed lag modelling approach to cointegration analysis.
Chapter 11 in Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial
Symposium, Strom S (ed.). Cambridge University Press: Cambridge.
Pesaran MH, Shin Y, Smith RJ. 2000. Structural analysis of vector error correction models with exogenous
I(1) variables. Journal of Econometrics 97: 293–343.
Phillips AW. 1958. The relationship between unemployment and the rate of change of money wage rates in
the United Kingdom, 1861–1957. Economica 25: 283–299.
Phillips PCB, Durlauf S. 1986. Multiple time series with integrated variables. Review of Economic Studies
53: 473–496.
Phillips PCB, Ouliaris S. 1990. Asymptotic properties of residual based tests for cointegration. Econometrica
58: 165–193.
Phillips PCB, Solo V. 1992. Asymptotics for linear processes. Annals of Statistics 20: 971–1001.
Rahbek A, Mosconi R. 1999. Cointegration rank inference with stationary regressors in VAR models. The
Econometrics Journal 2: 76–91.
Sargan JD. 1964. Real wages and prices in the U.K. Econometric Analysis of National Economic Planning,
Hart PE Mills G, Whittaker JK (eds). Macmillan: New York. Reprinted in Hendry DF, Wallis KF (eds.)
Econometrics and Quantitative Economics. Basil Blackwell: Oxford; 275–314.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
326 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Shin Y. 1994. A residual-based test of the null of cointegration against the alternative of no cointegration.
Econometric Theory 10: 91–115.
Stock J, Watson MW. 1988. Testing for common trends. Journal of the American Statistical Association 83:
1097–1107.
Urbain JP. 1992. On weak exogeneity in error correction models. Oxford Bulletin of Economics and Statistics
52: 187–202.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
JOURNAL OF APPLIED ECONOMETRICS
J. Appl. Econ. 16: 289– 326 (2001)
DOI: 10.1002/jae.616
SUMMARY
This paper develops a new approach to the problem of testing the existence of a level relationship between
a dependent variable and a set of regressors, when it is not known with certainty whether the underlying
regressors are trend- or first-difference stationary. The proposed tests are based on standard F- and t-statistics
used to test the significance of the lagged levels of the variables in a univariate equilibrium correction
mechanism. The asymptotic distributions of these statistics are non-standard under the null hypothesis that
there exists no level relationship, irrespective of whether the regressors are I0 or I1. Two sets of asymptotic
critical values are provided: one when all regressors are purely I1 and the other if they are all purely
I0. These two sets of critical values provide a band covering all possible classifications of the regressors
into purely I0, purely I1 or mutually cointegrated. Accordingly, various bounds testing procedures are
proposed. It is shown that the proposed tests are consistent, and their asymptotic distribution under the null
and suitably defined local alternatives are derived. The empirical relevance of the bounds procedures is
demonstrated by a re-examination of the earnings equation included in the UK Treasury macroeconometric
model. Copyright 2001 John Wiley & Sons, Ltd.
1. INTRODUCTION
Over the past decade considerable attention has been paid in empirical economics to testing for
the existence of relationships in levels between variables. In the main, this analysis has been
based on the use of cointegration techniques. Two principal approaches have been adopted: the
two-step residual-based procedure for testing the null of no-cointegration (see Engle and Granger,
1987; Phillips and Ouliaris, 1990) and the system-based reduced rank regression approach due to
Johansen (1991, 1995). In addition, other procedures such as the variable addition approach of Park
(1990), the residual-based procedure for testing the null of cointegration by Shin (1994), and the
stochastic common trends (system) approach of Stock and Watson (1988) have been considered.
All of these methods concentrate on cases in which the underlying variables are integrated of order
one. This inevitably involves a certain degree of pre-testing, thus introducing a further degree of
uncertainty into the analysis of levels relationships. (See, for example, Cavanagh, Elliott and Stock,
1995.)
This paper proposes a new approach to testing for the existence of a relationship between
variables in levels which is applicable irrespective of whether the underlying regressors are purely
Ł Correspondence to: M. H. Pesaran, Faculty of Economics and Politics, University of Cambridge, Sidgwick Avenue,
Cambridge CB3 9DD. E-mail: hashem.pesaran@econ.cam.ac.uk
Contract/grant sponsor: ESRC; Contract/grant numbers: R000233608; R000237334.
Contract/grant sponsor: Isaac Newton Trust of Trinity College, Cambridge.
Copyright 2001 John Wiley & Sons, Ltd. Received 16 February 1999
Revised 13 February 2001
290 M. H. PESARAN, Y. SHIN AND R. J. SMITH
I(0), purely I(1) or mutually cointegrated. The statistic underlying our procedure is the familiar
Wald or F-statistic in a generalized Dicky–Fuller type regression used to test the significance
of lagged levels of the variables under consideration in a conditional unrestricted equilibrium
correction model (ECM). It is shown that the asymptotic distributions of both statistics are
non-standard under the null hypothesis that there exists no relationship in levels between the
included variables, irrespective of whether the regressors are purely I(0), purely I(1) or mutually
cointegrated. We establish that the proposed test is consistent and derive its asymptotic distribution
under the null and suitably defined local alternatives, again for a set of regressors which are a
mixture of I0/I1 variables.
Two sets of asymptotic critical values are provided for the two polar cases which assume that all
the regressors are, on the one hand, purely I(1) and, on the other, purely I(0). Since these two sets
of critical values provide critical value bounds for all classifications of the regressors into purely
I(1), purely I(0) or mutually cointegrated, we propose a bounds testing procedure. If the computed
Wald or F-statistic falls outside the critical value bounds, a conclusive inference can be drawn
without needing to know the integration/cointegration status of the underlying regressors. However,
if the Wald or F-statistic falls inside these bounds, inference is inconclusive and knowledge of the
order of the integration of the underlying variables is required before conclusive inferences can be
made. A bounds procedure is also provided for the related cointegration test proposed by Banerjee
et al. (1998) which is based on earlier contributions by Banerjee et al. (1986) and Kremers et al.
(1992). Their test is based on the t-statistic associated with the coefficient of the lagged dependent
variable in an unrestricted conditional ECM. The asymptotic distribution of this statistic is obtained
for cases in which all regressors are purely I(1), which is the primary context considered by these
authors, as well as when the regressors are purely I(0) or mutually cointegrated. The relevant
critical value bounds for this t-statistic are also detailed.
The empirical relevance of the proposed bounds procedure is demonstrated in a re-examination
of the earnings equation included in the UK Treasury macroeconometric model. This is a
particularly relevant application because there is considerable doubt concerning the order of
integration of variables such as the degree of unionization of the workforce, the replacement
ratio (unemployment benefit–wage ratio) and the wedge between the ‘real product wage’ and the
‘real consumption wage’ that typically enter the earnings equation. There is another consideration
in the choice of this application. Under the influence of the seminal contributions of Phillips (1958)
and Sargan (1964), econometric analysis of wages and earnings has played an important role in
the development of time series econometrics in the UK. Sargan’s work is particularly noteworthy
as it is some of the first to articulate and apply an ECM to wage rate determination. Sargan,
however, did not consider the problem of testing for the existence of a levels relationship between
real wages and its determinants.
The relationship in levels underlying the UK Treasury’s earning equation relates real average
earnings of the private sector to labour productivity, the unemployment rate, an index of union
density, a wage variable (comprising a tax wedge and an import price wedge) and the replacement
ratio (defined as the ratio of the unemployment benefit to the wage rate). These are the variables
predicted by the bargaining theory of wage determination reviewed, for example, in Layard
et al. (1991). In order to identify our model as corresponding to the bargaining theory of wage
determination, we require that the level of the unemployment rate enters the wage equation, but not
vice versa; see Manning (1993). This assumption, of course, does not preclude the rate of change
of earnings from entering the unemployment equation, or there being other level relationships
between the remaining four variables. Our approach accommodates both of these possibilities.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 291
A number of conditional ECMs in these five variables were estimated and we found that, if a
sufficiently high order is selected for the lag lengths of the included variables, the hypothesis that
there exists no relationship in levels between these variables is rejected, irrespective of whether
they are purely I(0), purely I(1) or mutually cointegrated. Given a level relationship between these
variables, the autoregressive distributed lag (ARDL) modelling approach (Pesaran and Shin, 1999)
is used to estimate our preferred ECM of average earnings.
The plan of the paper is as follows. The vector autoregressive (VAR) model which underpins
the analysis of this and later sections is set out in Section 2. This section also addresses the
issues involved in testing for the existence of relationships in levels between variables. Section 3
considers the Wald statistic (or the F-statistic) for testing the hypothesis that there exists no
level relationship between the variables under consideration and derives the associated asymptotic
theory together with that for the t-statistic of Banerjee et al. (1998). Section 4 discusses the power
properties of these tests. Section 5 describes the empirical application. Section 6 provides some
concluding remarks. The Appendices detail proofs of results given in Sections 3 and 4.
The following notation is used. The symbol ) signifies ‘weak convergence in probability
measure’, Im ‘an identity matrix of order m’, Id ‘integrated of order d’, OP K ‘of the same
order as K in probability’ and oP K ‘of smaller order than K in probability’.
Assumption 1 permits the elements of zt to be purely I(1), purely I(0) or cointegrated but excludes
the possibility of seasonal unit roots and explosive roots.1 Assumption 2 may be relaxed somewhat
to permit fet g1
tD1 to be a conditionally mean zero and homoscedastic process; see, for example,
PSS, Assumption 4.1.
We may re-express the lag polynomial 8L in vector equilibrium correction model (ECM)
form; i.e. 8L 5L C 0L1 L in which the long-run multiplier matrix is defined by 5
1 Assumptions 5a and 5b below further restrict the maximal order of integration of fzt g1
tD1 to unity.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
292 M. H. PESARAN, Y. SHIN AND R. J. SMITH
p p1 i
p iD1 8i , and the short-run response matrix lag polynomial 0L IkC1 iD1 0i L ,
IkC1
0i D jDiC1 j , i D 1, . . . , p 1. Hence, the VAR(p) model (1) may be rewritten in vector
ECM form as
p1
zt D a0 C a1 t C 5zt1 C 0i zti C et t D 1, 2, . . . 2
iD1
where ut ¾ IN0, ωuu , ωuu ωyy wyx Z1 xx wxy and ut is independent of ext . Substitution of (4)
into (2) together with a similar partitioning of a0 D ay0 , a0x0 0 , a1 D ay1 , a0x1 0 , 5 D p0y , 50x 0 ,
0 D g0y , 00x 0 , 0i D g0yi , 00xi 0 , i D 1, . . . , p 1, provides a conditional model for yt in terms of
zt1 , xt , zt1 , . . .; i.e. the conditional ECM
p1
yt D c0 C c1 t C py.x zt1 C y0i zti C w0 xt C ut t D 1, 2, . . . 5
iD1
where w 1 0 0 0 0
xx wxy , c0 ay0 w ax0 , c1 ay1 w ax1 , yi gyi w 0xi , i D 1, . . . , p 1, and
0
py.x py w x . The deterministic relations (3) are modified to
where gy.x gy w0 0x .
We now partition the long-run multiplier matrix 5 conformably with zt D yt , x0t 0 as
!yy pyx
D
pxy 5xx
2 See also Nielsen and Rahbek (1998) for an analysis of similarity issues in cointegrated systems.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 293
t D 1, 2, . . ., where
Under Assumption 4, from (7), we may express 5xx as 5xx D axx b0xx , where axx and bxx are both
k, r matrices of full column rank; see, for example, Engle and Granger (1987) and Johansen
(1991). If the maximal order of integration of the system (8) and (7) is unity, under Assumptions
1, 3 and 4, the process fxt g1tD1 is mutually cointegrated of order r, 0 r k. However, in
contradistinction to, for example, Banerjee, Dolado and Mestre (1998), BDM henceforth, who
concentrate on the case r D 0, we do not wish to impose an a priori specification of r.6 When
pxy D 0 and 5xx D 0, then xt is weakly exogenous for !yy and pyx.x D pyx in (8); see, for example,
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
294 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Johansen (1995, Theorem 8.1, p. 122). In the more general case where 5xx is non-zero, as !yy and
pyx.x D pyx w0 5xx are variation-free from the parameters in (7), xt is also weakly exogenous for
the parameters of (8).
Note that under Assumption 4 the maximal cointegrating rank of the long-run multiplier
matrix 5 for the system (8) and (7) is r C 1 and the minimal cointegrating rank of 5 is r. The
next assumptions provide the conditions for the maximal order of integration of the system (8)
and (7) to be unity. First, we consider the requisite conditions for the case in which rank5 D r.
In this case, under Assumptions 1, 3 and 4, !yy D 0 and pyx f0 5xx D 00 for some k-vector f.
Note that pyx.x D 00 implies the latter condition. Thus, under Assumptions 1, 3 and 4, 5 has rank
r and is given by
0 pyx
D
0 5xx
Hence, we may express 5 D ab0 where a D a0yx , a0xx 0 and b D 0, b0xx 0 are k C 1, r matrices of
full column rank; cf. HJNR, p. 390. Let the columns of the k C 1, k r C 1 matrices a? ?
y ,a
? ? ? ? ? ?
and by , b , where ay , by and a , b are respectively k C 1-vectors and k C 1, k r
matrices, denote bases for the orthogonal complements of respectively a and b; in particular,
a? ? 0 ? ? 0
y , a a D 0 and by , b b D 0.
Assumptions 1, 3, 4 and 5a and 5b permit the two polar cases for fxt g1 1
tD1 . First, if fxt gtD1 is a
purely I0 vector process, then 5xx , and, hence, axx and bxx , are nonsingular. Second, if fxt g1 tD1
is purely I1, then 5xx D 0, and, hence, axx and bxx are also null matrices.
Using (A.1) in Appendix A, it is easily seen that py.x zt m gt D py.x CŁ Let , where
fCŁ Let g is a mean zero stationary process. Therefore, under Assumptions 1, 3, 4 and 5b, that is,
!yy 6D 0, it immediately follows that there exists a conditional level relationship between yt and
xt defined by
yt D (0 C (1 t C qxt C vt , t D 1, 2, . . . 10
where (0 py.x m/!yy , (1 py.x g/!yy , q pyx.x /!yy and vt D py.x CŁ Lεt /!yy , also a zero mean
stationary process. If pyx.x D ˛yy b0yx C ayx w axx b0xx 6D 00 , the level relationship between yt
and xt is non-degenerate. Hence, from (10), yt ¾ I0 if rankbyx , bxx D r and yt ¾ I1 if
rankbyx , bxx D r C 1. In the former case, q is the vector of conditional long-run multipliers and,
in this sense, (10) may be interpreted as a conditional long-run level relationship between yt and
xt , whereas, in the latter, because the processes fyt g1 1
tD1 and fxt gtD1 are cointegrated, (10) represents
the conditional long-run level relationship between yt and xt . Two degenerate cases arise. First,
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 295
if !yy 6D 0 and pyx.x D 00 , clearly, from (10), yt is (trend) stationary or yt ¾ I0 whatever the
value of r. Consequently, the differenced variable yt depends only on its own lagged level yt1
in the conditional ECM (8) and not on the lagged levels xt1 of the forcing variables. Second, if
!yy D 0, that is, Assumption 5a holds, and pyx.x D ayx w0 axx b0xx 6D 00 , as rank5 D r, pyx.x D
f w0 axx b0xx which, from the above, yields pyx.x xt mx gx t D py.x CŁ Let , t D 1, 2, . . .,
where m D )y , m0x 0 and g D *y , g0x 0 are partitioned conformably with zt D yt , x0t 0 . Thus, in
(8), yt depends only on the lagged level xt1 through the linear combination f w0 axx of the
lagged mutually cointegrating relations b0xx xt1 for the process fxt g1 tD1 . Consequently, yt ¾ I1
whatever the value of r. Finally, if both !yy D 0 and pyx.x D 00 , there are no level effects in the
conditional ECM (8) with no possibility of any level relationship between yt and xt , degenerate
or otherwise, and, again, yt ¾ I1 whatever the value of r.
Therefore, in order to test for the absence of level effects in the conditional ECM (8) and, more
crucially, the absence of a level relationship between yt and xt , the emphasis in this paper is a
test of the joint hypothesis !yy D 0 and pyx.x D 00 in (8).7,8 In contradistinction, the approach of
BDM may be described in terms of (8) using Assumption 5b:
yt D c0 C c1 t C ˛yy ˇyy yt1 C b0yx xt1 C ayx w0 axx b0xx xt1
p1
C y0i zti C w0 xt C ut 11
iD1
BDM test for the exclusion of yt1 in (11) when r D 0, that is, bxx D 0 in (11) or 5xx D 0 in
(7) and, thus, fxt g is purely I1; cf. HJNR and PSS.9 Therefore, BDM consider the hypothesis
˛yy D 0 (or !yy D 0).10 More generally, when 0 < r k, BDM require the imposition of the
untested subsidiary hypothesis ayx w0 axx D 00 ; that is, the limiting distribution of the BDM test
is obtained under the joint hypothesis !yy D 0 and pyx.x D 0 in (8).
In the following sections of the paper, we focus on (8) and differentiate between five cases of
interest delineated according to how the deterministic components are specified:
ž Case I (no intercepts; no trends) c0 D 0 and c1 D 0. That is, m D 0 and g D 0. Hence, the
ECM (8) becomes
p1
yt D !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 12
iD1
ž Case II (restricted intercepts; no trends) c0 D !yy , pyx.x m and c1 D 0. Here, g D 0. The
ECM is
p1
yt D !yy yt1 )y C pyx.x xt1 mx C y0i zti C w0 xt C ut 13
iD1
7 This joint hypothesis may be justified by the application of Roy’s union-intersection principle to tests of ! D 0
yy
in (8) given pyx.x . Let W!yy pyx.x be the Wald statistic for testing !yy D 0 for a given value of pyx.x . The test
max!yx.x W!yy pyx.x is identical to the Wald test of !yy D 0 and pyx.x D 0 in (8).
8 A related approach to that of this paper is Hansen’s (1995) test for a unit root in a univariate time series which, in our
context, would require the imposition of the subsidiary hypothesis pyx.x D 00 .
9 The BDM test is based on earlier contributions of Kremers et al. (1992), Banerjee et al. (1993), and Boswijk (1994).
10 Partitioning 0 D g 0 0
xi xy,i , 0xx,i , i D 1, . . . , p 1, conformably with zt D yt , xt , BDM also set gxy,i D 0, i D
1, . . . , p 1, which implies gxy D 0, where 0x D gxy , 0xx ; that is, yt does not Granger cause xt .
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
296 M. H. PESARAN, Y. SHIN AND R. J. SMITH
p1
yt D c0 C !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 14
iD1
p1
yt D c0 C !yy yt1 *y t C pyx.x xt1 gx t C y0i zti C w0 xt C ut 15
iD1
p1
yt D c0 C c1 t C !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 16
iD1
It should be emphasized that the DGPs for Cases II and III are treated as identical as are those
for Cases IV and V. However, as in the test for a unit root proposed by Dickey and Fuller (1979)
compared with that of Dickey and Fuller (1981) for univariate models, estimation and hypothesis
testing in Cases III and V proceed ignoring the constraints linking respectively the intercept and
trend coefficient, c0 and c1 , to the parameter vector !yy , pyx.x whereas Cases II and IV fully
incorporate the restrictions in (9).
In the following exposition, we concentrate on Case IV, that is, (15), which may be specialized
to yield the remainder.
However, as indicated in Section 2, not only does the alternative hypothesis H1 of (17) cover the
case of interest in which !yy 6D 0 and pyx.x 6D 00 but also permits !yy 6D 0, pyx.x D 00 and !yy D 0
and pyx.x 6D 00 ; cf. (8). That is, the possibility of degenerate level relationships between yt and xt
is admitted under H1 of (18). We comment further on these alternatives at the end of this section.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 297
For ease of exposition, we consider Case IV and rewrite (15) in matrix notation as
y D iT c0 C ZŁ1 pŁy.x C Z y C u 19
where iT is a T-vector of ones, y y1 , . . . , yT 0 , X x1 , . . . , xT 0 , Zi
z1i , . . . , zTi 0 , i D 1, . . . , p 1, y w0 , y01 , . . . , y0p1 0 , Z X, Z1 , . . . ,
Z1p , ZŁ1 tT , Z1 , tT 1, . . . , T0 , Z1 z0 , . . . , zT1 0 , u u1 , . . . , uT 0 and
g0 !yy
pŁy.x D
IkC1 p0yx.x
The least squares (LS) estimator of pŁy.x is given by:
0
p̂Ły.x Z̃Ł1 P Ł 1 Ł0
Z Z̃1 Z̃1 P
Z y
20
P. Z , y
where Z̃Ł1 P. ZŁ1 , Z P. y, P. IT iT i0 iT 1 i0 and P
T T Z IT
0 0
1
Z Z Z Z . The Wald and the F-statistics for testing the null hypothesis H0 of
(17) against the alternative hypothesis H1 of (18) are respectively:
0 0 W
W p̂Ły.x Z̃Ł1 P Ł Ł
Z Z̃1 p̂y.x /ωO uu , F 21
kC2
where ωO uu T m1 TtD1 uQ t2 , m k C 1p C 1 C 1 is the number of estimated coefficients
and uQ t , t D 1, 2, . . . , T, are the least squares (LS) residuals from (19).
The next theorem presents the asymptotic null distribution of the Wald statistic; the limit
behaviour of the F-statistic is a simple corollary and is not presented here or subsequently.
Let WkrC1 a Wu a, Wkr a0 0 denote a k r C 1-dimensional standard Brownian motion
partitioned into the scalar and k r-dimensional sub-vector independent standard Brownian
motions Wu a and Wkr a, a 2 [0, 1]. We will also require the corresponding 1 de-meaned k
r C 1-vector standard Brownian motion W̃krC1 a WkrC1 a 0 WkrC1 ada, and de-
meaned
and
de-trended k r C 1-vector standard Brownian motion ŴkrC1 a W̃krC1 a
1
12 a 12 0 a 12 W̃krC1 ada, and their respective partitioned counterparts W̃krC1 a D
WQ u a, W̃kr a0 0 , and ŴkrC1 a D W O u a, Ŵkr a0 0 , a 2 [0, 1].
Theorem 3.1 (Limiting distribution of W) If Assumptions 1–4 and 5a hold, then under H0 :
!yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic distribution of the Wald statistic W of
(21) has the representation
1
1 1
1
W ) z0r zr C dWu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 adWu a 22
0 0 0
The asymptotic distribution of the Wald statistic W of (21) depends on the dimension and
cointegration rank of the forcing variables fxt g, k and r respectively. In Case IV, referring to
(11), the first component in (22), z0r zr ¾ / 2 r, corresponds to testing for the exclusion of the r-
dimensional stationary vector b0xx xt1 , that is, the hypothesis ayx w0 axx D 00 , whereas the second
term in (22), which is a non-standard Dickey–Fuller unit-root distribution, corresponds to testing
for the exclusion of the k r C 1-dimensional I1 vector b? ? 0
y , b zt1 and, in Cases II and
IV, the intercept and time-trend respectively or, equivalently, ˛yy D 0.
We specialize Theorem 3.1 to the two polar cases in which, first, the process for the forcing
variables fxt g is purely integrated of order zero, that is, r D k and 5xx is of full rank, and, second,
the fxt g process is not mutually cointegrated, r D 0, and, hence, the fxt g process is purely integrated
of order one.
Corollary 3.1 (Limiting distribution of W if fxt g ¾ I0). If Assumptions 1–4 and 5a hold
and r D k, that is, fxt g ¾ I0, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the
asymptotic distribution of the Wald statistic W of (21) has the representation
1
FadWu a2
W ) z0k zk C 0 1 23
0 Fa2 da
Corollary 3.2 (Limiting distribution of W if fxt g ¾ I1). If Assumptions 1–4 and 5a hold
and r D 0, that is, fxt g ¾ I1, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the
asymptotic distribution of the Wald statistic W of (21) has the representation
1
1 1
1
0 0
W) dWu aFkC1 a FkC1 aFkC1 a da FkC1 adWu a
0 0 0
where FkC1 a is defined in Theorem 3.1 for Cases I–V, a 2 [0, 1].
In practice, however, it is unlikely that one would possess a priori knowledge of the rank r
of 5xx ; that is, the cointegration rank of the forcing variables fxt g or, more particularly, whether
fxt g ¾ I0 or fxt g ¾ I1. Long-run analysis of (12)–(16) predicated on a prior determination
of the cointegration rank r in (7) is prone to the possibility of a pre-test specification error;
see, for example, Cavanagh et al. (1995). However, it may be shown by simulation that the
asymptotic critical values obtained from Corollaries 3.1 (r D k and fxt g ¾ I0) and 3.2 (r D 0
and fxt g ¾ I1) provide lower and upper bounds respectively for those corresponding to the
general case considered in Theorem 3.1 when the cointegration rank of the forcing variables
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 299
fxt g process is 0 r k.11 Hence, these two sets of critical values provide critical value
bounds covering all possible classifications of fxt g into I0, I1 and mutually cointegrated
processes. Asymptotic critical value bounds for the F-statistics covering Cases I–V are set out in
Tables CI(i)–CI(v) for sizes 0.100, 0.050, 0.025 and 0.010; the lower bound values assume that
the forcing variables fxt g are purely I0, and the upper bound values assume that fxt g are purely
I1.12
Hence, we suggest a bounds procedure to test H0 : !yy D 0 and pyx.x D 00 of (17) within the
conditional ECMs (12)–(16). If the computed Wald or F-statistics fall outside the critical value
bounds, a conclusive decision results without needing to know the cointegration rank r of the
fxt g process. If, however, the Wald or F-statistic fall within these bounds, inference would be
inconclusive. In such circumstances, knowledge of the cointegration rank r of the forcing variables
fxt g is required to proceed further.
The conditional ECMs (12)–(16), derived from the underlying VAR(p) model (2), may also be
interpreted as an autoregressive distributed lag model of orders (p, p, . . . , p) (ARDL(p, . . . , p)).
However, one could also allow for differential lag lengths on the lagged variables yti and
xti in (2) to arrive at, for example, an ARDL(p, p1 , . . . , pk ) model without affecting the
asymptotic results derived in this section. Hence, our approach is quite general in the sense that
one can use a flexible choice for the dynamic lag structure in (12)–(16) as well as allowing
for short-run feedbacks from the lagged dependent variables, yti , i D 1, . . . , p, to xt in
(7). Moreover, within the single-equation context, the above analysis is more general than the
cointegration analysis of partial systems carried out by Boswijk (1992, 1995), HJNR, Johansen
(1992, 1995), PSS, and Urbain (1992), where it is assumed in addition that 5xx D 0 or xt is purely
I1 in (7).
To conclude this section, we reconsider the approach of BDM. There are three scenarios for
the deterministics given by (12), (14) and (16). Note that the restrictions on the deterministics’
coefficients (9) are ignored in Cases II of (13) and IV of (15) and, thus, Cases II and IV are now
subsumed by Cases III of (14) and V of (16) respectively. As noted below (11), BDM impose
but do not test the implicit hypothesis ayx w0 axx D 00 ; that is, the limiting distributional results
given below are also obtained under the joint hypothesis H0 : !yy D 0 and pyx.x D 00 of (17). BDM
!
test ˛yy D 0 (or H0 yy : !yy D 0) via the exclusion of yt1 in Cases I, III and V. For example, in
Case V, they consider the t-statistic
ŷ01 P
y
Z
,X̂1
t!yy D 1/2
24
ωO uu ŷ01 P
Z ŷ1 1/2
,X̂1
where ωO uu is defined in the line after (21), y P. ,0 y, ŷ1 P. ,0 y1 , y1
T T T T
P. ,0 Z , P. ,0 P.
y0 , . . . , yT1 , X̂1 P.T ,0T X1 , X1 x0 , . . . , xT1 0 , Z
0
T T T T T
0 1 0
P.T tT t0T P.T tT 1 t0T P.T , P
Z ,X̂1 D P Z X̂1 X̂1 P
Z P Z X̂1 X̂1 P
Z and P Z
IT Z Z 0 Z 1 Z 0 .
11 The critical values of the Wald and F-statistics in the general case (not reported here) may be computed via stochastic
simulations with different combinations of values for k and 0 r k.
12 The critical values for the Wald version of the bounds test are given by k C 1 times the critical values of the F-test in
Cases I, III and V, and k C 2 times in Cases II and IV.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
300 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Table CI. Asymptotic critical value bounds for the F-statistic. Testing for the existence of a levels
relationshipa
Table CI(i) Case I: No intercept and no trend
0 3.00 3.00 4.20 4.20 5.47 5.47 7.17 7.17 1.16 1.16 2.32 2.32
1 2.44 3.28 3.15 4.11 3.88 4.92 4.81 6.02 1.08 1.54 1.08 1.73
2 2.17 3.19 2.72 3.83 3.22 4.50 3.88 5.30 1.05 1.69 0.70 1.27
3 2.01 3.10 2.45 3.63 2.87 4.16 3.42 4.84 1.04 1.77 0.52 0.99
4 1.90 3.01 2.26 3.48 2.62 3.90 3.07 4.44 1.03 1.81 0.41 0.80
5 1.81 2.93 2.14 3.34 2.44 3.71 2.82 4.21 1.02 1.84 0.34 0.67
6 1.75 2.87 2.04 3.24 2.32 3.59 2.66 4.05 1.02 1.86 0.29 0.58
7 1.70 2.83 1.97 3.18 2.22 3.49 2.54 3.91 1.02 1.88 0.26 0.51
8 1.66 2.79 1.91 3.11 2.15 3.40 2.45 3.79 1.02 1.89 0.23 0.46
9 1.63 2.75 1.86 3.05 2.08 3.33 2.34 3.68 1.02 1.90 0.20 0.41
10 1.60 2.72 1.82 2.99 2.02 3.27 2.26 3.60 1.02 1.91 0.19 0.37
0 3.80 3.80 4.60 4.60 5.39 5.39 6.44 6.44 2.03 2.03 1.77 1.77
1 3.02 3.51 3.62 4.16 4.18 4.79 4.94 5.58 1.69 2.02 1.01 1.25
2 2.63 3.35 3.10 3.87 3.55 4.38 4.13 5.00 1.52 2.02 0.69 0.96
3 2.37 3.20 2.79 3.67 3.15 4.08 3.65 4.66 1.41 2.02 0.52 0.78
4 2.20 3.09 2.56 3.49 2.88 3.87 3.29 4.37 1.34 2.01 0.42 0.65
5 2.08 3.00 2.39 3.38 2.70 3.73 3.06 4.15 1.29 2.00 0.35 0.56
6 1.99 2.94 2.27 3.28 2.55 3.61 2.88 3.99 1.26 2.00 0.30 0.49
7 1.92 2.89 2.17 3.21 2.43 3.51 2.73 3.90 1.23 2.01 0.26 0.44
8 1.85 2.85 2.11 3.15 2.33 3.42 2.62 3.77 1.21 2.01 0.23 0.40
9 1.80 2.80 2.04 3.08 2.24 3.35 2.50 3.68 1.19 2.01 0.21 0.36
10 1.76 2.77 1.98 3.04 2.18 3.28 2.41 3.61 1.17 2.00 0.19 0.33
0 6.58 6.58 8.21 8.21 9.80 9.80 11.79 11.79 3.05 3.05 7.07 7.07
1 4.04 4.78 4.94 5.73 5.77 6.68 6.84 7.84 2.03 2.52 2.28 2.89
2 3.17 4.14 3.79 4.85 4.41 5.52 5.15 6.36 1.69 2.35 1.23 1.77
3 2.72 3.77 3.23 4.35 3.69 4.89 4.29 5.61 1.51 2.26 0.82 1.27
4 2.45 3.52 2.86 4.01 3.25 4.49 3.74 5.06 1.41 2.21 0.60 0.98
5 2.26 3.35 2.62 3.79 2.96 4.18 3.41 4.68 1.34 2.17 0.48 0.79
6 2.12 3.23 2.45 3.61 2.75 3.99 3.15 4.43 1.29 2.14 0.39 0.66
7 2.03 3.13 2.32 3.50 2.60 3.84 2.96 4.26 1.26 2.13 0.33 0.58
8 1.95 3.06 2.22 3.39 2.48 3.70 2.79 4.10 1.23 2.12 0.29 0.51
9 1.88 2.99 2.14 3.30 2.37 3.60 2.65 3.97 1.21 2.10 0.25 0.45
10 1.83 2.94 2.06 3.24 2.28 3.50 2.54 3.86 1.19 2.09 0.23 0.41
(Continued overleaf )
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 301
0 5.37 5.37 6.29 6.29 7.14 7.14 8.26 8.26 3.17 3.17 2.68 2.68
1 4.05 4.49 4.68 5.15 5.30 5.83 6.10 6.73 2.45 2.77 1.41 1.65
2 3.38 4.02 3.88 4.61 4.37 5.16 4.99 5.85 2.09 2.57 0.92 1.20
3 2.97 3.74 3.38 4.23 3.80 4.68 4.30 5.23 1.87 2.45 0.67 0.93
4 2.68 3.53 3.05 3.97 3.40 4.36 3.81 4.92 1.72 2.37 0.51 0.76
5 2.49 3.38 2.81 3.76 3.11 4.13 3.50 4.63 1.62 2.31 0.42 0.64
6 2.33 3.25 2.63 3.62 2.90 3.94 3.27 4.39 1.54 2.27 0.35 0.55
7 2.22 3.17 2.50 3.50 2.76 3.81 3.07 4.23 1.48 2.24 0.31 0.49
8 2.13 3.09 2.38 3.41 2.62 3.70 2.93 4.06 1.44 2.22 0.27 0.44
9 2.05 3.02 2.30 3.33 2.52 3.60 2.79 3.93 1.40 2.20 0.24 0.40
10 1.98 2.97 2.21 3.25 2.42 3.52 2.68 3.84 1.36 2.18 0.22 0.36
0 9.81 9.81 11.64 11.64 13.36 13.36 15.73 15.73 5.33 5.33 11.35 11.35
1 5.59 6.26 6.56 7.30 7.46 8.27 8.74 9.63 3.17 3.64 3.33 3.91
2 4.19 5.06 4.87 5.85 5.49 6.59 6.34 7.52 2.44 3.09 1.70 2.23
3 3.47 4.45 4.01 5.07 4.52 5.62 5.17 6.36 2.08 2.81 1.08 1.51
4 3.03 4.06 3.47 4.57 3.89 5.07 4.40 5.72 1.86 2.64 0.77 1.14
5 2.75 3.79 3.12 4.25 3.47 4.67 3.93 5.23 1.72 2.53 0.59 0.91
6 2.53 3.59 2.87 4.00 3.19 4.38 3.60 4.90 1.62 2.45 0.48 0.75
7 2.38 3.45 2.69 3.83 2.98 4.16 3.34 4.63 1.54 2.39 0.40 0.64
8 2.26 3.34 2.55 3.68 2.82 4.02 3.15 4.43 1.48 2.35 0.34 0.56
9 2.16 3.24 2.43 3.56 2.67 3.87 2.97 4.24 1.43 2.31 0.30 0.49
10 2.07 3.16 2.33 3.46 2.56 3.76 2.84 4.10 1.40 2.28 0.26 0.44
a The critical values are computed via stochastic simulations using T D 1000 and 40,000 replications for the F-statistic
for testing f D 0 in the regression: yt D f zt1 C a wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and
zt1 D yt1 , x0t1 0 , wt D 0 Case I
z 0 0
t1 D yt1 , xt1 , 1 , wt D 0 Case II
zt1 D yt1 , x0t1 0 , wt D 1 Case III
z D yt1 , x0t1 , t0 , wt D 1 Case IV
t1
zt1 D yt1 , x0t1 0 , wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0 and
et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0 if xt
is purely I0. The critical values for k D 0 correspond to the squares of the critical values of Dickey and Fuller’s (1979)
unit root t-statistics for Cases I, III and V, while they match those for Dickey and Fuller’s (1981) unit root F-statistics
for Cases II and IV. The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0,
while the columns headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
302 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Theorem 3.2 (Limiting distribution of t!yy ). If Assumptions 1-4 and 5a hold and gxy D 0, where
0x D gxy , 0xx , then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic
distribution of the t-statistic t!yy of (24) has the representation
1
1 1/2
dWu aFkr a Fkr a2 da 25
0 0
where
1 1
1
Wu a 0 Wu aWkr a0 da 0 Wkr aWkr a0 da Wkr a Case I
1
1 1
Fkr a D WQ u a W Q u aW̃kr a da
0
W̃ aW̃ a0
da W̃ a Case III
0
0 kr kr
1
kr
O 1 1
Wu a 0 WO u aŴkr a da
0
Ŵkr aŴkr a0
da Ŵkr a Case V
0
r D 0, . . . , k, and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
The form of the asymptotic representation (25) is similar to that of a Dickey–Fuller test for
a unit root except that the standard Brownian motion Wu a is replaced by the residual from
an asymptotic regression of Wu a on the independent (k r)-vector standard Brownian motion
Wkr a (or their de-meaned and de-meaned and de-trended counterparts).
Similarly to the analysis following Theorem 3.1, we detail the limiting distribution of the t-
statistic t!yy in the two polar cases in which the forcing variables fxt g are purely integrated of
order zero and one respectively.
Corollary 3.3 (Limiting distribution of t!yy if fxt g ¾ I0). If Assumptions 1-4 and 5a hold
and r D k, that is, fxt g ¾ I0, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the
asymptotic distribution of the t-statistic t!yy of (24) has the representation
1
1 1/2
2
dWu aFa Fa da
0 0
Wu a Case I
where
Fa D Q u a Case III
W
O u a Case V
W
and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
Corollary 3.4 (Limiting distribution of t!yy if fxt g ¾ I1). If Assumptions 1-4 and 5a hold,
!
gxy D 0, where 0x D gxy , 0xx , and r D 0, that is, fxt g ¾ I1, then under H0 yy : !yy D 0, as
T ! 1, the asymptotic distribution of the t-statistic t!yy of (24) has the representation
1
1 1/2
2
dWu aFk a Fk a da
0 0
where Fk a is defined in Theorem 3.2 for Cases I, III and V, a 2 [0, 1].
As above, it may be shown by simulation that the asymptotic critical values obtained from
Corollaries 3.3 (r D k and fxt g is purely I0) and 3.4 (r D 0 and fxt g is purely I1) provide
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 303
lower and upper bounds respectively for those corresponding to the general case considered in
!
Theorem 3.2. Hence, a bounds procedure for testing H0 yy : !yy D 0 based on these two polar cases
may be implemented as described above based on the t-statistic t!yy for the exclusion of yt1 in
the conditional ECMs (12), (14) and (16) without prior knowledge of the cointegrating rank r.13
These asymptotic critical value bounds are given in Tables CII(i), CII(iii) and CII(v) for Cases I,
III and V for sizes 0.100, 0.050, 0.025 and 0.010.
As is emphasized in the Proof of Theorem 3.2 given in Appendix A, if the asymptotic analysis
!
for the t-statistic t!yy of (24) is conducted under H0 yy : !yy D 0 only, the resultant limit distribution
for t!yy depends on the nuisance parameter w f in addition to the cointegrating rank r, where,
under Assumption 5a, ayx f0 axx D 00 . Moreover, if yt is allowed to Granger-cause xt , that is,
gxy,i 6D 0 for some i D 1, . . . , p 1, then the limit distribution also is dependent on the nuisance
parameter gxy /*yy f0 gxy ; see Appendix A. Consequently, in general, where w 6D f or gxy 6D 0,
Table CII. Asymptotic critical value bounds of the t-statistic. Testing for the existence of a levels relationshipa
Table CII(i): Case I: No intercept and no trend
0 1.62 1.62 1.95 1.95 2.24 2.24 2.58 2.58 0.42 0.42 0.98 0.98
1 1.62 2.28 1.95 2.60 2.24 2.90 2.58 3.22 0.42 0.98 0.98 1.12
2 1.62 2.68 1.95 3.02 2.24 3.31 2.58 3.66 0.42 1.39 0.98 1.12
3 1.62 3.00 1.95 3.33 2.24 3.64 2.58 3.97 0.42 1.71 0.98 1.09
4 1.62 3.26 1.95 3.60 2.24 3.89 2.58 4.23 0.42 1.98 0.98 1.07
5 1.62 3.49 1.95 3.83 2.24 4.12 2.58 4.44 0.42 2.22 0.98 1.05
6 1.62 3.70 1.95 4.04 2.24 4.34 2.58 4.67 0.42 2.43 0.98 1.04
7 1.62 3.90 1.95 4.23 2.24 4.54 2.58 4.88 0.42 2.63 0.98 1.04
8 1.62 4.09 1.95 4.43 2.24 4.72 2.58 5.07 0.42 2.81 0.98 1.04
9 1.62 4.26 1.95 4.61 2.24 4.89 2.58 5.25 0.42 2.98 0.98 1.04
10 1.62 4.42 1.95 4.76 2.24 5.06 2.58 5.44 0.42 3.15 0.98 1.03
0 2.57 2.57 2.86 2.86 3.13 3.13 3.43 3.43 1.53 1.53 0.72 0.71
1 2.57 2.91 2.86 3.22 3.13 3.50 3.43 3.82 1.53 1.80 0.72 0.81
2 2.57 3.21 2.86 3.53 3.13 3.80 3.43 4.10 1.53 2.04 0.72 0.86
3 2.57 3.46 2.86 3.78 3.13 4.05 3.43 4.37 1.53 2.26 0.72 0.89
4 2.57 3.66 2.86 3.99 3.13 4.26 3.43 4.60 1.53 2.47 0.72 0.91
5 2.57 3.86 2.86 4.19 3.13 4.46 3.43 4.79 1.53 2.65 0.72 0.92
6 2.57 4.04 2.86 4.38 3.13 4.66 3.43 4.99 1.53 2.83 0.72 0.93
7 2.57 4.23 2.86 4.57 3.13 4.85 3.43 5.19 1.53 3.00 0.72 0.94
8 2.57 4.40 2.86 4.72 3.13 5.02 3.43 5.37 1.53 3.16 0.72 0.96
9 2.57 4.56 2.86 4.88 3.13 5.18 3.42 5.54 1.53 3.31 0.72 0.96
10 2.57 4.69 2.86 5.03 3.13 5.34 3.43 5.68 1.53 3.46 0.72 0.96
(Continued overleaf )
!
13 Although Corollary 3.3 does not require gxy D 0 and H0 yx.x : pyx.x D 00 is automatically satisfied under the conditions
!
of Corollary 3.4, the simulation critical value bounds result requires gxy D 0 and H0 yx.x : pyx.x D 00 for 0 < r < k.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
304 M. H. PESARAN, Y. SHIN AND R. J. SMITH
0 3.13 3.13 3.41 3.41 3.65 3.66 3.96 3.97 2.18 2.18 0.57 0.57
1 3.13 3.40 3.41 3.69 3.65 3.96 3.96 4.26 2.18 2.37 0.57 0.67
2 3.13 3.63 3.41 3.95 3.65 4.20 3.96 4.53 2.18 2.55 0.57 0.74
3 3.13 3.84 3.41 4.16 3.65 4.42 3.96 4.73 2.18 2.72 0.57 0.79
4 3.13 4.04 3.41 4.36 3.65 4.62 3.96 4.96 2.18 2.89 0.57 0.82
5 3.13 4.21 3.41 4.52 3.65 4.79 3.96 5.13 2.18 3.04 0.57 0.85
6 3.13 4.37 3.41 4.69 3.65 4.96 3.96 5.31 2.18 3.20 0.57 0.87
7 3.13 4.53 3.41 4.85 3.65 5.14 3.96 5.49 2.18 3.34 0.57 0.88
8 3.13 4.68 3.41 5.01 3.65 5.30 3.96 5.65 2.18 3.49 0.57 0.90
9 3.13 4.82 3.41 5.15 3.65 5.44 3.96 5.79 2.18 3.62 0.57 0.91
10 3.13 4.96 3.41 5.29 3.65 5.59 3.96 5.94 2.18 3.75 0.57 0.92
a The critical values are computed via stochastic simulations using T D 1000 and 40 000 replications for the t-statistic for
testing 2 D 0 in the regression: yt D 2yt1 C d0 xt1 C a0 wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and
wt D 0 Case I
wt D 1 Case III
wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0
and et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0
if xt is purely I0. The critical values for k D 0 correspond to those of Dickey and Fuller’s (1979) unit root t-statistics.
The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0, while the columns
headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
!
although the t-statistic t!yy has a well-defined limiting distribution under H0 yy : !yy D 0, the above
!
bounds testing procedure for H0 yy : !yy D 0 based on t!yy is not asymptotically similar.14
Consequently, in the light of the consistency results for the above statistics discussed in
Section 4, see Theorems 4.1, 4.2 and 4.4, we suggest the following procedure for ascertaining
the existence of a level relationship between yt and xt : test H0 of (17) using the bounds procedure
based on the Wald or F-statistic of (21) from Corollaries 3.1 and 3.2: (a) if H0 is not rejected,
!
proceed no further; (b) if H0 is rejected, test H0 yy : !yy D 0 using the bounds procedure based on
!
the t-statistic t!yy of (24) from Corollaries 3.3 and 3.4. If H0 yy : !yy D 0 is false, a large value of
t!yy should result, at least asymptotically, confirming the existence of a level relationship between
yt and xt , which, however, may be degenerate (if pyx.x D 00 ).
!
14 In principle, the asymptotic distribution of t!yy under H0 yy : !yy D 0 may be simulated from the limiting representation
2 2
given in the Proof of Theorem 3.2 of Appendix A after substitution of consistent estimators for f and lxy gxy /*yy.x under
!yy 2 0
H0 : !yy D 0, where *yy.x *yy f *xy . Although such estimators may be obtained straightforwardly, unfortunately,
they necessitate the use of parameter estimators from the marginal ECM (7) for fxt g1 tD1 .
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 305
of the Wald statistic of (21) under a sequence of local alternatives. Finally, we show that the
bounds procedure based on the t-statistic of (24) is consistent.
In the discussion of the consistency of the bounds test procedure based on the Wald statistic
of (21), because the rank of the long-run multiplier matrix 5 may be either r or r C 1 under the
! ! ! !
alternative hypothesis H1 D H1 yy [ H1 yx.x of (18) where H1 yy : !yy 6D 0 and H1 yx.x : pyx.x 6D 00 , it is
!yy
necessary to deal with these two possibilities. First, under H1 : !yy 6D 0, the rank of 5 is r C 1 so
!
Assumption 5b applies; in particular, ˛yy 6D 0. Second, under H0 yy : !yy D 0, the rank of 5 is r so
!yx.x
Assumption 5a applies; in this case, H1 : pyx.x 6D 00 holds and, in particular, ayx w0 axx 6D 00 .
!
Theorem 4.1 (Consistency of the Wald statistic bounds test procedure under H1 yy ). If Assumptions
!
1-4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the Wald statistic W (21) is consistent against
!yy
H1 : !yy 6D 0 in Cases I–V defined in (12)–(16).
! !
Theorem 4.2 (Consistency of the Wald statistic bounds test procedure under H1 yx.x \ H0 yy ). If
! !
Assumptions 1–4 and 5a hold, then under H1 yx.x : pyx.x 6D 00 of (18) and H0 yy : !yy D 0 of (17) the
!yx.x 0
Wald statistic W (21) is consistent against H1 : pyx.x 6D 0 in Cases I–V defined in (12)–(16).
Hence, combining Theorems 4.1 and 4.2, the bounds procedure of Section 3 based on the Wald
! ! ! !
statistic W (21) defines a consistent test of H0 D H0 yy \ H0 yx.x of (17) against H1 D H1 yy [ H1 yx.x
of (18). This result holds irrespective of whether the forcing variables fxt g are purely I0, purely
I1 or mutually cointegrated.
We now turn to consider the asymptotic distribution of the Wald statistic (21) under a suitably
specified sequence of local alternatives. Recall that under Assumption 5b, py.x [D !yy , pyx.x ] D
˛yy ˇyy , ˛yy b0xy C ayx w0 axx b0xx . Consequently, we define the sequence of local alternatives
H1T : py.xT [D !yyT , pyx.xT ] D T1 ˛yy ˇyy , T1 ˛yy b0xy C T1/2 dyx w0 dxx b0xx 26
In order to detail the limit distribution of the Wald statistic under the sequence of local alterna-
tives H1T of (26), it is necessary to define the (k r C 1)-dimensional Ornstein–Uhlenbeck pro-
cess JŁkrC1 a D JŁu a, JŁkr a 0 0
which obeys the stochastic integral and differential equations,
0 a Ł
JkrC1 a D WkrC1 a C ab 0 JkrC1 r dr and dJŁkrC1 a D dWkrC1 a C ab0 JŁkrC1 a da,
Ł
Theorem 4.3 (Limiting distribution of W under H1T ). If Assumptions 1–4 and 5a hold, then under
H1T : !y.x D T1 ˛yy b0y C T1/2 dyx w0 dxx b0 of (26), as T ! 1, the asymptotic distribution of
the Wald statistic W of (21) has the representation
1
1 1
1
W ) z0r zr C dJŁu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 a dJŁu a 28
0 0 0
0
where zr ¾ NQ1/2 h, Ir , Q[D Q1/20 Q1/2 ] D p limT!1 T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ , h dyx w
0
The first component of (28) z0r zr is non-central chi-square distributed with r degrees of
!
freedom and non-centrality parameter h0 Qh and corresponds to the local alternative H1Tyx.x :
!
pyx.xT D T1/2 dyx w0 dxx b0xx under H0 : !yy D 0. The second term in (28) is a non-standard
yy
!
Dickey–Fuller unit-root distribution under the local alternative H1Tyy : !yyT D T1 ˛yy ˇyy and
dyx w0 dxx D 00 . Note that under H0 of (17), that is, ˛yy D 0 and dyx w0 dxx D 00 , the limiting
representation (28) reduces to (22) as should be expected.
The proof for the consistency of the bounds test procedure based on the t-statistic of (24)
requires that the rank of the long-run multiplier matrix 5 is r C 1 under the alternative hypothesis
!
H1 yy : !yy 6D 0. Hence, Assumption 5b applies; in particular, ˛yy 6D 0.
!
Theorem 4.4 (Consistency of the t-statistic bounds test procedure under H1 yy ). If Assumptions
!
1–4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the t-statistic t!yy (24) is consistent against
!yy
H1 : !yy 6D 0 in Cases I, III and V defined in (12), (14) and (16).
As noted at the end of Section 3, Theorem 4.4 suggests the possibility of using t!yy to
! ! !
discriminate between H0 yy : !yy D 0 and H1 yy : !yy 6D 0, although, if H0 yx.x : pyx.x D 00 is false,
the bounds procedure given via Corollaries 3.3 and 3.4 is not asymptotically similar.
utility. Following Darby and Wren-Lewis (1993), the theoretical real wage equation underlying
the Treasury’s earnings equation is given by
Prodt
wt D 29
1 C fURt 1 RRt /Uniont
where wt is the real wage, Prodt is labour productivity, RRt is the replacement ratio defined as
the ratio of unemployment benefit to the wage rate, Uniont is a measure of ‘union power’, and
fURt is the probability of a union member becoming unemployed, which is assumed to be an
increasing function of the unemployment rate URt . The econometric specification is based on a
log-linearized version of (29) after allowing for a wedge effect that takes account of the difference
between the ‘real product wage’ which is the focus of the firms’ decision, and the ‘real consumption
wage’ which concerns the union.15 The theoretical arguments for a possible long-run wedge effect
on real wages is mixed and, as emphasized by CSW, whether such long-run effects are present
is an empirical matter. The change in the unemployment rate (URt ) is also included in the
Treasury’s wage equation. CSW cite two different theoretical rationales for the inclusion of URt
in the wage equation: the differential moderating effects of long- and short-term unemployed
on real wages, and the ‘insider–outsider’ theories which argue that only rising unemployment
will be effective in significantly moderating wage demands. See Blanchard and Summers (1986)
and Lindbeck and Snower (1989). The ARDL model and its associated unrestricted equilibrium
correction formulation used here automatically allow for such lagged effects.
We begin our empirical analysis from the maintained assumption that the time series properties
of the key variables in the Treasury’s earnings equation can be well approximated by a log-linear
VARp model, augmented with appropriate deterministics such as intercepts and time trends.
To ensure comparability of our results with those of the Treasury, the replacement ratio is not
included in the analysis. CSW, p. 50, report that ‘... it has not proved possible to identify a
significant effect from the replacement ratio, and this had to be omitted from our specification’.16
Also, as in CSW, we include two dummy variables to account for the effects of incomes policies
on average earnings. These dummy variables are defined by
D7475t D 1, over the period 1974q1 1975q4, 0 elsewhere
D7579t D 1, over the period 1975q1 1979q4, 0 elsewhere
The asymptotic theory developed in the paper is not affected by the inclusion of such ‘one-
off’ dummy variables.17 Let zt D wt , Prodt , URt , Wedget , Uniont 0 D wt , x0t 0 . Then, using the
analysis of Section 2, the conditional ECM of interest can be written as
p1
wt D c0 C c1 t C c2 D7475t C c3 D7579t C !ww wt1 C pwx.x xt1 C y0i zti C d0 xt C ut
iD1
30
15 The wedge effect is further decomposed into a tax wedge and an import price wedge in the Treasury model, but this
decomposition is not pursued here.
16 It is important, however, that, at a future date, a fresh investigation of the possible effects of the replacement ratio on
real wages should be undertaken.
17 However, both the asymptotic theory and associated critical values must be modified if the fraction of periods in which
the dummy variables are non-zero does not tend to zero with the sample size T. In the present application, both dummy
variables included in the earning equation are zero after 1979, and the fractions of observations where D7475t and D7579t
are non-zero are only 7.6% and 19.2% respectively.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
308 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Under the assumption that lagged real wages, wt1 , do not enter the sub-VAR model for xt ,
the above real wage equation is identified and can be estimated consistently by LS.18 Notice,
however, that this assumption does not rule out the inclusion of lagged changes in real wages in
the unemployment or productivity equations, for example. The exclusion of the level of real wages
from these equations is an identification requirement for the bargaining theory of wages which
permits it to be distinguished from other alternatives, such as the efficiency wage theory which
postulates that labour productivity is partly determined by the level of real wages.19 It is clear
that, in our framework, the bargaining theory and the efficiency wage theory cannot be entertained
simultaneously, at least not in the long run.
The above specification is also based on the assumption that the disturbances ut are serially
uncorrelated. It is therefore important that the lag order p of the underlying VAR is selected
appropriately. There is a delicate balance between choosing p sufficiently large to mitigate the
residual serial correlation problem and, at the same time, sufficiently small so that the conditional
ECM (30) is not unduly over-parameterized, particularly in view of the limited time series data
which are available.
Finally, a decision must be made concerning the time trend in (30) and whether its coefficient
should be restricted.20 This issue can only be settled in light of the particular sample period under
consideration. The time series data used are quarterly, cover the period 1970q1-1997q4, and are
seasonally adjusted (where relevant).21 To ensure comparability of results for different choices of
p, all estimations use the same sample period, 1972q1–1997q4 (T D 104), with the first eight
observations reserved for the construction of lagged variables.
The five variables in the earnings equation were constructed from primary sources in the fol-
lowing manner: wt D lnERPRt /PYNONGt , Wedget D ln1 C TEt C ln1 TDt lnRPIXt /
PYNONGt , URt D ln100 ð ILOUt /ILOUt C WFEMPt , Prodt D lnYPROMt C 278.29 ð
YMFt /EMFt C ENMFt , and Uniont D lnUDENt , where ERPRt is average private sector
earnings per employee (£), PYNONGt is the non-oil non-government GDP deflator, YPROMt
is output in the private, non-oil, non-manufacturing, and public traded sectors at constant fac-
tor cost (£ million, 1990), YMFt is the manufacturing output index adjusted for stock changes
(1990 D 100), EMFt and ENMFt are respectively employment in UK manufacturing and non-
manufacturing sectors (thousands), ILOUt is the International Labour Office (ILO) measure
of unemployment (thousands), WFEMPt is total employment (thousands), TEt is the average
employers’ National Insurance contribution rate, TDt is the average direct tax rate on employ-
ment incomes, RPIXt is the Retail Price Index excluding mortgage payments, and UDENt is
union density (used to proxy ‘union power’) measured by union membership as a percentage of
employment.22 The time series plots of the five variables included in the VAR model are given in
Figures 1–3.
18 See Assumption 3 and the following discussion. By construction, the contemporaneous effects x are uncorrelated
t
with the disturbance term ut and instrumental variable estimation which has been particularly popular in the empirical
wage equation literature is not necessary. Indeed, given the unrestricted nature of the lag distribution of the conditional
ECM (30), it is difficult to find suitable instruments: namely, variables that are not already included in the model, which
are uncorrelated with ut and also have a reasonable degree of correlation with the included variables in (30).
19 For a discussion of the issues that surround the identification of wage equations, see Manning (1993).
20 See, for example, PSS and the discussion in Section 2.
21 We are grateful to Andrew Gurney and Rod Whittaker for providing us with the data. For further details about the
sources and the descriptions of the variables, see CSW, pp. 46–51 and p. 11 of the Annex.
22 The data series for UDEN assumes a constant rate of unionization from 1980q4 onwards.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 309
(a)
4.0
3.5
Real Wages
3.0
Log Scale
2.5
2.0
1.5 Productivity
1.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
(b)
0.04
0.03
Real Wage
0.02
0.01
0.00
−0.01
−0.02
Productivity
−0.03
−0.04
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
Figure 1. (a) Real wages and labour productivity. (b) Rate of change of real wages and labour productivity
It is clear from Figure 1 that real wages (average earnings) and productivity show steadily rising
trends with real wages growing at a faster rate than productivity.23 This suggests, at least initially,
that a linear trend should be included in the real wage equation (30). Also the application of unit
root tests to the five variables, perhaps not surprisingly, yields mixed results with strong evidence
in favour of the unit root hypothesis only in the cases of real wages and productivity. This does
not necessarily preclude the other three variables (UR, Wedge, and Union) having levels impact
on real wages. Following the methodology developed in this paper, it is possible to test for the
existence of a real wage equation involving the levels of these five variables irrespective of whether
they are purely I0, purely I1, or mutually cointegrated.
23 Over the period 1972q1– 97q4, real wages grew by 2.14% per annum as compared to labour productivity that increased
by an annual average rate of 1.54% over the same period.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
310 M. H. PESARAN, Y. SHIN AND R. J. SMITH
−0.2
−0.3
UNION
−0.4
−0.5
−0.6
WEDGE
−0.7
−0.8
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
3.0
2.5
2.0
Log Scale
UR
1.5
1.0
0.5
0.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
To determine the appropriate lag length p and whether a deterministic linear trend is required
in addition to the productivity variable, we estimated the conditional model (30) by LS, with
and without a linear time trend, for p D 1, 2, . . . , 7. As pointed out earlier, all regressions were
computed over the same period 1972q1–1997q4. We found that lagged changes of the productivity
variable, Prodt1 , Prodt2 , . . . , were insignificant (either singly or jointly) in all regressions.
Therefore, for the sake of parsimony and to avoid unnecessary over-parameterization, we decided
to re-estimate the regressions without these lagged variables, but including lagged changes of
all other variables. Table I gives Akaike’s and Schwarz’s Bayesian Information Criteria, denoted
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 311
respectively by AIC and SBC, and Lagrange multiplier (LM) statistics for testing the hypothesis
2 2
of no residual serial correlation against orders 1 and 4 denoted by /SC 1 and /SC 4 respectively.
As might be expected, the lag order selected by AIC, p aic D 6, irrespective of whether a
deterministic trend term is included or not, is much larger than that selected by SBC. This latter
criterion gives estimates p sbc D 1 if a trend is included and p
sbc D 4 if not. The /SC
2
statistics also
suggest using a relatively high lag order: 4 or more. In view of the importance of the assumption
of serially uncorrelated errors for the validity of the bounds tests, it seems prudent to select p to
be either 5 or 6.24 Nevertheless, for completeness, in what follows we report test results for p D 4
and 5, as well as for our preferred choice, namely p D 6. The results in Table I also indicate
that there is little to choose between the conditional ECM with or without a linear deterministic
trend.
Table II gives the values of the F- and t-statistics for testing the existence of a level earnings
equation under three different scenarios for the deterministics, Cases III, IV and V of (14), (15)
and (16) respectively; see Sections 2 and 3 for detailed discussions.
The various statistics in Table II should be compared with the critical value bounds provided
in Tables CI and CII. First, consider the bounds F-statistic. As argued in PSS, the statistic FIV
which sets the trend coefficient to zero under the null hypothesis of no level relationship, Case
IV of (15), is more appropriate than FV , Case V of (16), which ignores this constraint. Note that,
if the trend coefficient c1 is not subject to this restriction, (30) implies a quadratic trend in the
level of real wages under the null hypothesis of !ww D 0 and pwx.x D 00 , which is empirically
implausible. The critical value bounds for the statistics FIV and FV are given in Tables CI(iv) and
CI(v). Since k D 4, the 0.05 critical value bounds are (3.05, 3.97) and (3.47, 4.57) for FIV and
FV , respectively.25 The test outcome depends on the choice of the lag order p. For p D 4, the
Table I. Statistics for selecting the lag order of the earnings equation
Notes: p is the lag order of the underlying VAR model for the conditional ECM (30), with zero restrictions on the
coefficients of lagged changes in the productivity variable. AICp LLp sp and SBCp LLp sp /2 ln T denote
Akaike’s and Schwarz’s Bayesian Information Criteria for a given lag order p, where LLp is the maximized log-likelihood
value of the model, sp is the number of freely estimated coefficients and T is the sample size. /SC 2 1 and / 2 4 are LM
SC
statistics for testing no residual serial correlation against orders 1 and 4. The symbols Ł , ŁŁ , and ŁŁŁ denote significance
at 0.01, 0.05 and 0.10 levels, respectively.
24 In the Treasury model, different lag orders are chosen for different variables. The highest lag order selected is 4 applied
to the log of the price deflator and the wedge variable. The estimation period of the earnings equation in the Treasury
model is 1971q1– 1994q3.
25 Following a suggestion from one of the referees we also computed critical value bounds for our sample size, namely
T D 104. For k D 4, the 5% critical value bounds associated with FIV and FV statistics turned out to be (3.19,4.16) and
(3.61,4.76), respectively, which are only marginally different from the asymptotic critical value bounds.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
312 M. H. PESARAN, Y. SHIN AND R. J. SMITH
With Without
deterministic trends deterministic trends
p FIV FV tV FIII tIII
Notes: See the notes to Table I. FIV is the F-statistic for testing
0
!ww D 0, pwx.x D 0 and c1 D 0 in (30). FV is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30). FIII is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30) with c1 set equal to 0. tV
and tIII are the t-ratios for testing !ww D 0 in (30) with and without
a deterministic linear trend. a indicates that the statistic lies below
the 0.05 lower bound, b that it falls within the 0.05 bounds, and c
that it lies above the 0.05 upper bound.
hypothesis that there exists no level earnings equation is not rejected at the 0.05 level, irrespective
of whether the regressors are purely I0, purely I1 or mutually cointegrated. For p D 5, the
bounds test is inconclusive. For p D 6 (selected by AIC), the statistic FV is still inconclusive, but
FIV D 4.78 lies outside the 0.05 critical value bounds and rejects the null hypothesis that there
exists no level earnings equation, irrespective of whether the regressors are purely I0, purely
I1 or mutually cointegrated.26 This finding is even more conclusive when the bounds F-test is
applied to the earnings equations without a linear trend. The relevant test statistic is FIII and the
associated 0.05 critical value bounds are (2.86, 4.01).27 For p D 4, FIII D 3.63, and the test result
is inconclusive. However, for p D 5 and 6, the values of FIII are 5.23 and 5.42 respectively and
the hypothesis of no levels earnings equation is conclusively rejected.
The results from the application of the bounds t-test to the earnings equations are less clear-cut
and do not allow the imposition of the trend restrictions discussed above. The 0.05 critical value
bounds for tIII and tV , when k D 4, are (2.86, 3.99) and (3.41, 4.36).28 Therefore, if a
linear trend is included, the bounds t-test does not reject the null even if p D 5 or 6. However,
when the trend term is excluded, the null is rejected for p D 5. Overall, these test results support
the existence of a levels earnings equation when a sufficiently high lag order is selected and
when the statistically insignificant deterministic trend term is excluded from the conditional ECM
(30). Such a specification is in accord with the evidence on the performance of the alternative
conditional ECMs set out in Table I.
In testing the null hypothesis that there are no level effects in (30), namely (!ww D 0, pwx.x D 0)
it is important that the coefficients of lagged changes remain unrestricted, otherwise these tests
could be subject to a pre-testing problem. However, for the subsequent estimation of levels effects
and short-run dynamics of real wage adjustments, the use of a more parsimonious specification
seems advisable. To this end we adopt the ARDL approach to the estimation of the level relations
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 313
discussed in Pesaran and Shin (1999).29 First, the (estimated) orders of an ARDLp, p1 , p2 , p3 , p4
model in the five variables wt , Prodt , URt , Wedget , Uniont were selected by searching across
the 75 D 16, 807 ARDL models, spanned by p D 0, 1, . . . , 6, and pi D 0, 1, . . . , 6, i D 1, . . . , 4,
using the AIC criterion.30 This resulted in the choice of an ARDL6, 0, 5, 4, 5 specification with
estimates of the levels relationship given by
wt D 1.063 Prodt 0.105 URt 0.943 Wedget C1.481 Uniont C2.701 C vO t 31
0.050 0.034 0.265 0.311 0.242
where vO t is the equilibrium correction term, and the standard errors are given in parenthesis.
All levels estimates are highly significant and have the expected signs. The coefficients of the
productivity and the wedge variables are insignificantly different from unity. In the Treasury’s
earnings equation, the levels coefficient of the productivity variable is imposed as unity and the
above estimates can be viewed as providing empirical support for this a priori restriction. Our
levels estimates of the effects of the unemployment rate and the union variable on real wages,
namely 0.105 and 1.481, are also in line with the Treasury estimates of 0.09 and 1.31.31
The main difference between the two sets of estimates concerns the levels coefficient of the
wedge variable. We obtain a much larger estimate, almost twice that obtained by the Treasury.
Setting the levels coefficients of the Prodt and Wedget variables to unity provides the alternative
interpretation that the share of wages (net of taxes and computed using RPIX rather than the
implicit GDP deflator) has varied negatively with the rate of unemployment and positively with
union strength.32
The conditional ECM regression associated with the above level relationship is given in
Table III.33 These estimates provide further direct evidence on the complicated dynamics that seem
to exist between real wage movements and their main determinants.34 All five lagged changes in
real wages are statistically significant, further justifying the choice of p D 6. The equilibrium
correction coefficient is estimated as 0.229 (0.0586) which is reasonably large and highly
significant.35 The auxiliary equation of the autoregressive part of the estimated conditional ECM
has real roots 0.9231 and 0.9095 and two pairs of complex roots with moduli 0.7589 and 0.6381,
which suggests an initially cyclical real wage process that slowly converges towards the equilibrium
described by (31).36 The regression fits reasonably well and passes the diagnostic tests against non-
normal errors and heteroscedasticity. However, it fails the functional form misspecification test at
29 Note that the ARDL approach advanced in Pesaran and Shin (1999) is applicable irrespective of whether the regressors
are purely I0, purely I1 or mutually cointegrated.
30 For further details, see Section 18.19 and Lesson 16.5 in Pesaran and Pesaran (1997).
31 CSW do not report standard errors for the levels estimates of the Treasury earnings equation.
32 We are grateful to a referee for drawing our attention to this point.
33 Clearly, it is possible to simplify the model further, but this would go beyond the remit of this section which is first to
test for the existence of a level relationship using an unrestricted ARDL specification and, second, if we are satisfied that
such a levels relationship exists, to select a parsimonious specification.
34 The standard errors of the estimates reported in Table III allow for the uncertainty associated with the estimation of the
levels coefficients. This is important in the present application where it is not known with certainty whether the regressors
are purely I0, purely I1 or mutually cointegrated. It is only in the case when it is known for certain that all regressors
are I1 that it would be reasonable in large samples to treat these estimates as known because of their super-consistency.
35 The equilibrium correction coefficient in the Treasury’s earnings equation is estimated to be 0.1848 (0.0528), which
is smaller than our estimate; see p. 11 in Annex of CSW. This seems to be because of the shorter lag lengths used in the
Treasury’s specification rather than the shorter time period 1971q1– 1994q3. Note also that the t-ratio reported for this
coefficient does not have the standard t-distribution; see Theorem 3.2. p
36 The complex roots are 0.34293 š 0.67703i and 0.17307 š 0.61386i, where i D 1.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
314 M. H. PESARAN, Y. SHIN AND R. J. SMITH
the 0.05 level which may be linked to the presence of some non-linear effects or asymmetries in
the adjustment of the real wage process that our linear specification is incapable of taking into
account.37 Recursive estimation of the conditional ECM and the associated cumulative sum and
cumulative sum of squares plots also suggest that the regression coefficients are generally stable
over the sample period. However, these tests are known to have low power and, thus, may have
missed important breaks. Overall, the conditional ECM earnings equation presented in Table III
has a number of desirable features and provides a sound basis for further research.
2
R D 0.5589, GO D 0.0083, AIC D 339.57, SBC D 302.55,
2 4 D 8.74[0.068], / 2 1 D 4.86[0.027]
/SC FF
2 2 D 0.01[0.993], / 2 1 D 0.66[0.415].
/N H
37 The conditional ECM regression in Table III also passes the test against residual serial correlation but, as the model
was specified to deal with this problem, it should not therefore be given any extra credit!
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 315
6. CONCLUSIONS
Empirical analysis of level relationships has been an integral part of time series econometrics
and pre-dates the recent literature on unit roots and cointegration.38 However, the emphasis of this
earlier literature was on the estimation of level relationships rather than testing for their presence (or
otherwise). Cointegration analysis attempts to fill this vacuum, but, typically, under the relatively
restrictive assumption that the regressors, xt , entering the determination of the dependent variable of
interest, yt , are all integrated of order 1 or more. This paper demonstrates that the problem of testing
for the existence of a level relationship between yt and xt is non-standard even if all the regressors
under consideration are I0 because, under the null hypothesis of no level relationship between yt
and xt , the process describing the yt process is I1, irrespective of whether the regressors xt are
purely I0, purely I1 or mutually cointegrated. The asymptotic theory developed in this paper
provides a simple univariate framework for testing the existence of a single level relationship
between yt and xt when it is not known with certainty whether the regressors are purely I0,
purely I1 or mutually cointegrated.39 Moreover, it is unnecessary that the order of integration
of the underlying regressors be ascertained prior to testing the existence of a level relationship
between yt and xt . Therefore, unlike typical applications of cointegration analysis, this method is
not subject to this particular kind of pre-testing problem. The application of the proposed bounds
testing procedure to the UK earnings equation highlights this point, where one need not take an a
priori position as to whether, for example, the rate of unemployment or the union density variable
are I1 or I0.
The analysis of this paper is based on a single-equation approach. Consequently, it is inappropri-
ate in situations where there may be more than one level relationship involving yt . An extension of
this paper and those of HJNR and PSS to deal with such cases is part of our current research, but
the consequent theoretical developments will require the computation of further tables of critical
values.
38 For an excellent review of this early literature, see Hendry et al. (1984).
39 Of course, the system approach developed by Johansen (1991, 1995) can also be applied to a set of variables containing
possibly a mixture of I0 and I1 regressors.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
316 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where b? ?
y , b is a k C 1, k r C 1 matrix whose columns are a basis for the orthogonal
y , b is a basis for R
kC1
complement of b. Hence, b, b? ?
. Let x be the k C 2-unit vector 1, 00 0 .
Then, bŁ , x, d is a basis for R . It therefore follows that
kC2
where zŁt D t, z0t 0 , BkC1 a is a k C 1-vector Brownian motion with variance matrix Z and [Ta]
denotes the integer part of Ta, a 2 [0, 1]; see Phillips and Solo (1992, Theorem 3.15, p. 983). Also,
T1 x0 zŁt D T1 t ) a. Similarly, noting that b0 C D 0, we have that bŁ0 zŁt D b0 m C b0 CŁ Let D
OP 1. Hence, from Phillips and Solo (1992, Theorem 3.16, p. 983), defining Z̃Ł1 Pi ZŁ1 and
Pi Z , it follows that
Z
0 0
T1 b0Ł Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 b0Ł Z̃Ł1 Z 0 Z
D OP 1, T1 Z D OP 1
0 0
D OP 1
T1 B0T Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 B0T Z̃Ł1 Z A2
where BT d, T1/2 x . Similarly, defining ũ Pi u,
0
0
ũ D OP 1
T1/2 b0Ł Z̃Ł1 ũ D OP 1, T1/2 Z A3
Cf. Johansen (1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146).
The next result follows from Phillips and Solo (1992, Theorem 3.15, p. 983); cf. Johansen
(1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146) and Phillips and Durlauf
(1986).
Lemma A.1 Let BT d, T1/2 x and define Ga D G1 a0 , G2 a0 , where G1 a b? ? 0
y ,b
1 1
CB̃kC1 a, B̃kC1 a[D BQ 1 a , B̃k a ] D BkC1 a 0 BkC1 ada, and G2 a a 2 , a 2 [0,1].
0 0 0
Then
1
1
0 0
T2 B0T Z̃Ł1 Z̃Ł1 BT ) GaGa0 da, T1 B0T Z̃Ł1 ũ ) GadBQ uŁ a
0 0
where BQ uŁ a BQ 1 a w0 B̃k a and B̃k a D BQ 1 a, B̃k a0 0 , a 2 [0, 1]
Proof of Theorem 3.1 Under H0 of (17), the Wald statistic W of (21) can be written as
0 1 0
ωO uu W D ũ0 P Z Z̃ Ł
1 Z̃ Ł
1 P Z
Z̃ Ł
1 Z̃Ł1 P Z ũ
1
0 Ł0 0
D ũ0 P Ł
Z Z̃1 AT AT Z̃1 P
Ł
Z Z̃1 AT A0T Z̃Ł1 P
Z ũ
0
where AT T1/2 bŁ , T1/2 BT . Consider the matrix A0T Z̃Ł1 P Ł
Z Z̃1 AT . It follows from (A2)
and Lemma A.1 that
1 0 Ł0 Ł
0 T bŁ Z̃1 P Z Z̃1 bŁ 00
A0T Z̃Ł1 P Z̃
Z 1 T
Ł
A D 0 C oP 1 A4
0 T2 B0T Z̃Ł1 Z̃Ł1 BT
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 317
0
Next, consider A0T Z̃Ł1 P
Z ũ. From (A3) and Lemma A.1,
0
0 T1/2 b0Ł Z̃Ł1 P
Z ũ
A0T Z̃Ł1 P
Z ũ D 0
C oP 1 A5
T1 B0T Z̃Ł1 ũ
Finally, the estimator for the error variance ωuu (defined in the line after (21)),
0 Ł0 1 0 Ł0
ωO uu D T m1 ũ0 ũ ũ0 P Ł
Z Z̃1 AT AT Z̃1 P
Ł
Z Z̃1 AT AT Z̃1 P
Z ũ
D T m1 ũ0 ũ C oP 1 D ωuu C oP 1 A6
We consider each of the terms in the representation (A7) in turn. A central limit theorem allows us
to state 1/2
0 0
1/2
T1 b0Ł Z̃Ł1 P Z̃
Z 1
Ł
b Ł T1/2 b0Ł Z̃Ł1 P
Z ũ/ωuu ) zr ¾ N0, Ir
Hence, the first term in (A7) converges in distribution to z0r zr , a chi-square random variable with
r degrees of freedom; that is,
1
1 0 Ł0 0
2
T1 ũ0 P Ł
Z Z̃1 bŁ T bŁ Z̃1 P
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 P 0
Z ũ/ωuu ) zr zr ¾ / r A8
which, as C D b? ? ? ? 0 ? ? 1 ? ? 0
y , b [ay , a 0ˇy , b )] ay , a , may be expressed as
0
? ? 0 1
1
a? ? 0
y , a B̃kC1 a a?1 ? 0
y , a B̃kC1 a ay , a B̃kC1 a 0
dBQ uŁ a da
0 a 12 0 a 12 a 12
1 ? ? 0
ay , a B̃kC1 a
ð dBQ uŁ a/ωuu
0 a 12
Now, noting that under H0 of (17) we may express a? 0 0 ? ?0 0
y D 1, w and a D 0, axx where
a? 0
xx axxD 0, we define the k r C 1-vector of independent de-meaned standard Brownian
motions,
Q u a, W̃kr a0 0 ] [a?
W̃krC1 a[ D W ? 0 ? ? 1/2 ?
y , a Zay , a ] ay , a? 0 B̃kC1 a
1/2 Q
ωuu Bu a
D
a? 0 ? 1/2 ? 0
xx Zxx axx axx B̃k a
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
318 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where BQ uŁ a D BQ 1 a w0 B̃k a is independent of B̃k a and B̃kC1 a BQ 1 a, B̃k a0 0 is par-
titioned according to zt D yt , x0t 0 , a 2 [0, 1]. Hence, the second term in (A7) has the following
asymptotic representation:
1 0
1 0 1
dW Q u a W̃krC11a W̃krC1 a W̃krC1 a
da
0 a 2 0 a 12 a 12
1
W̃krC1 a Q u a
ð dW A9
0 a 12
Note that dW Q u a in (A9) may be replaced by dWu a, a 2 [0, 1]. Combining (A8) and (A9) gives
the result of Theorem 3.1.
For the remaining cases, we need only make minor modifications to the proof for Case IV.
In Case I, d D b? ? ?
y , b with b, by , b
?
a basis for RkC1 and BT D d. For Case II, where
Ł 0 0
Z1 D iT , Z1 , we have
m0
bŁ D b
IkC1
and, consequently, we define x as in Case IV,
m0
dD b? ?
y , b and BT D d, x.
IkC1
Case III is similar to Case I as is Case V.
Proof of Corollary 3.1 Follows immediately from Theorem 3.1 by setting r D k.
Proof of Corollary 3.2 Follows immediately from Theorem 3.1 by setting r D 0.
Proof of Theorem 3.2 We provide a proof for Case V which may be simply adapted for Cases I
and III. To emphasize the potential dependence of the limit distribution on nuisance parameters,
the proof is initially conducted under Assumptions 1-4 together with Assumption 5a which implies
! p
H0 yy : !yy D 0 but not necessarily H0 yx.x : pyx.x D 00 ; in particular, note that we may write a? y D
!
1, f0 0 for some k-vector f. The t-statistic for H0 yy : !yy D 0 may be expressed as the square
root of 1
0P
y 0 0
A0T Ẑ01 P
Z ,X̂1
Ẑ 1 A T A Ẑ P
T 1 Z Ẑ 1 A T Z ,X̂1 y/ωO uu A10
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 319
Under the conditions of the theorem, f D w and l2xy D 0 and, therefore, BO u2 a[D BO uŁ a] D
0 2
1/2 O
ωuu Wu a and a? ?0 ?0 ? 1/2
xx B̂k a[D axx B̂k a] D axx Zxx axx Ŵkr a, a 2 [0, 1].
Proof of Corollary 3.3 Follows immediately from Theorem 3.2 by setting r D k.
Proof of Corollary 3.4 Follows immediately from Theorem 3.2 by setting r D 0.
Proof of Theorem 4.1 Again, we consider Case IV; the remaining Cases I–III and V may be
!
dealt with similarly. Under H1 yy : !yy 6D 0, Assumption 5b holds and, thus, D ay b0y C ab0 where
ay D ˛yy , 00 0 and by D ˇyy , b0yx 0 ; see above Assumption 5b. Under Assumptions 1–4 and 5b,
the process fzt g1 Ł
tD1 has the infinite moving-average representation, zt D m C gt C Cst C C Let ,
? ?0 ? 1 ?0
where now C b [a 0b ] a . We redefine bŁ and d as the k C 2, r C 1 and k C 2, k r
matrices,
g0
bŁ by , b
IkC1
and
g0
d b? ,
IkC1
where b? is a k C 1, k r matrix whose columns are a basis for the orthogonal complement of
by , b. Hence, by , b, b? is a basis for RkC1 and, thus, bŁ , x, d a basis for RkC2 , where again
x is the k C 2-unit vector 1, 00 0 . It therefore follows that
T1/2 d0 zŁ[Ta] D T1/2 b?0 m C T1/2 b?0 Cs[Ta] C b?0 T1/2 CŁ Le[Ta] ) b?0 CBkC1 a
Also, as above, T1 x0 zŁt D T1 t ) a and b0Ł zŁt D by , b0 m C by , b0 CŁ Let D OP 1.
The Wald statistic (21) multiplied by ωO uu may be written as
1
ũ P 0 Ł0 0
Ł Ł 0 Ł0 0 Ł0
Z Z̃1 AT AT Z̃1 P
Z Z̃1 AT A0T Z̃Ł1 P
Z ũ C 2lŁ Z̃1 P
Ł
Z Z̃1 lŁ ,
Z ũ C lŁ Z̃1 P
B1
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
320 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where lŁ bŁ ay , a0 1, w0 0 , AT T1/2 bŁ , T1/2 BT and BT d, T1/2 x. Note that (A6)
!
continues to hold under H1 yy : !yy 6D 0. A similar argument to that in the Proof of Theorem 3.1
demonstrates that the first term in (B1) divided by ωuu has the limiting representation
1
1 1
1
z0rC1 zrC1 C dWu aFkr a0 Fkr aFkr a0 da Fkr adWu a B2
0 0 0
where zrC1 ¾ N0, IrC1 , Fkr a D W̃kr a0 , a 12 0 and W̃kr a a? 0 ? 1/2 ? 0
xx Zxx axx axx B̃k a
is a k r-vector of de-meaned independent standard Brownian 1 motions independent of the
standard Brownian motion Wu a, a 2 [0, 1]; cf. (22). Now, 0 Fkr adWu a is mixed normal
1
with conditional variance matrix 0 Fkr aFkr a0 da. Therefore, the second term in (B2) is
unconditionally distributed as a / 2 k r random variable and is independent of the first term; cf.
(A4). Hence, the first term in (B1) divided by ωuu has a limiting / 2 k C 1 distribution.
The second term in (B1) may be written as
0
1/2 1/2 0 Ł0
21, w0 ay , ab0Ł Z̃Ł1 P
Z ũ D 2T 1, w0
ay , a T b Z̃ P
Ł 1 Z ũ D OP T1/2 , B3
0
as T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ converges in probability to a positive definite matrix. Moreover, as
!
1, w0 ay , a 6D 00 under H1 yy : !yy 6D 0, the Theorem is proved.
Proof of Theorem 4.2 A similar decomposition to (B1) for the Wald statistic (21) holds under
! !
H1 yx.x \ H0 yy except that bŁ and d are now as defined in the Proof of Theorem 3.1. Although
!yy !
H0 : !yy D 0 holds, we have H1 yx.x : pyx.x 6D 00 . Therefore, as in Theorem 3.2, note that we may
write a? 0 0
y D 1, f for some k-vector f 6D w. Consequently, the first term divided by ωuu may be
written as
1
1 0 Ł0 0
T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃
Ł 1 P Z̃
Z 1
Ł
b Ł b0Ł Z̃Ł1 P
Z ũ/ωuu
0
1 0
C T2 ũ0 Z̃Ł1 BT T2 B0T Z̃Ł1 Z̃Ł1 BT B0T Z̃Ł1 ũ/ωuu C oP 1 B5
cf. (A7). As in the Proof of Theorem 3.1, the first term of (B5) has the limiting representation z0r zr
where zr ¾ N0, Ir ; cf. (22). The second term of (B5) has the limiting representation
Q2 1
1 Bu a 0
1 BQ u2 a Q2
Bu a 0
dBQ uŁ a a? 0
xx B̃k a
a? 0
xx B̃k a a? 0
xx B̃k a da
1 1
0 a 2 0 a 2 a 12
1 BQ u2 a
ð a? 0
xx B̃k a dBQ uŁ a/ωuu D OP 1
0 1
a 2
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 321
where BQ uf a BQ 1 a f0 B̃k a, a 2 [0, 1]; cf. Proof of Theorem 3.2. The second term of (B1)
becomes
0
1/2 1/2 0 Ł0
21, w0 ab0Ł Z̃Ł1 PZ ũ D 2T 1, w0
a T b Z̃ P
Ł 1 Z ũ D OP T1/2
0
ð T1 b0Ł Z̃Ł1 PZ 1Z̃ Ł
b Ł a0 1, w0 0 D OP T
! p
The Theorem follows as 1, w0 a 6D 00 under H0 yy : !yy D 0 and H1 yx.x : pyx.x 6D 00 .
Proof of Theorem 4.3 We concentrate on Case IV; the remaining Cases I–III and V are
proved by a similar argument. Let fztT gTtD1 denote the process under H1T of (26). Hence,
8LztT m gt D xtT , where xtT 5T 5[zt1T m gt 1] C et and 5T 5 is
given in (27). Therefore, ztT ) gt D CxtT C CŁ LxtT , Cz D C C 1 zCŁ z and
?
C D b? ? ? ? 0 ? 1 ? ? 0
y , b [ay , a 0(by , b )] ay , a , and thus,
[IkC1 IkC1 C T1 Cay b0y L]ztT m gt D CetT C CŁ LxtT B6
where
dyx
etT T1/2 b0 [zt1T m gt 1] C et , t D 1, . . . , T, T D 1, 2, . . .
dxx
Note that xtT D 5T 5[zt1T m gt 1] C et . It therefore follows that T1/2 d0 zŁ[Ta]T
a
) b? ? 0 Ł 0 0
y , b CJkC1 a, where d is defined above Lemma A.1 and ztT D t, ztT , JkC1 a 0 exp
0
fay by Ca rgdBkC1 r is an Ornstein-Uhlenbeck process and BkC1 a is a k C 1-vector Brow-
nian motion with variance matrix Z, a 2 [0, 1]; cf. Johansen (1995, Theorem 14.1, p. 202).
Similarly to (A4),
1 0 Ł0 Ł
T bŁ Z̃1 PZ Z̃1 bŁ 00
A0T Z̃01 P Z̃ A
Z 1 T D 0 C oP 1
0 T2 B0T Z̃Ł1 Z̃Ł1 BT
Therefore, expression (B1) for the Wald statistic (21) multiplied by ωO uu is revised to
1
% 0 yP
ωO uu W D T1 Ł
T 1 0 Ł0 Ł 0
b0Ł Z̃Ł1 P
Z
Z̃ 1 b Ł b Ł Z̃ 1 P Z
Z̃ 1 b Ł Z y
1
C T2 % 0 yP Ł
T 2 0 Ł0 Ł 0
B0T Z̃Ł1 P
Z̃
Z 1 T
B B Z̃ Z̃
T 1 1 T B Z y C oP 1 B7
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
322 M. H. PESARAN, Y. SHIN AND R. J. SMITH
where pŁyT T1 ˛yy b0yŁ C T1/2 dyx w0 dxx b0Ł . Defining h dyx w0 dxx 0 , consider
0 0 0
T1/2 b0Ł Z̃Ł1 P Ł Ł
Z Z̃1 pyT D T
1/2 0 Ł Ł 1
Z Z̃1 byŁ ˛yy T C bŁ hT
bŁ Z̃1 P 1/2
0
D T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ h C oP 1 B9
where we have made use of T1/2 b0yŁ zŁ[Ta]T ) b0y CJkC1 a. Therefore, (B8) divided by ωuu may be
re-expressed as
0
0
1/2 0 Ł0
T1/2 b0Ł Z̃Ł1 P
Z ũ C Qh Q 1
T b Z̃
Ł 1 P Z ũ C Qh /ωuu C oP 1 D z0r zr C oP 1
B9
1 0 Ł0 Ł 1/2
where Q p limT!1 T bŁ Z̃1 P Z Z̃1 bŁ and zr ¾ NQ h, Ir .
As Ł Ł0 0
T1 B0T Z̃Ł1 P 1 0 Ł0 Ł Ł0
P Z y D P Z Z̃1 pyT C ũ, Z0 y D T B0 T Z̃1 PZ Z̃1 pyT C ũ.
Consider the second term in (B7), in particular, T1 B0T Z̃Ł1 P Ł Ł
Z Z̃1 pyT which after substitution
Ł
for pyT becomes
0 0 0
T2 B0T Z̃Ł1 P Ł
Z Z̃1 byŁ ˛yy C T
3/2 0 Ł
BT Z̃1 P Ł 2 0 Ł
Z Z̃1 bŁ h D T BT Z̃1 P
Ł
Z Z̃1 byŁ ˛yy C oP 1
1 ? ? 0
by , b CJ̃kC1 a
) 1 J̃kC1 a0 C0 by ˛yy da
0 a 2
Therefore,
1
0
b? ? 0
y , b CJ̃kC1 a 1/2 Q
T1 B0T Z̃Ł1 P
Z y ) ωuu dWu a C J̃kC1 a0 C0 by ˛yy da
0 a 12
Consider
by ; cf. Johansen (1995, Theorem 14.4, p. 207). Note that the first element of J̃ŁkrC1 a satisfies
QJŁu a D WQ u a C ωuu 0 a Ł
1/2
˛yy b 0 J̃krC1 r dr and dJQ Łu a D dWQ u a C ωuu
1/2
˛yy b0 JQ ŁkrC1 a da.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 323
Therefore,
1
1 0
b? ? 0
y , b CJ̃kC1 a 1/2 Q Ł
T B0T Z̃Ł1 P
Z Y ) ωuu dJu a
0 a 12
Hence, the second term in (B7) weakly converges to
1
1 1
1
ωuu dJQ Łu aFkrC1 a0 FkrC1 aFkrC1 a da 0
FkrC1 a dJQ Łu a B10
0 0 0
Proof of Theorem 4.4 We consider Case V; the remaining Cases I and III may be dealt with
!
similarly. Under H1 yy : !yy 6D 0, from (10), ŷ1 D X̂1 q C v̂1 , where v̂1 P Z ,X̂1 v1 and
0 0
v1 D 0, v1 , . . . , vT1 . Therefore, ŷ1 P 0 0
Z ,X̂1 y D v̂1 P
Z ,X̂1 Y and ŷ1 P
Z ,X̂1 ŷ1 D
0
Z ,X̂1 v̂1 .
v̂1 P
As in Appendix A,
T1/2 b? 0
xx x[Ta] D T
1/2 ? 0
bxx mx C T1/2 b? 0
xx gx t C T
1/2 ? 0 ?
bxx bxx a?0 0b? 1 a?0 s[Ta]
C 0, b? 0
xx T
1/2 Ł
C Le[Ta]
T1 ŷ01 P
Z ŷ1 D T1 v̂01 P
Z v̂1 T1 v̂01 P
Z ? v̂1 C oP 1
,X̂1 ,X̂1 bxx ,X̂1 bxx
D T1 v̂01 P
Z v̂1 C oP 1
,X̂1 bxx
0 0 1 0 0
where P
Z ,X̂1 bxx P
Z PZ X̂1 bxx bxx X̂1 P
Z X̂1 bxx bxx X̂1 P
Z and P
Z ,X̂1 b?xx
? ?0 0 ? 1 ? 0 0 1 0
Z X̂1 bxx bxx X̂1 P
P
Z X̂1 bxx bxx X̂1 P
Z . Therefore, as T v̂1 v̂1 D OP 1,
T1 ŷ01 P
Z ŷ1 D OP 1 B11
,X̂1
T1/2 v̂01 P
Z û D T1/2 v̂01 P
Z û T1/2 v̂01 P
Z ? û C oP 1
,X̂1 ,X̂1 bxx ,X̂1 bxx
D T1/2 v̂01 P
Z û C oP 1 D OP 1
,X̂1 bxx
T1 v̂01 P
Z Ẑ1 l D T1 v̂01 P
Z Ẑ1 l T1 v̂01 P
Z ? Ẑ1 l C oP 1
,X̂1 ,X̂1 bxx ,X̂1 bxx
D T1 v̂01 P
Z Ẑ1 l C oP 1 D OP 1
,X̂1 bxx
T1/2 v̂01 P
Z Ẑ1 l D OP T1/2 . B12
,X̂1
Because ωO uu ωuu D oP 1, combining (B11) and (B12) yields the desired result.
ACKNOWLEDGEMENTS
We are grateful to the Editor (David Hendry) and three anonymous referees for their helpful
comments on an earlier version of this paper. Our thanks are also owed to Michael Binder, Peter
Burridge, Clive Granger, Brian Henry, Joon-Yong Park, Ron Smith, Rod Whittaker and seminar
participants at the University of Birmingham. Partial financial support from the ESRC (grant Nos
R000233608 and R000237334) and the Isaac Newton Trust of Trinity College, Cambridge, is
gratefully acknowledged. Previous versions of this paper appeared as DAE Working Paper Series,
Nos. 9622 and 9907, University of Cambridge.
REFERENCES
Banerjee A, Dolado J, Galbraith JW, Hendry DF. 1993. Co-Integration, Error Correction, and the Econo-
metric Analysis of Non-Stationary Data. Oxford University Press: Oxford.
Banerjee A, Dolado J, Mestre R. 1998. Error-correction mechanism tests for cointegration in single-equation
framework. Journal of Time Series Analysis 19: 267–283.
Banerjee A, Galbraith JW, Hendry DF, Smith GW. 1986. Exploring equilibrium relationships in economet-
rics through static models: some Monte Carlo Evidence. Oxford Bulletin of Economics and Statistics 48:
253–277.
Blanchard OJ, Summers L. 1986. Hysteresis and the European Unemployment Problem. In NBER Macroe-
conomics Annual 15–78.
Boswijk P. 1992. Cointegration, Identification and Exogeneity: Inference in Structural Error Correction
Models. Tinbergen Institute Research Series.
Boswijk HP. 1994. Testing for an unstable root in conditional and structural error correction models. Journal
of Econometrics 63: 37–70.
Boswijk HP. 1995. Efficient inference on cointegration parameters in structural error correction models.
Journal of Econometrics 69: 133–158.
Cavanagh CL, Elliott G, Stock JH. 1995. Inference in models with nearly integrated regressors. Econometric
Theory 11: 1131–1147.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 325
Chan A, Savage D, Whittaker R. 1995. The new treasury model. Government Economic Series Working
Paper No. 128, (Treasury Working Paper No. 70).
Darby J, Wren-Lewis S. 1993. Is there a cointegrating vector for UK wages? Journal of Economic Studies
20: 87–115.
Dickey DA, Fuller WA. 1979. Distribution of the estimators for autoregressive time series with a unit root.
Journal of the American Statistical Association 74: 427–431.
Dickey DA, Fuller WA. 1981. Likelihood ratio statistics for autoregressive time series with a unit root.
Econometrica 49: 1057–1072.
Engle RF, Granger CWJ. 1987. Cointegration and error correction representation: estimation and testing.
Econometrica 55: 251–276.
Granger CWJ, Lin J-L. 1995. Causality in the long run. Econometric Theory 11: 530–536.
Hansen BE. 1995. Rethinking the univariate approach to unit root testing: using covariates to increase power.
Econometric Theory 11: 1148–1171.
Harbo I, Johansen S, Nielsen B, Rahbek A. 1998. Asymptotic inference on cointegrating rank in partial
systems. Journal of Business Economics and Statistics 16: 388–399.
Hendry DF, Pagan AR, Sargan JD. 1984. Dynamic specification. In Handbook of Econometrics (Vol. II)
Griliches Z, Intriligator MD (des). Elsevier: Amsterdam.
Johansen S. 1991. Estimation and hypothesis testing of cointegrating vectors in Gaussian vector autoregres-
sive models. Econometrica 59: 1551–1580.
Johansen S. 1992. Cointegration in partial systems and the efficiency of single-equation analysis. Journal of
Econometrics 52: 389–402.
Johansen S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford Uni-
versity Press: Oxford.
Kremers JJM, Ericsson NR, Dolado JJ. 1992. The power of cointegration tests. Oxford Bulletin of Economics
and Statistics 54: 325–348.
Layard R, Nickell S, Jackman R. 1991. Unemployment: Macroeconomic Performance and the Labour
Market. Oxford University Press: Oxford.
Lindbeck A, Snower D. 1989. The Insider Outsider Theory of Employment and Unemployment, MIT Press:
Cambridge, MA.
Manning A. 1993. Wage bargaining and the Phillips curve: the identification and specification of aggregate
wage equations. Economic Journal 103: 98–118.
Nickell S, Andrews M. 1983. Real wages and employment in Britain. Oxford Economic Papers 35: 183–206.
Nielsen B, Rahbek A. 1998. Similarity issues in cointegration analysis. Preprint No. 7, Department of
Theoretical Statistics, University of Copenhagen.
Park JY. 1990. Testing for unit roots by variable addition. In Advances in Econometrics: Cointegration,
Spurious Regressions and Unit Roots, Fomby TB, Rhodes RF (eds). JAI Press: Greenwich, CT.
Pesaran MH, Pesaran B. 1997. Working with Microfit 4.0: Interactive Econometric Analysis, Oxford Univer-
sity Press: Oxford.
Pesaran MH, Shin Y. 1999. An autoregressive distributed lag modelling approach to cointegration analysis.
Chapter 11 in Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial
Symposium, Strom S (ed.). Cambridge University Press: Cambridge.
Pesaran MH, Shin Y, Smith RJ. 2000. Structural analysis of vector error correction models with exogenous
I(1) variables. Journal of Econometrics 97: 293–343.
Phillips AW. 1958. The relationship between unemployment and the rate of change of money wage rates in
the United Kingdom, 1861–1957. Economica 25: 283–299.
Phillips PCB, Durlauf S. 1986. Multiple time series with integrated variables. Review of Economic Studies
53: 473–496.
Phillips PCB, Ouliaris S. 1990. Asymptotic properties of residual based tests for cointegration. Econometrica
58: 165–193.
Phillips PCB, Solo V. 1992. Asymptotics for linear processes. Annals of Statistics 20: 971–1001.
Rahbek A, Mosconi R. 1999. Cointegration rank inference with stationary regressors in VAR models. The
Econometrics Journal 2: 76–91.
Sargan JD. 1964. Real wages and prices in the U.K. Econometric Analysis of National Economic Planning,
Hart PE Mills G, Whittaker JK (eds). Macmillan: New York. Reprinted in Hendry DF, Wallis KF (eds.)
Econometrics and Quantitative Economics. Basil Blackwell: Oxford; 275–314.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
326 M. H. PESARAN, Y. SHIN AND R. J. SMITH
Shin Y. 1994. A residual-based test of the null of cointegration against the alternative of no cointegration.
Econometric Theory 10: 91–115.
Stock J, Watson MW. 1988. Testing for common trends. Journal of the American Statistical Association 83:
1097–1107.
Urbain JP. 1992. On weak exogeneity in error correction models. Oxford Bulletin of Economics and Statistics
52: 187–202.
Copyright 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
+ MODEL
ScienceDirect
The Journal of Finance and Data Science xx (2018) 1e19
http://www.keaipublishing.com/en/journals/jfds/
Abstract
Economists face method selection problem while working with time series data. As time series data may possess specific
properties such as trend and structural break, common methods used to analyze other types of data may not be appropriate for the
analysis of time series data. This paper discusses the properties of time series data, compares common data analysis methods and
presents a methodological framework for time series data analysis. The framework greatly helps in choosing appropriate test
methods. To present an example, Nepal's moneyeprice relationship is examined. Test results obtained following this methodo-
logical framework are found to be more robust and reliable.
© 2018 China Science Publishing & Media Ltd. Production and hosting by Elsevier on behalf of KeAi Communications Co. Ltd.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Time series analysis; Unit root test; Methodological framework; Moneyeprice relationship in Nepal
1. Introduction
Time series data is a sequence of observations of the defined variable at a uniform interval over a period of time in
successive order. Most common series are in annual, quarterly, monthly, weekly and daily frequencies. Economic time
series data often possess unique features such as clear trend, high degree of persistence on shocks, higher volatility
over time and meandering and sharing co-movements with other series.1 Researchers need to understand such features
of time series data properly and address them.
In time series analysis, it is important to understand the behavior of variables, their interactions and integrations
over time. If major characteristics of time series data are understood and addressed properly, a simple regression
analysis using such data can also tell us about the pattern of relationships among variables of interest. This paper
attempts to highlight the basic econometric issues related to the time series data and provides a basic methodological
framework for time series analysis. In addition, the paper analyses the relationship between money and price in Nepal
using the methodological framework presented in this paper to provide practical example.
*
Note: Preliminary version of this paper was published as NRB Working Paper No. 36 (March 2017).
* Corresponding author.
E-mail addresses: minbshrestha@gmail.com (M.B. Shrestha), bhatta.gunaraj@gmail.com (G.R. Bhatta).
Peer review under responsibility of China Science Publishing & Media Ltd.
https://doi.org/10.1016/j.jfds.2017.11.001
2405-9188/© 2018 China Science Publishing & Media Ltd. Production and hosting by Elsevier on behalf of KeAi Communications Co. Ltd. This
is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
2 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
Time series data may have some kind of relationship with its previous values. The autoregressive (AR) character of
time series model indicates that present value of any variable is determined by its past value and some adjustment
factors. Such adjustment factors are estimated from the relation of current value with past values. If the current value is
based solely on the immediate preceding value, it is termed as first order autoregressive, AR (1), and if it is based on
two preceding values, second order autoregressive, AR (2), and so on.
A univariate linear regression modelc can be estimated as:
Yt ¼ m þ rYt1 þ εt ð1Þ
where, Yt is a dependent variable, Y, at period t. m is a constant parameter. εt is the unexplained part (gap) of actual data
and fitted line by regression equation, termed as error. Yt1 is the first lagged value of Y, r is the coefficient of Yt1.
Eq. (1) says that the value of Yt equals the constant m plus r times its previous value and an unknown component εt
The model to be estimated in Eq. (1) is an AR (1) process.
Similarly,
A time series data is called stationary if its value tends to revert to its long-run average value and properties of data
series are not affected by the change in time only (Fig. 1).e On the contrary, the non-stationary time series does not tend
to return to its long-run average value, hence, its mean, variance and co-variance also change over time (Fig. 2).
Most of the macroeconomic variables such as volume of gross domestic product (GDP), consumption, consumer
price index, etc. exhibit a strong upward or downward movement over time with no tendency to revert to a fixed mean.
c
For details, see Stigler (1981).2
d
Error terms are the unobserved factors of regression that may affect the dependent variable. These are residuals of actual and fitted values of a
regression. It is represented by ε or u. Wooldridge (2002) mentions that “dealing with this error term is the most important component of any
econometric analysis”.
e
For details, see Verbeek (2017)3, Chapter 8.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 3
Hence, they are non-stationary series. If the time series is non-stationary, it is said to have a unit root. Therefore, in
econometrics, the stationarity of a time series is examined by conducting unit root test.
Mathematically, the series Yt is stationary if:
where,
E(Yt) ¼ Expected value of Y at period t
Var ¼ Variance, the variation or spread of Yt from E(Yt)
Cov ¼ Covariance, the joint variation of Yt and Yts
Yts ¼ Lag of Y up to period ts
Trend is a sustained upward or downward movement in time series data over the long run (Fig. 3). Cycle is a short-
run fluctuation which occurs in a given interval such as monthly, quarterly or annual (Fig. 4). Trends are always non-
stationary whereas cycles may be either stationary or non-stationary.
Seasonality is a kind of pattern in a high frequency data such as quarterly, monthly, weekly or daily. For instance,
we may observe a high volume of sales in festive season, more currency coming into circulation during Dashain
festival and increased government spending at the last quarter of the fiscal year in Nepal.f
Most of the modeling techniques applied in time series analysis are primarily concerned with stationarity of the
data. The starting point is to examine the properties of series graphically and confirming it statistically. Graphs are the
most preliminary tool to get the rough idea about the stationarity of the series. However, statistical tests are required
for final decision. Unit root tests provide statistical evidence on the stationarity of a given series.
f
The easiest way of identifying seasonality is seasonal graphs that can be drawn using EViews. It gives a graphical plot of the series for each
season. If the series is found to be higher or lower than the average at any particular season, say a month, we can declare that there is seasonality. If
the seasonality is detected in the series, it should be addressed while modeling. The possible solution for the seasonality is generation of seasonally
adjusted series by using available methods. In EViews, Census X13, Census X12, X11 (Historical), Tramo/Seats and Moving Average Methods are
available. These methods generate seasonally adjusted additional variable of the original seasonal unadjusted series.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
4 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
Fig. 3. Trend.
60
50
40
30
20
10
0
1981Q1
1981Q3
1982Q1
1982Q3
1983Q1
1983Q3
1984Q1
1984Q3
1985Q1
1985Q3
1986Q1
1986Q3
1987Q1
1987Q3
1988Q1
Fig. 4. Cycle.
The statistical procedure employed to determine the stationarity of a series is called ‘unit root test’. The following
section discusses the widely used stationarity test methods, namely Augmented DickeyeFuller, PhillipsePerron and
KPSS tests.
where,
d¼a1
a ¼ coefficient of yt1
Dyt ¼ first difference of yt, i.e. ytyt1
The null hypothesis of ADF is d ¼ 0 against the alternative hypothesis of d < 0. If we do not reject null, the series is
non-stationary whereas rejection means the series is stationary.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 5
where,
et is a I(0) with zero mean and Dti is a deterministic trend component.
The hypothesis is tested for p ¼ 0. The basic difference between the ADF and PP tests is that PP is a non-parametric
test, meaning that it does not need to specify the form of the serial correlation of Dyt under the null hypothesis. Thus,
the calculation procedure of t-ratio to get the value of p becomes different. Furthermore, PP corrects the statistics to
consider the autocorrelation and heteroskedasticity issues. The hypothesis testing procedure is similar as of ADF test.
Although the ADF test has been reported to be more reliable than the PP test, the problem of size distortion and low
power of test make both these tests less useful.4 For the larger volume of financial data, PP test is also suggested.
In the above model, hypothesis is tested for ut. The reported critical values of the KPSS test is derived from the
Lagrange Multiplier (LM) test statistics.
Structural break is a sudden jump or fall in an economic time series which occurs due to the change in regime,
policy direction, and external shocks, among others. Structural break may occur in intercept, trend or both (Fig. 5).
Structural breaks can create difficulties in unit root test. As shown by Perron (1989),5 in the presence of structural
break, conventional unit root test methods may show a time series to be non-stationary, which in fact is a stationary
series. In other words, a stationary series which has a structural break may be regarded as a non-stationary series by the
above mentioned unit root test methods because these methods do not make adjustment for structural break.
To address the structural break issue Perron (1989),5 has developed a unit root test method, which accommodates a
known structural break in the time series. More recently, some new methods have been proposed for unit root test
allowing unknown single and multiple structural breaks.6e9g
Applying appropriate methodology for the time series data is most crucial part of the time series analysis as wrong specification
of the model or using wrong method provides biased and unreliable estimates. Primarily, the method selection for time series
analysis is based on the unit root test results which determine the stationarity of the variable. Methods commonly used to analyze
the stationary time series cannot be used to analyze non-stationary series. If all the variables of interest are stationary, the meth-
odology becomes simple. In such a case, ordinary least square (OLS) or vector autoregressive (VAR) models can provide unbiased
estimates. If all the variables of interest are non-stationary, OLS or VAR models may not be appropriate to analyze the relationship.
Similarly, additional problem arises when variables used in the analysis are of mixed type, i.e., some are stationary and others are
non-stationary.
Following is a general methodological framework for time series analysis.
The method selection criteria of Fig. 6 should be treated as the most basic approach. This is because there are other several
considerations in time series models.
The non-stationary variables can be made stationary by taking first difference. Similarly, the non-stationary data with a
persistent long-run trend can be made stationary with either i) putting time variable in the regression or ii) extracting trends and
cycles from the single series by using popular filtering techniques such as Hodrick Prescott (HP) filter. Nevertheless, it should be
noted that the long-run relationship/information of the variables may be lost when we modify them to make stationary such as by
differencing, de-trending or filtering.
g
See Shrestha and Chowdhury (2005)10 for detailed discussion on unit root test with the structural break.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
6 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
30
70
25
60
20
50
15
40
10 30
5 20
10
0
1981 1986 1991 1996 2001 2006
0
19811983 1985198719891991 1993199519971999 200120032005 20072009
a b
120
50
45 100
40
35 80
30
25 60
20
40
15
10 20
5
0 0
19811983 1985198719891991 1993199519971999 200120032005 20072009 198119831985198719891991199319951997199920012003200520072009
c d
Fig. 5. Structural break in time series data, a. Structural break in intercept, b. Structural break in intercept, c. Structural break in trend, d.
Structural break in intercept and trend.
Unit
Unit root tests
No cointegration Cointegration
ECM
All Variables
Causality test
Nonstationary
Fig. 6. Method selection for time series data. OLS: Ordinary least squares; VAR: Vector autoregressive; ARDL: Autoregressive distributed lags;
ECM: Error correction models.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 7
The first step to start the time series analysis is to conduct unit root test. If unit root test results show that all variables being
analyzed are stationary, then OLS method can be used to determine the relationship between the given variables. A bivariate linear
regression model, termed as ordinary least squares (OLS), can be estimated as:
Yi ¼ b1 þ b2 Xi þ ei ð9Þ
¼ Yi b1 b2 Xi ð11Þ
Above model shows that the residuals (ei ) are simply the difference between the actual (Yi ) and estimated ( Ybi ) values. OLS
minimizes the residual sum of squares while choosing b1 and b2.h
As mentioned above, a non-stationary time series can be converted into a stationary series by differencing. If a time series
becomes stationary after differencing one time, then the series is said to be integrated of order one and denoted by I(1). Similarly, if
a time series has to be differenced two times to make it stationary, then it is called integrated of order 2 and written as I(2). As the
stationary time series need not to be differenced, it is denoted by I(0).
Taking difference of non-stationary time series and using OLS method after making all the variables stationary may seem to be
an easy way to analyze the relationship. However, the difference represents only the short-run change in the time series but totally
misses out the long-run information. Hence, this method is not suggested for the analysis of non-stationary variables.
Vector Autoregressive (VAR) model allows the feedback or reverse causality among the dependent and independent regressors
using their own past values. In the general VAR model, no exogenous variables require as it assumes all the regressors endogenous.
The simpler VAR dimensioni for two variables X and Y with only one lag is given below:
Yt ¼ d1 þ q11 Yt1 þ q12 Xt1 þ ε1t ð12Þ
where ε1t and ε2t are uncorrelated white noise disturbances or error terms.
Choosing appropriate lag length is important in VAR modeling. Optimal number of lags can be selected by using available lag
length selection criteria. Most popular criteria are Akaike Information Criterion (AIC), Schwartz Bayesian Criterion (SBC), and
Hannan Quinn criterion (HQC).
Using ordinary least square or other similar methods for non-stationary time series may produce spurious results. In other words,
the test results of regression may show that a significant relationship exists between two given variables, which in fact are un-
correlated. This type of regression is termed as ‘spurious regression’ which mainly occurs due to the non-stationarity of the time
series used in the regression model. On the other hand, two or more variables may form long term equilibrium relationship even
though they may deviate from the equilibrium in the short run. Due to these issues, Engle and Granger (1987)13 developed
cointegration test method to analyze the relationships among non-stationary variables.
If two or more variables are linked to form an equilibrium relationship spanning the long run, these variables are said to be
cointegrated. In fact, one variable drags the other over the period and hence, both of them share the same movement. Fig. 7 shows
the movement of two cointegrated time series.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
8 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
900
Remiance Import
800
700
600
500
400
300
200
100
0
2001 2003 2005 2007 2009 2011 2013 2015
xt ¼ A1 xt1 þ εt ; ð14Þ
so that
Dxt ¼ A1 xt1 xt1 þ εt ð15Þ
where,
xt and εt are (n.1) vectors
A1 ¼ an (n.n) matrix of parameters
I ¼ an (n.n) identity matrix
P ¼ A1I
We test the rank of A1I matrix. If the rank of A1I, that is, the rank of P ¼ 0, then we say sequences are unit root processes.
If rank of P ¼ k then we say the series is stationary and if rank of P < k, also known as reduced rank, then there exists
cointegration. Hence, the intuition is if we have 3 variables in cointegration tests, the maximum rank of P can be less than three (if
k ¼ 3, cointegration rank<3 and maximum cointegration relation is only two).
If the variables are I(1) and there exists a cointegration relationship, then Error Correction Model (ECM) can be derived.
Consider the following bivariate relationship.
Yt ¼ m þ b1 Xt þ εt ð17Þ
13
Based on the representation theorem of Engle and Granger (1987), we establish a link between the cointegration and Error
Correction Model (ECM) by transforming Eq. (17).
Cointegration equation between Yt, and Xt are as follows:
εt ¼ Yt m b1 Xt ð18Þ
X
l X
l
DXt ¼ mX þ aX εt1 þ a2h DYth þ b2h DXth þ uXt ð20Þ
h¼1 h¼1
where, uYt and uXt are stationary white noise processes for some number of lags l. The model can be further advanced in multivariate
case in a similar way.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 9
The coefficients in the cointegration equation give the estimated long-run relationship among the variables and coefficients on
the ECM describe how deviations from that long-run relationship affect the changes on them in next period. The parameters aY and
aX of Eqs. (19) and (20) measure the speed of adjustment of Xt and Yt, respectively towards the long-run equilibrium.
Johansen cointegration test cannot be applied directly if variables of interest are of mixed order of integration or all of them are
not non-stationary, as this method requires all the variables to be I(1). An autoregressive distributed lag (ARDL) model is an
ordinary least square (OLS) based model which is applicable for both non-stationary time series as well as for times series with
mixed order of integration.16,17 This model takes sufficient numbers of lags to capture the data generating process in a general-to-
specific modeling framework.
A dynamic error correction model (ECM) can be derived from ARDL through a simple linear transformation. Likewise, the
ECM integrates the short-run dynamics with the long-run equilibrium without losing long-run information and avoids problems
such as spurious relationship resulting from non-stationary time series data.
To illustrate the ARDL modeling approach, the following simple model can be considered:
yt ¼ a þ bxt þ dzt þ et ð21Þ
The first part of the equation with b, d and ε represents short run dynamics of the model. The second part with ls represents long
run relationship. The null hypothesis in the equation is l1 þ l2 þ l3 ¼ 0, which means non-existence of long run relationship.
If two variables Y and X is cointegrated, then there may exist any of the 3 relationships: a) X affects Y, b) Y affects X and c) X
and Y affect each other. The first two show unidirectional relationship while the third shows bidirectional relationship. If two
variables are not cointegrated, then one does not affect the other and are independent. To determine the pattern of such relationship,
Granger (1969)18 has developed causality test method. If current and lagged values of X improve the prediction of the future value
of Y, then it is said that X ‘Granger causes’ Y. The simple model of Granger causality is as follows:
X
n X
n
DYt ¼ ai DYti þ bj DXtj þ u1t ð23Þ
i¼1 j¼1
X
n X
n
DXt ¼ li DXti þ dj DYtj þ u2t ð24Þ
i¼1 j¼1
Eq. (23) shows that the current value of DY is related to the past values of itself and the past values of DY. Similarly, Eq. (24)
postulates that DX is related to the past values of itself and that of DY.
The null hypothesis in Eq. (23) is bj ¼ 0 which means, “DX does not Granger cause DY”. Similarly, the null hypothesis in Eq.
(24) is dj ¼ 0, and states “DY does not Granger cause DX.” The rejection or non-rejection of the null hypothesis is based on the F-
statistics.
To make the estimated model robust and unbiased, we need to determine the fitness of the model through checking
goodness of fit statistics and conducting diagnostics tests.
A rough impression of the robustness of estimated regression coefficients can be made by examining how well the
regression line explains the data, whether there is a serial correlation in residuals and whether the overall model is
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
10 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
significant, among others. Goodness of fit test values is displayed together with the estimated coefficients by almost all
types of software.
Common tests for goodness of fit include R2, which shows a correlation in bivariate case and hence the value closer
towards 1 is considered to be better. In a multivariate regression, adjusted R2 is chosen instead of R2. R2 increases with
the increase in the number of variable while adjusted R2 increases only when the new variable improves the prediction
power. Durbin Watson (DW) statistics indicate whether there is an autocorrelation in residuals. If the value of DW is
near to two, then model is considered to be ‘autocorrelation free’.
Diagnostic tests tell us about the robustness of estimated coefficients. Diagnostic test statistics are generally not
reported automatically by software and thus should be estimated separately. Type of the diagnostic test depends upon
the modeling technique being utilized. However, the most common types of diagnostics tests are lag structure, co-
efficient diagnostics and residual diagnostics. Residual diagnostics is the most crucial part of diagnostic tests in
economic modeling since the regression models try to minimize errors (or residuals). The error terms must be white
noise (independently and identically distributed, i.i.d.). Residual diagnostics examine whether the error terms are i.i.d.
Lagrange multiplier (LM) test, correlogram, and heteroskedasticity test are the major test methods for residual di-
agnostics. The stability diagnostics examine whether the parameters of the estimated model are stable across various
sub-samples of the data.
The diagnostics tests have been discussed in detail in Annex 1.
Classical and neoclassical economists believe that over-supply of money leads to an increase in price level. The
most famous quantity theory of money by Fisher (1922)19 has expressed the moneyeprice relationship as follows:
MV ¼ PT ð25Þ
where, M denotes money supply, V refers the velocity of money, P is the average price level and T indicates the total
volume of transaction of goods and services in an economy. The modern quantity theory of money (QTM) believes
that the firm specific cost increase cannot be inflationary as long as they are not related to, or accommodated by,
increases in the money supply. The relationship can be expressed as:
MV ¼ PY ð26Þ
In the above equation, if output of the economy, Y, and the velocity of money V are given, then increase in M will
proportionately increase P.
In the developing countries like Nepal, where the supply side bottlenecks are also a big issue, demand side inflation
may be dominated by structural constraints. This paper empirically analyses the moneyeprice relationship in Nepal
by following the econometric framework discussed in the preceding sections of this paper.
In line with the methodological framework discussed above, Nepal's moneyeprice relationship is analyzed with
due consideration to the properties of time series. We include the monthly series of Nepalese consumer price index
(CPI), nominal effective exchange rate (NEER), broad money (M2) in Rs. million, and Indian CPI (CPII) from
January 2000 through April 2014. The graphical plots of the series are presented in Annex 2.
From the graphs shown in Annex 2, we can figure out the possible non-stationarity in Nepalese CPI, Indian CPI and
M2 but cannot determine the nature of the non-stationarity. In the case of NEER, it looks like a stationary series but
cannot be sure about it.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 11
The unit root test on the monthly series of CPI, M2, NEER and CPII at level data and as transformed series by
taking log, first difference and including intercept and both trend and intercept was carried out separately on the three
popular test methods: ADF, PP and KPSS as discussed in Section 3.1. The unit root test results are presented below in
Table 1.
The ADF tests for stationarity shows that all the four variables are non-stationary at the level data as well as at log
transformation. The level series of NEER and CPII becomes stationary at first difference. Nonetheless, even at the first
difference, CPI and M2 are non-stationary. But M2 becomes stationary at first difference after taking log. None of the
series is trend stationary since all of them were still non-stationary after the inclusion of time trend in the ADF test
equation. CPI variable is found non-stationary even at the first difference with and without the log. However, CPI is
stationary at 5 percent at first difference if both trend and intercept is included in ADF test equation.
The PhilipsePerron (PP) test results also show that all the variables are non-stationary (Table 2). Results are
consistent with ADF test results.
The KPSS test for stationary also shows the similar results of non-stationarity of all the series at level (with and
without taking logs). Nonetheless, the test results are different at first difference. Although CPI and log(CPI) both were
non-stationary at first difference (without trend) in ADF Tests, KPSS test reports that log(CPI) is stationary but CPI is
not at the first difference. Both of them were stationary at first difference in PP tests. In case of M2, both M2 and
log(M2) were non-stationary in KPSS tests even at first difference but log(M2) was stationary at PP as well as in ADF
tests (at first difference). M2 at first difference was stationary only in PP tests. Surprisingly, although log(CPII) was
stationary at first difference in ADF and PP tests, it is not stationary in KPSS tests (Table 3).
As shown in Table 4, although non-stationary property can be confirmed by any of the available test methods,
sometimes the way we make those variables stationary and retest for confirmation might show inconsistent results.
Usually, we have to be careful on those variables that cannot be stationary even at the first difference and those at the
borderline of decision points. One good way might be choosing the property that has been repeated or are similar in
various test results.
Following the methodology illustrated in Section 3, we should not estimate OLS model as unit root tests show that
all the variables included in our model are non-stationary. However, for comparison purpose, we conduct following
OLS regression using the log data in order to measure elasticity.
logðCPIt Þ ¼ a þ b1 logðM2t Þ þ b2 logðCPIIt Þ þ b3 logðNEERt Þ þ εt ð27Þ
Adj. R2 ¼ 0.999, F-stat: 12040. DW Stat: 0.686. *: significant at 5 percent or lower level.
Table 1
ADF tests results.
Variable Intercept Trend and intercept
Level First difference Level First difference
t-stat p-value t-stat p-value t-stat p-value t-stat p-value
CPI 2.252 1.000 0.996 0.754 0.004 0.996 3.689 0.026
log(CPI) 1.375 0.999 1.953 0.307 2.150 0.514 2.722 0.229
M2 4.6176 1.000 0.653 0.854 0.904 0.999 6.679 0.000
log(M2) 2.147 0.999 12.160 0.000 1.399 0.858 12.576 0.000
NEER 1.456 0.553 10.736 0.000 1.162 0.914 10.805 0.000
log(NEER) 1.441 0.561 10.845 0.000 1.139 0.918 10.916 0.000
CPII 5.149 1.000 9.498 0.000 0.566 0.999 7.613 0.000
log(CPII) 3.748 1.000 10.406 0.000 1.177 0.911 7.605 0.000
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
12 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
Table 2
PhilipsePerron tests results.
Variable Intercept Trend and intercept
Level First difference Level First difference
t-stat p-value t-stat p-value t-stat p-value t-stat p-value
CPI 3.705 1.000 10.755 0.000 1.130 0.920 12.232 0.000
log(CPI) 2.541 1.000 13.249 0.000 2.535 0.311 18.626 0.000
M2 7.46 1.000 12.359 0.000 1.351 1.000 14.187 0.000
log (M2) 1.662 0.999 13.917 0.000 1.352 0.871 14.265 0.000
NEER 1.127 0.704 10.657 0.000 0.928 0.949 10.777 0.000
log (NEER) 1.127 0.704 10.819 0.000 0.944 0.948 10.876 0.000
CPII 3.892 1.000 9.603 0.000 0.587 0.978 10.422 0.000
log (CPII) 2.541 1.000 13.249 0.000 2.535 0.311 18.626 0.000
Table 3
KPSS test results.
Variable LM Statistics KPSS tests
Intercept Trend and intercept
Critical value @ 5% ¼ 0.463 Critical value @ 5% ¼ 0.146
Level First difference Level First difference
CPI 1.579 0.732 0.417 0.032
log(CPI) 1.638 0.461 0.405 0.112
M2 1.519 1.385 0.415 0.108
log(M2) 1.647 0.527 0.347 0.102
NEER 0.466 0.267 0.342 0.072
log(NEER) 0.471 0.261 0.344 0.073
CPII 1.537 1.235 0.412 0.051
log(CPII) 1.604 0.696 0.411 0.061
Note: For not rejecting null, i.e., the variable to be stationary, LM-Stats should be smaller than the critical values and vice versa.
Table 4
Comparison of results of three unit root test methods.
Variables ADF PP KPSS
CPI Non-stationary at first difference Non-stationary at first difference Stationary at first difference
log(CPI) Non-stationary at first difference Stationary at first difference Stationary at first difference
M2 Non-stationary at first difference Stationary at first difference Non-stationary at first difference
log(M2) Stationary at first difference Stationary at first difference Non-stationary at first difference
CPII Stationary at first difference Stationary at first difference Non-stationary at first difference
log(CPII) Stationary at first difference Stationary at first difference Non-stationary at first difference
Without considering the time series properties of the data, the level data estimates show robust-looking result with
high adjusted R2 values, significant F-stat among others and all variables being significant. But the preliminary
observation of non-stationarity of these series might have given spurious estimates. This can be shown by the lower
value of DW-Stat (0.686) even lower than the R2 values. Also, the Adjusted R2 value very close to 1 (0.999) is believed
to have spurious relation. Further to this, if we plot the residuals of the model, we won't get the sum zero, which
violates the OLS assumptions (Annex 3 Figure A5).
The VAR models of those four variables using level data with two lags can be represented as follows:
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 13
The estimates of the VAR model (Eqs. (29)e(32)) are obtained as follows:
The unit root tests show that all the series included in the moneyeprice model are I(1). In this case, the series might
be cointegrated which, if not addressed, may result in spurious estimates. This has been shown by the test results of
OLS and VAR models presented in the preceding sections. Hence, we conduct Johansen cointegration test employing
monthly series of log(CPI), log(CPII), log(M2) and log(NEER). The test results given by the EViews software are
presented in Table 5.
The software reports two different types of test statistics: Trace statistics and maximum eigenvalue statistics. The
calculation process of rank of the matrix slightly differs between them. The trace statistics tests for the null hypothesis
of k cointegrating relations against the alternative hypothesis of k1. On the other hand, the maximum eigenvalue
statistics tests for the null hypothesis of r cointegrating relations against the alternative of rþ1. In both methods, we
proceed sequentially from r ¼ 0 to r ¼ k1 until we fail to reject the null hypothesis. Both methods in general show the
similar decisions on number of cointegration relations. In case both methods show conflicting results, there is a
convention of interpreting the result based upon the economic logics and purpose of the study.
With this process, the unrestricted cointegration rank tests based on trace statistics and maximum eigenvalue both
indicate that there exists one cointegration relationship. However, it is relatively weak since we reject the null hy-
pothesis only at the 10 percent level of significance. Still, with the logic that Nepal's inflation might have cointegrated
with Indian inflation (as graphical plots indicate), we can assume one cointegration relation.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
14 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
Besides the number of cointegrating vectors, Johansen cointegration test also jointly estimates the long-run and
short-run relationships of the variables incorporated in the model. The long-run estimates are called Beta relations
while short-run estimates are Alpha relations. However, the beta coefficients are only identified when some re-
strictions are imposed in VECM to normalize the relationship amongst the variables.
The results of the Johansen cointegration relations are presented in Table 6.
The Johansen cointegration test results indicate that all the three variables have significant positive impact on
inflation in the long run. The magnitude of impact (the coefficient) of Indian inflation is largest while that of money
supply and exchange rate are almost similar. The short run relation statistics show a significant positive impact on
inflation of its lag values and a negative impact of money supply.
Based on the Johansen test result of one cointegrated relation, we estimate an error correction model (ECM) as
described in Section 4.5.
To identify the number of optimal lags, we can run normal unrestricted VAR and check for optimal lag lengths of
the series. In our case, the optimal lag length is three as indicated by FPE, AIC and HQ criterion (Table 7).
The cointegration and error correction equation of the LNCPI, LNCPII, LNM2 and LNNEER can be estimated as
given below. The VECM approach estimates the long run relationship (with cointegration equation) first and then the
short run relationships for each of the variables (error correction equations).
Cointegration Equation:
X
3 X
3 X
3
DLNCPIt ¼ mLNCPI þ aLNCPI εt1 þ a1h DLNCPIth þ b1h DLNCPIIth þ c1h DLNM2th
h¼1 h¼1 h¼1
ð38Þ
X
3
þ d1h DLNNEERth þ uLNCPIt
h¼1
X
3 X
3 X
3
DLNCPI It ¼ mLNCPI I þ aLNCPI I εt1 þ a2h DLNCPIth þ b2h DLNCPIIth þ c2h DLNM2th
h¼1 h¼1 h¼1
ð39Þ
X
3
þ d2h DLNNEERth þ uLNCPI It
h¼1
Table 5
Johansen cointegration test results.
Unrestricted cointegration tank test (Trace) Unrestricted cointegration rank test (Maximum eigenvalue)
Hypothesized Eigenvalue Trace statistics 0.05 Critical Prob. Hypothesized Eigenvalue Statistic Critical value Prob.
no. of CE(s) value no. of CE(s)
None* 0.203 67.036 47.86 0.00 None* 0.203 37.36 27.58 0.002
At most 1 0.109 29.673 29.80 0.05 At most 1 0.109 19.13 21.13 0.093
At most 2 0.049 10.541 15.49 0.24 At most 2 0.049 8.33 14.26 0.346
At most 3 0.013 2.213 3.84 0.12 At most 3 0.013 2.21 3.84 0.137
*
Rejection of hypothesis at 5 percent significance level.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 15
Table 6
Johansen cointegration relations results.
Coefficient Estimates Standard error
Long-run (Beta) relations
LNCPIIt 0.72b 0.079
LNM2t 0.13b 0.035
LNNEERt 0.125b 0.056
Short-run (Alpha) relations
Dlog (CPII)t 0.055 0.056
Dlog (NEER)t 0.049 0.038
Dlog(M2)t 0.16a 0.090
Dlog(CPI)t 0.49b 0.065
a
Significant at 10 percent level.
b
Significant at 5 percent or lower level.
X
3 X
3 X
3
DLNM2t ¼ mLNM2 þ aLNM2 εt1 þ a3h DDLNCPIth þ b3h DLNCPIIth þ c3h DDLNM2th
h¼1 h¼1 h¼1
ð40Þ
X
3
þ d3h DLNNEERth þ uLNM2t
h¼1
The estimates of the coefficients of Eqs. (37e41) obtained through EViews 8 are given in Table 8j. The long-run
relationships indicate that the contemporaneous impact of Indian CPI is about 68 percent to Nepal's CPI whereas broad
money supply (M2) and nominal effective exchange rate (NEER) account about 15 percent and 10 percent respec-
tively. All the coefficients of cointegration equation are significant at 5 percent or lower level of significance.
The short-run equilibrium coefficient of ECM (as) indicates that M2 helps correcting the disequilibrium of Nepal's
inflation whereas exchange rate and Indian inflation does not. The coefficient for LNM2 is 0.18 and significant at 10
percent level, indicating some level of control of the central bank over inflation both in the short and long-run.
The diagnostics test results show a robust VECM estimates. The residual plot of regressors and the cointegration
equation shows a random zero mean disturbances. Likewise, the inverse roots of AR lie randomly inside the circle.
The LM test does not reject the null hypothesis of no autocorrelation in residuals up to three lags (Annex 4).
As mentioned earlier, we estimated the determinants of Nepal's consumer price index (CPI) by including broad
money supply (M2), Indian CPI (CPII) and nominal effective exchange rate (NEER) in the model. The Johansen test
indicated a weak cointegration relation. On the other hand, graphical plots of CPI and CPII show a common trend,
indicating a cointegration relation. Thus, it would not be wise to take first difference and estimate models as it may
ignore the long run relationship. In this case, ARDL model can capture both long-run and short-run relation of the
cointegrated variables. Hence, ARDL model discussed in 4.6 has been employed to revisit the moneyeprice rela-
tionship in Nepal. Following is the model used with data in log (LN) form:
LNCPIt ¼ a þ bLNM2t þ cLNNEERt þ dLNCPIIt þ et ð42Þ
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
16 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
Table 7
VAR lag order selection criteria.
Lag LogL LR FPE AIC SC HQ
1 1992.727 NA 2.96e-16 24.404 24.099a 24.280
2 2016.836 45.837 2.68e-16 24.504 23.894 24.256
3 2045.989 53.986 2.28ee16a 24.666a 23.751 24.295a
4 2057.528 20.799 2.41ee16 24.611 23.391 24.116
LR: Likelihood ratio; FPE: Final prediction error; AIC: Akaike information criterion; SC: Schwarz criterion; HQ: HannaneQuinn criterion.
a
Optimal lag length.
X
p X
p X
p X
p
DLNCPIt ¼ ε0 þ fi DLNCPIti þ 4i DLNM2ti þ gi DLNNEERti þ hi DCPIIti
i¼1 i¼1 i¼1 i¼1
ð43Þ
þl1 LNCPIt1 þ l2 LNM2 þ l3 LNNEERt1 þ l4 LNCPII þ ut
Above models were estimated on Microfit. The ARDL (1,0,0,1) model was selected based on Akaike Information
Criterion (Table 9).
The long-run estimates of ARDL (Eqs. (42) and (43)) show that M2 and CPII are the determinants of inflation in
Nepal. According to the test results, one percent change in money supply (M2) brings a change of about 0.27 percent
in inflation while one percent change in Indian inflation leads to a change in Nepal's inflation by 0.43 percent.
However, NEER does not seem to affect the inflation.
The diagnostic tests for the ARDL estimates indicate a white noise i.i.d. error terms with Homoskedasticity and
normality. The null hypothesis of Lagrange multiplier test of residual serial correlation cannot be rejected, the
functional form is fine, error terms distributed normally and the null hypothesis for homoskedastic error terms cannot
be rejected (Annex 5).
As described in Section 4.7, the Granger causality tests show pairwise relationship, which may be one-way or two-
way relationship or no relationship. To justify the inclusion of variables in the model, validate cointegration relation
and know the direction of the relationship, this test serves as a complement. The summary results of the Granger
causality test are presented in Table 10.
In a nutshell, the Granger causality test confirms that all the variables (CPII, NEER and M2) included in the model
influence the CPI. These relationships are also theoretically valid and no other problems such as endogeneity are
observed.
Table 8
Cointegration and EC estimates of Eqs. (37e41).
Coefficient Estimates t-stats Equation No.
Long-run cointegration estimates
4 0.965 e Eq. 37
b1 (LNCPIIt) 0.682a 8.20
b2 (LNM2t) 0.150a 3.997
b3 (LNNEERt) 0.097b 1.709
Short-run ECM estimates
aLNCPI 0.462a 5.968 Eq. 38
aLNCPII 0.106a 2.516 Eq. 39
aLNM2 0.18b 1.728 Eq. 40
aLNNEER 0.052 1.172 Eq. 41
a
Significant at 5 percent or lower level.
b
Significant at 10 percent level.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 17
Table 9
ARDL test statistics.
Coefficient Estimates t-stats
Long-run estimates (Eq. (41))
a (Constant) 1.386**
b (LNM2) 0.268** 2.935
l (LNNEER) 0.130 0.795
d (LNCPII) 0.432** 2.106
Short-run estimates (Eq. (42))
DConstant 0.414* 1.80
DLNM2 0.80** 2.558
DLNNEER 0.039 4.178
DLNCPII 0.008 2.653
Adjusted R2 0.99
DW stat 2.15
F-stat. F(5, 162) 3316
*Significant at 10 percent level.
**Significant at 5 percent level.
Table 10
Granger causality tests.
Pair Null hypothesis F-Stat/(p-value) Explanations
1 CPII does not Granger cause CPI 24.729 (0.000) Only first hypothesis is rejected. It shows that Indian inflation has a
CPI does not Granger cause CPII 0.8509(0.428) unidirectional relationship with Nepal's inflation.
2 NEER does not Granger cause CPI 3.658(0.027) Only the first hypothesis is rejected. This indicates that there exists a
CPI does not Granger cause NEER 1.066(0.346) unidirectional relationship of NEER with CPI.
3 M2 does not Granger cause CPI 6.089(0.002) Only the first hypothesis is rejected. It means that there is an unidirectional
CPI does not Granger cause M2 1.957(0.144) relationship of M2 with CPI.
4 NEER does not Granger cause CPII 0.609(0.545) We do not reject both null hypotheses. This indicates that there is no
CPII does not Granger cause NEER 0.964(0.383) relationship between NEER and CPII.
5 M2 does not Granger cause CPII 1.1823(0.309) As both hypotheses are not rejected, we can infer that M2 and CPII are
CPII does not Granger cause M2 1.684(0.188) independent from each other.
6 M2 does not Granger cause NEER 1.196(0.304) Second hypothesis is rejected. It shows that NEER affects M2 but M2 does
NEER does not Granger cause M2 7.962 (0.000) not affect NEER.
Nepal's moneyeprice relationship has been modeled following the methodological framework described in Section
4 of this paper. Different models provide different coefficients of relationships as shown in Table 11.
Table 11
Summary results of estimated models.
Model Variables incorporated Estimates Remarks
OLS CPI, CPII, M2, NEER M2 ¼ 0.106 Estimates are significant but DW stat is
(Monthly 2000 Jane2014 Apr) Adjusted R2 ¼ 0.99 lower than R2 value. It shows that the
DW ¼ 0.686 model is spurious.
VAR M2 ¼ 0.14 Estimates are significant but we reject
Adjusted R2 ¼ 0.99 the null hypothesis of no autocorrelation
LM Stat: 58 in residuals.
p-value: 0.00
Johansen Long run: M2 ¼ 0.13 Shows one cointegration equation but
cointegration Short run: M2 ¼ 0.16 relatively weak (we reject null at 10%)
ECM Long run: M2 ¼ 0.15 Estimates are significant and robust
Short run: M2 ¼ 0.18
LM stat: 13.06
p-value: 0.667
ARDL Long run: M2 ¼ 0.27 Estimates are significant and robust.
Short Run: DM2 ¼ 0.80
Adjusted R2 ¼ 0.99
DW ¼ 2.15
Granger causality A unidirectional relationship: M2 affects CPI but CPI does not affect M2.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
18 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
As discussed above, various methods report different coefficients on the impact of money on consumer price
inflation in Nepal. OLS test results suggest that one percent change in M2 will lead to 0.11 percent change in CPI.
According to VAR results, one percent change in M2 brings a change of 0.14 percent in CPI. However, model fitness
indicators show that these results are spurious. This is due mainly to the non-stationarity of the variables included in
the model. Johansen cointegration test shows that long-run coefficient of M2 is 0.13, while coefficient of M2 estimated
by VECM is 0.15. According to ARDL model test results, coefficient of M2 is found to be 0.27 indicating that one
percent change in M2 leads to 0.27 percent change in CPI.
The unique features possessed by time series data create difficulty in method selection process while analyzing the
relationship among economic variables. The autoregressiveness, stationarity, trends, cycles, seasonality and structural
breaks are the most common properties of time series. These properties should be duly accommodated or addressed to
make the models robust. Specifically, researchers must be aware of spurious relationship among variables. This paper
suggests a general framework for time series analysis which could help in avoiding spurious regression and obtaining
robust results.
Unit root test is the starting point for time series analysis. Based on the results of the unit root test, methods and
models should be selected for the analysis. It is suggested that OLS, VAR or other similar models be used if all the
variables are stationary. However, these models may provide spurious relationship if all or some variables are non-
stationary. The diagnostics test is the significant part of time series analysis to identify spuriousness and robustness.
Johansen cointegration test method is employed when all the variables included in the model are nonstationary. In
the case of mixed variables, i.e., some variables stationary but others nonstationary, Johansen cointegration method
cannot be used. In such a case, ARDL models are appropriate. ARDL models also can be employed using all
nonstationary variables.
Nepal's money price relationship is analyzed following the methodological framework suggested in this paper. The
framework greatly helps in choosing appropriate test methods for data analysis. Analysis of the moneyeprice relation
employing ARDL model shows that in the long-run, money supply affects consumer price inflation by 27 percent.
Based on the model fitness statistics, we can argue that this estimate is robust and reliable compared to the estimates
given by other methods.
References
1. Enders Walter. Applied Econometric Time Series. 4th ed. USA: John Wiley & Sons; 2014.
2. Stigler Stephen M. Gauss and the invention of least squares. Ann Stat. 1981;9(3):465e474.
3. Verbeek Marno. A Guide to Modern Econometrics. 5th ed. Australia: John Wiley & Sons Ltd; 2017.
4. Maddala GS, Kim IM. Unit Roots, Cointegration, and Structural Change. Cambridge: Cambridge University Press; 2003.
5. Perron Pierre. The great crash, the oil price shock, and the unit root hypothesis. Econometrica. 1989;57(6):1361e1401.
6. Perron Pierre, Vogelsang Timothy J. Nonstationary and level shifts with an application to purchasing power parity. J Bus Econ Stat.
1992;10(3):301e320.
7. Perron Pierre. Further evidence on breaking trend functions in macroeconomic variables. J Econom. 1997;80:355e385.
8. Lumsidaine R, Papel DH. Multiple trend breaks and the unit root hypothesis. Rev Econ Stat. 1997;79:212e218.
9. Bai Jushan, Perron Pierre. Computation and analysis of multiple structural change models. J Appl Econom. 2003;18:1e22.
10. Shrestha Min B, Chowdhury Khorshed. Sequential Procedure for Testing Unit Roots in the Presence of Structural Break in Time Series Data.
Economics Working Papers. NSW, Australia: School of Economics, University of Wollongong; 2005.
11. Gujarati Damodar N. Basic Econometrics. New York: McGraw-Hill; 1995.
12. Sims C, Goldfeld S, Sachs J. Policy analysis with econometric models. Brookings Pap Econ Activ. 1982;1982(1):107e164.
13. Engel Robert F, Granger CWJ. Co-integration and error correction: representation, estimation, and testing. Econometrica.
1987;55(2):251e276.
14. Johansen S. Statistical analysis of cointegration vectors. J Econ Dynam Contr. 1988;12(2e3):231e254.
15. Johansen S, Juselius K. Maximum likelihood estimation and inference on cointegration- with applications to the demand for money. Oxf Bull
Econ Stat. 1990;52:169e210.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 19
16. Pesaran M, Hasem, Pesaran Bahram. Working with Microfit 4.0: Interactive Econometric Analysis. Oxford: Oxford University Press; 1997.
17. Pesaran M, Hasem, Shin Yongcheol. An autoregressive distributed lag modelling approach to cointegration analysis. In: Strom S, Holly A,
Diamond P, eds. Econometrics and Economic Theory in the 20th Century: The Ranger Frisch Centennial Symposium. Cambridge: Cambridge
University Press; 1999.
18. Granger CWJ. Investigating causal relations by econometric models and cross-spectral methods. Econometrica. 1969;37(3):424e438.
19. Fisher Irving. The Purchasing Power of Money-its Determination and Relation to Credit, Interest, and Crises. New York: The Macmillan Co;
1922.
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
~ 1 j _,i() 1
Par
Çlément YELOU
Lu et approuvé par :
Au commencement Dieu créa les cieux et la terre. La terre était informe et vide. ... Dieu dit: Que
... Puis Dieu dit: Que la terre produise de la verdure, de l'herbe portant de la semence, des arbres
fruitiers donnant du fruit selon leur espèce et ayant en eux leur semence sur la terre. Et cela fut
ainsi.
... Aujourd'hui, Il me fait reposer dans de verts pâturages, Il me dirige près des eaux paisibles....
Aussi, Il répond à quiconque vient à Lui car Il dit: "Que celui qui a soif vienne, que celui qui
Les premières idées de ce travail ont été enrichies de nombreuses manières tout au long de
la recherche par mon directeur de recherche, le professeur Aly Ahmadou MBAYE; je lui
témoigne toute ma reconnaissance et le remercie pour ses encouragements et 1'_exemple de
rigueur et de précision qu'il m'a montré. Que le professeur Bouna Birahim NIANG reçoive
ici mes remerciements pour l'exemple de précision et d'humilité qui caractérise la
collaboration avec lui. La coordination et le calendrier de progression de ce travail ont été
bien assurés par le professeur M. LABIDI, chef de la division formation de l'IDEP. Que
l'ensemble du personnel de la bibliothèque de l'IDEP ainsi que celui du centre de
documentation de la mission résidente de la Banque Mondiale à Dakar soient rassurés de
mes reconnaissances pour leurs divers services de documentation et d'information. Le
directeur de la Division des Entreprises de la Direction de la Prévision et de la Statistique
(DPS) du Sénégal, M. SAMBA BA, m'a fourni les données nécessaires aux analyses; qu'il
t
Que mes parents trouvent ici une part du fruit de l'éducation qu'il m'ont assurée dès mon
enfance; elle était de nature à m'encourager à l'effort. A mon frère Emmanuel GOLOU et
sa famille, je demande de recevoir mes remerciements; ils m'entourent d'une grande
attention et de beaucoup de soins. Que les familles FANTODJI et DANS OU trouvent en ce
travail l'effet de leur affection toujours renouvelée à mon égard. Que ma soeur Fatima
Myriam VICENS soit rassurée de ma gratitude pour son attention, ses encouragements et
son soutien. A Madame Soukheynatou KABA, qui s'est toujours montrée préoccupée par
rapport à ma situation sociale, je dois beaucoup de conseils et de soutien; qu'elle soit
satisfaite de ce travail. Mon frère Thimothé AMOUSSOU m'a toujours aidé de
nombreuses manières; qu'il trouve en ce travail un fruit de tous ces efforts. Je dois aussi
beaucoup à mes soeurs Ablavi GOZA et SENAVOR DJIGBODI en raison des diverses
aides qu'elles m'ont apportées dans mes multiples occupations pendant les moments de ce
travail.
A mes amis que l'avancement de ce travail a toujours préoccupé depuis qu'ils savent que
je dois le faire je demande d'en tirer toute la fierté possible: Blandine, Angélique,
Stéphanie, Antoine, Aby, Olivier, Marius, Wilfrid, Kourouma, Alexis, Sabine, Lydienne,
Hervé, Fernand, Essowaza, Reine, Biaka, Calixte, David, Félicité, Stéphane, Ngaradoum,
Allasra, Symphorien et les autres dont le nom n'est pas écrit ici.
Que mes amis Narcisse KOUTON, Appolinaire HOUENOU et Damien Fousséni CHABI-
YO trouvent un grand plaisir en ce travail; je sais bien qu'ils pensent toujours à moi.
Par dessus tout, cette formation n'aurait pas été possible sans le soutien constant de Dieu.
Je crois que c'est de Sa volonté et je lui rends grâce; il a éliminé tout obstacle et sa fidélité
me constitue une bonne source d'espérance et de courage.
11
RESUME
Depuis l'indépendance, malgré les efforts mis en oeuvre pour assurer la diversification des
l-activités économiques et les diverses politiques économiques appliquées, l'économie
sénégalaise n'a pas connu une croissance régulière et continue. Cette absence de décollage
économique véritable suggére que les facteurs pertinents pour la croissance économique
n'auraient pas été bien maîtrisés dans l'économie. Pour cela, cette etude a essayé de
répondre à la question : "Quels sont les facteurs explicatifs des variations du PIB par tête
dans l'économie sénégalaise". La méthode d'analyse utilisée consiste à estimer
,.
successivement un modèle de croissance avec résidus de Solow, un modèle à capital
humain, puis un modèle avec variables de politique économique en tant que variables de
contrôle. Cette démarche vise à endogénéiser le résidu de Solow tel que suggére par les ·
modèles de la croissance endogène. Ainsi, nous supposons que les variables de politique
économique influent sur le taux de croissance par le biais de leur effet sur le residu de
Solow. Les données de l'étude sont relatives à la période de 1971-1997 qui prend en
compte l'ensemble des principales réformes macroéconomiques et sectorielles adoptées
dans l'économie]
Les résultats suggèrent que ni le capital humain ni le capital physique n'ont été bien
exploités dans l'économie, faute d'un environnement de travail et de motivation adéquat.
En fait, les politiques macroéconomiques et le cadre de production n'ont pas permis une
pleine exploitation des ressources. Par ailleurs, les politiques de dépen,ses de
consommation de l'Etat ont induit des attitudes favorables à la croissance. En outre, les
changements climatiques engendrant une forte sécheresse ont des répercussions négatives
sur 1' économie. Enfin, les politiques de libéralisation commerciale se révèlent comme des
mesures qm incitent, aussi bien les entreprises de production locale que celles
d'exportation, à rechercher une meilleure compétitivité à travers une hausse de la
productivité et l'amélioration de la qualité des produits; de ce fait, elles favorisent la
crOissance.
lll
ABSTRACT
Since independence, despite the efforts put forth to secun;! the diversification of economie
activities and the various applied economies, senegalese economy hasn't known constant
and regular growth. This absence of real economie take off suggests that pertinent factors
for the economie growth haven't been mastered in the economy. For tfhat, this study has
tried to answer to the question : "What are the explicative factors of the per capita GDP
variations in the senegalese economy". The analysis method use consists in estimating
successively a growth model with Solow residual, a model with human capital and a model
with economie policy variables considered as control variables. This proceeding aims at
endogenizing the Solow residual as suggested by endogenous growth models. So, we
suppose that economie policy variables influence the growth rate through the impact they
have on Solow residual. The data of the study relate to the period 1971-1997 that includes
the whole of the main sectorial and macroeconomie reforms adopted in the economy.
The results suggest that neither human capital nor physical capital have been well
exploited because of the lack of an adequate working environment and motivation. In fact,
macroeconomie policies and the production frame didn't allow a full exploitation of
resources. Besides, the public consumption expenses policy induced fovourable attitudes to
growth. Futhermore, climatic changes causing strong grought have negative effects on the
economy. Finally, commercial liberalization policies are measures that incite as well as
local production enterprises than the orres of exportation to look for a better
competitiveness through a rise of productivity and an improvement of the products'
quality ; so, they incite growth.
These analyses suggest that government policy should define a frame in which enterprises
could widely work for better productivity. The different actors of the economy should put
forth all their strenght to better exploite the opportunities of production.
IV
•
•
SOMMAIRE
REMERCIEMENTS ..................................................·........................................................... i
RESUME .............................................................................................................................. iii
INTRODUCTION GÉNÉRALE ........................................................................................ !
1. Introduction ..................................................................................................................... 1
2. Problème Central De L'étude ....... .......................................................................
,.
.......... 3
3. Justification De L'étude ...................................................................................... ......... ... 7
4. Objectifs De L'étude ....................................................................................................... 8
5. Hypothèses De Recherche .................................................................. ........................... 8
6. Organisation De L'étude .................. .................... .......................................................... 9
CHAPITRE PREMIER :
CONCLUSION ..... ....... .. .. .. .... ..... ... ....... ........................ ... .... .... ..... .... ......... ... ... .. .............. 97
RECOMMANDATIONS ....................... .... .. ........ ..................................... ........... .. .......... 99
1) Une politique macroéconomique saine .................................. ............................ 99
2) Réforme de l'intervention publique axée sur la bonne gouvernance .......... 100
3) Réformes en matière de dépenses publiques ................................................ 103
4) Renforcement du capital humain ............................ .......................................... 104
5) Expansion rapide des exportations .. ............... ........ .. ........ .. .............................. 106
6) Information- Education et Communication en matière d'Environnement ... 108
7) Rendre les réformes irréversibles .. ..... ............................................................ 109
BIBLIOGRAPHIE .......................................................................................................... 11 0
ANNEXES
vi
Introduction Générale
INTRODUCTION GENERALE
1. INTRODUCTION
En Afrique subsaharienne, plusieurs mesures ont été adoptées après les indépendances en
vue d'assurer une croissance économique rapide susceptible de éonduire chaque pays vers
substitution aux importations ont été adoptées dans la plupart des pays. Mais, la protection
élevée et permanente des entreprises locales que cette stratégie requiert a engendré des
productrices n'étant pas à la hauteur de celles utilisées dans les pays développés, cela n'a
pas permis aux produits locaux de répondre aux critères de compétitivité en terme de
qualité. Malgré la faible compétitivité-qualité des produits locaux, ceux-ci sont so.uvent
vendus à des prix plus élevés que ceux des produits homologues importés. Ce qui
n'encourage pas les consommateurs à les acheter. Cette stratégie de croissance n'a donc
Dans le secteur agricole, pilier naturel de 1' économie en Afrique, le taux de croissance
annuel de la production n'a été que 2% en moyenne entre 1965 et 1980, ce qui est moins
HO EVEN et V AN DER KRAAIJ, 1994). Les autres secteurs des économies de 1' Afrique
subsaharienne n'ont pas enregistré des résultats meilleurs, si bien que depuis le milieu des
années 70, le taux de croissance annuel est devenu plus faible que sa valeur de la période
Introduction Générale
1965- 1973 (BANQUE MONDIALE, 1993). En effet, alors que dans cette région, ce taux
était de 4. 7 % sur la période 1965 - 1973, il est tombé à 3.2% au cours de la période 1974
- 1980, puis à 1.2% entre 1981 et 1985 avant de remonter à 2.5% entre 1986 et 1990
Cette situation économique défavorable de la fin des années 70 a conduit la plupart des
(FMI) et de la Banque Mondiale. Un bilan de ces politiques montre que les résultats
effet, dans la plupart des pays, même si on a assisté à une stabilisation financière,
l'économie n'a connu qu'une stagnation et les revenus par habitants n'ont augmenté que
très faiblement. Ces faibles performances de l' activité économique se sont sérieusement
répercutées sur le niveau de vie des populations : le taux de croissance moyen du revenu
par habitant de l'Afrique subsaharienne est négatif entre 1981 (où elle vaut - 1.8%) et 1992
(où elle est de - 0.4%). Il en est découlé l'aggravation des problèmes de pauvreté et le
niveau de vie a décliné d'année en année (PNUD, 1998). Ainsi, ces politiques ont engendré
Par ailleurs, des politiques d'ajustement externe ont été appliquées dans certains pays
1994 dans les pays africains de la zone franc devrait relancer la compétitivité de leur
économie. Bien que cette mesure ait engendré une baisse du pouvoir d' achat du
2
Introduction Générale
1' Afrique subsaharienne ou de la zone franc ont été mises en œuvre avec certaines
spécificités comme des mesures d'accompagnement. Mais, elles n'ont pas permis une
la dévaluation du franc CFA que 1' on enregistre des taux de croissance économique assez
Dans ces conditions des actions devraient être mises en oeuvre par les autorités pour
souci du bien-être individuel et collectif, il est évident que la croissance du revenu global
Cette étude recherche les facteurs qui conditionnent la croissance du revenu global dans
l'économie du Sénégal. La démarche adoptée met l'accent sur les facteurs qui favorisent
La recherche des causes des différences entre les résultats économiques réalisés par divers
pays au cours d'une même période ou entre ceux enregistrés par un même pays en.deux
périodes différentes est une préoccupation majeure des théories économiques. Pour ce
fonction de production. La fonction de production est définie à partir des facteurs utilisés
3
Introduction Générale
capital physique. Selon les premiers modèles de croissance (modèle de Solow (1956), de
réelle qui n'est pas saisie par ces deux facteurs. Cette part non expliquée est la Productivité
Globale des Facteurs (PGF) désignée par le terme ''résidu de Solow'' ; elle est supposée
grande part de la différence des résultats obtenus, soit dans le temps, soit dans l'espace,
(Chenery, 1991). Ainsi, à niveau de développement semblable, les pays qui augmentent
Ces insuffisances des modèles à deux facteurs ont suscité la recherche d'autres facteurs
susceptibles de capter une bonne partie du résidu de Solow. C'est alors que les théories de
la croissance endogène examinent, à partir du milieu des années 80, les principaux facteurs
qui expliquent les dynamiques de croissance et leur caractère auto-entretenu. Ces nouvelles
rendement marginal dégressif, soit qu'une partie de la production supplémentaire sert à des
activités économiques. Ces théories suggèrent aussi que les réformes de politiques, par
4
Introduction Générale
effet, elles font une rupture avec les conceptions traditionnelles de la fonction de
long terme. La croissance à long terme peut se faire par des rendements croissants du
croissance économique ne devrait donc plus se limiter à l'analyse des seules variations des
technique endogène.
Au Sénégal, il faut dire qu'au cours des années 60 et 70, il a été adopté des politiques
caractérisées par 1'intervention publique dans les marchés des biens et des facteurs, un
politiques ont eu pour résultat de faibles niveaux d'épargne et d'investissement qui, face à
une forte croissance démographique, ont entrainé à leur tour une stagnation du PIB par tête
(Banque Mondiale, 1997). Les politiques d'ajustement initiées au debut des années 80 ont
réussi partiellement à rétablir les équilibres macroéconomiques. Mais, au début des années
90, les conditions économiques vont encore se dégrader à la suite d'une profonde
secteur public, ont eu des résultats globalement positifs. Ainsi, sur la période 1994-1997,
on a pu enregistrer des taux de croissance annuels de l'ordre de 5%, ce qui est supérieur
5
Introduction Générale
aux résultats des périodes antérieures. Mais, ces résultats en matière de crOissance
économique sont encore très bas compte tenu du niveau actuel de l'économie.
Facteurs (PGF) n'a pas été régulière depuis 1960. En effet, Berthélémy et al. (1996)
montre que la PGF n'a connu de croissance régulière et positive que sur la période 1960-
1966. Entre 1967 et 1990, la PGF a beaucoup fluctué, enregistrant même des taux de
croissance annuels négatifs. Ces auteurs trouvent que malgré cette faible évolution, la PGF
a contribué pour près de 22% à la croissance du PIB entre 1960 et 1990. On peut
ses fortes fluctuations peuvent avoir engendré les faibles résultats enregistrés en matière de
croissance depuis l'indépendance en 1960. En fait, depuis lors, des politiques de réformes
international a beaucoup évolué. Aussi, des efforts ont été faits dans le cadre de
1' accroissement des compétences et aptitudes humaines. En ce sens, il faut noter que les
services de santé et de sécurité ont connu à la fois une amélioration de leur qualité et une
plus grande couverture géographique. Les taux d'inscription aux divers niveaux
d'éducation ont beaucoup augmenté par rapport à ce qu'ils étaient en 1960 (68% en 1996
contre 41% dans l'éducation primaire, World Development Indicators CD-Rom - World
Bank, 1999 ) et les domaines de formation se sont diversifiés. Ces différentes politiques
croissance de la PGF. Dans ces conditions, il est important de rechercher quels ont été
leurs effets réels sur les variations du PIB par tête. Cette étude recherche les facteurs qui
permettent d'expliquer au mieux les variations du PIB par tête de l'économie sénégalaise
6
Introduction Générale
3. JUSTIFICATION DE L'ETUDE
chercher à accroître le revenu par tête. Certes, il s'agit là d'un indicateur basé sur un calcul
répartition de la richesse nationale. Toutefois, l'accroissement dJ.I revenu national sur une
longue période est une phase préalable et nécessaire à la réduction de la pauvreté. Il reste
donc que les politiques macroéconomiques doivent continuer à se focaliser sur l'objectif de
la croissance durable. Mais, les variables sur lesquelles il convient d'agir pour accélérer la
croissance sont identifiées par les théories et modèles de croissance. Mais, ces modèles ne
prennent pas en compte les facteurs de croissance qui peuvent être liés à, 1' environnement ~ ,
économique- aussi bien interne qu'externe - dans lequel s'opère l'activité de production.
Or, ce type de facteur est important. En effet, les facteurs travail et capital physique ne
peuvent opérer que dans un climat qui est favorable à leur pleine exploitation et à leur
renouvellement. L'intérêt de cette étude est de montrer que les deux facteurs traditionnels
qui conditionnent l'efficience du système productif et qui constituent ainsi des facteurs
Nous pourrons alors disposer de nouvelles stratégies pour l'orientation des politiques
7
Introduction Générale
4. OBJECTIFS DE L'ETUDE
croissance continue du PIB par travailleur. De façon spécifique, elle vise à cerner :
S. HYPOTHESES DE RECHERCHE
A partir des résultats de divers travaux empiriques portant sur 1' analyse de ·la croissance
dans plusieurs types de pays (en terme niveau de développement) nous formulons les
faible des variations du taux de croissance du produit intérieur brut (PIB) par tête de
consommation publique sont les princip<).UX facteurs qui expliquent les variations du
taux de croissance du PIB par tête au Sénégal. Ces facteurs seraient positivement liés
Ces hypothèses mettent ainsi l'accent sur les facteurs endogènes qui sont, en dehors du
'
8
Introduction Générale
6. ORGANISATION DE L'ETUDE
principales politiques et réformes qui y ont été mises en oeuvre depuis 1960. Le chapitre
9
Chapitre 1: Evolution récente de l 'économie sénégalaise
Avec sa position très avancée dans 1' ouest de 1'Afrique, le Sénégal couvre une superficie
de 196722 Km2 et comprend une enclave, la Gambie, de 10300 Km 2 . Le relief du pays est,
dans sa grande partie, plat et ne s'élève pas au-dessus de 130 mètres; seule la région
Sud-Est est quelque peu accidentée. Le climat subit des influences géographiques et diffère
notablement entre la zone côtière et les régions de l'intérieur. D'autre part, la circulation
atmosphérique favorisée par l'absence d'obstacles montagneux place le pays sous les effets
dont les durées varient d'une région à l'autre; les pluies diminuent progressivement en
Ziguinchor (sud), 800 mm dans la région de Kaolack (zone centrale), 330 mm à Podor
(nord). En dehors de ses deux fleuves (le Sénégal et la Gambie), le sénégal dispose
prédominent les épineux ; la savane arborée riche en faune caractérise les zones
soudaniennes et la forêt épaisse est localisée dans la zone subguinéenne limitée à la Basse-
Casamance.
Toutes ces caractéristiques physiques ont fortement limité les possibilités de choix en
10
Chapitre 1: Evolution récente de l'économie sénégalaise
les rapports de la métropole avec ses colonies d'Afrique occidentale. En effet, pour des
deuxième guerre mondiale. Ainsi, le Sénégal avait été privilégié par les autorités coloniales
le Sénégal avait joué un rôle de premier plan dans les affaires coloniales en Afrique et ,a
dans les années qui suivent l'indépendance que l'économie nationale s'est orientée,
Au cours des deux décennies qui ont suivi 1'indépendance (1960-1980), la situation
économique du Sénégal n'a été que peu satisfaisante dans l'ensemble, même au regard des
autres pays d'Afrique subsaharienne. Le PIB a augmenté en moyenne de 2,1% par an alors
que l'accroissement de la population était de 2,8%, ce qui a engendré une baisse du revenu
réel par tête. Aussi, de tous les pays africains épargnés par la guerre, le Sénégal est celui
qui a enregistré le plus faible taux de croissance au cours de cette période. En fait, on peut
distinguer sur cette période quatre sous-périodes caractérisées par des tendances différentes
accordé par la France à ses exportations agricoles, la gestion de l' économie nationale a
11
Chapitre 1: Evolution récente de l'économie sénégalaise
été relativement saine et la croissance a atteint 3,5% par an environ, soit plus que le
Entre 1967 et 1974, année où le prix mondial du pétrole a quadruplé, le PIB du Sénégal
n'a augmenté que de 1,3% par an et la production d'arachide a chuté de près de moitié.
peu près égale à celle de la population, ce qui s'explique en grande partie par de bonnes
grandes sécheresses et de la forte baisse des cours mondiaux de 1' arachide ; ce qui a
A la fin de cette période, les principaux indicateurs économiques faisaient tous apparaître
négatif et la consommation totale était supérieure au PIB. Aussi, entre 1975 et 1980,
l'inflation s'est accélérée pour atteindre un taux de 12% tandis que les termes de l'échange
diminuaient eux aussi de 12%. Cette situation laisse percevoir la nécessité de réformes
y ont été mises en oeuvre depuis 1980 nous faisons d'abord un aperçu de la structure de la
production nationale.
12
Chapitre 1: Evolution récente del 'économie sénégalaise
La population active du pays est inégalement répartie entre ces secteurs ; de même la
globale de la structure de la population active selon ces trois secteurs est donnée dans le
· tableau suivant.
~ ]
1980 1983 1991 1995 1997
s
Primaire 32,5 23,1 20,6 21,0 19,0
s'explique par la forte expansion des activités du secteur informel et par l'exode rural qui
engendre une baisse de la population engagée dans les activités agricoles. Depuis 1983, la
part de l'emploi du secteur secondaire semble stagner autour de 20% de l'emploi total ; ce
qui résulte de l'importance toujours accordée aux activités industrielles dan ~ . le pays. Cette
différents secteurs.
Chapitre 1: Evolution récente de 1'économie sénégalaise
Tableau n°2 : Taux de croissance annuel moyen des valeurs ajoutées sectorielles
réelles (prix constant de 1987) de 1970 à 1997 (%).
~
1970-74 1975-79 1980-84 1985-89 1990-93 1994-97
s
Primaire 1,66 0,89 0,42 2,38 -2,38 1,88
Source : Calculs de 1' auteur a' partir des PIB sectonels de la base de données économiques
annuelles de la Direction de la Prévision et de la Statistique du Sénégal.
dans le tertiaire. Par contre, les taux de croissance toujours faibles du primaire peuvent
~
1970 1975 1980 1985 1990 1993
s
Primaire 1,7 1,1 -4,8 1,6 2,2
14
Chapitre 1: Evolution récente de l'économie sénégalaise
a) Le secteur primaire
Il s'agit principalement de 1' agriculture, de 1' élevage, de la pêche et des services forestiers
1' agriculture dans 1' économie et dans la subsistance de la population - notamment dans
l'optique de l'autosuffisance alimentaire - doit inciter à mieux gérer les atouts et les
contraintes qui y sont liés au Sénégal. Or, depuis deux décenniss, cette branche agricole
traverse des difficultés dues à une pluviosité défavorable, à la réduction des subventions et
18,8% sur la période 1960-1986 à 11% sur la période 1987-1993 (MEPF, 1997).
On note, malgré les mesures prises à partir de 1981 par les autorités pour la résorption du
L'élevage contribue en moyenne pour près de 7,3% au PIB. Cette branche a connu des
efforts de développement mais ne satisfait pas encore toute la demande en lait (MEPF,
1997). Elle permet cependant des exportations de cuirs et de peaux. Elle souffre de la
fragilité du paturâge due à la légère végétation du pays et à une mauvaise gestion du milieu
naturel.
La pêche a connu une croissance régulière qui la place aujourd'hui au premier rang de
artisanale qui a connu depuis les années 70 de nouvelles techniques, et la pêche industrielle
15
Chapitre 1: Evolution récente de 1'économie sénégalaise
b) Le secteur secondaire
travaux Publics (BTP) et d'artisanat. Bien que la contribution à l'économie nationale des
autres composantes ne soit pas négligeable, les activités industrielles constituent la branche
il suffit de renforcer la synergie entre les branches de production pour obtenir des
avantages en terme de coûts de production et de qualité des produits. Aussi, des possibilités
textile et l'énergie. Selon les critères de la valeur ajoutée et des opportunités d'emplois
cette dernière fournit en moyenne 40% de la valeur ajoutée industrielle et près de la moitié
de l'emploi total de l'industrie manufacturière (Latreille, T., 1996). Il faut noter que dans le
importations a été adoptée jusqu'en 1985; cette stratégie vise à protéger l'industrie locale
par l'instauration de barrières tarifaires et non tarifaires. La forte protection n'a pas permis
à l'industrie d'être compétitive ni sur les marchés extérieurs ni par rapport aux produits
importés. Mais, depuis 1986 des politiques de libéralisation ont été adoptées et l'industrie
devrait désormais faire des efforts pour améliorer sa compétitivité prix et qualité.
16
c) Le secteur tertiaire
transport et par des activités du secteur informel. Les services de transport, de poste et de
Sénégal fait que plusieurs personnes s'adonnent aux activités commerciales ou à des
activités relevant du secteur informel. Ceci justifie la forte contribution du tertiaire au PIB .
qu'il crée. Le tourisme est favorisé par la position géographique du pays, la qualité de ses
Dans le domaine des télécommunications, le Sénégal est le pays le plus compétitif dans la
zone UEMOA. Il possède aussi les produits de télécommunication les plus nombreux et de
la meilleure qualité, y compris 1' accès à 1'internet. Ce secteur vient aussi d'être libéralisé.
En outre, ses infrastructures de télécomunication sont 1'une des plus modernes de 1'Afrique
de l'ouest; leur taux de défaut de 39% est parmi les plus bas de l'Afrique subsaharienne
Au Sénégal, les politiques budgétaires et monétaires sont caractérisées par une forte
intervention de l'Etat et par des contraintes liées à l'appartenance à l'UEMOA 1• Elles ont
engendré des déséquilibres macroéconomiques qui se sont aggravés vers la fin des années
70. Alors, les institutions internationales vont intervenir à travers des politiques de
1
Dans le cadre de sa politique de crédit, cette union impose un plafond statutaire aux avances consenties aux
gouvernements qui ne peuvent dépasser 20% des recettes budgétaires ordinaires de l'exercice budgétaire
précédent. Ce qui limite le financement des déficits publics par la Banque Centrale de l'Union.
17
Chapitre 1: Evolution récente de 1'éco nomie sénégalaise
A la fin des années 70, le gouvernement sénégalais a commencé à se renue compte des
phosphates au cours des années 1973-1977 et un contexte d'endettement fac le ont favorisé
des politiques internes expansives (Diagne, A., 1995). Les déséquilibres internes et
appel aux institutions de Bretton Woods pour mettre en oeuvre des p rogrammes de
Dans une première phase, entre 1980 et 1984, le pays s'est efforcé de mettr( en oeuvre des
du compte courant: c'est la phase de stabilisation. Cette phase a été marquée par
en Novembre 1979 sur une période de cinq ans . Les mesures contenue~ dans ce plan
parapublic. Il a aussi été conclu en Août 1980 un accord de facilité élargie (FEE) avec le
FMI, puis un prêt à l'ajustement structurel en Décembre 1980 avec la Banque Mondiale.
18
Chapitre 1: Evolution réc ente de 1'économie sénégalaise
Le FEE contenait des mesures fiscales visant à accroître les recettes et let réduction des
dépenses publiques. Mais le gouvernement sénégalais n'ayant pas appliqtLé les mesures
demandées par le FMI, aucun achat n'a pu être effectué au titre du FEE el en Septembre
Par la suite, il a été conclu avec le FMI un programme annuel de confirmat on pour 1981-
1982 avec révision à mi-parcours et des achats et des achats au titre de cet accord sont
programme sont : augmentation des prix des denrées de première nécessité, majoration des
taux d'imposition indirecte, blocage des salaires de la fonction publique, cc,mpression des
En 1982, un autre programme conclu pour la période 1982-1983 avait mi ~ en oeuvre les
mesures suivantes : limitation des effectifs de la fonction publique, relèvem;nt du pri x des
intrants, compression du coût des effectifs des filières agricoles et de certaines entreprises
crédit.
Pour la période 1983-1984, les mesures prises visaient 1'ugmentation des prix des produits
de base, l'augmentation de la retenue sur les prix au producteurs, la réduction des dépenses
Décembre 1985 avec l'adoption d'un Programme d'Ajustement à Moyen d Long Term e
la stagnation de 1' économie sénégalaise est imputable à trois catégories de facteurs : une
19
demande intérieure supérieure au PIB, le faible potentiel de croissance du secteur primaire,
un secteur public hypertrophié et inefficace. Les mesures contenues dans le PAMLT ont
cet effet, des politiques agricole, industrielle et commerciale ont été élaborées dans le cadre
La Nouvelle Politique Industrielle qui est un plan d'actions mis au point en Juillet 1986 a
été adoptée pour s'attaquer à la protection excessive dont l'industrie sénégalaise a toujours
1' élimination de tous les facteurs de rigidité relatifs à 1' emploi et à la détermination des
salaires.
Pour sa mise en oeuvre prévue pour trois ans (1986-1988), les mesures suivantes ont été
20
A propos des retombées macro-économiques de ces programmes, elles sont mitigées.
institutionnels sont intervenus pendant la période où elles ont été appliquées: succès des
(déréglementation sur les marchés, suppression des contrôles des prix, meilleure gestion
des entreprises parapubliques) (Elliot Berg and associates, 1990; Ministère de l'Economie
associates (1990) aboutit à la conclusion centrale : ''Le Sénégaln 'a pas apporté de grands
changements en matière d'ajustement durant les années 80, décennie pendant laquelle
politique"(p.32). Selon cette étude, l'ajustement structurel a été ajourné au Sénégal. Pour
1992 peut être résumé par l'expression: "Stabilisation, peut-être; croissance, non" (p.
XI). Diagne, A. (1995) trouve en effet que la diminution du déficit budgétaire entre 1985 et
1992 a été obtenue au moyen de mesures incompatibles avec l'objectif de croissance. Pour
lui, la pression fiscale s'est alourdie, les taxes frappant les facteurs techniques (eau,
énergie, télécommunication) ont provoqué une hausse de leurs prix, ce qui a réduit la
compétitivité des entreprises. Par ailleurs, les ressources d'entretien et d'investissement ont
baissé alors que la masse salariale augmentait. Pour BERG, E. and associates, la lenteur
des efforts du Sénégal en matière d'ajustement structurel peut s'expliquer par plusieurs
facteurs : 1) des facteurs exogènes comme la sécheresse, la structure des prix des
21
politique); 3) la présence des bailleurs de fonds qm auraient manqué de capacité à
En terme d'impact de la NPI, les études menées à cet effet montrent que les résultats
escomptés n'ont pas été atteints et que la politique a même engendré des effets négatifs sur
certains aspects de 1' économie nationale. En effet, la Banque Mondiale ( 1992) montre que
phosphate, toutes les exportations ont en réalité, soit stagné, soit diminué.
Au début des années 90, la plupart des pays africains de la zone franc ont vu leurs
adopté une dévaluation du franc CFA de 50% le 11 Janvier 1994. L'objectif principal est
des biens échangeables par rapport à celui des biens non échangeables, améliore la balance
commerciale et incite à une réallocation des ressources en faveur du secteur des biens
exportables.
22
la taille du secteur public. Depuis 1994, les mesures de réforme suivantes ont été bien
d'exporter, ainsi que des prix de référence au cordon douanier, réduction des tarifs
préalables requises pour les licenciements effectués pour des motifs économiques ;
En terme de mesures sociales, la hausse des salaires de 10% et la mise en place d'un
Par rapport à l'objectif de croissance, les effets de la dévaluation peuvent être analysés
d'appui au secteur privé, etc. D'une façon générale, les effets ont été mitigés et ne sont pas
très encourageants.
23
Trois études 2 menées en 1996 sur la réaction des petites et moyennes entreprises (PME) du
Les PME au Sénégal utilisent beaucoup d'intrants importés, lesquels sont difficilement
remplaçables par des intrants locaux. La dévaluation n'a donc pas permis la
entreprises à se lancer dans les secteurs qui concurrencent les importations. Une
concurrence accrue est née de la part des micro-entreprises et du secteur informel qui
peuvent produire des marchandises à moindres coûts. Dans ces conditions, les
entreprises intervenant sur le marché local ont été contraintes d'ajuster vers le bas et le
Tous les secteurs tournés vers les exportations ont pns un grand essor, surtout
l'industrie de la pêche;
Les secteurs protégés de la concurrence extérieure (électricité, eau, énergie), mais qui
sont d' importants fournisseurs du secteur extérieur, ont continué à affaiblir l'efficacité
de ce dernier secteur ;
conduit pas à la création d'emplois. Les entreprises préfèrent augmenter les salaires et
2Les références de ces études sont: 1) Les petites et moyennes industries après la dévaluation du franc CFA:
Conséquences, réactions et potentiels au Sénégal, par R. QUALMANN; R. FRACKMANN; T.
GANSLAMYR; B. GERHARDUS et B. SCHONEWALD, Etudes et rapports d'expertise 1511996. Institut
Allemand de développement, Berlin, 1996.
2) L'offre des entreprises manufacturières deux ans après la dévaluation du franc CFA: le cas du Sénégal, par
G. COLLANGE, Département des Politiques et des Etudes, Division de l'ajustement et de la macro-
économie (CFD), Janvier 1996.
3) Impact de la dévaluation sur le secteur productif. Rapport provisoire. Ministère de 1' Economie, des
Finances et du Plan. Unité de Politique Economique, Dakar, Mars 1996.
24
a donc eu une baisse de la sous-utilisation des capacités de production, ce qui s'est
depuis 1994. Aussi bien le gouvernement que les bailleurs de fonds trouvent que les effets
de la dévaluation et des réformes ont été positifs (Harold, 1995 ; Banque Mondiale, 1997).
Il est cependant à prendre en compte les facteurs exogènes à cette politique qui peuvent
Investis. Intér. Brut /PIB 12,6 13,1 13,7 15,6 16,3 16,7
Sources : World Bank, World tables, 1992 , 1995, African Development Ind1cators, 1997,
1998/1999
PIB a été positive en 1994 et passe de 4,8% en 1995 à 5,6 en 1996, puis à 4,7% en 1997 ;
ce qui contraste avec la stagnation du début des années 90. Les déficits budgétaires et de la
25
balance courante (hors dons) sont passés respectivement de 5,7% et 9,3% du PIB en 1994 à
2% et 7,2% du PIB en 1996, puis à 1,3% et 6,1% en 1997. L'épargne intérieure brute est
passée de 7,4% du PIB en 1994 à 10,9% en 1996. Après avoir diminué de 3,7% par an au
Les politiques macroéconomiques ont ainsi eu des résultats mitigés en terme de croissance.
véritable décollage en terme de croissance. Bien que les aléas climatiques peuvent avoir
joué, les avantages en terme d'infrastructures dont le pays bénéficiait dès le départ
(l'indépendance) apparaissent avoir été mal exploités. Ainsi, cette absence de décollage
véritable devrait être plutôt liée aux erreurs contenues dans la mise en oeuvre des
différentes politiques économiques adoptées dans le pays. Mais, les nouvelles orientations
consultatif des bailleurs de fonds, une nouvelle stratégie fondée sur 1' accélération de la
croissance et dont les objectifs sont: 1) Réaliser l'équilibre des opérations financières de
l'Etat à partir de 1997 ; 2) Réduire le déficit des paiements courants de 8,3% du PIB en
1994 à 6,8% en l'an 2000; 3) Maîtriser l'inflation pour le maintenir à des niveaux
26
Pour atteindre ces objectifs, il a été prévu de se focaliser sur les quatre axes suivants: 1) La
Dans ce cadre, il a été mis en oeuvre plusieurs mesures qui n'ont pas manqué de stimuler
l'économie : des mesures sur le plan fiscal et budgétaire, des mesures visant une saine
Bien que la croissance ait été accrue depuis 1994, l'objectif de taux de croissance de 6%
n'a pas encore été atteint (le maximum de 5,6% a été atteint en 1996). Les réalisations en
terme d'investissement sont encourageants (15,2% du PIB en 1995 et 17% en 1997), mais
politique et sociale qui caractérise la plus grande partie du pays (en dehors de la région de
(au moins dans la région de l'Afrique subsaharienne), le Sénégal peut attirer un important
pour que ces opportunités soient bien exploitées, il faut des mesures de politique adéquates .
C'est en ce sens que la suite de ce travail est consacrée à la recherche des facteurs qui
expliquent le mieux les mouvements du taux de croissance du produit global par tête de
sera abordée d'un point de vue théorique à travers les résultats de quelques travaux
27
CHAPITRE DEUXIEME:
FACTEURS DE CROISSANCE ECONOMIQUE :
UNE REVUE DE LA LITIERATURE
Cette revue sera structurée autour de quatre aspects. Le premier est relatif à l'évolution des
endogène. Le second traite du concept de capital humain et son rôle dans la croissance
économique. Après avoir présenté les liens entre la stratégie commerciale et la croissance
Selon PERROUX, F., la croissance économique est "l'augmentation soutenue pendant une
ou plusieurs périodes longues d'un indicateur de dimension : pour une nation, le produit
global en termes réels". Pour pouvoir parler de croissance économique, il faut que la
quantité de biens et de services matériels produits dans l'économie augmente pendant une
longue période. Par ailleurs, si la répartition des revenus créés n'est pas trop inégalitaire,
saisir ces changements qualitatifs. De ce fait, des définitions plus récentes de la croissance
(1973) cité par TERLECKYJ (1984) considère que "la croissance économique moderne
reflète une capacité permanente d'offrir à une population en augmentation une quantité
28
accrue de biens et services par habitant". Plus globalement, TERLECKYJ ( 1984) définit
la croissance de façon à prendre en compte les cas où elle est négative ou positive d'une
part et ceux où elle concerne la production globale ou la production par habitant: "On
peut légitimement qualifier de croissance économique, une capacité à soutenir des effectifs
produit réel par tête de l'économie pendant une longue période de façon à améliorer, si
La croissance économique, ainsi définie, est un processus de long terme dont la finalité est
davantage à l'amélioration des niveaux de vie que ne l'ont fait toutes les analyses de
politiques macroéconomiques de court terme. En ce sens, des économistes ont porté leur
attention sur les caractéristiques du processus de croissance. Pour Kaldor ( 1963) cité par
Barro et Sal-I-Martin (1996), la croissance économique est caractérisée par six principaux
physique par tête croît avec le temps; 3) Le taux de rendement du capital est
29
crmssance économique moderne; il souligne le taux rapide des transformations
écrit en 1928. Les conditions d'optimalité introduites par RAMSEY sont beaucoup
ou de la théorie des cycles économiques. Entre RAMSEY et la fin des années 40,
HARROD (1939) et DOMAR (1946) ont tenté de concilier l'analyse keynésienne avec
certains éléments de la croissance économique. Ils utilisent pour cela des fonctions de
était intrinsèquement instable. Par la suite, les travaux de SOLOW (1956) et SWAN (1956)
qui suivirent ont élaboré une fonction de production de forme néoclassique. Leur fonction
de production postule que les rendements d'échelle sont constants, que les rendements
factoriels sont décroissants par rapport à chaque facteur de production et que 1'élasticité de
PIB réel par tête est faible par rapport à sa position de long terme ou d'état régulier, plus le
30
taux de croissance est rapide. Cette notion qui découle de 1'hypothèse des rendements
décroissants du capital permet de comprendre une grande part des différences de taux de
croissance économique entre certains pays ou certaines régions. En effet, les économies
proches de leur position de long terme croissent moins vite que celles qui y sont plus
la technologie, la croissance par tête finit par s'arrêter; ce qui est lié à la décroissance du
rendement marginal du capital. Ce résultat est toutefois mis en défaut par les observations
empiriques de taux de croissance par tête positifs sans aucune tendance nette à la baisse
Les travaux ont alors été poursuivis par d'autres théoriciens néoclassiques avec le postulat
de progrès technique exogène. L'idée de progrès technique exogène suppose que le progrès
technique ne résulte pas d'une activité économique et que son niveau ne peut être
déterminé dans la sphère économique. Les nouveaux modèles obtenus montrent alors que
le taux de croissance par tête est déterminé par le taux du progrès technique et par le taux
de croissance de la population, tous deux exogènes aux modèles. Dans ces conditions, la
Peu après les modèles de ARROW (1962) et SHESHINSKY (1967) introduisent les idées
doing). Dans ces modèles, les découvertes de chaque individu se répandent immédiatement
dans 1'économie tout entière par un processus de diffusion. Mais, les travaux relatifs aux
effets de la diffusion des idées dans l'économie ne vont pas beaucoup évoluer.
31
Ce n'est qu'à partir du milieu des années 80 que la recherche sur la croissance économique
a connu un nouvel essor grâce aux travaux de Romer (1986) et de Lucas (1988). Ces
court terme. Mais, il fallait alors élaborer un modèle où la croissance par tête à long terme
n'est plus indexée sur des variables exogènes comme dans les modèles néoclassiques, mais
où elle est expliquée par des variables internes au modèle économique. Leurs modèles sont
ainsi qualifiés de modèles de croissance endogène. Dans ces modèles, la croissance peut se
poursuivre indéfiniment parce que le rendement des investissements réalisés dans une
catégorie de biens capitaux (incluant le capital humain) ne diminuent pas à mesure que
externes du capital humain font partie du processus de croissance en faisant obstacle aux
croissants du capital, mais alors l'économie risque de connaître une croissance explosive,
phénomène dont les effets peuvent être très défavorables au développement. Aussi, le
Développement (R&D). Il n'existe dès lors aucun risque d'épuisement des idées et le taux
de croissance de l'économie peut demeurer positif à long terme. C'est alors que les
terme; ces modèles tendent ainsi à réhabiliter le rôle de l'Etat dans l'activité économique.
fait que le rendement des investissements réalisés dans les biens capitaux ne diminuent pas
32
nécessairementL au fur et à mesure que 1' économie se développe. Le second modèle de
nouvelle vague est élaboré par LUCAS, R. en 1988 et privilégie 1' accumulation de capital
humain effectuée par les individus. Dans le même sens, G. Becker et al. (1990) reprennent
considèrent que les économies ont intérêt à limiter la progression de la population afin de
lui garantir un meilleur niveau de capital humain et pouvoir ainsi soutenir un processus de
d'inputs. Les nouveaux inputs fabriqués avec des rendements croissants permettent
l'économie. Par ailleurs, dans ses travaux effectués en 1990 et en 1991, R. BARRO
infrastructures publiques dans le processus de croissance. Dans ces modèles, les biens
publics permettent d'améliorer la productivité des agents privés et d'accroître les processus
de croissance.
technologique qui postulent que par l'imitation les pays moins développés peuvent utiliser
les découvertes des pays plus avancés, ce qui est moins coûteux que l'innovation
technologique.
En définitive, les théories de la croissance endogène font une rupture avec les théories
croissance comme un processus auto-entretenu qui peut être influencé par des rendements
33
constants et par divers types d'extemalité comme l'innovation technologique, les biens
Selon l'encyclopédie économique, le capital humain est le stock des capacités humaines
les êtres humains. Cette définition souligne l'intérêt du capital humain en tant que facteur
professionnel". Bien que ces définitions privilégient l' expérience issue de l'éducation dans
la formation du capital humain, il faut noter que ce dernier résulte également d'autres types
et favorise ces autres types d'investissement, ce qui la rend centrale dans l'accumulation du
résulte de l'idée selon laquelle le volume du produit créé par une unité économique dépend
transformés (BEHRMAN, J.R. et TAUBMAN, P.J., 1984). C'est alors que la trilogie de
34
économique aux décisions liées à l'investissement dans les êtres humains. En fait, c'est
depuis 1960, que THÉODORE SCHULTZ, GARY BECKER et JACOB MINCER ont
introduit l'idée que les hommes investissaient eux-mêmes pour accrroître leur stock de
capital humain (BEHRMAN, J. R. et TAUBMAN, P.J., 1984). Cette idée a été appréciée
avec intérêt par les économistes et un nombre important d'études y ont été consacrées
(PSACHAROPOULOS, 1988). Ce n'est que dans la deuxième moitié des années 80 que
les travaux sur le rôle du capital humain ont été repris en même temps que la relance de la
physique n'atteindra pas tout son potentiel si on n'a pas investi dans les personnes qui sont
Les résultats empiriques portant sur le rôle du capital humain dans la croissance dans
divers pays ne permettent pas de conclure à la nature de son effet. Cet effet varie selon le
niveau de développement du pays ou selon les politiques économiques qui y ont été mises
en oeuvre.
LAU, JAMISON et LOUAT (1991) ont estimé une fonction de production Cobb-Douglas
utilisant les différences premières des logarithmes des variables, pour raison de
stationnarité. L'étude porte sur un panel de 58 pays situés dans 5 régions du monde en
croissance. Il ressort que le capital humain a un effet négatif en Afrique et dans le Moyen-
Orient et un effet non significatif en Asie du Sud et en Amérique latine. C'est seulement en
Asie de 1' est que 1' éducation a un impact positif et significatif. Ces résultats peuvent être le
35
reflet du fait que l'action du capital humain sur la croissance peut dépendre du niveau de
Mondiale, 1995).
BARRO ( 1991) a regressé les revenus par tête des pays de son échantillon d'étude sur un
comme variable mesurant le capital humain. Ses estimations ont montré que le niveau
PYO (1995) procède à une estimation empirique à partir de données en séries temporelles
relatives aux Etats-Unis et à la république de Corée. Le capital humain est capté par le total
des dépenses investies dans la formation du capital humain sous forme de subventions ou
de dépenses en éducation. Bien que l'effet du capital humain sur la croissance soit positif
et significatif pour les deux pays, 1' auteur fait remarquer que dans le cas de la Corée
comme dans celui des pays en développement, le capital humain joue plutôt un rôle
..
PRITCHETT (1996) a fait une analyse des facteurs de croissance à partir des données de
panel couvrant 91 pays. Ses résultats montrent que l'accumulation du capital humain
mesurée à 1' aide des données relatives à 1' éducation a un important effet négatif et
36
aurait empêché la main-d'oeuvre qualifiée de servir dans les activités qui promeuvent la
crOissance.
BERTHÉLEMY et al. (1997) ont contribué à l'analyse du rôle du capital humain dans la
croissance en utilisant des données de panel relatives à 83 pays et à six périodes de 5 ans,
de 1960-1965 à 1985- 1990. La justification de leur étude est liée au constat qu'aucune
validation économétrique basée sur données de panel n'avait encore été faite au sujet de
1'hypothèse selon laquelle le capital humain contribue à la croissance. Dans une première
estimation du modèle de Solow augmenté, ces auteurs ont abouti à un effet négatif du
capital humain sur la croissance. En introduisant alors une variable explicative qui rend
représentant le capital humain. Se référant alors à l'analyse de Gould et Ruffin. (1995), ils
notent que le régime commercial influence la capacité d'une économie à mobiliser son
entre capital humain et régime commercial. En définitive, il ressort de leur étude que le
capital humain peut exercer un effet positif sur la croissance, mais cet effet dépend de la
capacité de l'économie à canaliser ses ressources humaines dans des activités génératrices
SACERDOTI et al. (1998) partent des résultats de certaines études qui montrent que la
relation positive entre les taux d'inscription scolaire et la croissance ne devraient pas faire
étant très faiblement corrélée avec l'accumulation de capital humain. L'objectif de leur
étude est de rechercher les facteurs qui influencent la croissance économique dans 9 pays
37
d'Afrique de l'Ouest et de calculer des séries de données relatives à l'accumulation du
capital humain pour chacun de ces pays. A partir d'une méthodologie d'analyse comptable
des sources de la croissance, ils trouvent qu'une augmentation du capital physique, surtout
demander comment des avantages élevés résultant d'un plus haut niveau d'éducation
n'auraient qu'un faible impact ou même un impact négatif sur la croissance du produit par
travailleur. Ils ont poursuivi leur analyse en construisant des modèles où des facteurs
spécifiques aux pays, comme les variables de chocs exogènes ou de politique économique,
sont pris en compte. Ils identifient les termes de 1' échange, le degré d'ouverture
total comme étant les principaux composants des effets spécifiques. Ils en déduisent donc
que, pour avoir un impact significatif sur la croissance, l'éducation devrait être
sociaux. Ils recommandent que les politiques économiques doivent alors viser la création
RAMON, L. et al. (1998) ont fait remarquer qu'aucun pays n'a connu un développement
soutenu sans avoir véritablement investi dans le capital humain. Mais, les faits ont aussi
montré que certains pays ont adopté de bonnes politiques d'éducation sans pour autant
emegistrer par la suite de bons résultats en terme économique. Face à ce contraste entre
faits empiriques et résultats théoriques relatifs au rôle du capital humain dans la croissance,
leur étude a essayé de répondre à la question de savoir quand et comment 1' éducation peut
engendrer des effets remarq~ables dans l'économie. L'étude a fait ressortir deux facteurs
38
explicatifs: la distribution de l'éducation et les politiques économiques mises en oeuvre.
Ainsi, à partir des données de panel sur un ensemble de 12 pays d'Asie et d'Amérique
Latine et sur la période 1970 - 1994, ils ont recherché les liens entre l'éducation, les
D'abord, une distribution très inégalitaire de l'éducation entre les travailleurs tend à avoir
un impact négatif sur le revenu par tête dans la plupart des pays. Lorsqu'on utilise un
positif et significatif alors que si on ne tient pas compte de la distribution, 1'effet est non
significatif ou même négatif pour certains pays. Par ailleurs, l'effet des réformes sur
1' impact de 1' éducation sur la croissance est saisi dans le modèle à 1' aide du coefficient
muette de réformes économiques qui prend la valeur 1 pour les années ou des réformes
sont mises en oeuvre et la valeur 0 pour les années sans mesure spécifique. Ensuite, les
résultats montrent que les politiques économiques qui suppriment les forces du marché
humain ne peut avoir qu'un faible effet sur la croissance à moins que l'éducation soit
peut aussi aider à améliorer la qualité des effets de 1' éducation; il peut induire une
VERNER (1999) a utilisé une série de données relatives aux travailleurs d'une part et à
leurs entreprises respectives d'autre part pour estimer une fonction de production et des
équations de salaires dans le cas des entreprises ghanéennes. Elle a utilisé les données de
et qui porte sur un échantillon de 215 entreprises manufacturières, des micro aux plus
39
grandes entreprises. Le modèle utilisé pour expliquer le salaire et la productivité est donné
productivité (v).
Les variables explicatives F sont des caractéristiques de la firme, ce sont des facteurs de
mesurant les impacts marginaux des variables explicatives (I et F respectivement) sur les
salaires et la productivité.
Cette approche lui a permis de mesurer non seulement l'impact marginal de différentes
caractéristiques (aussi bien des travailleurs que de la firme) sur les salaires, mais aussi de
Les femmes sont moins payées que les hommes dans les entreprises sans que cette
Plus les travailleurs possèdent une formation et une éducation élevées, plus leurs
Les différences de productivité sont distinguées pour cinq niveaux d'éducation. Les
écarts de productivité sont plus importants que les écarts de salaires pour ces différents
niveaux d'éducation, ce qui montre que les salaires ne sont pas rigoureusement indexés
sur la productivité.
interne, engendre des salaires plus élevés sans avoir un impact notable sur la
productivité.
40
Elle en conclut que même dans le court terme l'investissement en capital humain améliore
la productivité.
NGUYEN et SCHWAB (1999) ont testé empiriquement le rôle du capital humain dans la
ajoute à ces deux variables une autre représentant le capital humain mesuré par le nombre
d'actifs du pays qui ont fait des études de premier et second cycles du collège. Les modèles
sont spécifiés en log-linéaire et ont été estimés pour quatre nouveaux pays émergents
d'Asie: l'Indonésie, la Malaisie, les Philippines et la Thaïlande; les analyses ont été aussi
Les résultats d'estimation du premier modèle révèlent le rôle positif du capital physique et
Thaïlande fait l'exception avec un coefficient négatif mais non significatif pour la
détermination R 2 . Le capital humain a des coefficients positifs dans la plupart des cas, ce
qui implique qu'une augmentation du nombre total de personnes ayant fait des études de
coefficients ne sont pas significatifs, ce qui remet en cause 1' effet précedemment évoqué.
durant le parcours professionnel pourrait être à l'origine des coefficients non significatifs
obtenus pour le capital humain dans l'explication de la croissance dans les pays en
développement. Ils observent en effet que l'apprentissage par la pratique est prédominant
et que l'estimation du rôle du capital humain devrait prendre en compte ce fait. Dans le cas
de la Thaïlande, le coefficient du capital humain est négatif et significatif à 10%. Cela peut
41
s'expliquer par le fait que, dans les pays où le stock de capital humain est relativement
faible (comme le cas de la Thaïlande), il peut avoir des coûts fixes élevés dans la
1' acquisition de 1' éducation. Aussi, dans les pays sous-développés, les travailleurs éduqués
Au total, ces travaux ont permis de saisir les conditions dans lesquelles le capital humain
peut être utilisé pour contribuer effectivement à la croissance économique. Mais, puisque
l'économie évolue dans un contexte mondial, les politiques commerciales peuvent aussi
Depuis le début des années 80, une nouvelle approche interventionniste liée à l'échange
coexistence que d'un nombre très réduit d'entreprises qui tirent des bénéfices au-delà du
coût d'opportunité du capital du fait qu'elles ont le pouvoir suffisant pour fixer les prix. Un
pays de taille relativement importante peut alors s'assurer d'une part importante de ce
une certaine période pour lui permettre de réaliser des économies d'échelle (théorie des
industries naissantes) (Krugman, P., 1996). Cet argument semble peu réaliste dans les pays
42
développement, stipule que certains secteurs d'activités engendrent des effets
externes ne peuvent pas se produire par le seul libre jeu des mécanismes du marché local
locale de savoir technique sera sous-optimale. Dans ces conditions, une certaine protection
tarifaire et des subventions à 1' exportation deviennent des mesures salutaires pour le bien-
être de la nation. En fait, l'idée de ce deuxième argument suggère que les activités
Mais pour les pays en voie de développement, quel type de stratégie commerciale faut-il
s'impose. Que peut-on alors attendre de la libéralisation d'une économie sur sa croissance
sur la productivité. Il fait remarquer que la croissance de la productivité totale des facteurs
constitue un facteur de croissance beaucoup plus important dans les pays développés que
dans le premier groupe de pays contre moins du tiers dans le second. Le rôle de la
productivité dans l'explication de la croissance se voit ainsi réduit pour les pays en voie de
développement. La question de la stratégie commerciale étant ainsi bien posée, sans que
l'auteur n'y apporte d'éléments de réponse, LAHOUEL (1996) a examiné, un peu plus
43
libéralisation des échanges comme moyen d'amélioration de la productivité sont au moins
au nombre de quatre:
réduction de leurs coûts. Dans ce cas, la libéralisation des importations peut les inciter
à améliorer leur productivité afin de conserver leur part de marché, leurs produits étant
la capacité de production installée, surtout lorsque le marché local est très limité, et à
productivité des facteurs locaux. Cet argumemt semble bien soutenable même si pour
certains auteurs comme RODRICK (1992), les goulots évoqués résultent d'une
3) Le troisième argument prend en compte les économies d'échelle dont bénéficieront les
entreprises produisant des biens pour lesquels le pays a des avantages comparatifs qui
seront exploités avec la libéralisation. Cet argument tient surtout dans les pays où le
44
4) Le quatrième motif concerne la circulation d'idées, de biens et de nouvelles méthodes
de gestion qui s'opère grâce à 1' ouverture sur le marché extérieur. Cet argument des
nouvelles idées suceptibles d'accompagner les échanges extérieurs a été explicité par
idées seront admises plus rapidement dans les sociétés où les gens sont habitués au
changement... Un pays qui est isolé est, par contre, moins amené à absorber
rapidement de nouvelles idées . . ." (LAHOUEL, 1996). En fait, en dehors des flux de
Par ailleurs, dans le cadre des travaux empiriques portant sur l'importance de la
relation qui existe entre le taux de croissance du PIB et celui des exportations. Il a étudié le
cas spécifique des pays africains de la zone franc à l'aide d'une analyse transversale, puis à
partir d'une analyse en série temporelle sur la période 1962- 1979. Les résultats se sont
révélés décevants car une corrélation significative et positive n'est observée que dans trois
cas sur douze: la Côte d'Ivoire, le Sénégal et le Cameroun qui sont les pays les plus
développés de la zone. L'auteur fait alors trois remarques liées aux caractéristiques des
nécessaires pour obtenir un certain taux de croissance économique est d'autant plus faible
que le taux d'exportation du pays, c'est à dire le degré d'ouverture sur l'extérieure, est
économique semblent dépendre de la structure des exportations; ces effets sont d'autant
plus importants que la structure des exportations est plus concentrée (forte homogénéité
45
des exportations). Troisièmement, l'effet de la croissance des exportations est d'autant plus
conclusion que les entreprises tournées vers 1' extérieur connaissent des gains de
productivité plus importants que ceux travaillant pour le marché local. Dans son étude, le
ont construit un modèle qui intègre les extemalités de la libéralisation dans des modèles
calculables d'équilibre général. Son étude a porté sur la Corée du Sud. Il a considéré deux
types d'extemalités, l'un associé à l'expansion des exportations, l'autre à l'expansion des
importations de biens d'équipement. Il trouve que 1' orientation vers les marchés extérieurs
augmente la productivité des facteurs dans les secteurs des industries manufacturières
légères et lourdes, mais n'affecte pas de manière directe les deux autres secteurs,
1' agriculture et les services. On peut alors comprendre que les effets bénéfiques liés au
d'équipement. Il faut noter que dans le modèle de De Melo et Robinson, les mécanismes
par lesquels la libéralisation du commerce extérieur génère les extemalités ne sont pas bien
explicités.
46
FOSU, A.K. (1990) s'est intéressé au cas spécifique des pays africains car selon lui ces
pays ont plus de similitudes entre eux qu'avec les autres pays en développement aussi bien
exportations sur le taux de croissance annuel moyen du PIB de 28 pays africaines sur la
et les exportations. Ses résultats ont montré que la croissance des exportations a un impact
croissance dépend aussi bien du niveau de développement des pays que de la structure de
leurs exportations. En utilisant une analyse transversale sur tous les pays les moins avancés
pour lesquels les données sont disponibles, il a trouvé que la croissance économique est
fortement corrélée avec la part des produits manufacturiers dans les exportations totales,
développement non exportateurs de pétrole sur les périodes 1970-1981 et 1973-1985. Les
des facteurs est influencée à la fois par la croissance des exportations et par l'instabilité des
exportations dont les effets sont supposés dépendre de la politique d'ouverture extérieure.
Les estimations du modèle ont confirmé l'hypothèse d'un effet positif de la croissance des
47
différences de croissance entre les pays en développement de l'échantillon. Pour l'auteur,
"La politique d'ouverture paraît exercer une influence favorable à la croissance selon trois
modalités principales : elle agit à travers les taux d'investissement (mais cet effet est peu
sensible sur l'échantillon considéré); ensuite, elle élève la croissance des exportations
pays donné bénéficie des importations de biens intermédiaires incorporant les nouvelles
technologies, cette variable étant prise comme proxy de l'investissement étranger dans la
huit pays de l'OCDE (la Suède et les pays du G7) entre 1970 et 1991 pour estimer un
modèle établissant les liens entre le commerce et la croissance. Ses résultats montrent
pays qui a mené les activités deR & D. C'est dire que la qualité de la nouvelle technologie
varie d ' un pays à l'autre. Deuxièmement, l'effet de la R & D domestique d'un pays est
plus importante que celui de la R&D menée en moyenne par un pays étranger ; ce qui
s'explique par le fait que les réalisations technologiques locales sont mieux adaptées aux
besoins et compétences locaux. Mais, l'auteur fait la réserve que certains mécanismes
alternatifs comme l'investissement direct étranger devraient être pris en compte lorsqu'on
technologie. Troisièmement, la structure des importations d'un pays n'affecte pas signi-
48
ficativement le degré dont il bénéficie de la R & D étrangère. Entre autres raisons à ce
général d'externalité provenant des investissements en R & D étrangers. Cet effet ne serait
pas lié au commerce international mais serait transmis par d'autres mécanismes tels que
hauteur de 20%.
international, l'accent n'est plus mis sur les effets statiques de la libéralisation qui sont des
effets de réallocation des ressources entre facteurs. Sont alors privilégiés les effets
technologie et, plus généralement, des effets liés à des gains de productivité.
des autorités politiques à gérer efficacement l'économie. Alors, en facilitant les décisions
adéquate de prix relatifs sont des conditions nécessaires à un environnement stable. Dans
du taux d'inflation, des variations du taux de change réel, des poids du déficit budgétaire et
49
du secteur extérieur. Par contre, face au boom de l'huilerie, le Nigéria a attendu jusqu'à la
dévaluation imposée par une crise ultérieure. Ces différences de réaction aux chocs
économiques ont conduit l'Indonésie, qui était plus pauvre que le Nigéria en 1960, à
surpasser le Nigéria au début des années 80 en terme de PIB par tête et de la structure des
exportations.
Par ailleurs, la Côte d'Ivoire et la Malaisie sont toutes deux dotées de riches terres
agricoles et de minerais. Entre 1961 et 1970, le PIB de la Côte d'Ivoire a cru à un taux
d'environ 12,4% par an, ce taux qui est bien plus élevé que celui des économies
nouvellement industrialisées d'Asie. Entre 1965 et la fin des années 70, les PIB par tête des
deux pays ont cru au même taux. Mais, depuis la fin des années 70, alors que la Malaisie a
continué à croître, le PIB par tête a chuté en Côte d'Ivoire. Cette chute a été engendrée par
une union monétaire). Une appréciation du taux de change réel intervenue en Malaisie au
début des années 80 a causé un déclin temporaire qui a été corrigé en quelques années.
subventions .
Dans le troisième groupe, les politiques adoptées en Thaïlande ont créé un environnement
en Tanzanie, par contre, l'investissement privé était découragé par des contrôles directs. La
Thaïlande a maintenu une politique de taux de change stable avec seulement une
dépréciation progressive de sa monnaie par rapport au dollar d'environ 15% vers la fin des
années 80. Cependant, le Ghana et la Tanzanie ont expérimenté de fortes variations de leur
taux de change: dans la première moitié des années 80, leurs monnaies ont connu une
appréciation de plus de 100% et à la fin de la même décennie, les taux de change sont
51
de la dette extérieure par rapport au produit intérieur brut. Selon FISCHER (1993),
L'impact de l'inflation sur la croissance est souvent mal saisi dans les modèles
faire baisser le taux d'intérêt réel et provoque alors un ajustement de portefeuille de l'actif
monétaire réel à 1' actif physique en capital. Ce qui induit une augmentation du volume de
l'investissement et donc une coissance plus élevée. Cependant, dans le cas des pays en
conséquent, une forte inflation anticipée devrait réduire l'investissement privé et donc
1961 et 1993, 1'Afrique a enregistré un taux de croissance du PIB par tête très faible
(0,3%) par rapport à celui de l'Asie de l'Est (7,4%) ou de l'Asie du Sud (4,3%). Ils ont
alors recherché les causes de cette lente croissance économique de 1' Afrique en procédant à
une comparaison des économies asiatiques et africaines. Afin de contrôler les conditions
initiales, ils ont choisi trois groupes de pays d'Asie du Sud et d'Afrique subsaharienne à
troisième
possibilité de réduire les effets adverses de la crise allemande en adoptant une politique
fiscale restrictive et une politique monétaire prudente. Aussi, des ajustements rapides du
taux de change en réponse aux changements du prix de 1'huile ont permis d'éviter la crise
50
Ces résultats montrent qu'une politique macroéconomique stable est une condition
essentielle à la croissance. En ce sens, le maintien d'un taux d'inflation faible par une
restriction des dépenses publiques et par l'adoption d'une politique monétaire prudente est
important. En outre, la politique du taux de change doit permettre une certaine flexibilité
SACHS et WARNER (1996) ont estimé un modèle de convergence dans lequel le taux de
croissance est affecté par le gap entre le niveau du revenu d'équilibre et son niveau
courant. Il ressort que plus le niveau du revenu courant est faible par rapport au revenu
d'équilibre, plus élevé est le taux de croissance. Par ailleurs, le niveau d'équilibre du
revenu d'un pays est influencé par des variables de politiques économiques et par des
Les résultats confirment par ailleurs l'hypothèse de convergence si l'on contrôle ces deux
les pays requiert, au-delà de la compréhension du lien entre la croissance et les politiques
52
BURNSIDE, C. ET DOLLAR, D. (1997) se sont intéressés à la relation entre l'aide
extérieure, les politiques adoptées par les gouvernements et la croissance du PNB par tête.
Leur étude a porté sur un échantillon de 56 pays en développement observés sur six
périodes de quatre ans (1970- 1993). Les résultats montrent que les politiques qui influent
le plus sur la croissance économique sont celles orientées vers la fiscalité, l'inflation et
l'ouverture extérieure. L'aide s'avère d'un impact positif sur la croissance des pays en
Toutefois, il n'existe pas une influence significative de l'aide sur ces politiques.
TAKATOSHO ITO (1997) souligne que les expériences du Japon et d'autres pays d'Asie
la croissance économique est un processus dynamique qui comporte plusieurs étapes. Dans
une première étape, lorsque les conditions sont normales, l'économie commence à décoller
à partir d'un état stagnant, puis accélère la croissance jusqu'à un taux à deux chiffres. Cette
manufacturière, d'abord simple, puis sophistiquée. Dans une seconde étape, le taux de
croissance ralentit, la part des produits manufacturiers dans le PIB ayant atteint son plateau
et la technologie étant à sa pointe. L'auteur fait remarquer que les modèles de convergence
saisissent bien la seconde étape et qu'une plus grande attention devrait être portée à la
tion d'une stabilité sociale et politique semblent être des conditions particulièrement
Pour ALBERTO ALES INA (1997), la qualité institutionnelle - mesurée par l'efficacité
53
la loi - est importante pour la croissance. Ainsi, sont essentielles la stabilité politique et les
libertés civiles et économiques. Dans les pays où les institutions sont faibles, la
vicieuse. En outre, dans ces pays, la consommation publique n'améliore pas les indicateurs
ailleurs, étant donné que 1' aide extérieure sert essentiellement à accroître la consommation
plan d'assistance technique et financière aux pays qui ne satisfont pas les conditions
institutionnel.
En définitive, à partir des travaux évoqués dans cet aperçu de la littérature, on comprend
souci du bien-être individuel. Aussi, le capital humain défini comme le stock des
extemalités sur certains secteurs productifs de l'économie; ce qui lui permet d'influencer le
taux de croissance. En outre, 1' action du capital humain sur la croissance dépend de la
élaborées à 1' étranger favorisent dans les pays en développement 1' émergence de
1'éducation entre les individus est un autre facteur qui peut, en cas d'une forte inégalité,
inhiber le rôle du capital humain (RAMON et al., 1998). Enfin, il est à souligner que la
54
stabilité macroéconomique et politique est nécessaire pour inciter le secteur privé à investir
Les modèles théoriques ne comportent qu'un nombre très limité de facteurs dans la
expliquée par ceux-ci au progrès technique A. Ce qui limite fortement les actions de
promotion de la croissance économique. Il convient donc d ' élargir le cadre défini par les
modèles théoriques afin de disposer d'assez de variables que l'on peut manipuler pour
promouvoir la croissance. Aussi, 1' approche basée sur les seuls facteurs directs de
production ne permet pas de déduire des politiques d'accroissement de ces facteurs, mais
indique seulement comment les facteurs disponibles sont utilisés. Or l'accumulation accrue
Par ailleurs, si dans les travaux empiriques évoqués plusieurs variables institutionnelles
pertinentes sont mises en évidence, celles-ci ne sont pas identiques chez tous les auteurs.
Mais, pour pouvoir saisir le plus nettement possible l'effet d'une variable institutionnelle, il
est nécessaire de contrôler toutes les autres qui sont susceptibles d'influer sur la croissance.
Par exemple, l'effet des distorsions commerciales sur la croissance ne peut être bien
L'objet de cette étude étant de rechercher, dans le contexte économique du Sénégal, les
facteurs pertinents dans 1' explication des variations de la croissance économique dans le
temps, nous pouvons à présent construire un cadre méthodologique d'analyse qui permet
d ' identifier les variables pertinentes et de mesurer au mieux leur effet net. Notre modèle
cherchera à prendre en compte dans une fonction de production les effets du capital
55
CHAPITRE TROISIEME
ETUDE EMPIRIQUE DES FACTEURS DE CROISSANCE :
METHODOLOGIE ET RESULTATS
A partir des enseignements de la revue de la littérature sur la croissance, nous pouvons définir
un cadre d' analyse pour la recherche des facteurs de croissance économique au Sénégal.
Nous allons spécifier un modèle explicatif des variations du taux de croissance annuelle du
produit intérieur brut (PIB) par tête; nous présentons ensuite la technique d'estimation
économétrique du modèle.
Pour construire le modèle économétrique permettant d'identifier les principaux facteurs qui ~
expliquent les variations du taux de croissance du PIB de 1' économie sénégalaise dans le
compte le capital humain, puis d'autres facteurs qui influent sur le taux de croissance à
56
Cette forme de la fonction suppose que la technologie augmente la production (neutralité du
progrès technique au sens de Hicks). En prenant les logarithmes des deux membres et en
production agrégée :
Multiplions et divisons le premier élément entre parenthèses parK et le second élement entre
concurrentiel. En effet, dans un tel contexte, le produit marginal de chaque facteur est égal à
son prix et donc AFk est égal au prix du capital, r et AF 1 est égal au taux de salaire w. Il en
résulte que le terme (AFd<N) représente la part du revenu national qui sert à payer le capital
investi et (AF 1.LIY) représente la part du revenu national allant aux salaires.
Cette équation suppose que les variations du taux de croissance de la production agrégée sont
expliquées par celles des taux de croissance des deux facteurs de production. La partie de la
gt = c + a*kt + b*lt + Ut ( 1)
où g est le taux de croissance du PIB, c est le terme constant, k et 1 les taux de croissance
(t.At/A) 1 par c+u 1• Dans ce modèle, la PGF est supposée exogène au sens de Solow.
57
Cependant, pour prendre en compte son caractère endogène mis en exergue par les théories
de la croissance endogène, nous considérons que la PGF est déterminée par divers facteurs
dont le capital humain. Un modèle qui isole l'effet du capital humain constitue donc une
meilleure approche de recherche des facteurs de croissance; un tel modèle réduit en effet les
modifions ce modèle pour tenir compte des facteurs qui interviennent par l'intermédiaire de
du travail effectif (AL) ; V est un vecteur de variables qui influent sur la Productivité Globale
Posons: k = K 1 AL ; h = H 1 AL ; y = Y 1 AL et v = V 1 AL
58
TXPIBT est le taux de croissance annuelle du Produit Intérieur Brut par travailleur ;
travailleur ;
V est un vecteur de facteurs qui peuvent influencer la PGF en dehors du capital humain.
Bien que le modèle (2) constitue une meilleure approche par rapport au modèle de Solow, il
ne prend pas en compte l'influence de la politique économique mise en oeuvre dans le pays
non stable peut engendrer une utilisation inefficace des facteurs ou une sous-utilisation des
Il s'agit du modèle à capital humain (modèle 2) augmenté par l'introduction de variables liées
l'extérieur (performance des exportations pour saisir les distorsions commerciales) et aux
aléas cilmatiques. L'introduction de ces variables s'explique par l'importance qui leur est
attribuée aussi bien dans les théories de la croissance endogène que dans les travaux
empiriques examinés dans la revue de la littérature. Nous supposons que ces variables entrent
59
EXPORT, est la variable représentant l'ouverture extérieure;
SECHER est une variable muette qui vise à saisir l'effet éventuel du climat sur la cr~issance
au cours des années particulièrement marquées par une forte sécheresse. Ici, il faut noter que
l'introduction des donées de pluviométrie n'a pas donné des résultats concluants. Ce qui nous
a conduit à préférer cette variable muette qui se révèle significative dans nos analyses. Elle
La variable dépendante du modèle est le taux de croissance du PIB par travailleur. Elle est
égale à la variation relative annuelle du rapport PIE/population active. Le PIB est mesuré en
Le facteur travail
Le facteur travail est représenté par la population active totale du Sénégal. La série des taux
Le capital physique
cours d'une période en vue de son accroissement constitue les dépenses d'investissement
relatives à cette période. L'investissement peut être réalisé par le secteur public ou par le
secteur privé.
60
Nous calculons une série de capital physique par la méthode de l'inventaire permanent qui
consiste à faire la somme cumulée des chiffres de l'investissement matériel brut et à corriger
le résultat par une estimation de la dépréciation du stock existant. Cette méthode suppose que
pour deux dates t et t-1, 1'investissement (It ) et le capital physique (Kt) sont liés par une
relation du type :
Ce qui signifie que le capital physique à la date t (Kt) est égal au capital physique de la date t-
Pour générer la série de capital physique à partir de cette relation itérative, nous prendrons
L'analyse du ratio capital physique par tête (intensité capitalistique, KIL) permettra de
vérifier si dans le temps chaque actif dispose de plus en plus de capacités matérielles lui
3 Cette valeur du coefficient de dépréciation a déjà été utilisée par Sacerdoti et al. ( 1998) et par Berthelemy et al.
1996.
4 Sacerdoti et al. (1998) ont déjà utilisé cette relation initiale pour l'année 1970 pour le Sénégal et d'autres pays
de 1'Afrique de l'Ouest. Notre recul jusqu'à 1960 se justifie par le fait que la relation itérative pourrait déjà être
stationnaire à partir de 1970 (convergence assez rapide de la relation itérative); ce qui rend plus robustes les
résultats obtenus pour la période 1970-1997 qui intéresse notre étude.
61
Le capital humain
Défini comme l'ensemble des compétences et qualifications détenues par les travailleurs
d'une économie, le capital humain apparaît comme un facteur qui améliore le rendement de
la main-d ' oeuvre. Il peut dès lors expliquer une partie des effets traditionnellement attribués
notamment des variables qui réflètent les efforts accomplis dans les domaines de la santé et
de l'éducation. Mais le manque de données sur longue période relatives aux dépenses
sanitaires ou à la couverture des soins sanitaires nous contraint à ne retenir que des variables
relatives à l'éducation. Nous avons pris en compte trois variables de stock. Bien que les
variables de flux (par exemple le taux de scolarisation primaire ou total) permettent de saisir
1' effet des efforts réalisés chaque année dans le cadre de la valorisation des ressources
humaines, les données ne sont pas disponibles sur longue période. Les variables de stock
montrent l'effet de tous les efforts menés jusque-là. Ces efforts ont permis d'accumuler le
capital humain dont le stock est utilisé dans l'activité économique. Il s'agit des variables :
Le nombre moyen d'années de scolarité par individu de la population active (dans le cyle
des salaires de la fonction publique sur le niveau d'instruction. Le calcul de cet indice
suppose que le salaire constitue une bonne estimation de la productivité du capital humain
détenu par les travailleurs. La productivité est supposée croître avec le capital humain.
Cet indice a été calculé pour le sénégal par Sacerdoti et al. (1998) qui ont utilisé le salaire
62
La variable d'ouverture extérieure
L'ouverture extérieure est saisie par l'importance des exportations dans l'économie nationale.
Pour cela, nous 1' avons approchée par le taux de contribution des exportations à la croissance
du PIB. Pour une année donnée, cette contribution est définie comme le produit du taux de
croissance annuelle des exportations par le poids des exportations dans le PIB au cours de
l'année précéente. Les exportations et les PIB sont évaluées au prix constant de 1987. Les
activités d'exportation sont censées avoir une relation positive avec la croissance économique
développement des activités d'exportation conduit à une meilleure allocation des ressources
selon les avantages comparatifs, permet une plus grande utilisation des capacités de
étrangère. Ce qui est de nature à améliorer la productivité et donc à accroître le produit global
par travailleur.
Aussi, les performances à l'exportation reflètent l'ampleur des distorsions commerciales. Une
forte protéction n'incite pas les secteurs protégés à l'amélioration de leur productivité du fait
de l'absence de concurrence. Il en résulte une faible compétitivité aussi bien sur le marché
politique monétaire : déficit public avec dons 1 PIB, masse monétaire M2 1 PIB net du taux de
croissance de l'économie et dette totale 1 PIB. L'indice donne pour chaque année une mesure
de la qualité de la politique économique comparée à celle a été mise en oeuvre au cours des
63
___ ...__ ~
autres années de la période 1970-1997. Une politique macroéconomique de bonn<: qualité est
En fait, Pour chacun des trois ratios, nous avons procédé à un classement des 28 années de la
période de l'étude (1971-1997) suivant l'importance de ses valeurs. A l' issue du classement,
l' année de plus faible rang (rang 1) est celle qui a enregistré la plus grande vakur du ratio
considéré; donc c' est l'année où la politique économique qui soutend le ratio a cté la moins
bonne sur toute la période. En revanche, cette politique a été la plus favorabl e w cours de
l'année qui a le rang le élevé (rang 28). Ainsi, la variable de rang issue de chaque ratio croît
avec l'amélioration de la politique économique liée au ratio. On obtient ainsi les "lariables de
L'indice de qualité macroéconomique (QUALMACRO) est alors défini , pour cluque année,
par la moyenne arithmétique simple de ses valeurs pour les trois variabl es de rang Rdéficit,
Rmonnaie, Rdette5 . Cet indice est, par construction croissante avec la qualité de la politique
macroéconomique.
Les agrégats qui interviennent dans cette estimation sont mesurés au pri x constant de 1987 .
La consommation publique
Les données sur la consommation publique incluent les dépenses courantes de 'Etat et les
sociales, dépenses d'entretien. Elles ne comprennent pas les dépenses milité1ires, ni les
5Wacziarg (1998) a utilisé une telle démarche pour construire une variable de qualité de la politique .':.
macroéconomique; il a classé les années en classes de déciles pour les trois ratios et a ensuite défini la v"ariable
de qualité comme la moyenne des rangs de chaque année pour les trois ratios. Toutefois, il faut remarquer qu ' il
convient mieux d'utiliser le taux d'inflation au lieu du ratio M2/PIB et que le déficit public hors de ns donne une
meilleure appréciation de la politique de dépenses de 1'Etat.
64
La variable d'aléas climatiques
Il s'agit d'une variable muette qui prend la valeur 1 pour les années qui ont été marquées par
Les informations relatives aux années de sécheresse on été obtenues auprès du service des
Nous avons préféré l'utilisation d'une variable muette à la mesure de la pluviosité parce que
nous voulons saisir 1' effet des chocs exogènes que constituent la sécheresse. Aussi,
l'introduction des données de pluviométrie n'a pas donné des résultats concluants.
L'analyse des déterminants de la croissance sera faite en plusieurs étapes. A chaque étape,
nous introduirons une nouvelle variable afin de tester la pertinence de diverses hypothèses
des théories de croissance dans le cas du Sénégal. On estimera ainsi les modèles suivants :
65
Ce modèle recherche les facteurs qui influent sur la PGF: le capital humain, l'ouverture
EXPORT: Taux de croissance des exportations pondéré par leur part relative dans le
PIB.
CAPHUM est la variable de capital humain ; elle sera représentée successivement par
les trois variables suivantes : HUMPRI: Nombre moyen d'années passées par un actif à
l'école primaire; HUMTOT : Nombre moyen d'années passées par un actif dans
1997. Avant de procéder à l'estimation des modèles, nous vérifierons les hypothèses de base
requises par l'estimation par la méthode des MCO. Nous ferons en particulier le test de
66
3.1.2.J.Test de spécification de RAMSEY RESET
Le test de spécification de RAMSEY RESET permet de vérifier s'il y a des variables que
nous aurions dû introduire dans le modèle. L'idée du test est que si le modèle souffre de
l'omission d'une variable pertinente, alors il sera possible d'introduire une (ou plusieurs)
variable fictive dans le modèle ; ce sera une variable qui, en dehors des variables explicatives
conclut que notre spécification est complète et prend en compte toutes les variables
Une autre condition requise pour l'estimation par les MCO d'un modèle utilisant des séries
qv-.t
temporelles est chacune des variables du modèle soit stationnaire. Une série temporelle
{
stationnaire est une série dont :
La covariance entre ses valeurs en deux instants t et t+k ne dépend pas de t, mais de la
facteur évoluant avec le temps. L'intérêt de la condition de stationarité réside en ce que 1' effet
produit par un choc sur une série possédant une tendance ou un facteur dépendant du temps
(série non stationnaire) est transitoire. Ce choc ne peut affecter significativement la tendance,
et la série retrouve son mouvement tendanciel. Dans ces conditions, il est difficile de cerner
67
clairement l'effet d'une autre série sur les variations d'une série non stationnaire. C'est ce qui
conduit à des régressions fallacieuses (spurious regressions) pour des modèles comportant
existe aussi des tests d'hypothèses qui permettent des conclusions plus précises : ce sont les
tests de racine unitaire. Ces tests ont été élaborés pour la première fois par Dickey Fuller et
ont été améliorés par la suite pour donner les tests de Dickey Fuller Augmenté (DF A). Le test
Les tests de racine unitaire visent à tester, pour une série Xt donnée, 1'hypothèse HO : ~ = 1
dans l'une quelconque des formes suivantes; accepter HO signifie que la série est non
stationnaire :
j =p
Mt= pXt- 1 - :Lt;&Mt- j +1 + é't
j=2j
j=p
Mt = px - 1 - :L t;&Mt - j +1 + c + êt
j=2
j =p
Mt= pXt- 1 - :Lt;&Mr- j +1 + bt + c + êt
j=2
(nous prendrons p= 4 )
série Xt est non stationnaire. Il en est ainsi si le t- statistique associé au coefficient p est
68
supérieur à la valeur critique du test pour le seuil considéré. Nous utiliserons le test de racine
unitaire de Phillips -Perron qui tient compte de 1' éventuelle hétéroscédasticité des erreurs de
la régression des modèles du test. En pratique, le logiciel E-views donne la valeur empirique
de la statistique du test ainsi que les valeurs théoriques correspondant aux seuils de 5% et 1%.
Si la valeur absolue de la valeur empirique est supérieure à la valeur théorique pour une seuil
a donné, on conclut que la série est stationnaire avec un risque de se tromper égal à a ; sinon,
la série est non stationnaire et on refait le test sur sa différence première ou sa différence
Lorsque les variables d'un modèle ne sont pas stationnaires, on est amené à rechercher
l'ordre de différentiation qui rend chacune d'elles stationnaire. S'il faut différencier d fois
pour stationnariser une variable X, on dit que X est intégrée d'ordre d. Dans ces conditions,
on vérifie si les variables ont le même ordre d'intégration et on cherche à spécifier un modèle
à correction d'erreur (modèle de court terme): l'estimation du modèle se fait alors par la
méthode de la cointégration.
Toutefois, les tests de racine unitaire ont montré que toutes les variables de notre modèle sont
stationnaires en niveau ; la régression peut donc se faire sans risque d'erreur liée aux
données relatives aux trois variables de stock représentant le capital humain proviennent des
séries calculées par Sacerdoti et al. Les données relatives au PIB de la France et des Etats-
69
Unis, au taux de change et à la masse monétaire annuelle du Sénégal sont issues des
1998. Les informations relatives au déficit public et à la dette publique sont collectées dans
Mondiale.
La fiabilité de ces données est supposée acquise dès lors que ces sources ont toujours été
exploitées à des fins d'études économiques qui ont été concluantes. Nous pouvons donc
70
3.2. RÉSULTATS
Le test de Phillips-Perron (PP)sur les séries temporelles correspondant aux variables utilisées
dans nos modèles a montré qu'elles sont toutes stationnaires en niveau. Elles sont donc
intégrées d'ordre O. En conséquence, on peut les utiliser pour estimer les modèles par la
méthode des moindres carrés ordinaires sans risque de régression fallacieuse. Le tableau 4.1
suivant montre en effet que pour chaque variable la valeur absolue de la statistique du test PP
Tableau 3.1.: Résultats des tests de racine unitaire sur les variables de l'étude (test PP).
71
Chapitre 3: Etude empirique des facteurs de croissance: Méthodologie et Résultats
Le tableau 3.2. suivant donne les coefficients de corrélation simple entre les variables de
l'étude, prises deux à deux. Pour que les modèles soient estimables et interprétables, il faut
qu'il n'existe pas de multicolinéarité entre les variables explicatives intervem.nt dans un
même modèle. Le test que nous utilisons consiste à vérifier que les canés je tous les
coefficients de corrélation simple relatifs aux paires de variables explicatives cl 'un modèle
72
Chapitre 3: Etude empirique des facteurs d e c r oissance: Méthodologie et Résultats
Tableau 3.2. Matrice de corrélations simples entre les variables du modèle de croissance
TXPIBT TXHTOT TXHREV TXHPRJ TXCONSG TXCAPITALT GAPUSA GAPFRANCE EXPORT QUALMACRO
TXPIBT 1.000000 -0.065094 0.284465 0.164568 0.112726 -0.376189 0.081171 0.1 49922 0.706372 0.156178
TXHTOT -0.065094 1.000000 -0.143 104 0.184153 -0.216263 -0.062465 0.198379 0.217770 0.034751 -0.064682
TXHREV 0.284465 -0.143104 1.000000 -0.055949 0.171867 -0.046733 0.009514 0.01 9254 0.008447 0.102510
TXHPRJ 0.164568 0.184153 -0.055949 1.000000 -0.166263 0.102802 0.103086 0.238240 0.332503 -0.198337
TXCONSG 0.112726 -0.216263 0.171867 -0.166263 1.000000 -0.102690 -0.330252 -0.334362 -0.136968 0.148706
TXCAPITALT -0.376189 -0.062465 -0.046733 0.102802 -0.102690 1.000000 -0.033075 -0.004805 -0.304289 0.044491
GAPUSA 0.081171 0.198379 0.009514 0.103086 -0.330252 -0.033075 1.000000 0.887827 -0.0 19402 -0.257994
GAPFRANCE 0.149922 0.217770 0.019254 0.238240 -0.334362 -0.004805 0.887827 1.000000 0.044510 -0.101442
EXPORT 0.706372 0.034751 0.008447 0.332503 -0.136968 -0.304289 -0.019402 0.044510 1.000000 -0.035116
QUALMACRO 0.156178 -0.064682 0.102510 -0.198337 0.148706 0.044491 -0.257994 -0.101442 -0.035116 1.000000
L'examen du tableau montre que les corrélations sont assez faibles. Mais après l'estimation de chaque modèle, nous procéderons au test de
multicolinéarité à partir des coefficients de ce tableaux et du coefficient de détermination R 2 obtenu. Si on conclut à l'absence de
multiconinéarité entre les variables explicatives d'un modèle, alors on peut interpréter les coefficients en terme d'effet "toutes choses égales par
ailleurs ''.
73
3. 2.1. 3. Résultats de l'estimation des modèles
Nous avons procédé à l'estimation des quatre modèles emboîtés spécifiés dans le chapitre
l'évolution de la productivité globale des facteurs (PGF). Pour chacune des trois autres
équations, nous avons procédé à une série de trois régressions qui diffèrent par la mesure
correspond à la série des nombres moyens d'années de scolarité primaire par actif; la
seconde ( 2 ) le représente par le nombre moyen d'années de scolarité qu'un actif a passé
dans tout le système éducatif; dans la troisième régression ( 3 ), il est représenté par une
mesure de la scolarité indexée sur les salaires versés dans la fonction publique. Comme
précisé dans le chapitre 3, ces trois variables de capital humain ont été définies et calculées
par sacerdoti et al. (1998). Dans les tableaux qui présentent ces résultats, nous avons écrit
(resp. 10%) si la valeur absolue du t-student associé est supérieure à 1,96 (resp. 1,64). A
propos des tests diagnostics nous avons écrit la valeur des statistiques empmques
jugés bons par rapport à un test diagnostic donné si la probabilité critique associée est
a) Estimation du modèle ( 1 )
74
Chapitre 3: Etude empirique des facteurs de croissance. Méthodologie e1 Résulutts
Ces résultats suggèrent qu'aucun des deux facteurs traditionnels de production n'explique
R 2 (2%) et la statistique de Fischer F (0,31) montrent que le modèle n'a aucun pouvoir
explicatif. Par ailleurs, bien que les résidus ne soient pas autocorrélés, on m•te que le carré
2
du coefficient de corrélation entre les deux variables explicatives est supérieur au R du
modèle, ce qui dénote la présence d'un forte colinéarité entre les cleux variables
Nous devrions alors éliminer l'une des variables explicatives qui sont fortement corrélées.
Mais, comme notre modèle ne comporte que deux variables explicatives, éliminer 1' une
des deux nous conduirait à un modèle à une variable explicative. Or il n' existe pas de
fondement théorique pour une fonction de production à un facteur. Pour cela, nous
n'opérons pas la correction de multicolinéarité, et notre analyse portera sur ks résultats des
autres modèles.
75
Chapitre 3. Etude empirique des facteurs de croissan ce. Méthodologie et Rés ztlti! ts
L'estimation du modèle ( Il) qui explique le taux de croissance du PIB p:1r actif par les
deux variables explicatives : le taux de croissance du capital physique par tête et le taux de
croissance du stock de capital humain, montre que ce modèle est meilleur au modèle ( I ).
Le pouvoir explicatif (14 à 19%) est un peu plus élevé mais demeure tiès faible et la
1 1
statistique de Fischer montre qu'il existe au moins une variable dont l'effet est significatif.
significatif au seuil de 10% et même au seuil de 5% [équations (1) et (2)]. Pour toutes les
différentes mesures du capital humain, l'effet n'est pas significativement non nul. Aussi , la
76
Chapitre 3: Etude empirique des facteurs de croissance. Méthodologie et Résultuls
entre variables explicatives). Par ailleurs, les tests diagnostics sont tous bom ; seulement le
modèle ne peut pas servir de base aux prévisions. Donc lorsqu'on augmente Je taux de
humain, il en résulte une baisse du taux de croissance du produit globale d'environ 2,4%,
2,3% ou 2% selon que le capital humain est mesuré par la durée moyenne de scolarité d'un
actif dans le cycle primaire de 1' éducation (équation 1), ou dans tout le cycle d'éducation
(équation 2) ou encore par par une mesure basée sur les rémunératior s versées aux
employés dans la fonction publique (équation 3). Par contre, le capital h11main apparaît
sans effet sur la croissance dans les trois équations. On note donc que les tr•)is mesures du
capital humain conduisent à des résultats similaires, aussi bien en ce qui concerne 1'effet
non significatif du capital humain qu'en ce qui concerne l'ampleur (ordre de grandeur) de
n'aurait pas été bien exploité dans l'économie au cours de la période 1970-1997. Mais, il
faut constater que dans ce modèle, on ne contrôle que le capital humain lorsqu'on fait
varier le capital physique. Or, il se peut que d'autres variables de 1'environ 1ement varient
de cette relation négative nous estimons les modèles qui prennent en comp1.e les variables
77
. --
Chapitre 3: Etude empirique des facteurs de croissance: Méthodologie et Résultt!ts
~
):J
Ramsey Reset 0,043 (0,83 7) 0,054 (0,818) 0,267 (0,611)
Nous avons alors estimé le modele (III) qm cont1ent en plus des vanables du modèle (II)
sécheresse. Les résultats sont plus concluants et sont similaires pour les trois variables de
' 1
capital humain. On explique en effet plus de 66% des variations du taux dt- croissance du
78
Chapitre 3: Etude empirique des facteurs de croissance:
PIB par actif. La statistique de Fischer montre que le modèle a une très large significativité
globale. De plus, il n'y a pas de problème d'autocorrélation des etTeurs (dam l'équation (3)
non multicolinéarité des variables explicatives (R2 plus élevé que les carrés cles coefficients
de corrélation entre les couples de variables explicatives). Donc, les coefficients estimés
sont interprétables. En outre, le test de Ramsey montre qu'on peut accepter que le modèle
est bien spécifié. Il ressort que le taux de croissance du capital physique a toujours un effet
négatif, mais qui n'est plus significatif, son ordre de grandeur est toujours le même dans les
trois équations. L'effet du capital humain, alors positif pour toutes les mesures du capital
humain, n'est significatif à 5% que pour la mesure indexée sur les salaires versés dans la
fonction publique [équation 3]. Les années passées par les travailleurs dans Je système
éducatif (cycle primaire ou tout Je cycle éducatif) semblent ne pas avoir de~; repercussions
positives significaftives sur l'activité économique; elles n'amélioreraient donc pas leur
productivité de façon significative. Par contre, lorsque les salaires versés aux travailleurs
salaires semblent donc se traduire par une amélioration de la productivité des travailleurs .
ce qui influe positivement sur le taux de croissance. Les quatre nouvelles variables sont
toutes pertinentes avec des effets positifs (sauf! a variable muette de séchere;se dont l'effet
apparaît le plus important dans toutes les équations. L'effet positif de la consommation
publique n'est significative (au seuil de 10%) que dans une seule équation [equation 1]. En
outre, pour chaque variable explicative, l'ordre de grandeur de son coefficient est le même
dans les trois équations. Ce modèle apparaît ainsi satisfaisant aussi bien d J point de vue
des propriétés statistiques que de la pertinence des résultats. Toutefois, il ne peut pas servir
79
Ces résultats permettent d'expliquer les mécanismes par lesquels on peut améliorer sur
Les résultats précédents montrent que 1'hypothèse de recherche 2 de notre étude est
vérifiée. A propos de l'hypothèse 1, les deux facteurs - capital physique et capital humain Il
1'économie. L'effet négatif et non significatif du taux de croissance du capital physique par
actif apparaît surprenant puisque c'est l'un des principaux facteurs de croissance identifiés
par les théories de la croissance, depuis le modèle de Solow jusqu'aux nouvelles théories
al.(1998) pour les pays d'Afrique de l'ouest. Même si cette différence de résultats peut être
D'abord, l'effet des facteurs de l'environnement économique semble n'avoir pas été
favorable à une bonne exploitation du stock de capital existant dans l'économie. Cet
argument est lié au fait qu'en contrôlant les facteurs de politique économique, le
coefficient du capital physique cesse d'être significatif. En effet, dans le modèle (II), le
coefficient mesure l'effet d'une variation du taux de croissance du capital physique sur le
taux de croissance du PIB par actif lorsque seul le taux de croissance du capital humain est
contrôlé et maintenu constant. Dans le modèle III, cet effet est mesuré lorsqu'on suppose
en plus que l'on met en oeuvre les mêmes mesures de politique économique et que les
conditions climatiques sont identiques. Les conditions de travail n'ont donc pas été
favorables à une bonne utilisation des capacités de production nouvellement acquises dans
80
l'économie. Ces conditions pourraient avoir engendré par exemple une sous-utilisation des
d'entreprises industrielles du Sénégal sont caractérisées par des taux élevés de sous-
utilisation des capacités de production (MTOA, SONACOS, etc.) (MEFP, 1997). Aussi, la
situation de l'emploi dans le pays qui est caractérisée par des difficultés d'insertion
automatique après formation ne permet pas aux nouvelles compétences formées d'être
Par ailleurs, un autre argument peut être lié à la forte dépendance climatique de
l'économie. L'économie étant encore dominée par les activités agricoles et les industries
improductifs si la pluviosité n'a pas été favorable aux activiés agricoles. Il y a là un effet
direct d'une pluviosité défavorable sur les activités agricoles et un effet de répercussion sur
A partir de ces deux arguments, il ressort que les efforts d'accumulation du capital
physique devraient être suivis par de bonnes politiques économiques. L'analyse du résultat
L'effet du capital humain n'est positif et significatif que lorsqu'il est représenté par sa
mesure qui est indexée sur les salaires versés dans la fonction publique. Les deux autres
mesures du capital ont des effets non significatifs et parfois négatifs. Ce résultat est
conforme à ceux obtenus par Sacerdoti et al. (1998). Ce résultat suggère que l'amélioration '1
6 Sacerdoti et a. ( 1998) ont estimé un modèle à effet fixe puis à terme constant commun sur données de panel
alors que notre analyse est simplement longitudinale.
81
des compétences des travailleurs ne conduit pas à une augmentation significative du niveau
Mais, on sait qu'un niveau d'éducation plus élevé permet au travailleur d'améliorer se
productivité et donc d'avoir de plus grands rendements productifs. L'effet non significatif
du nombre d'années de scolarité montre donc que les compétences marginales acquises
chaque année grâce au système éducatif n'ont généralement pas un impact considérable sur
suivants.
D'abord, d'un point de vue statistique, le taux de croissance du capital humain varie très
faiblement alors que la série du taux de croissance du PIB par actif connaît de grandes
fluctuations. Par conséquent, il est difficile d'expliquer les fortes fluctuations par une série
assez stable. Mais, si le taux de croissance du PIB a fluctué si fortement, cela suggère que
la croissance n'a pas pu être maîtrisée; en particulier, les efforts faits pour maintenir stable
le taux de croissance du capital hurnain 7 n'ont donc pas eu d'impact notable sur l'activité
économique.
Ensuite, on doit rechercher pourquoi les compétences marginales acquises chaque année ne
parviennent pas à améliorer les résultats de l'activité économique. On peut évoquer le taux
de chômage assez élevé dans le pays. Aujourd'hui, il est rare qu'un jeune sortant du
système éducatif trouve automatiquement du travail et soit ainsi impliqué dans le secteur
compétences. Les variables de capital humain utilisées dans nos estimations surestiment
compte les nouvelles compétences. L'effet donné par l'analyse économétrique étant celui
induit par les nouvelles compétences, il apparaît normal qu'il ne soit pas significatif.
82
En outre, on peut évoquer la forte tertiarisation de l'économie. Les activités du secteur
Mais, ces activités sont généralement menées par des personnes analphabètes ou n'ayant
qu'un faible niveau d'éducation. En fait, ces personnes s'adonnent aux activités de
commerce qui procurent une forte valeur ajoutée mais qui ne requièrent pas une haute
qualification intellectuelle. Il en résulte que même lorsqu'on détourne des individus de leur
parcours scolaire et qu'on les canalise convenablement dans les secteurs du commerce, il y
l
aura amélioration du niveau de l'économie. Toutefois, afin de pouvoir alimenter le secteur Il
du commerce en produits locaux de bonne qualité, il est important d'encourager la
formation de hautes compétences. Ces compétences seront utilisées dans les industries de
transformation des produits agricoles afin d'assurer une bonne dynamique des secteurs
compétences techniques. Les efforts dans la formation du capital humain devraient donc
Par ailleurs, on peut évoquer l'inhibition par des forces politiques, des initiatives
favorables à la croissance que pourraient prendre les individus ayant un haut niveau
d'éducation. En effet, la qualité des orientations stratégiques de 1' économie est importante
dans l'incitation et la motivation des travailleurs à bien se déployer. Mais, dans nos pays
africains, l'homme politique domine l'économiste si bien que ce dernier travaille au service
gré du gouvernement en place. Ainsi, les décisions stratégiques de l'économie ne sont pas
exactement celles que les analyses de l'économiste lui suggèrent, mais celles désirées par
7 La stabilité du taux de croissance du capital humain implique que le capital humain a toujours augmenté
d'année en année à un taux presque constant. Son évolution a donc été continue.
83
le gouvernement. Alors, apparaît une faiblesse de la productivité des travailleurs liée à un
manque de motivation. Cet argument soutient donc que les compétences issues de
l'éducation ne sont pas toujours exploitées de façon objective, ce qui ne permet pas que
capital humain par le nombre moyen d'années de scolarité par actif comporte un grand
biais d'inégalité. Cet argument évoque le rôle joué par la distribution de l'éducation entre
les travailleurs dans son utilisation économique 8 . Ce rôle a été identifié par Ramon et al.
( 1998) qui trouve que le coefficient négatif de 1' éducation devient positif et significatif
lorsqu'on contrôle 1' inégalité de la distribution de 1' éducation. Une forte inégalité dans la
'r
distribution de l'éducation implique que la plupart des compétences et aptitudes au travail
issues du système éducatif sont possédées par quelques individus seulement, la plupart des '1
individus paraissant n'avoir pas été à l'école. Dans ces conditions, une variation du capital
humain mesuré par le niveau moyen d' éducation ne traduit pas nécessairement une
peut ainsi ne pas s'améliorer à la suite d'une augmentation de la variable de capital humain
justifié dans la mesure où la variable de capital humain basée sur les salaires versés dans la
fonction publique a un effet positif et significatif dans les différents modèles. En fait, cette
mesure ne prend en compte que les personnes travaillant dans la fonction publique; celles-
ci étant toutes instruites, l'inégalité de l'éducation entre eux est plus petite que celle qui
84
Au total, on peut dire que le capital humain contribue positivement à la crOissance
économique au Sénégal. Les mesures du capital humain basées sur le nombre moyen
d'années de scolarité en sont une évaluation moyenne au sein de la population active qui
est caractérisée par de fortes disparités entre les individus. Ce qui n'a pu rendre compte de
Par ailleurs, les chocs climatiques subis par l'économie vers la fin des années 70 et au
milieu des années 80 ont fortement contrasté les activités économiques. En fait, il y a une
répercussion négative directe sur les activités du secteur primaire. Indirectement, les
entreprises manufacturières vont connaître une baisse de leurs activités suite à l'éventuelle
pénurie des approvionnements en produits primaires locaux. Il est aussi possible que la
baisse de la production agricole ait engendré une inflation généralisée dans l'économie, ce
l'économie. L'aversion pour le risque conduit alors les agents à réduire leurs
Le taux de croissance des dépenses publiques de consommation exerce un effet positif sur
la croissance. Mais, cet effet n'est pas très robuste; il n'est pas significatif dans toutes les
équations. Ces dépenses incluent celles d'éducation, de santé, de salaires, d'entretien ; elles
ne prennent pas en compte les dépenses en capital qui sont des dépenses d'investissement.
L'effet positif obtenu est cependant contraire à celui trouvé par Barro et Sala-I-Martin
(1996) à partir d'une analyse transversale. Mais, il faut noter que leur variable de
consommation publique n'inclut pas les dépenses d'éducation. Bien que les dépenses
8 Une forte inégalité dans la distribution de l'éducation ne permet pas de saisir son rôle dans l'économie.
85
fonctionnement de l'Etat auraient influencé négativement la croissance de l'économie, les
dépenses consacrées aux secteurs sociaux tels que 1' éducation et la santé ont certainement
contribué à améliorer la productivité. Cet effet positif des dépenses publiques sur la
productivité des travailleurs peut être expliqué par le renforcement des capacités humaines
par les programmes d'ajustement structurel (PAS), cet effet positif peut s'expliquer par une
réaction inverse des populations face à la réduction des dépenses de l'Etat. En effet, les
faire augmenter leurs revenus. Pour subsister face à la crise sociale, les individus ont dû se
créer des activités secondaires constituant de nouvelles sources potentielles de revenus. Cet
Par ailleurs, les mesures d'assainissement des finances publiques prises dans le cadre des
PAS ont eu pour objectif de réduire les dépenses non prodqctives. Cette réduction des
dépenses non productives et la priorité données aux secteurs sociaux depuis le début des
années 90 ont donc conduit à adapter les dépenses publiques aux objectifs de
développement. Ces deux orientations ont pour effet d'améliorer la productivité des
salariales semblent avoir été compatibles avec les objectifs de croissance ; ce que reflètent
humain basée sur les salaires versés par la fonction publique. En fait, cela peut s'expliquer
par les mesures de réduction de la masse salariale dictées par les PAS . Ces mesures qui ont
été mises en oeuvre à travers le programme de départ volontaire ont contribué à éliminer le
personnel oisif de la fonction publique. Dès lors, il devrait y avoir une efficacité dans les
86
privatisations opérées dans le cadre des PAS ont énormément réduit la masse salariale de
l'Etat alors qu'elles ont engendré une meilleure gestion des entreprises nationales
privatisées. Il en est donc résulté une plus grande efficacité productive au moment où les
secteurs productifs constituent la priorité dans 1' affectation des ressources de 1'Etat.
En définitive, les différents arguments qui précèdent justifient 1' existence d'une relation
de cet effet (environ une augmentation du taux de croissance du PIB par actif de 0.26%
fragilité (absence de robustesse) seraient dues aux comportements négatifs des personnels
non optimale.
fait, ces trois ratios indiquent le degré de prudence de 1' action de l'Etat ou des autorités
implique que l'Etat a su maîtriser ses dépenses par rapport à ses recettes. De ce fait, les
bailleurs de fonds et les partenaires bilatéraux seraient disposés à lui accorder de prêts et
même des aides pour financer des besoins d'investissement. Il en résulte une augmentation
des ressources disponibles dans l'économie; si elles sont bien exploitées, il y aura hausse
87
du niveau de la production. Si le ratio de la masse monétaire au PIB est faible, cela suggère
que les prix seront assez maîtrisés dans le pays et donc que les risques d'investissement liés
à l'incertitude seront faibles. Alors, les entreprises seront incitées à investir et les autres
agents seront capables d'épargner davantage. En ce sens, il semble que les agents
courant) de la fin des années 70, il était nécessaire de les réduire afin d'aspirer à la
croissance. On peut dire que les PAS ont été bien menés dans leur première phase qu'est la
stabilisation. En outre, les mesures de réduction de la masse monétaire qui ont suivi la
Toutefois, on pouvait penser que si la dette est utilisée pour pour financer des activités
productives, alors elle serait positivement liée à la croissance. Mais, l'effet positif de la
confiance liée à un faible poids d'endettement semble meilleur, à long terme, à l'effet
positif que peut avoir une dette utilisée dans des activités productives. En effet, les taux de
croissance du PIB ayant été toujours faibles, les politiques de croissance qui comptent sur
(taux de croissance économique inférieur au taux d'intérêt sur la dette). Ce qui constituerait
en conséquence une perte totale de confiance auprès des bailleurs et par la suite, il y aurait
pénurie de moyens financiers dans l'économie. Mais, il faut noter à cet égard que le
Sénégal ne souffre pas d'un poids d'endettement excessif. Sa dette a été jugé soutenable en
88
1998 par l'initiative PPTE 9 ; ce qui suggère que la politique d'endettement du Sénégal a été
de la production. Mais, il faut noter qu'au Sénégal, les périodes où la politique des
dépenses publiques a commencé par définir des secteurs prioritaires correspondent aussi à
celles de réduction des dépenses (période d'ajustement structurel). Il en résulte que les
dépenses productives correspondent surtout à des dépenses limitées mais bien affectées
selon les priorités de croissance. Ce qui soutient plutôt une relation négative entre la
PIB traduit l'importance des mesures de libéralisation commerciale dans les activités de
production. Bien que cette relation significative peut résulter d'un effet de simultanéité, il
reste que cette variable réflète la qualité des politiques commerciales. En effet, les
l'ouverture aux échanges extérieurs. L'effet favorable des politiques d'ouverture sur les
activités de production peut s'expliquer par des mécanismes de réallocation des facteurs
concurrence étrangère, aussi bien sur le marché national que sur les marchés extérieurs. La
9 Selon l'initiative en faveur des Pays Pauvres Très Endettés (PPTE), la dette d'un pays est jugée soutenable
lorsque les principaux ratios d'endettement montrent que le pays est capable de solder lui-même sa dette sans
aucune mesure spéciale d'allègement; dans ce cas, le pays n'est pas élu pour bénéficier de l' initiative.
89
libéralisation des importations permet en effet l'introduction dans le marché national de
tout produit fabriqué à l'étranger. Ainsi, même les entreprises produisant pour le marché
local sont concurrencées par les produits fabriqués à l'étranger. Pour survivre, elles sont
les entreprises d'exportation sont amenées à réduire leurs coûts de production afin d'être
Elles chercheront par exemple les approvisionnements en matières premières les moins
coûteux.
devrait permettre aux entreprises locales de produire dans des conditions comparables à
En définitive, les résultats de 1' estimation des modèles de croissance suggèrent que ni le
capital humain ni le capital physique n'ont été bien exploités dans l'économie, faute d 'un
consommations de l'Etat ont induit des attitudes favorables à la croissance. Les politiques
de libéralisation se révèlent comme des mesures qui obligent aussi bien les entreprises de
crmssance.
90
3.3. IMPLICATIONS DE POLITIQUE ECONOMIQUE
Les analyses précédentes suggèrent d'importantes actions de l'Etat et des autres acteurs de
politique gouvernementale doit définir un cadre à 1'intérieur duquel les entreprises peuvent
production doivent être libres de procéder à des ajustements de façon souple et efficace
pour profiter des nouvelles occasions que leur offre l'environnement international. Le
secteur privé, les organisations non gouvernementales et les individus doivent se montrer
préoccupés par rapport au bien-être collectif. Ils doivent alors se déployer pour exploiter au
éviter les réductions dans les investissements publics et doit multiplier les stimulants, tant
pour investir que pour épargner, à travers des politiques fiscales adéquates. Mais,
afin d'éviter les effets décourageants de l'aversion pour le risque lié à une grande
fraude et la corruption sont des actions que l'Etat doit mettre en oeuvre. En ce sens, la
91
politique de l'Etat veillera à contenir le prix des biens d'investissement afin de faciliter
capacités productives, ce qui suppose une baisse des coûts de production et une
publics et entraînent ainsi une hausse des coûts de production. Aussi, ces attitudes
n'incitent les agents ni à déployer de l'effort ni à être soucieux d'une saine exploitation des
ressources. En outre, l'Etat continuera à identifier les secteurs prioritaires qu'il privilégiera
sera de garantir que le niveau de production réalisé par l'économie correspond à son
potentiel. Si cette pleine utilisation des capacités est accompagnée d'un développement
Aussi bien 1'Etat que les entrepreneurs privés doivent mettre en oeuvre des politiques de
92
technologiques dans le pays afin de multiplier les capacités techniques locales. Ces
capacités peuvent être regroupées dans les trois catégories suivantes : investissement
développera pas adéquatement. Ou, si les compétences nécessaires sont créées, mais ne
l'efficacité avec laquelle le capital accumulé est utilisé revêt une importance cruciale. Pour
assurer cette synergie entre les trois catégories de capacités, il faudra : a) encourager le
développement des établissements et des instituts de formation afin de créer une main-
d' oeuvre locale dotée des compétences professionnelles et techniques appropriées ; b) créer
nouveaux produits et des liens étroits entre les établissements de fonnation universitaire ou
financé par la Banque Mondiale est déjà une action favorable. Le volet Bibliothèque de ce
programme qui est déjà en cours d'exécution constitue une aide à la R&D. En outre, les
investisseurs étrangers pourraient créer des liens avec les sociétés locales afin de contribuer
pourra alors bénéficier de 1' assimilation à la fois des nouvelles pratiques de gestion et des
technologies modernes.
Au total, 1' objectif du développement technologique est de faciliter 1' accès à la technologie
nouvelle et de garantir que la technologie acquise est utilisée aussi efficacement que
93
3.3.3. Valorisation du capital humain
La Banque Mondiale note au sujet des économies très performantes d'Asie "Entre 60 et
humain. L'évolution de la productivité a été plus forte que dans les autres économies en
développement et elle est importante pour la réussite de l 'Asie, mais elle n 'en constitue pas
le facteur dominant" (World Bank, 1993). Nos résultats ont montré que l'environnement
et les politiques macroéconomiques n'ont pas favorisé une bonne exploitation du capital
humain au sein de la population active n'a pas permis de capter ses effets réels sur la
croissance à partir de nos modèles. Toutefois, il reste vrai, à travers la synergie mise en
évidence plus haut, qu'une combinaison efficace du capital humain et du capital physique
"soins de sauté primaires pour tous". En outre, il faudra préparer des structures de travail
pour accueillir les nouvelles compétences formées. Aussi bien 1'Etat que les organismes
social auprès des populations et à augmenter de plus en plus les fonds alloués aux
spécialisation croissante font que l'enseignement général ne peut offrir que des aptitudes
générales de base. Une formation en cours d' emploi est ainsi nécessaire pour pouvoir
communiquer les nouvelles techniques aux personnels d'entreprises. Par ailleurs, les
entrepreneurs devront faire participer leur personnel à des séminaires ou stages pratiques,
94
organisés à 1' échelle international afin de bénéficier plus facilement des idées développées
Il faut dire que les politiques de valorisation du capital humain visent à former de la main-
d'oeuvre qualifiée dont les compétences sont toujours mises à jour avec l'évolution
Les mesures de libéralisation commerciale prises en 1986 dans le cadre de la NPI doivent
être poursuivies. Les autorités publiques devraient insister sur la mise en oeuvre de
mesures de promotion des exportations. En ce sens, il pourra être procédé à une révision à
la baisse des taxes sur les exportations, surtout celles qui portent sur les produits
qui peuvent permettre aux entreprises locales de réduire leurs coûts d'approvisionnement et
donc leurs coûts de production. Par ailleurs, les importations devront aussi bénéficier de
mesures de libéralisation, mais cela ne devrait concerner que les produits importants et
utiles pour la production nationale. Par exemple, les importations de biens d'équipement
spécifiques dans le pays. Il appartient donc aux entreprises locales de contrôler la qualité et
le prix de leur produits pour que les consommateurs nationaux ne leur préfèrent pas des
éliminer et à éviter les distorsions commerciales de façon que les entreprises - aussi bien
celles qui produisent pour le marché locale que celles d'exportation - soient capables
d'offrir des produits compétitifs sur les marchés locaux ou internationaux. Mais, la prise en
95
compte de l'environnement climatique sujet à plusieurs aléas telle la sécheresse est un
autre aspect important pour accroître les chances de performance dans l'économie.
des mauvaises pluviosités, suscite la nécessité de renforcer les politiques de lutte contre la
sans effet parce que le climat n'a pas été favorable. Pour cela, dans un premier temps, les
populations devraient être éduquées sur l'importance de protéger la faune et la flore de leur
cadre de vie qui sont déjà pauvres (pays sahélien). Les services des eaux et forêts devraient
par exemple), devraient être poursuivies et renforcées. A tout moment, chaque citoyen
pourrait participer à l'oeuvre du reboisement en plantant au moins un arbre sur une partie
du territoire national.
Ces différentes politiques devraient être mises en oeuvre à travers des mesures et des
actions spécifiques qui font appel aussi bien aux autorités publiques, à la société civile
96
CONCLUSION ET RECOMMANDATIONS
CONCLUSION
Au cours des deux décennies qui ont suivi l'indépendance du Sénégal, l'économie a
connue plusieurs mouvements qui ont finalement conduit à une grande récession à la fin
des années 70 comme dans la plupart des pays d'Afrique subsaharienne. Les grands
voie d'issue qui a été adoptée est de mettre en oeuvre des programmes d'ajustement
Monétaire Internationale). Mais, les politiques contenues dans ces programmes ont été
élaborées et dictées par ces institutions, si bien qu'elles n'ont pu tenir compte des réalités
sociale dans le pays. De plus, ces programmes n'ont permis qu'une stabilisation
dévaluation du franc CFA, que l'on note une reprise continue de la croissance du PIB. Les
dans le pays afin d'être orientés dans le sens d'une croissance économique continue. C'est
dans ce sens que cette étude a été menée en vue d'apporter une contribution à la recherche
des facteurs explicatifs des variations du revenu global par actif de l'économie sénégalaise.
Les analyses ont consisté en l'estimation successive d'un modèle de croissance simple
avec résidu de Slow, d'un modèle de croissance avec capital humain et d'un modèle de
données sont relatives à la période de 1971-1997 qui prend en compte l'ensemble des
97
Les résultats des autres modèles montrent en effet que ni le capital humain ni le capital
physique n'ont été bien exploités dans l'économie, faute d'un environnement de travail et
de motivation adéquat. Les changements climatiques engendrant une forte sécheresse ont
consommation de l'Etat ont induit des attitudes favorables à la croissance. Les politiques
de libéralisation commerciale se révèlent comme des mesures qui obligent aussi bien les
produits.
Ces analyses suggèrent d'importantes actions de l'Etat et des autres acteurs de l'économie
production doivent être libres de procéder à des ajustements de façon souple et efficace
pour profiter des nouvelles occasions que leur offre l'environnement international. Le
secteur privé, les organisations non gouvernementales et les agents économiques doivent se
montrer préoccupés par rapport au bien-être collectif. Ils doivent alors se déployer pour
Il faut noter que cette étude présente de nombreuses limites qui, bien qu'elles ne
compromettent pas la validité des résultats, n'ont pas permis de saisir tous les aspects
possibles des facteurs de croissance. Les mesures du capital humain n'ont pas permis de
98
saisir son effet réel ; en particulier, les informations disponibles ne rendent pas compte de
production utilisée n'a pas permis de tester séparément le rôle joué par l'investissement
sens, elle n'a permis ni d'examiner les facteurs de croissance spécifiques aux principales
est souhaitable que les études portant sur la croissance économique au Sénégal soient
orientées beaucoup plus vers ces aspects micro qui impliqueraient des politiques de
chaque région ou aux potentialités de chaque branche de l'économie. Par ailleurs, il est
souhaitable de mener des études sur l'économie des pays victimes de conflits sociaux afin
de cerner l'effet de ces événements sur la croissance économique. C'est dans ces limites
que sont formulées les recommandations suivantes liées aux résultats de l'étude.
RECOMMANDATIONS
Les décideurs politiques, les autorités monétaires devraient continuer à rechercher les
économique. La politique monétaire devra viser le maintien de faible taux d'inflation dans
réduit les dépenses mais les rend efficaces ; ce faisant, elle veillera à réduire au mieux les
devra être favorables à la réduction des déficits de la balance commerciale. Par ailleurs,
99
Conclusion et Recommandations
aussi bien 1'Etat que les entreprises de production (privées ou publiques) devraient définir
une politique d'endettement telle que leur niveau d'activité ne compromette pas leur
sur les mesures qui favorisent l'efficacité et améliorent la productivité aussi bien du capital
inefficace pour tendre vers une administration où 1' accent sera mis sur une meilleure
qualité des prestations des services nécessaires au développement et vers un cadre incitatif
aux trois niveaux suivants: ajustement institutionnel, cohérence entre les structures de
En ce qui concerne l'ajustement institutionnel, l'objectif devrait être d'avoir un Etat plus
légitime, transparent et ayant un grand sens de responsabilité et qui puisse assurer les trois
rôles suivants :
nouvel équilibre entre les rôles respectifs des secteurs public et privé, où la
l'ingérence politique dans la gestion économique et éliminer toutes les autres pratiques
improductives.
100
Conclusion et Recommandations
participation effective des populations aux processus de prise de décision, leur accès à
1' information.
décentralisation du pouvoir au profit des collectivités locales. Ceci permet de créer une
développement à la base propre à réduire les inégalités internes dans le pays. Il faut
sa propre faveur les politiques gouvernementales ayant un effet sur le bien-être de ses
membres. La pression de l'opinion publique peut en effet conduire à une plus grande
1' efficacité du service, telles que ressenties par les bénéficiaires. Ceci vise à éviter les
dans 1' ensemble du système productif. Pour cela, la planification, le suivi et 1' évaluation
des programmes de travail individuels devraient être des instruments utilisés au sein des
unités administratives.
devrait être mis sur 1' amélioration de 1' efficacité et de la capacité de 1' administration
publique à exécuter effectivement les politiques les plus importantes. Les principaux
101
Conclusion et Recommandations
secteur privé et des organisations non gouvernementales (ONG). Cette action concertée
permettra d'éviter les politiques qui n'avantagent que quelques classes d'individus
seulement, mais de tenir compte des intérêts de tous, puisque toutes les couches seront
favorable aux activités du secteur public et privé. Même si ces mesures doivent, de par leur
nature, prendre du temps pour être mises en oeuvre, il est souhaitable que celles qui
concernent les fonctions fondamentales de l'Etat soient exécutées. En ce sens, une réforme
102
Conclusion et Recommandations
d'intervention sont:
qualité des programmes d'investissement public soit élevée et à ce que les projets
soient soumis à un certain nombre de tests économiques, car un projet mal conçu ou
mal exécuté peut coûter très cher. Il faudra privilégier les investissements publics qui
complètent les activités déterminées par le marché et non ceux qui leur font
concurrence.
b) Financer 1'exploitation et 1'entretien des biens de capital. Une part des dépenses
courantes au titre des biens et services doit être destinée à l'exploitation (fournitures et
insuffisantes, les niveaux d'efficacité risquent d'être faibles dans des domaines tels que
c) Remédier aux causes d'une faible productivité dans les administrations publiques.
d) Une politique de dépense efficace par rapport à son coût : Le manque de ressources
rend plus pressante l'adoption d'une politique des dépenses qui soit efficace par rapport
à son coût et qui permette d'atteindre des objectifs comme: redistribution du revenu,
autosuffisance. Par exemple, l'octroi généralisé de subventions sur le prix des produits
103
Conclusion et Recommandations
alimentaires est une solution qui n'est pas forcément la plus efficace pour améliorer
l'état nutritionnel des pauvres et qui pourrait être avantageusement remplacée par
éléments les moins productifs. Le secteur public pourra alors contribuer à l'épargne
L'action de l'Etat en faveur d'une bonne affectation des ressources d'investissement doit
être suivi d'un renforcement subséquent de la main d'oeuvre pour une meilleure
- ·Poursuivre les politiques de l'éducation primaire obligatoire pour tous, afin de réduire
niveau universitaire. En ce sens, il ne s'agit pas de montrer à ceux qui n'ont qu'un
faible niveau qu'ils n'ont aucune place dans la sphère économique, mais de leur
montrer qu'ils seraient plus utiles s'ils poursuivaient davantage leurs études ;
Encourager l'initiative privée afin qu'il y ait une multiplication d'emplois dans le pays,
104
Conclusion et Recommandations
Que les parents, dans la mesure de leurs possibilités, investissent dans l'éducation de
leurs enfants. Ce faisant, ils devraient être conscients qu'ils sont entrain d'investir pour
Dans cette phase initiale, l'objectif est d'assurer une "Education pour tous tout au long
Dans une deuxième étape, les institutions internationales et l'Etat devront mettre en
oeuvre des stratégies pour réduire les inégalités face au savoir et remédier aux problèmes
d'information.
Les institutions internationales pourraient intervenir de deux façons : fournir des biens
connaissances. En effet :
Plusieurs formes de connaissances sont des biens publics et aucun pays n'est prêt à
investir seul dans la création de ce type de biens qui profiterait à tout le reste du monde.
C'est pourquoi, les institutions internationales et les ONG qui agissent pour le compte
Ainsi, les institutions internationales devraient jouer un rôle important. Mais, c'est l'action
de l'Etat et des agents de l'économie qui décidera de l'efficacité avec laquelle ils utilisent
ces connaissances. L'ouverture aux savoirs existant à l'étranger est un aspect important qui
doit être assuré par l'Etat et les entreprises privées. Pour cela, il est souhaitable de prendre
entreprises à une efficacité et à avoir des produits conformes aux normes internationales.
De ce fait elles sont amenées à utiliser davantage les nouvelles connaissances. Par ailleurs,
'
105
Conclusion et Recommandations
par leurs activités au Sénégal, les sociétés multinationales, toujours à la pointe du progrès
outre, au Sénégal, il serait plus facile d'exploiter une technologie étrangère sous licence,
que d'inventer une nouvelle technologie de production. L'Etat devrait donc favoriser
l'orientation des politiques d'acquisition et d'assimilation des connaissances vers ces trois
aspects.
En définitive, l'Etat et les entreprises privées devront veiller à une large diffusion des
technologies afin que toutes les couches de la population et toutes les structures de
production puissent y accéder. Par exemple, on veillera à rendre plus simple l'acquisition
de l'ordinateur même par des particuliers, l'accès au téléphone, à l'internet. Ces mesures
productif.
mesures d'incitation, à créer des marchés financiers sains et à laisser ensuite le marché
décider quels produits vont réussir sur les marchés d'exportation. Il ne s'agit donc pas de
choisir au départ les produits à exporter. En fait, une telle stratégie relèverait du
gouvernement et on sait que ce dernier est, sur le plan institutionnel, incapable de définir le
type de comportement entrepreneurial qui est nécessaire pour trouver et promouvoir les
produits pouvant réussir sur les marchés d'exportation. Le développement soutenu des
exportations ne peut être basé dans le long terrile que sur une large ouverture aux échanges
et à l'investissement étranger. Aussi, faut-il que les prix soient déterminés par le marché.
Aussi, il faut noter que les distorsions actuelles sont le résultat soit de défaillances du
106
Conclusion et Recommandations
marché, telles que les situations monopolistiques existant dans beaucoup de secteurs, soit
producteurs (Banque Mondiale, 1997). Les entreprises qui écoulent leur production sur le
marché national en sont relativement peu affectées, puisqu'elles sont quelque peu à l'abri
de la concurrence internationale. Par contre, l'impact de ces distorsions est plus important
pour les exportateurs puisque leurs prix de vente sont déterminés sur le marché
directement aux distorsions, il faut reconnaître que c'est un processus qui doit prendre du
temps. Le gouvernement peut atténuer certaines défaillances du marché dans le secteur des
programme de réformes déjà initié en 1994. En outre, il faudra mettre en oeuvre un certain
exploitation des opportunités offertes par la bourse régionale des valeurs mobilières de
compétentes ;
107
Conclusion et Recommandations
La priorité absolue accordée par le système judiciaire à la résolution des différends liés
1' environnement.
sensibiliser les populations sur le souci de multiplier les opportunités de croissance liées à
l'environnement, notamment les conditions qui favorisent une bonne pluviosité. Les axes
d'attitudes et de comportements;
environnementale ;
e) L'élaboration de plans communautaires qui précisent, dans le cadre d'un plan local de
108
Conclusion et Recommandations
1' espace, notamment celles relatives à 1' occupation et à 1'affectation des sols selon leurs
Nous avons vu que des facteurs politiques et institutionnels n'ont pas toujours pern1is de
mettre en oeuvre les différentes mesures prévues dans différentes réformes économiques au
règlements rendent caduques les politiques les mieux intentionnées. Il apparaît donc que le
succès des différentes mesures contenues dans nos recommandations dépend de 1' existence
de moyens établissant de façon crédible que le gouvernement ne reviendra pas sur ses
promesses favorables déclarées. Ces moyens doivent avoir la même logique : mettre en
place des mécanismes qui empêcheront de revenir sur les engagements déjà pris. Il
faut une franche rupture avec la façon dont les politiques économiques ont souvent été·
mises en oeuvre dans le passé. Les autorités publiques devraient se soumettre aux
restrictions que leur imposent les politiques adéquates et supporter les coûts qu'elles
de la déréglementation.
En définitive, les mesures de promotion de la croissance au Sénégal font appel aussi bien à
l'Etat que les organismes internationaux, les organisations non gouvernementales et les
agents privés. Bien qu'elles ne soient pas exhaustives, en raison des nombreuses limites
que présente ce travail, il faudrait veiller à leur mise en oeuvre pour pouvoir espérer
enregistrer des taux de croissance positifs et soutenus au cours des prochaines années.
109
Références bibliographiques
BIBLIOGRAPHIE
ALBERTO, A., (1997), The political economy ofhigh and low growtl!, Paper presented
at the Annual World Bank Conference on development Economies, 1997, Washington D.C.
1... AMABLE, B.; GUELLEC, D. : (1992), Croissance endogène : les principaux
mécanismes, Economie et prévision, n° 1016, Paris, Mai.
i- AMVOUNA, A. M. : (1999), Existe-il un taux de croissance seuil au-delà duquel la
contribution du capital humain devient nécessairement positive ? Communication,
Quatrième Journées Scientifiques, Ouagadougou, Janvier.
BANQUE MONDIALE: (1993), Région de l'Afrique: données internes, Banque
Mondiale, Washington, DC.
BANQUE MONDIALE : (1995), Rapport sur le développement dans le monde 1995 :Les
travailleurs dans un monde en mutation, New York, Oxford University Press for the World
Bank.
BANQUE MONDIALE, (1993), Sénégal : Stabilisation, Ajustement partiel et Stagnation,
Rapport n°11506 - SE.
BANQUE MONDIALE, (1997), Sénégal : le défi de l'intégration international,
Décembre.
BARRO, R· XAVIER Sala-I-Martin : (1996), La croissance économique,
'
MCGRA WHILL, Ediscience, Paris.
BARRO, R. J. : (1991), Economie growth in a cross-section of countries, Quaterly Journal
of economies, March.
)( BARRO, R.J. et XAVIER Sala-I-Martin : (1992), Public finance in models of economie
growth, Review of Economie Studies, no 59.
BARROS, A.R., (1993), Sorne implications of new growth theory for economie
development, in Journal ofInternational Development, vo/.5, n°5, 531-558.
BEN HAMMOUDA, H. (1998), Les théories du post-ajustement: quelques pistes de
recherche pour les économies africaines, CODESRIA-Dakar, Série Etats de la littérature,
N° 1., Dakar, Sénégal.
BERTHELEMY, J-C; DESSUS, S. et VAROUDAKIS, A.: (1997), Capital humain,
Ouverture extérieure et Croissance : estimation dur données de panel d'un modèle à
coefficients variables, OCDE-document technoque, no 121, Janvier.
BONSTON CONSULTING GROUP, (1990), République du Sénégal: Impact de la
réforme de la politique industrielle, Dakar, Sénégal.
BRASSEUL, J.: (1993), Introduction à l'économie du développement, Cursus, Paris.
BROCHART, F.: (1984), Exportation et croissance économique: application aux pays
africains de la zone franc, revue d'Economie Ploitique, 95ème année, N°4, pp. 469-483.
BURNSIDE, C., DOLLAR, D.: (1997), Aid, policies and growth, World Bank Working
Paper, N° 1777, Washington, DC.
Conseil Economique et Social (CES) du Sénégal, (1995), Etude sur l'impact de la
dévaluation du franc CFA, Novembre, Dakar, Sénégal.
DE MELO, J.; ROBINSON, S. : (1990), Productiviry and externalities: Models of
export-led growth, WPS, n° 387, World Bank, Mars.
DIAGNE, A. ; KANE, K. ; DAFFE, G. ; NIANG, I.C. ; SALL, S.S. ; KASSOUM, S.,
(1998), Relance et durabilité de la croissance économique au Sénégal, Dakar, Sénégal.
DIAGNE, A., (1995), Evaluation des politiques macro-économiques du Sénégal avant et
après la dévaluation du franc CFA, Document de recherche du CREA, Dakar, Sénégal.
Direction de la Prévision et de la Statistique (DPS) du sénégal, Base de données sur les
comptes économiques du Sénégal 1960-1997, Dakar.
DODARO, S. :(1991), Comparative advantage, trade and growth: export-led growth
revisited, World Development, vol19, n°9; pp 1153-1165.
DURUFLE, G., (1988), L'ajustement structurel en Afrique, Karthala, Paris.
EASTERLY, W.; LIVE, R.: (1997), Africa's growth tragedy: Policies and ethnie
divisions, Quater/y Journal of Economies, N°ll2, November, pp.1203-1250.
ELLIOT, B., ALEXANDRIA, V. (1990), Ajustement ajourné: réforme de la politique
économique du Sénégal dans les années 80, Dakar, Sénégal.
FISCHER, S.: (1993), The rôle of macroeconomie factors m growth, Journal of
Monetary Economies, vo/32, December, pp. 485-512.
FOSU, A. K.: (1990), Exports and economie growth: the african case, World
111
Références bibliographiques
Krueger, NBER.
112
RAMON, L.; VINOD, T.; YANG, W., (1998}, Adressing the Education puzzle, World
Bank Policy Research Working Paper, n°2031, December.
ROMER, D., (1997), Macroéconomie Approfondie, Traduit de l'américain par Fabrice
Mazerolle, Ediscience International, Paris.
1li SACERDOTI, E. BRUNSCHWIG, S. et TANG, J. : (1998), The impact of human
capital on growth: Evidence from west Africa, IMF Working Paper.
SACHS, J. D. ; ANDREW, M. ; W ARNER; (1996), Sources of slow growth in african
economies, Paper presented at the Annual World Bank Conference on Development
Economies 1996, Washington D.C.
x SPIEGEL, M. M.; BENHABIB, J.: (1994), The role of human capital in economie
development : evidence from aggregate cross-country data, Journal of Monetary
Economies, n° 34, pp. 143-173.
TAKATOSHI, 1., (1997), What can developing countries leam from East Asian economie
growth ? , Paper presented at the Annual World Bank Conference on development
Economies, 1997, Washington D.C.
TYBOUT, J. : (1992), Reaserching the trade/productivity link: new direction, World
11 3
ANNEXES
ANNEXE 1: DONNES UTILISEES DANS L'ETUDE
Les données relatives aux variables utilisées dans les analyses sont données dans les
tableaux 1, 2, 3 et 4. Les libellés et unités de mesure de ces variables sont précisées ci-
après.
Tableau 1:
TXCH: Taux de change dollar US- FCFA. Il s'agit du taux de change sur le marché
officiel. Ce taux donne le nombre d'unités de Franc~ CFA qui équivaut à 1 dollar US.
Les données proviennent de l'annuaire "International Financial Statistics Yearbook,
1999" publié par les services statistiques du Fonds Monétaire International.
PIB: Produit Intérieur Brut, mesuré en millions de FCFA au prix constant de 1987. Les
données proviennent de la base de données de la Direction de la Prévision et de la
Statistique du Sénégal sur les comptes économique.
Dettesdoll: Encours de la dette extérieure totale du sénégal; elle est exprimée en million
de dollars US . Les données proviennent de l'annuaire "World Bank World Tables,
1990", puis de World Development Indicators CD-Rom, World Bank".
dettescfa: est l'encours de la dettes extérieure totale exprimée en million de francs CFA.
Cette série est obtenue par le produit de Dettesdoll par TXCH.
dette-ratio: C'est le ratio de la dette au PIB. C'est le rapport de dettescfa sur PIB
exprimé en pourcentage.
Classement: Il s'agit des valeurs de dette-ratio ordonnées par valeurs décroissantes. Cet
ordre est compatible avec l'évolution de la qualité de la politique d'endettement dans le
temps. Par exemple, la plus grande valeur de dette-ratio correspond à l'année où la
politique d'endettement a été la moins bonne sur toute la période 1970-1997; ainsi, la
qualité de la politique d'endettement a la plus faible valeur (1) pour cette année.
2
Tableau 2:
Deficit-ratio: ratio du déficit budgétaire au PIB exprimé en pourcentage. Les données
proviennent à la fois de "World Bank World Tables, 1990" et de African Development
Indicators, 1998/1999".
Tableau 3:
Les variables rangdette, rangsurplus budgétaire et rangM2 donnent pour chaque année le
rang de la valeur prise respectivement par dette-ratio, deficit-ratio et TXM2-g pour cette
année, dans le classement des 28 valeurs (1970-1997) par ordre de qualité croissante de la
politique concernée.
L'indice de qualité de la gestion macroéconomique (indice qualmacro) est la moyenne
arithmétique simple des variables rangdette, rangsurplus budgetaire et rangM2. Cet indice
est donc sans unité.
Tableau 4.
PIBTFR : PIB par tête de la France exprimé en millier de FF. Il est calculé à partir du
PIB par tête en FF au prix constant de 1990.
PIBTUSA : PIB par tête des Etats-Unis exprimé en millier de dollar US. Il a été calculé à
partir du PIB par tête en dollar US au prix constant de1990.
SECHER : variable muette représentant les années marquées par une forte sécheresse au
Sénégal. Elle prend la valeur 1 pour les années où il y a eu forte sécheresse et la valeur 0
pour les autres années.
HUMPRI (resp. HUMTOT) :Nombre moyen d'années de scolarité passées par un actif
dans le cycle primaire (resp. tout de cycle) de l'éducation. L'unité est année.
PIBT : PIB par tête. C'est le rapport du PIB par la populationn active; il est exprimé en
millier de francs CFA.
4
EXPORT: Contribution des exportations à la croissance du PIB. Pour chaque année,
cette contribution est obtenue en multipliant le taux de croissance des exportations au
cours de 1' année par le poids des exportations dans le PIB au cours de 1' année précédente.
Elle est calculée à partir de la base données de la DPS.
GAPFRANCE et GAPUSA : Ecarts entre les PIB par tête de la France et des Etats-Unis
et celui du Sénégal. Ils sont exprimés en million de FCFA et calculés à partir des séries
PIBTFR, PIBTUSA, PIBT, TXCH et du facteur de conversion entre le FCFA et le FF
( 1FF = 50FCFA pour les années avant 1994 et 1FF = 100FCFA à partir de 1994).
5
Calcul de l'indice de qualité macroéconomique
Tableau 1: Classement des années 1970-1997 selon la qualité de la politique
d'endettement
7
Tableau 3: Résultat du calcul de l'indice de qualité macroéconomique pour la période
1970-1997 au Sénégal.
Années rangdette Rangsurplus rangM2 Indice
budgetaire quai macro
1970 28 25 9 20.67
1971 26 24 17 22.33
1972 27 27 14 22.67
1973 25 19 5 16.33
1974 24 22 2 16.00
1975 23 23 15 20.33
1976 22 20 7 16.33
1977 21 18 6 15.00
1978 19 26 3 16.00
1979 20 21 27 22.67
1980 18 14 8 13.33
1981 17 16 4 12.33
1982 16 6 18 13.33
1983 15 3 21 13.00
1984 11 1 12 8.00
1985 8 4 22 11.33
1986 9 9 10 9.33
1987 7 12 26 15.00
1988 10 13 23 15.33
1989 12 8 11 10.33
1990 13 5 28 15.33
1991 14 28 13 18.33
1992 6 11 19 12.00
1993 5 7 25 12.33
1994 1 2 1 1.33
1995 3 10 20 11.00
1996 4 15 16 11.67
1997 2 17 24 14.33
8
Tableau 4: Données utilisées dans les analyses économétriques.
14
Phillips-Perron Unit Root Test on TXPIBT
15
LS //Dependent Variable is TXPIBT
Date: 04/12/00 Time: 09:51
Sample(adjusted): 1971 1997
lncluded observations: 27 after adjusting endpoints
15~---------------------------------
--- ---
--- --- --- ---
10
--- ---
---
--- ---
1
5 --- --- ---
1·
1
OT-----------------------------------
-5 --
---
-10
-15+-~~~~~~~~~~~~~~~~~~~
78 80 82 84 86 88 90 92 94 913
---
10 ---
---
--- --- --- ---
--- --- --- ---
5 --- --- ---
0~====~-------------------------~
-5 ~~
-10
-15+-~~~~~~~~~~~~~~~~~~
78 80 82 84 86 88 90 92 94 913
15.---------------------------------- -- ... .
---
--- ---
10 --- --- ---
--- ---
--- ---
--- ---
--- --- ---
5
0+---~~~~~----~~--~~~~~~-~
-5 --
-10
-15+-~~--~~~--~~~~~~~~~~
74 76 78 80 82 84 86 88 90 92 94 S6
oTr~~-----------------------------
-5 -
---
-10 ---
-15+-~~~~~~~~~~~~~~~~~
74 76 78 80 82 84 86 88 90 92 94 S6
10 ~~~
~~~
~~~
~~~
~~~
~~~
5 ~~~
~~~
OT----------------------------------
1.2
0.8
0.4
-0.4+-~~~~~~~~~~~~~~~~~
78 80 82 84 86 88 90 92 94 95