Wiley Royal Economic Society

The Role of Economic Theory in Modelling the Long Run
Author(s): M. Hashem Pesaran

Source: The Economic Journal, Vol. 107, No. 440 (Jan., 1997), pp. 178-191
Published by: Wiley on behalf of the Royal Economic Society
Stable URL: http://www.jstor.org/stable/2235280 .
Accessed: 31/07/2013 16:42
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Wiley and Royal Economic Society are collaborating with JSTOR to digitize, preserve and extend access to The
Economic Journal.
http://www.jstor.org
This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

All use subject to JSTOR Terms and Conditions
The EconomicJournal, 107 (January), I78-I9I. ? Royal EconomicSociety I997. Publishedby Blackwell
Publishers, io8 Cowley Road, Oxford OX4 iJF, UK and 238 Main Street, Cambridge, MA 02I42, USA.
THE ROLE OF ECONOMIC THEORY IN

MODELLING THE LONG RUN*
M. HashemPesaran
The notion of' long-run' is inextricably linked with the concept of' equilibrium'
in economics, although in much time series econometrics long-run analysis is
conducted without providing an explicit account of the type of equilibrium
theory that may underlie it. This 'atheoretical' approach to long-run modelling
has received a further impetus from cointegration analysis, in particular the
unrestricted VAR approach of Johansen (1988, i99i). In this paper I shall
argue against such a purely statistical approach to long-run modelling, and
discuss the alternative theory-based procedures that could be employed in
practice.
While we have learned a great deal from the theoretical literature on unit
roots and cointegration, empirical applications of this methodology have
focused on the statistical properties of the underlying economic time series,
often at the expense of theoretical insights and economic reasoning. What is
needed is a more satisfactory integration of the cointegration analysis with the
various structural economic modelling approaches that are extant in the
literature. Already important first steps have been made along these lines.'
Estimation of long-run relations can be approached at different levels,
depending on the extent to which short-run dynamics predicted (possibly) by
the economic theory under consideration are taken into account in the
specification of an economic model. Cointegration analysis, at least in the form
it has been implemented so far, does not take account of what theory may
predict concerning short-rundynamics, on the grounds that theory is concerned
with long-run behaviour only, and that such short-runeffects are of lower order
of importance and in practice can be adequately modelled within an
unrestricted VAR framework. In contrast, rational expectations and RBC
models impose the short-run dynamics predicted by the theory.
The plan of the paper is as follows. Section I deals with the theoretical issues
involved in establishing links between the different notions of equilibrium in
economics and the long-run. Here I argue in favour of formulating long-run
relations as the steady state solutions of intertemporal optimisation problems
* In preparing this draft I have benefited from discussionswith Michael Binder, Graham Elliott, Cheng
Hsiao, Yongcheol Shin, and Ron Smith, and comments by Terry Barker, Peter Boswijk, Paul Fisher, Brian
Henry, Huw Dixon and an anonymous referee. Partial financial support from the ESRC and the Isaac
Newton Trust Fund of Trinity College, Cambridge are gratefully acknowledged.
' Hsiao (I995), Pesaran and Shin (I995a), and Wickens (I996), discuss the estimation and hypothesis
testing problems in the context of traditional structural models with unit-root forcing variables. Ogaki
(I992), Ogaki and Park (995), Kashyap and Wilcox (I993), Clarida (994), Croix and Urbain (I995)
consider the implications of the unit-root processes for estimation of long-run relations based on Euler
equations, obtained from intertemporal-rational expectations optimising models. Finally, Hercowitz and
Sampson (i99i), King et al. (i99i), Mellander et al. (I992), and So5derlindand Vredin (I995) examine
cointegration properties of some simple real business cycle models.
[ I78 ]

[JANUARY 1997] MODELLING THE LONG RUN I 79
from economic theory. For empirical purposes the steady state solutions can
then be embedded within a suitable multivariate dynamic model, such as the
vector autoregressive models. Section II addresses the issues of estimation and
inference, and argues that the abandonment of traditional methods in favour
of the new unit-root-cointegration procedures might have been premature. It
shows how slight modifications of the traditional approach originally developed
for analysis of trend-stationary processes rendered them equally valid for the
analysis of difference-stationary processes. Section II also emphasises the
importance of a priori restrictions, obtainable from the steady state solution of
suitable intertemporal optimisation problems, in structural identification of
long-run or cointegrating relations. Section III raises the issue of aggregation
and its implications for estimation of theory-consistent long-run relations, and
discusses alternative methods of dealing with dynamic heterogeneities across
groups in estimation of long-run relations, using panel data. Section IV
provides some concluding remarks.
I. EQUILIBRIUM AND THE LONG-RUN

There is a variety of equilibrium concepts in economics, including the
Marshallian concept of the partial equilibrium, short-run equilibrium, and
temporary equilibrium. None of these notions are, however, appropriate for the
empirical analysis of long-run.2 A satisfactory equilibrium concept for dynamic
economic modelling must explicitly take account of the fact that current period
decisions of firms and households depend on their future expectations, and that
these expectations in turn depend on current and past decisions. Accordingly,
such a notion of equilibrium must be intertemporal, and refer to a state where
expectations and actual outcomes coincide in some suitable probabilistic sense.
A prominent example of such an equilibrium concept is the rational
expectations equilibrium (REE), under which all expectations are fulfilled, in
the sense that all decisions are based on 'correct' conditional probability
distributions of all relevant future events.3
The REE is, however, an extremely demanding concept and at best could
represent a state where all learning is complete, and agents in any period have
no incentive to alter their beliefs about their economic environment. The issues
of the existence and uniqueness of the REE further complicate the task of its
implementation in empirical contexts. Among the various structural modelling
approaches, it is only the RBC methodology that closely adheres to the REE,
and so far this has been achieved at the expense of the underlying model's
predictive performance.4 As far as the modelling of the long-run is concerned,
however, it is not necessary to assume the economy is at the REE at all times.
2 For a discussion of different notions of equilibrium in economics, see, for example, Hahn (I973) and
Milgate (I987).
3 The origins of the intertemporal equilibrium concept are discussedin Milgate (I979) and Phelps (i 987).
An excellent early account of this equilibrium notion is given by Hicks (2939, chapter Io). The concept of
the rational expectations equilibrium is reviewed in Grossman (I98I), Radner (I982), and Jordan and
Radner (i 982).
4 Empirical evaluation of the RBC models is a controversial subject and has been debated in the
November 1995 issue of this JOURNAL. In particular, see Kim and Pagan (I 995) and Wickens (I 995).
? Royal Economic Society 1997

i8o THE ECONOMIC JOURNAL [JANUARY
But, as should be evident, long-run relations can be formulated in a theory-

consistent manner, within an intertemporal optimisation framework, without
having to assume that the REE holds in every period. Such long-run relations
will then simply be the steady state solution of the economic model under
consideration, assuming of course that the steady state solution exists, is stable
and unique. For the purpose of empirical analysis, such theory-consistentlong-
run relations can then be embedded within a suitable multivariate dynamic
model, such as the vector autoregressive (VAR) model with unrestrictedshort-
run coefficients. Alternatively, the dynamics of the adjustment to long-run
equilibrium can be restricted by utilising the intertemporal nature of the
underlying optimisation problem as it is done under the rational expectations
hypothesis. Which one of these two approaches is followed very much depends
on how seriously the investigator takes the short-termpredictions of the theory.
But, irrespective of how the short-run dynamics are modelled, equating long-
run relations with the long-run equilibrium (or the steady-state solution) of the
economic model has two main advantages: first, it ensures from the outset
that the econometric model embodies a set of theory-consistent steady-state
relations. Secondly, to the extent that the theory implies over-identifying
restrictionson the long-run relations, it allows the model's long-run properties
to be evaluated empirically. The transparencyof the econometric model's long-
run properties is also an important useful feature of the approach, not always
shared by traditional econometric models.
The above long-run modelling approach is relatively straightforward to
implement in the case of intertemporal optimisation problems with linear
constraints and quadratic objective functions (namely the so-called 'LQform'
problem). The resultant Euler equations can be written in the form of the
following canonical multivariate rational expectations model:'
Ayt = Byt-+?CE(yt+l Qt) +Fxt+ut (I)
where yt is an n x I vector of endogenous variables, xt is a k x I vector of
observable exogenous or forcing variables, ut is an n x i vector of unobservable
(to the econometrician) forcing variables, and Qt is the information set
available to the agents at time t. Typically, it is assumed that the uts are serially
uncorrelated with zero means, and that the xts follow finite-order VAR
processes, possibly with unit roots. The coefficient matrices A, B, C and F,
usually depend on a few 'deep structural' parameters of the underlying
optimisation problem.
The solution properties of (i), and the conditions under which it has a
unique stable solution are discussed in Binder and Pesaran (I 995 a), and
depend on whether the quadratic determinantal equation
det (CA2-AA + B) = o, (2)
has pairs of solutions which satisfy the familiar regularity conditions, namely
5 For expositional simplicity I am considering a first-ordersystem. But this does not involve any loss of
generality as all higher-order systems can always be reduced to the first-ordercanonical form given in (i).
See, for example, Binder and Pesaran (I995a, pp. 140-2), and Wickens (I995).
C Royal EconomicSocietyI997

1997] MODELLING THE LONG RUN i8i
whether for each pair one root will fall inside the unit circle and the other
outside it. Assuming these conditions are satisfied and denoting the steady state
long-run relations associated with (i) will be given
values by '*', the structural
by
(A-B-C)y* = rx*++u*. (3)
In the case where the process for xt has a unit-root, the relations in (3) will
also be (structural) cointegrating relations in the sense that
(A-B-C) yt-Fxt - I (o), (4)
where I(d) stands for an integrated process of order d. The number of
cointegrating relations is equal to n, namely the number of endogenous (or
decision) variables in the model.6 The fact that A, B, C and F usually depend
on a small number of 'deep structural' parameters also means that the long-
run relations, (4), in general will be subject to a large number of over-
identifying restrictions that as far as possible should be tested.
Outside the LQ form intertemporal optimisation problems usually do not
have closed-formsolutions and at best can be approximately solved by numerical
techniques. Consequently, the long-run relations associated with these models
will also be difficult to identify, which in general will depend on the second and
higher-order moments of the forcing variables. It is also difficult to obtain
conditions under which such long-run relations are unique or stable. The RBC
literature avoids most of these difficulties by replacing the original non-LQ
optimisation problem with one based on log-linear and quadratic form
approximations of the constraints and the objective function, respectively. The
result once again is the canonical multivariate rational expectations model, (i).
(For more details see Binder and Pesaran (I995 a, section 4. i).) The degree of
approximation error involved depends on the size of the deviations around the
'steady state', and some research in this area suggests that for 'small'
deviations the approximation may be acceptable.7 In any case there are no
guarantees that the approximation procedure will work for large deviations
from the steady state, or in other more complicated settings. The issue of how
to deal empirically with the non-LQ stochastic optimisation problems is an
open one.8 In the rest of this paper I shall confine my discussion to the
optimisation problems with solutions that can be well approximated by linear
decision rules. Although, some progresshas been made in the area of non-linear
dynamic econometric modelling, the relationship of this literature to economic
theory is still rather tenuous. See, for example, Granger and Terasvirta (I993),
Pesaranand Potter (I993), and Granger(I995).
6 This result follows from the non-singularity of A - B - C when the solution to (I) is unique and stable.
In the unstable case (A -B - C) will be rank deficient and the number of cointegrating relations will be
determined by the rank of (A - B - C). Notice also that in this case one or more elements of Yt must be
integrated even if all the elements of xt are trend-stationary.
7 See, for example, Christiano (I990), Danthine et al. (I989), and Dotsey and Mao (I992).
8 For further discussion of this problem and an alternative approximation procedure based on different
ways of proxying for the unknown Lagrange Multipliers (or shadow prices) that enter the solution of the
optimisation problem see Pesaran and Smith (I 995 a). Also see Judd (i 99 I), Marcet (I 994) and the papers
in the recent Special Issue of the Journalof AppliedEconometrics edited by Kapteyn et al. (I 995).
? Royal Economic Society I997

I82 THE ECONOMIC JOURNAL [JANUARY
II. ESTIMATION AND INFERENCE
Over the past decade, largely under the influence of the unit root literature,
views about how to estimate long-run relations and how to make inferences
about them, have undergone important changes. Before the emergence of the
unit root literature, the traditional approach was to model relationships
between y, (the decision or endogenous variables), and x, (the forcing or
exogenous variables) in the form of stationary distributed lags or autoregressive-
distributed lags (ARDL), and then use standard asymptotic maximum
likelihood theory for estimation and inference on the long-run relations implicit
in the ARDL model. (For a survey of this literature see, for example, Hendry
et al. (1 984).) The cointegrating literature, pioneered by Granger (1 986), and
Engle and Granger (I987), by focussing on the relationships between unit-root
processes, seems to suggest that in the presence of unit roots, this traditional
approach is no longer applicable and should not be followed. Instead, the long-
run analysis must be conducted either in the unrestricted VAR framework of
Johansen (I988, I99I, I995) and Reinsel and Ahn (I992), or the semi-
parametric, triangular form advocated by Phillips (i99i), and its various
modifications/extensions in Phillips and Hansen (I990), Phillips and Loretan
(i99I), Phillips (I 995) and Saikkonen (i99I, I993).9 This literature implicitly
suggests that the traditional methods of estimation and inference are wrong. In
this section I shall argue that as far as estimation and inference involving long-
run parameters are concerned, this conclusion is premature. The main
contribution of the cointegration literature, however, has been to focus
attention on the issue of testing for the existence of long-run relations, often
taken for granted in the traditional approach.
Given the complexity of the issues involved I shall advance the arguments in
steps; starting first with the case where there is only one long-run relation (i.e.
n = I in the notation of the previous section); distinguishing between the cases
where the forcing variables are determined exogenously and when they are not.
II. I. Case of a Single Long-Run Relation

For pedagogical reasons I shall illustrate the main issues using the following
simple bivariate model:
Ayt = a I - ( I-0) yt-, + yxt-, + ut, (5)

Axt =b (iI-p) -(i I-p) xt-, + 8yt-, + et, (6)
for t = I, 2, ..., T, and assume that
(o, Y) ,L = (u Se)(7)
i8t ~id
Suppose that there exists a single long-run relationship between yt and xt. (The
case of multiple long-run relations will be considered in Section II.2.) Then it
9 For further references and excellent reviews of the unit-root and cointegration literature see Campbell
and Perron(i99i) and Watson(I 994).
C Royal Economic Society I997

I997] MODELLING THE LONG RUN I83
must be that either 101 < i, y * o and 8 = o, or IpI< i, 8 * o and y = o.
Otherwise, there will be two long-run relations to choose from, and the analysis
will be subject to an identification problem. Therefore, without loss of
generality I shall assume that 101< i, y t o and d = o. It is important to
recognise that the existenceof a long-run relationship between y, and x, does not
rest on whether xt is I(i) (i.e. p = i)."
When o-u is non-zero, the derivation of the long-run relationship between yt
and xt should allow for the contemporaneous feedback that exists between the
two variables. Under the asymptotically innocuous assumption that (ut, et) are
jointly normally distributed we have
ut= (o(ue/o(6eJ)et+ 'th

where a-u,/o-,, represents the population coefficient of the regression of ut on et,
and Yt is distributed independently of et. Using this result in (5) we now have:
Ayt = c- (i -
0) Yt-1+ [y + (Ju,/oJee)(i -
p)] xt-1 + (oue/o7eJ Axt + yt, (8)
where c = a(i - 5) -b(ou,/oa,) (i -p), and xt is independently distributed of Vt.
When 8 = o, equations, (8) and (6), are 'observationally' equivalent to the
original bivariate model (5) and (6); the main difference being that in (8), xt
can be treated as strictly exogenous, even if o=u=* o, but this is not true of xt in
(5). Irrespective of whether xt is strictly exogenous in (5), using (8) the long-
run relationship between yt and xt is given by
Yt = +xt + t (9)
where II(I), and x= /I0,(O
0 = ly+ (que/ee)6 (I- p)]/ (I -0) (I I)

The above long-run relationship has a number of important features. First,
it allows for the direct as well as the indirect effects of changes in xt on yt that
take place through the contemporaneous dependence between ut and et.
Secondly, the long-run coefficients, ocand 6, will be invariant to the parameters
of the xt process either when o>,u = o (i.e. xts are first-difference stationary). In
the stationary case with xts not strictly exogenous, the long-run coefficients
depend on the parameters of the xt process and their estimation should be based
on (io) and (i i), rather than on the familiar expressions obtained using (5),
namely a, and y/(i -q0).
As far as estimation of the long-run coefficients, ac and 6, are concerned, in
a recent paper, Pesaran and Shin (I 995 b) have shown that valid asymptotic
inferences on the short-run and long-run parameters can be made, using the
least squares estimates of the ARDL model (8), once the order of the ARDL
model is appropriately augmented to allow for possible contemporaneous
correlations between ut and et. Therefore, the ARDL method continues to be
10 The hypothesis that there exists no long-run relationship between Ytand xt can be tested by testing the
restrictions 0 = I and y = o, against the alternative that 10 < I and y * o. This is a non-standard testing
problem, irrespective of whether xt is I(o) or I(I). See Pesaran, Shin and Smith (I996) on how to test
0 = I and y = o when it is not known a priori whether xt is I(o) or I(I).

applicable even if xts are endogenous, irrespective of whether they are I (i) or
not.
The order-augmented traditional ARDL approach has the additional
advantage that it does not require pre-testing of the regressorsfor the presence
of unit roots, a problem that afflictsother approaches to estimation of long-run
relations, such as the fully modified OLS approach of Phillips and Hansen
(I990). The pre-testing is particularly problematic in the unit-root-
cointegration literature where the power of the unit-root tests is typically very
low, and there is a switch in the distribution function of the test statistics as one
or more roots of the xt process approach unity. It is useful to illustrate this point
in the context of the following simple bivariate model recently analysed by
Cavanagh et al. (1 995):
Yt = a +7Xt-l+8tn ~~~(I2)
s-1
Axt = b -(i I-p) xt-, + ;Ai Axt_i+ 6t. (I 3)

i=l
The authors consider the problem of testing y = yo when it is not known

whether p = i or p < i, and show that the usual two-step testing procedures
where in the first step xt is classified as I( i) or I (o) exhibit large size distortions
as p is allowed to converge to its limiting value of unit with the sample size
(T-- oo)."l However, suppose that the primary focus of the analysis is not y,
but the long-run effect of xt on yt, denoted by 0. In the present model where
q = o, it is clear from (i i) that 0 is given by
0 = y + (II-P) U
and we have 0 = y only if p = i and/or -, = o. Therefore, if we do not know

whether p = i or not, and we wish to allow for the possibility that -, * o, the
appropriate long-run coefficient is 0 and not y, and the relevant estimating
equation is not (I2), but is given by
s-i
Yt= a* +0xt1+ At Axti + , (' 4)
i=O
in which xt is strictly exogenous for 0. Hence standard t- and F-tests applied to

the coefficients of (I 4) will be valid irrespective of whether xt is I (i) or not, and
the pre-testing problem does not arise12 if the parameter of interest is the long-
run coefficient, 0. Clearly, the pretesting problem discussed in Cavanagh et al.
(I995) will continue to be applicable if the parameter of interest is y.
In using the ARDL approach to analyse the long-run, it is however
important that an appropriate choice of the order of the ARDL model is made.
Monte Carlo results reported in Pesaran and Shin (Iggs5b) suggest that a two-
" This is basically due to the fact that when
E(utet) = Sue $ o, the asymptotic distribution of the OLS
estimator of y will switch from being normal when p < I to having a mixture of normal and non-standard
Dickey-Fuller distributions when p = i. The weight on the non-standard component of this distribution is
proportional to o-ue and therefore the non-standard component of the distribution vanishes when o-ue = o.
12 It is worth noting that in this particular example the t-test applied to the OLS estimator of 0 in (I4)
will be exact.

1997] MODELLING THE LONG RUN I85
step strategy, whereby the lag orders of the ARDL model is first selected using
either the Akaike Information Criterion (AI C) or the Schwartz Bayesian
Criterion (SC), and then the long-run coefficients are estimated on the basis of
the selected model, performsreasonably well in medium-sized samples (around
IOOor more).13
11.2. Case of Multiple Long-Run Relations

Similar considerations also apply when the number of long-run relations is
more than one. Except that, in this more general case, the problems of
estimation and inference are now further complicated due to the simultaneous
determination of the endogenous variables, and the identification problem that
must be resolved. In situations where long-run relations are obtained as the
steady state solution of an intertemporal optimisation problem, the identi-
fication of long-run coefficients is achieved by noting that in practice the
number of 'deep structural' coefficients (such as the parameters of taste and
technology) are usually much less than the number of elements in the long-run
coefficient matrices, A - B - C and F. The necessary and sufficient conditions
for identification of the long-run relations are discussed in Pesaran and Shin
(I995a) and Boswijk (I995). In the case of the long-run relations, (3), there
must be n restrictions in each of the n long-run relations (including the
normalisation restriction), for the long-run relations to be exactly identified.
Assuming that long-run relations are identified, their estimation can be
carried out by incorporating them within the following stable vector
VARDL(p, q) model:
q-1
4 (L) yt = aO + a, t + Fxt + li Axt-l + ut, (I 5)
i=l
where ?L(L) = (o- lL- (DPLP, and all the roots of det [4D(z)]= o fall
outside the unit circle, and 4D(i) = A-B-C, and suppose that the k x
vector, xt, follows the VAR(s) process,
s-1
Axt = bo + (Db1) t-Dxt1 +E Ri Axt-i + et, t = I 2 ...*
* T, (i 6)
i=l
where D, andR1, i = I, 2, ..., S-I, are k x k matrices of fixed coefficients.I have

explicitly allowed for the presence of deterministic trends in the (yt, xt) process,
although this does not necessarily mean that the long-run relationships
between yt and xt also must contain deterministic trends.
The two polar cases of trend-stationary and first-difference stationary
regressorscorrespond to the cases where D is full rank, and when D is rank
deficient, respectively.
In the case where the xts are strictly exogenous, standard econometric
methods from the dynamic simultaneous equation literature can be applied, to
(I5) irrespective of whether xts are I(i) or I(o). (On this see Hsiao (I995).)
13 Notice that the consistency property of the SC as a model selection criterion continues to hold even if
the underlying variables are I (i). For a proof of this result in the case of univariate models see Chan and Wei
(i 988).
C Royal Economic Society I 997

When the xts are not strictly exogenous, long-run relations are no longer
given by (3) and the indirect effects of xt on yt resulting from the non-zero
correlation between Yt and ct must also be taken into account. Following a
similar procedure as in section 11.I we obtain the following structural long-run
relations:14
(A-B-C) y* = (r+i: E-1 D) x* +t ) (I7
where Ywe= E(u ct) and Lee= E(t ). When Eu O,these relations reduce to
the theoretical specification given by (3) only if xts are I(i) and are not
themselves cointegrated, i.e. if D = o. This case corresponds to identifying the
xts as the model's common stochastic trends (cf. Stock and Watson (i988)).
In general, when xts are not strictly exogenous, the econometric analysis of
the long-run relations can be carried out by embedding (I 7) in a VAR model
that combines two set of equations given in (I 5) and (i 6) for Yt and xt. Let
Zt= (yt, xt')', then the combined model can be written as:
p-i
Azt = do+ (Hdl) t- zt-1 + E vi Azt-1+vt, (i8)

i=1
where vt = (u', c'), Vi are m x m matrices of fixed coefficients, m = n+k, and

A-B-C -F
mxm nxn nxk ,(I 9)
L kxn kx k
which is the VAR (p) specification that underlies the cointegration analysis of
Johansen.'5 Notice that the long-run relationshipsimplied by (i 8) are the same
as those given by (I 7).16 The estimation of (i8) can now proceed by the ML
method, taking account of the long-run restrictions implied by the economic
theory on the elements of the coefficient matrices A, B, C, F, and the unit root
restrictions (if any) involving the elements of D. Pesaran and Shin's (I995a)
long-run structural modelling approach is directly applicable to (i8), and is
made fully operational (with and without I (I) exogenous variables) in Microfit
4.o (see Pesaran and Pesaran (I996)).
In the absence of any a priorirestrictionson H, the long-run relations are not
identified, and the best that can be done is to test rank restrictions on H,
assuming that zts are known to be I(i). In the case where the underlying
theoretical model has a unique stable solution, A - B - C will be non-singular
14
For expositional simplicity I have abstracted from the possible effects of the deterministic trends in the
long run relations.
15 In many applications ofJohansen's procedure the trend coefficientsin (i 8) are specified independently
of the long-run matrix coefficients, HI.However, as argued in Pesaran and Shin (I 995 a), the restrictionson
the coefficientsof the trends implicit in formulation (i 8) that require the trend-coefficientsto lie in the space
spanned by the columns of HI,ensure that the nature of the trends in the levels of zt remains invariant to the
rank of HI.But when the trend-coefficientsare left unrestrictedthe model (i 8) has the unsatisfactoryproperty
that zt will have linear deterministic trends when H is full rank, but zt will contain quadratic trends when
H is rank deficient.
16
Using (i8), and ignoring the deterministic variables the long-run relations are given by H1z*= v*
where vt' =ueSe ?*+t*, with E(1* Ixt*) = o. Hence, it readily follows that (A-B-C)E(y"xl =
(ro+ya Econom) X . S
(C)Royal EconomicSocietyI 997

and rank (H) = n + rank (D). Therefore, under the theory, rank deficiency of H
is directly caused by the rank deficiency of D, the long-run coefficient matrix
of the 'forcing variables' (see (i 6)). In the case where the forcing variables are
first-difference stationary, and are not themselves cointegrated (i.e. they can be
viewed as common stochastic trends), D = o and rank (H) = n. In this case the
n structurally identified cointegrating vectors, inJohansen's notation, are given
by
P'=(A-B-C -r). (20)

nxm nxn nxk
The main advantage of the above 'full' system approach over the traditional
simultaneous equations system approach lies in the fact that, in principle, the
'full' system approach allows one to test the theory's prediction that the
number of long-run relations is in fact equal to the number of the decision
variables, n. Unfortunately, the implementation of this test in practice has
encountered two major difficulties: First, to ensure that all the variables are
I (i) an important element of pre-testing will be involved. Secondly, when the
number of variables in the 'full' system exceeds I o, the cointegration test tends
to have rather poor small sample properties particularly for reasonable choices
for the order of the underlying VAR model.
III. ESTIMATION OF LONG-RUN RELATIONS USING PANELS
One of the difficulties in establishing links between economic theory and

empirical analysis is the 'aggregation' problem, which can arise in a variety of
forms; temporal aggregation across commodities, aggregation across firms,
households, or regions. While in empirical analysis, some degree of aggregation
is inevitable, the choice of the level of aggregation needs to be decided in the
light of both theoretical and practical considerations. The discussion in Section
I on the possible links between long-run relations and the steady state solutions
of an economic model assumes away the cross-sectional aggregation problem
by implicitly considering an optimisation problem faced by a single
'representative' agent. This approach ignores the often important differences
that exist across agents and can lead to inconsistent estimates and invalid
inferences. In situations where time series data on individuals or groups of
individuals are available it is possible to allow for some degree of heterogeneity
across different agents' decision rules when analysing the long-run. In what
follows I confine my attention to the relatively simple case where the decision
variable of the ith agent (or a representative agent of a given group) at time
t can be approximated by the following simple ARDL model:
yit = ai+0iyi,t_1+yxiXt+&zt+ui, (21)

for i= I,2,...,N, t= I,2,..., T, where xit is a kx I vector of agent-specific
forcing variables and zt is a vector of 'common' or 'economy-wide' forcing
variables. The inclusion of the economy-wide variables in (2 I) can be
theoretically justified in the case of intertemporal optimisation problems

i88 THE ECONOMIC JOURNAL [JANUARY
involving common forcing variables and/or social interactions."7 In what

follows I shall assume that xit and zt are strictly exogenous and follow general
VAR processes,possibly containing unit roots. The exogeneity assumption can
be relaxed along the lines discussed in Section II. I. Aggregation of (2 I) across
i, does not, in general, lead to a finite order ARDL model in Yt= 1 yit and
x = EN x and estimates of the long-run relation based on a finite-order

ARDL model in the aggregates, Yt,xt and zt will, in general, be inconsistent.18
The application of standard pooling techniques that ignore the heterogeneity
of the slope coefficients also leads to inconsistent estimates of the long-run
coefficientsand is not to be recommended when there are important differences
across Xi and y,.'9 One possibility would be to estimate the equations in (2I) for
each i separately, and then consider the average estimates of the individual
long-run coefficients,j'/ ( - q), and Si/ (i -oi) as the estimator of the
coefficients in the 'average' long-run relation between Yt, xt and zt. Pesaran
and Smith (I995b) refer to this as the 'mean group estimator' and show that
it is a consistent estimator of the average long-run coefficients defined by
E [yi/ (i - i) ] and E [8i/ (i - qi) ]. This procedure, however, treats the decision
rule of each individual (or group) separately and does not exploit any
' common features' that may exist acrossdifferent decision rules. In cases where
the theory predicts the same long-run relationship across groups, but does not
necessarily require the short-run adjustments to equilibrium to be the same, it
would be possible to take advantage of the extra power that pooling provides
without introducing inconsistencies that arise when heterogeneity of short-run
dynamics across groups is ignored. Once again standard normal asymptotic
theory can be employed to analyse long-run relations within dynamic panels.
The efficiency gains associated with pooling can be considerable. For example,
in the case where the forcing variables xit and zt are first-differencestationary,
such a pooled estimator of the long-run coefficients turns out to be TV N
consistent. See Pesaran and Smith (I996) for further details.
IV. CONCLUDING REMARKS

Modelling long-run relations in economics and empirical analysis of their
properties has been the focus of much attention, particularly since the
emergence of the unit-root-cointegrationliterature. In this paper I have argued
against the theoretical use of cointegration analysis and have emphasised the
importance of using intertemporal optimisation techniques from economic
theory in the formulation and identification of the long-run relations in applied
17 Economic models involving social interactions have been investigated, for example, by Bernheim
(I 994), Brock (I 993), Brock and Durlauf (I 995), and Cooper and John (i 988). Also see Binder and Pesaran
(Iggb) for an analysis of solution and empirical implementation of intertemporal models with social
interactions under a variety of information structures; in particular for the case where information is
disparate across agents.
18 Aggregation will not be a problem when the slope coefficients qS and
yj are the same across agents,
and/or agent-specific forcing variables, xit, are perfectly correlated with some 'macro' variables other than
zt. See, for example, Pesaran et al. (I989), Granger (I990), and Lippi and Forni (I99o).
" On this see Pesaran and Smith
(Igs99b). The extent of the inconsistency in the estimates of the long-
run coefficientsin small samples has been investigated by Monte Carlo techniques in a companion paper by
Pesaran, Smith et al. (I996).
? Royal Economic Society I997

economics. But, in empirical implementation of this approach special attention
needs to be paid to the problems of dynamicnon-linearitiesin the case of non-LQ
intertemporal optimisation models, and aggregation in the case of non-
representative agent models.
As far as the issues of estimation and inference are concerned, it now appears
that the abandonment of the traditional econometric methods in favour of
cointegration techniques has been premature, particularly in the light of the
often serious pre-testing problems that surround the cointegration analysis.
Finally, it is important to recognise that long-run restrictions alone can not
discriminate between theories. As Miron (I 99I) illustrates there are many
theories with the same long-run properties but with very different short-run
features and policy implications. The usefulness of long-run analysis will also be
crucially dependent on the speed of convergence to the equilibrium (see
Pesaran and Shin (i 996)). The slower the rate of convergence to equilibrium,
the more difficult will be the task of estimating the long-run relations, and the
less likely it is that the estimated relationships will be of much practical use in
policy analysis and decision making.
Trinity College, Cambridge
REFERENCES
Bernheim,D. (I994). 'A theoryof conformity.'Journalof PoliticalEconomy,vol. I02, pp. 84I-77.

Binder, M. and Pesaran, M. H. (1995a). 'Multivariate rational expectations models and macroeconomic
modelling: a review and some new results.' In Handbookof AppliedEconometrics: Macroeconomics(M. H.
Pesaran and M. Wickens). Oxford: Basil Blackwell.
Binder, M. and Pesaran, M. H. (I 995 b). 'Decision-making in the presence of heterogeneous information and
social interactions.' DAE Working Paper No. 9537, Department of Applied Economics, University of
Cambridge.
Boswijk,H. P. (I992). Cointegration, Identification, Models.
and Exogeneity:Inferencein StructuralErrorCorrection
Tinbergen Institute Research Series.
Boswijk, H. P. (I995). 'Identifiability of cointegrated systems.' Tinbergen Institute.
Brock, W. A. (I993). 'Pathways to randomnessin economy: emergent nonlinearlity and chaos in economics
and finance.' EstudiosEconomicos, vol. 8, pp. 3-55.
Brock, W. A. and Durlauf, S. N. (995). 'Discrete choice with social interactions.' Mimeo, University of
Wisconsin.
Campbell, J. Y. and Perron, P. (i99i). 'Pitfalls and opportunities: what macroeconomists should know
aboutunitroots.'In NationalBureauof EconomicResearch,Macroeconomics Annual(ed.0. J. Blanchardand
S. Fischer). Cambridge, MA: M.I.T. Press.
Cavanagh, C., Elliott, G. and Stock, J. H. (I995). 'Inference in models with nearly integrated regressors.'
Econometric Theory, vol. I I, pp. I I 3 I-47.
Chan, N. H. and Wei, C. Z. (I988). 'Limiting distributions of least squares estimates of unstable
autoregressive processes.' The Annalsof Statistics,vol. i6, pp. 367-401.
Christiano, L.J. (I990). 'Linear-quadratic approximation and value-function iteration: a comparison.'
Journal of Business and Economic Statistics, vol. 8, pp., 99-I I3.
Clarida, R. (994). 'Cointegration, aggregate consumption, and the demand for imports: a structural
econometric investigation.' AmericanEconomicReview,vol. 84, pp. 298-308.
Cooper, R. and John, A. (1988). 'Coordinating coordination failuresin Keynesian models.' Quarterly Journal
of Economics, vol. I03, pp. 44I-63.
de la Croix, D. and Urbain, J.-P. (I995). 'Intertemporal substitution in import demand and habit
formation.' Unpublished paper, University of Louvain.
Danthine, J.-P., Donaldson, J. B. and Mehra, R. (I989). 'On some computational aspects of equilibrium
businesscycle theory.'Journalof EconomicDynamicsand Control,vol. 13, pp. 449-70.
Dotsey, M. and Mao, C. S. (I992). 'How will do linear approximation methods work? Results for
suboptimal dynamic equilibria.' Journalof MonetaryEconomics,vol. 29, pp. 25-58.
Engle, R. F. and Granger, C. W. J. (i 987). ' Co-integration and error correction: representation, estimation
and testing.' Econometrica, vol. 55, pp. 25I-76.
C Royal EconomicSocietyI997

Granger, C. W. J. (i 986). 'Developments in the study of cointegrated economic variables.' OxfordBulletinof
Economics and Statistics, vol. 48, pp. 2I3-28.
Granger, C. W. J. (I990). 'Aggregation of time series variables: a survey.' In Disaggregation in Econometric
Modelling(ed. T. Barker and M. H. Pesaran). London: Routledge.
Granger, C. W. J. (I995). 'Modelling non-linear relationship between extended memory variables.'
Econometrica,vol. 63, pp. 265-79.
Granger, C. W. J. and Terasvirta, T. (I993). Modelling Non-Linear Economic Relationships. Oxford: Oxford
University Press.
Grossman, S. J. (I98I). 'An introduction to the theory of rational expectations under asymmetric
information.' Review of Economic Studies, vol. 54, pp. 54i-6o.
Hahn, F. H. (I973). On the Notion of Equilibrium in Economics. Cambridge: Cambridge University Press.
Hendry, D., Pagan, A. and Sargan, J. (I984). 'Dynamic specification.' Chapter I8 in Handbookof
Econometrics, vol. ii (ed. Z. Griliches and M. Intriligator). Amsterdam: North Holland.
Hercowitz, Z. and Sampson, M. (I99I). 'Output growth, the real wage, and employment fluctuations.'
American Economic Review, vol. 8i, pp. I2I5-37.
Hicks, J. R. -(1939). Value and Capital. Oxford: Clarendon Press.
Hsiao, C. (I995). 'Cointegration and dynamic simultaneous equations model.' Unpublished manuscript,
University of Southern California.
Johansen, S. (I988). 'Statistical analysis of cointegration vectors.' Journalof EconomicDynamicsand Control,
vol. I2, pp. 23 I-54-
Johansen, S. (I99I). 'Estimation and hypothesis testing of cointegrating vectors in Gaussian vector
autoregressive models.' Econometrica,vol. 59, pp. I55i-8o.
Johansen, S. (I 995). Likelihood Based Inferenceon Cointegrationin the VectorAutoregressiveModel. Oxford: Oxford
University Press, forthcoming.
Jordan, J. S. and Radner, R. (i 982). 'Rational expectations in microeconomic models: an overview.' Journal
of Economic Theory, vol. 26, pp. 20I-23.
Judd, K. (I991). 'Numerical methods in economics.' Unpublished manuscript, Hoover Institution.
Kapteyn, A., Keifer, N. and Rust, J. (ed.) (I995). 'The microeconometrics of dynamic decision making.'
Journal of Applied Econometrics, Special Issue.
Kashyap, A. K. and Wilcox, D. W. (I993). 'Production and inventory control at the General Motors
Corporation during the I920's and I930's.' American Economic Review, vol. 83, pp. 383-40 I.
Kim, K. and Pagan, A. R. (I995). 'The econometric analysis of calibrated macroeconomic models.' In
Handbook of Applied Econometrica: Macroeconomics (ed. M. H. Pesaran and M. Wickens). Oxford: Basil
Blackwell.
King, R. G., Plosser, C. I., Stock, J. H. and Watson, M. W. (i99i). 'Stochastic trends and economic
fluctuations.' American Economic Review, vol. 8i, no. 4, pp. 8I9-40.
Lippi, M. and Forni, M. (I990). 'On the dynamic specification of aggregated models.' In Disaggregation in
Economic Modelling (ed. T. S. Barker and M. H. Pesaran), pp. 35-72. London: Routledge.
Marcet, A. (I 994) . 'Simulation analysis of dynamic stochastic models: applications to theory and
estimation.' In Advancesin Econometrics ((ed. C. A. Sims). Sixth World Congress, vol. 2. Cambridge:
Cambridge University Press.
Mellander, E., Verdin, A. and Warne, A. (I992). 'Stochastic trends and economic fluctuations in a small
open economy.' Journal of Applied Econometrics,vol. 7, pp. 369-94.
Milgate, M. (I979). 'On the origin of the notion of intertemporal equilibrium.' Economica, vol. 46, pp. i-io.
Milgate, M. (I987). 'Equilibrium: development of the concept.' In The New Palgrave: A Dictionaryof
Economics,vol. 2 (ed. J. Eatwell, M. Milgate and P. Newman), pp. I 79-82. London: Macmillan.
Miron, J. (i 99 i). 'Comment' on 'Pitfalls and Opportunities: What Macroeconomists Should Know About
Unit Roots', by J. Y. Campbell and P. Perron in National Bureau of Economic Research, Macroeconomics
Annual(ed. O.J. Blanchard and S. Fischer). Cambridge, MA: M.I.T. Press.
Ogaki, M. (I 992). 'Engel's Law and cointegration.' Journal of Political Economy, vol. ioo, pp. I027-46.
Ogaki, M. and Park, J. (I 995) . 'A cointegration approach to estimating preference parameters.'
Unpublished manuscript, Ohio State University.
Pesaran, M. H. and Pesaran, B. (I996). Working with Microfit 4.0: 'An Interactive Introductionto Econometrics.
Oxford University Press, forthcoming.
Pesaran, M. H., Pierse, R. G. and Kumar, M. (I989). 'Econometric analysis of aggregation in the context
of linear prediction models.' Econometrica, vol. 57, pp. 86i-88.
Pesaran, M. H. and Potter, S. (ed.) (I993). Non Linear Dynamics, Chaos and Econometrics. Chichester: John
Wiley.
Pesaran, M. H. and Shin, Y. (I 995 a). 'Long run structuralmodelling.' Unpublished manuscript, University
of Cambridge.
Pesaran, M. H. and Shin, Y. (I 995 b). An autoregressivedistribution lag modelling approach to cointegration
analysis.' DAE Working Paper no. 95 I 4, Department of Applied Economics, University of Cambridge.
Pesaran, M. H. and Shin, Y. (I996). 'Cointegration and the speed of convergence to equilibrium.' Journal
of Econometrics, vol. 7I, pp. I I 7-43.

I997] MODELLING THE LONG RUN I9I
Pesaran, M. H., Shin, Y. and Smith, R. J. (X996). 'Testing for the existence of a long-run relationship.'
Under preparation.
Pesaran, M. H. and Smith, R. P. ( 995sa). 'Estimating long-run relationships from dynamic heterogeneous
panels.' Journalof Econometrics, vol. 68, pp. 79-I I3.
Pesaran, M. H. and Smith, R. P. (I995b). 'The role of theory in econometrics.' Journalof Econometrics,
vol. 67, pp. 6I-79.
Pesaran, M. H. and Smith, R. (i 996). 'Pooled estimation of long-run relationshipsin dynamic heterogeneous
panels.' Under preparation.
Pesaran, M. H., Smith, R. P. and Im, K. S. (i996). 'Dynamic linear models for heterogenous panels.' In
Econometrics of Panel Data, 2nd ed. (ed. L. Mhtyhs and P. Sevestre). London: Kluwer Academic
Publishers.
Phelps, E. S. (i 987) . 'Equilibrium: an expectational concept.' In TheNewPalgrave:A Dictionaryof Economics,
vol. 2 (ed. J. Eatwell, M. Milgate and P. Newman), pp. I77-9. London: Macmillan.
Phillips, P. C. B. (i99i). 'Optimal inference in cointegrated systems.' Econometrica,vol. 59, pp. 283-306.
Phillips, P. C. B. (I995). 'Fully modified least squares and vector autoregressions.' Econometrica, vol. 63,
Pp. I023-78.
Phillips, P. C. B. and Hansen, B. (1 990). 'Statistical inference in instrumental variables regressionwith I(i)
processes.' Reviewof EconomicStudies,vol. 57, pp. 99-I25.
Phillips, P. C. B. and Loretan, M. (i99i). "Estimating long run economic equilibria.' Reviewof Economic
Studies,vol. 58, pp. 407-36.
Radner, R. (i982). 'Equilibrium under uncertainty.' In Handbookof MathematicalEconomics,vol. 2 (ed.
K. J. Arrow and M. D. Intriligator). Amsterdam: North-Holland.
Reinsel, G. C. and Ahn, S. K. (1992). 'Vector AR models with unit roots and reduced rank structure:
estimation, likelihood ratio test, and forecasting.' Journalof TimesSeriesAnalysis,vol. I3, pp. 353-75.
Saikkonen, P. (I991). 'Asymptotically efficient estimation of cointegration regressions.' Econometric Theory,
vol. 7, pp. I-2 I .
Saikkonen, P. (I993). 'Estimation of cointegration vectors with linear restrictions.' EconometricTheory,
vol. 9, pp. I9-35-
Soderlind, P. and Vredin, A. (I995). 'Applied cointegration analysis in the mirror of macroeconomic
theory.' Unpublished manuscript, Stockholm School of Economics, Sweden.
Stock, J. H. and Watson, M. W. (i988). 'Testing for common trends.' Journalof the AmericanStatistical
Association,vol. 83, pp. I097-Io7.
Watson, M. (I 994). 'Vector autoregressionsin cointegration.' In Handbookof Econometrics, vol. iv (ed. R. F.
Engle and D. L. McFadden). Amsterdam: Elsevier.
Wickens, M. R. (I 995). 'Real businesscycle analysis: a needed revolution in macroeconometrics.' ECONOMIC
JOURNAL, vol. IO5, pp. I637-48.
Wickens, M. R. (i996). 'Interpreting co-integrating vectors and common stochastic trends.' Journal of
Econometrics, vol. 74, pp. 255-7I.

Have Your Cake and Eat It Too? Cointegration
and Dynamic Inference from Autoregressive
Distributed Lag Models
Andrew Q. Philips University of Colorado at Boulder
Abstract: Although recent articles have stressed the importance of testing for unit roots and cointegration in time-series
analysis, practitioners have been left without a straightforward procedure to implement this advice. I propose using the
autoregressive distributed lag model and bounds cointegration test as an approach to dealing with some of the most commonly
encountered issues in time-series analysis. Through Monte Carlo experiments, I show that this procedure performs better
than existing cointegration tests under a variety of situations. I illustrate how to implement this strategy with two step-by-
step replication examples. To further aid users, I have designed software programs in order to test and dynamically model
the results from this approach.
Replication Materials: The data, code, and any additional materials required to replicate all analyses in this arti-
cle are available on the American Journal of Political Science Dataverse within the Harvard Dataverse Network, at:
https://doi.org/10.7910/DVN/MPQQC0.
R
ecent work in the time-series literature has cointegration testing. Depending on the results of the
stressed the importance of testing for unit roots as cointegration test, this strategy absolves users from hav-
well as the existence of long-run relationships— ing to distinguish between stationary (henceforth I(0))
or cointegration—between variables.1 Since the presence and first-order nonstationary (I(1)) regressors. This is an
or absence of each of these characteristics ultimately de- advantage since unit root testing is difficult in short se-
termines the appropriate model, failure to perform such ries and introduces “a further degree of uncertainty into
pretesting makes spurious inferences more likely. Even the analysis” (Pesaran, Shin, and Smith 2001, 289). The
with existing tools designed to identify unit roots and test ARDL-bounds procedure involves the following:
for cointegration, short series, the weak power of statis-
tical tests, and the dangers of overfitting make pretesting 1. Ensuring the dependent variable is I(1).
time-series data particularly problematic. Although re- 2. Ensuring the independent variables are not ex-
cent articles have helped to identify these issues (Grant plosive or higher orders of integration than I(1).
and Lebo 2016; Keele, Linn, and Webb 2016), users have 3. Estimating the ARDL model in error correction
been left without a straightforward solution about how form, and ensuring there is no autocorrelation.
to deal with such problems.2 4. Performing the bounds test for cointegration.
I propose using the autoregressive distributed lag Three possibilities result: (a) all regressors are
model and associated bounds testing procedure (ARDL- I(1) and cointegrating, (b) all regressors are
bounds) developed by Pesaran, Shin, and Smith (2001) I(0)—by definition, they cannot cointegrate—or
as a comprehensive approach to model specification and (c) indeterminate. An indeterminate result may
Andrew Q. Philips is assistant professor, Department of Political Science, University of Colorado at Boulder, UCB 333, Boulder, CO
80309-0333 (andrew.philips@colorado.edu).
I would like to thank Lorena Barberia, Allyson Benton, Harold Clarke, Peter Enns, Nathan Favero, Eric Guntermann, Mark Pickup, Joe
Ura, B. Dan Wood, and participants of the Texas A&M methodology brownbag lunches. Special thanks go to Soren Jordan, Paul Kellstedt,
and Guy D. Whitten. Despite this helpful advice, any errors and omissions remain my own.
1
Covariance stationary series exhibit constant mean, variance, and covariance. A linear combination of two or more first-order nonstationary
series that yields a stationary series is said to be cointegrating.
2
Grant and Lebo (2016) provide two solutions, including the one discussed herein. However, their discussion is brief.
American Journal of Political Science, Vol. 00, No. 0, xxxx 2017, Pp. 1–15

C 2017, Midwest Political Science Association DOI: 10.1111/ajps.12318
1
2 ANDREW Q. PHILIPS
still find cointegration among some of the inde- collinearity, lag order restrictions are often imposed. A
pendent variables, although further testing and common restriction is the ARDL(1,1) model:
respecification (in Step 3) is required. yt = ␣0 + ␣1 yt−1 + ␤0 xt + ␤1 xt−1 + ⑀t . (2)
Surprisingly, while this method is popular in other The contemporaneous effect of xt on yt is given by ␤0 .
fields (over 5,300 cites on Google Scholar as of Septem- The magnitude of ␣1 informs us about the “memory”
ber 2016), it has been cited and implemented only twice of yt (De Boef and Keele 2008). Assuming 0 < ␣1 < 1,
among American Political Science Review, American Jour- larger values indicate that movements in yt take longer to
nal of Political Science, Journal of Politics, and Political dissipate.4 The long-run effect (or long-run multiplier)
Analysis: Dickinson and Lebo (2007) and Grant and Lebo is the total effect that a change in xt has on yt . It is given
(2016). 0 +␤1 )
as ␬1 = (␤(1−␣1 )
, and its variance is typically approximated
Four contributions stand out in this article. First, I using the delta method.
discuss why an additional time-series procedure is neces- The generalized error correction model (GECM) may
sary, given recent debates about the role of error correc- also be used if all variables are I(0); the most common
tion models (Esarey 2016; Grant and Lebo 2016; Helgason form is the one-step GECM:
2016; Keele, Linn, and Webb 2016). Second, I use Monte
Carlo experiments to compare the performance of the yt = ␣0 + ␣1∗ yt−1 + ␤0 xt + ␤1∗ xt−1 + ⑀t , (3)
ARDL-bounds cointegration test against existing alterna- where the first difference of yt is a function of a constant
tives, under a variety of scenarios that practitioners typi- term, ␣0 , its own lag, yt−1 , the first difference of xt and its
cally encounter. I also examine how well the model recov- lag, xt−1 , and an i.i.d. error term, ⑀t . Although the GECM
ers substantively interesting effects, such as long-run mul- is algebraically equivalent to the ARDL(1,1) model, inter-
tipliers or adjustment parameters. Third, I demonstrate pretation changes. Contemporaneous effects of a change
the utility of the ARDL-bounds approach and the merits in xt on yt are still given by ␤0 . The rate of adjustment, or
of dynamic interpretation through two replications. Fi- the speed at which the total effect of a change of xt accu-
nally, I conclude with guidelines for implementing this mulates in yt , is given by ␣1∗ . It is used in calculating the
␤∗
procedure and introduce software programs designed to long-run multiplier, ␬1 = − ␣1∗ . Although obtaining vari-
1
help practitioners with cointegration testing and explor- ance estimates of the short-run effect is straightforward,
ing the substantive implications of their results. the variance around ␬1 must be approximated using the
Bewley transformation or the delta method (De Boef and
Keele 2008).
Unit Roots and Cointegration The GECM is also ideal for when the dependent and
in Time-Series independent variables are I(1) and cointegrating. In our
bivariate example, if there exists some linear combination
Consider a general autoregressive distributed lag of the two I(1) series that results in a stationary series, they
ARDL( p, q ) model where a series, yt , is a function of are said to be cointegrating. Testing is often performed
a constant term, ␣0 , past values of itself stretching back p using the Engle-Granger “two-step” approach (Engle and
periods, contemporaneous and lagged values of an inde- Granger 1987), which involves regressing yt on xt :
pendent variable, xt , of lag order q , and an independent,
yt = ␬0 + ␬1 xt + z t . (4)
identically distributed (i.i.d.) error term:
p
q If both variables are I(1), there exists one cointegrat-
yt = ␣0 + ␣i yt−i + ␤ j xt− j + ⑀t , ing relationship if the residuals in Equation (4), z t , are
i =1 j =0 stationary.5 More generally, a sufficient condition in
which to use an error correction model is if all variables
⑀t ∼ N(0, ␴ 2 ). (1)
are I(1) and cointegrating.6
The data generation process for the dependent and inde-
pendent variables determines how Equation (1) is esti- 4
Values of ␣1 greater than one suggest an explosive series or a
mated. If variables on both the left- and right-hand sides model mis-specification. Values less than zero suggest the series is
are I(0), they will exhibit constant mean, variance, and overcorrecting or oscillating; this is rare in the social sciences.
covariance, and the ARDL( p, q ) shown in Equation (1) 5
This is true for any k series, which can have up to k − 1 cointe-
may be used.3 Since additional lags may induce multi- grating relationships.
6
p This condition is sufficient but not necessary; one could use other
3
The stationarity condition for yt is given as | i =1 ␣i | < 1. Such models (e.g., first differences). I focus on I(1) series since higher
variables are said to be covariance stationary. orders of integration are rare in political science, although this
HAVE YOUR CAKE AND EAT IT TOO? 3
Even if both series are I(1), there may not always be A Comprehensive Approach
an underlying cointegrating relationship between them. to Time-Series Analysis
Practitioners often conflate re-equilibration with error
correction and fail to test for cointegration (Grant and
While the autoregressive distributed lag (ARDL) model
Lebo 2016).7 Even if xt and yt are I(1), without cointe-
and associated bounds test of Pesaran, Shin, and Smith
gration, there cannot be a long-run relationship between
(2001) comprise an approach already popular in eco-
them since (rewriting Equation 4) the linear combina-
nomics, it remains relatively unknown in political sci-
tion of the series, z t = (yt−1 − ␬0 − ␬1 xt−1 ), will not be
ence. It is ideal for four reasons. First, although we may
stationary. If all variables are I(1) but not cointegrating,
suspect that all regressors are I(1), an initial model can
the series can only be analyzed in first differences since a
be estimated without having to rely on unit root testing
short-run relationship may still exist
to distinguish between I(0) or I(1) regressors. Restric-
The recommendations above are straightforward in
tions on the independent variables can then be imposed
theory. In practice, identifying the correct model is non-
to avoid spurious conclusions of cointegration. Second,
trivial. For one, unit root tests often have size distortions
the one-step procedure for the initial cointegration test is
and low power in small samples, making it difficult to
similar to the GECM, making it easy to estimate. Third,
determine whether a variable is I(0) or I(1) (Choi 2015;
the cointegration test is often straightforward to inter-
Maddala and Kim 1998). This difficulty is compounded
pret. Fourth, this framework provides a comprehensive
since users must test each variable in order to use models
approach for practitioners.
such as the GECM. Series may be so highly autoregressive
The ARDL-bounds approach is shown in schematic
(near-integrated) that testing procedures cannot distin-
form in Figure 1.9 As shown in step a, users must first es-
guish them from an I(1) series (De Boef and Granato
tablish whether the dependent variable is I(1). To mitigate
1997). Moreover, series may be fractionally integrated.
difficulties with unit root testing, users should employ a
While some scholars argue that these are common in po-
suite of unit root tests and account for the possibility of pe-
litical science (Box-Steffensmeier and Smith 1998; Grant
riodicity, drift, and deterministic trends. If the dependent
and Lebo 2016; Lebo, Walker, and Clarke 2000), others
variable is stationary, then cointegration is not possible
remain skeptical (Keele, Linn, and Webb 2016; Pickup
and any I(1) regressors must be first differenced (step f).
2009).8 In other words, with short series (less than 100),
After ensuring that all independent variables are station-
we are often at the mercy of our tests, and we risk choos-
ary (step c), we must also check that no autocorrelation
ing models that are not reflective of the characteristics of
remains in the residuals (step i). As shown by step h in
our data.
Figure 1, if there is autocorrelation, we can incorpo-
As recent work has shown, many scholars have over-
rate lags of the dependent and independent variables,
looked the crucial steps of testing for unit roots and coin-
or lagged first differences if a regressor is I(1). Lag struc-
tegration (Grant and Lebo 2016). Others find that com-
tures are typically chosen based on theoretical expecta-
plex model specifications tend to overfit and perform
tions about the data generation process, and by minimiz-
poorly in small samples (Esarey 2016; Keele, Linn, and
ing information criteria such as the Akaike Information
Webb 2016). While these important contributions have
Criterion (AIC) and Schwarz-Bayesian Information Cri-
identified potential problems, they leave users without a
terion (SBIC) . If no autocorrelation remains, the result-
clear and easy-to-implement solution. As I show in the
ing ARDL model is one where all variables are I(0), as
next section, a procedure already exists that greatly eases
shown in step j, a version of which was shown in Equa-
unit root testing, includes a test for cointegration, and is
tion (1). There is no need to check for cointegration since
simple to estimate. Moreover, when combined with dy-
all variables are stationary.
namic simulations, these models can provide additional
If the dependent variable is I(1), there may be coin-
substantive interpretations.
tegration. As shown in step b in Figure 1, we do not have
to establish whether the regressors are I(0) or I(1); we of
excludes the possibility of multi-cointegration (Enders 2010, 380– course suspect I(1), since we are testing for cointegration.
82). However, we must ensure that there are no explosive se-
7
While cointegrating relationships can be estimated using GECMs, ries, seasonal unit roots, or series higher than I(1) in any
estimating GECMs does not necessarily mean two or more series of the variables. Violation of these conditions invalidates
are cointegrated.
8 9
Helgason (2016) and Esarey (2016) investigate treating data For brevity, I do not consider fractionally integrated relation-
as fractionally integrated versus I(1) through Monte Carlo ships. I discuss strategies for handling these data in the supporting
simulations. information.
4 ANDREW Q. PHILIPS
FIGURE 1 The ARDL-Bounds Procedure’s Comprehensive Approach to Time-Series Analysis
(a) Is the
dependent
variable non-
stationary? (c)
Are the
(b) Are all independent
independent yes no
variables non-
variables of stationary?
order I(1) or (f)
lower? (e) no yes Difference
(d) Estimate independent
no yes ARDL(p,q) in variables
error
Difference correction (g)
independent form
variables Is there
(h) (l) (j)
autocorrelation
in the residuals? yes Is there no Estimate
Incorporate autocorrelation ARDL(p,q) in
(k) yes
in the residuals? levels
difference of
Incorporate no variables
difference of
variables
(n)
(i) (m) (o)
Conduct
Exclude yes
Are all bounds test. Do
at least one Conclude
yes independent the results
independent no, cointegration
variables in suggest
variable in indeterminate cointegration?
levels non-
levels
stationary?
(p) no, all I(0)
(r)
Exclude no
stationary
(q) First-
Conclude difference
variables from
stationary dependent variable
appearing in
regressors and run
levels
ARDL(p,q)
the testing procedure. Independent variables that are Rewritten, it becomes.

nonstationary of higher orders than I(1) must be dif-
yt = ␣0 − ␣(yt−1 − ␬ˆ 0 − ␬ˆ 1 xt−1 ) + ␤0 xt + ⑀t . (7)
ferenced (step d) before moving forward.10
Next, estimate the ARDL model in error correction The unrestricted error correction model referred to by
form (step e). Recall that a cointegrating relationship be- Pesaran, Shin, and Smith (2001, 293) forms the basis
tween an I(1) dependent variable, yt , and a weakly exoge- of the ARDL-bounds procedure. It involves multiplying
nous I(1) regressor, xt , can be written as.11 through by −␣ and collecting terms in Equation (7):
yt = ␬0 + ␬1 xt + z t . (5) yt = ␣0∗ + ␪0 yt−1 + ␪1 xt−1 + ␤0 xt + ⑀t , (8)
If the residuals, z t , are stationary, there is evidence of where ␣0∗ = (␣0 + ␣␬ˆ 0 ) and ␪0 = −␣. As with the GECM,
cointegration.12 In order to estimate this model, z t−1 is the coefficient on the lagged value of xt , ␪1 = ␣␬ˆ 1 , can be
included in the following GECM: combined with the lagged dependent variable to extract
yt = ␣0 − ␣(z t−1 ) + ␤0 xt + ⑀t . (6) the long-run multiplier. The contemporaneous effect is
given by ␤0 . Since residual autocorrelation may be prob-
10
This excludes the possibility of multi-cointegration (Enders 2010, lematic, up to q lags of the first difference of the inde-
380–82).
pendent variables, and up to p lags of the first differ-
11
In the context of cointegration, a variable is weakly exogenous if ence of the dependent variable, may be included in order
it “does not respond to the discrepancy from the long-run equilib-
rium relationship” (Enders 2010, 407). to purge serial autocorrelation from ⑀t (steps g and k;
12 Pesaran, Shin, and Smith 2001, 299). Theory and infor-
If a deterministic trend was suspected in yt , Equation (5) becomes.
yt = ␬ˆ 0 + ␥ˆ T + ␬ˆ 1 xt + z t . We could also exclude the drift term, ␬ˆ 0 , mation criteria should be used to specify lag structure, and
or account for a deterministic trend in xt . autocorrelation tests used to ensure white noise residuals.
FIGURE 2 Bounds Test Statistics
The resulting model appears as. the t- and F-statistics can be found in Pesaran, Shin, and

p Smith (2001, 300–304), and small-sample critical values
yt = ␣0∗ + ␪0 yt−1 + ␪1 xt−1 + ␣i yt−i for the F-statistic can be found in Narayan (2005, 1987–
i =1 90). No small-sample critical values are currently available

q for the t-test, so in small samples it should only be used
+ ␤ j xt− j + ⑀t . (9) for confirmatory purposes. Interpretation of the bounds
j =0 test is illustrated in Figure 2. Three possibilities result.
If the value of the F-statistic is lower than the station-
After estimating the ARDL-bounds model in Equa- ary critical value, then we cannot reject the null hypoth-
tion (9) and ensuring white noise residuals (steps g and esis that there is no cointegrating relationship (step q in
k), the next step is to conduct the bounds test (step n). Figure 1); in fact, we can conclude that all independent
It tests the null hypothesis of no cointegration between variables appearing in levels are stationary, without hav-
the dependent variable and any regressors included in the ing to conduct any further unit root testing. If this is the
cointegrating equation (Pesaran, Shin, and Smith 2001, case, the final model specification is the first difference of
294–95). Only regressors that enter into the equation in the dependent variable regressed on up to l lags of the in-
levels (e.g., xt−1 ) in Equation (9) can (potentially) coin- dependent variables appearing in levels, as well as up to p
tegrate with yt . The bounds F-test consists of running and q lags of the first differences of the dependent and in-
a Wald test or F-test on the following restriction from dependent variables necessary to remove autocorrelation
Equation (9): (step r):
H0 : ␪0 = ␪1 = 0 (10)
l
p
under the null hypothesis that no cointegrating relation- yt = ␣0 + ␦k xt−k + ␣i yt−i
k=0 i =1
ship exists between xt and yt . Rejecting H0 indicates that
there is a cointegrating relationship between the series.
q
In addition to the F-test, a one-sided t-test may be + ␤ j xt− j + ⑀t . (11)

j =0
used to test the null hypothesis that the coefficient on the
lagged dependent variable is equal to zero: H0 : ␪0 = 0. If the value of the F-statistic is higher than the I(1) crit-
The alternative hypothesis is that ␪0 < 0, which suggests ical value, not only are all series I(1), but there also exists a
cointegration. This is known as the bounds t-test. cointegrating relationship between them. No further unit
The critical value bounds for the F- and t-statistics root testing of the regressors is required, as shown by step
are nonstandard and depend on the number of regressors o in Figure 1. Evidence suggests that the resulting ARDL
appearing in levels, as well as the restrictions placed on
the intercept and trend.13 Asymptotic critical values for
as the series increases (Pesaran, Shin, and Smith 2001, 307). The
cointegration test does not account for the possibility of seasonal
13
Dummy variables may be included without compromising the unit roots (Pesaran, Shin, and Smith 2001, 291) or other forms of
asymptotic properties of the tests, as long as they tend toward zero periodicity, so these should be prewhitened out accordingly.
6 ANDREW Q. PHILIPS
model in error correction form is correctly specified, and investigates failing to detect cointegration when it exists
that cointegration exists between the dependent variable (Type II error).
and any independent variables appearing in levels. To evaluate the ability of the bounds cointegration
If the F-statistic is between the stationary and I(1) test to avoid Type I error, I generated an I(1) dependent
critical values, the test is inconclusive. There could be a variable, yt , for series of length T = 35, 50, 80.15 Next,
mix of stationary and I(1) regressors, and cointegration four independent variables, xkt (where k = 1, 2, 3, 4),
among the I(1) variables and the dependent variable may were generated. These were completely unrelated to yt ,
still exist. However, further testing is required. As shown or to one another:
by step m in Figure 1, the next step is to conduct unit
yt = yt−1 + ␩t . (12)
root tests for each independent variable. Since I(0) vari-
ables cannot possibly have a cointegrating relationship
xkt = ␾k xkt−1 + ␯kt . (13)
with an I(1) dependent variable, they should only enter
into the model in first-differenced form.14 After rerun- The stochastic components ␩t and ␯kt are i.i.d. and inde-
ning the ARDL model in error correction form (step e), pendent from each other. As discussed earlier, detection
conduct the bounds test for cointegration (step n) on the of stationary variables is difficult in short series. To see
remaining I(1) regressors. If a conclusive result is reached, the consequences of erroneously including an I(0) regres-
no further testing is required. If the test is still inconclu- sor when all other variables are I(1), I allow the autore-
sive, the next step is to start excluding combinations of gressive process for x1t , ␾1 , to vary from 0.0 to 1.0 by
I(1) regressors from the cointegrating equation (having a increments of 0.10. All other independent variables are
␪ coefficient in Equation 9) and repeat steps e and n. If, I(1) (i.e., ␾k = 1 ∀k = 1). Next, I ran the ARDL-bounds
after iterating through the possible combinations of inde- model:
pendent variables, there is still no conclusive result from yt = ␣0 + ␪0 yt−1 + ␪1 x1t−1 + · · · + ␪k xkt−1
the bounds test, then we can conclude no cointegration.
Since short-run effects between I(1) variables may still
p

q1
+ ␣i yt−i + ␤1 j x1t− j
exist, the final model can be estimated in first differences.
i =1 j =0
Evaluating the t-statistic is exactly the opposite of the
F-statistic; if the value of the t-statistic is lower then the
qk
I(1) critical value, then we can reject the null hypothesis of +··· + ␤kj xkt− j + ⑀t . (14)
j =0
no cointegrating relationship. If the value of the t-statistic
falls above the I(0) critical value, then we cannot reject the The number of lagged first differences of yt and each xkt
null hypothesis. Just as with the F-statistic, if the critical to include in Equation (14) was determined via SBIC for
value falls between the bounds, the test is inconclusive, each of the 500 simulations conducted for all combina-
and more precise testing of the regressors is necessary. tions of T , k, and ␾1x .16 After estimating Equation (14), an
That is to say, we would next use unit root testing to F-test of the null hypothesis that ␪0 = ␪1 = · · · = ␪k = 0
isolate out only the I(1) variables and iterate through was conducted for each simulation. The resulting statistic
them as needed in order to conclude either cointegration was compared against the associated critical values of the
(step o) or all I(0) regressors (step q). bounds test from Narayan (2005, 1988). Since these series
were independently generated, evidence of cointegration
(an F-statistic greater than the I(1) critical value) is an
incorrect rejection of the null hypothesis and thus a form
Monte Carlo Evidence of Type I error.17
The key component to the ARDL-bounds procedure is the 15

To mitigate issues involving initial conditions (Balke and Fomby
cointegration test, since it ultimately determines our con- 1997), I created a burn-in period of T = 100 for all simulations.
clusions about the relationships between variables. How 16
A restriction of p, q k ≤ 3 was placed on the maximum number of
does its performance compare to existing approaches? To lag lengths in Equation (14) for T = 35, and 4 for T = 50, 80. This
evaluate this, I present two Monte Carlo experiments. restriction appeared to be an ideal trade-off between overfitting and
The first focuses on finding evidence of cointegration ensuring white noise residuals; I discuss issues regarding overfitting
in the supporting information.
when it does not exist (Type I error), whereas the second
17
F-statistics between the I(0) and I(1) bound, or below the I(0)
bound, were treated as avoiding Type I error. Treating them as
14
Of course, I(0) series could still appear in levels in the final model Type I error does not change the substantive results, as shown in
specification without risking spurious regression. the supporting information.
FIGURE 3 Proportion of Monte Carlo Simulations (Falsely) Detecting Cointegration
Bounds Test Johansen BIC Johansen Rank Engle-Granger
T = 35, 1 X T = 35, 2 X T = 35, 3 X T = 35, 4 X

Proportion of Cointegrating Relationships
1 1 1 1
.8 .8 .8 .8
.6 .6 .6 .6
.4 .4 .4 .4
.2 .2 .2 .2
0 0 0 0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x x x
Value of Value of Value of Value of
T = 50, 1 X T = 50, 2 X T = 50, 3 X T = 50, 4 X

1 1 1 1
.8 .8 .8 .8
.6 .6 .6 .6
.4 .4 .4 .4
.2 .2 .2 .2
0 0 0 0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x x x
T = 80, 1 X T = 80, 2 X T = 80, 3 X T = 80, 4 X

1 1 1 1
.8 .8 .8 .8
.6 .6 .6 .6
.4 .4 .4 .4
.2 .2 .2 .2
0 0 0 0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x x x
Note: Each plot shows the proportion of simulations finding (at p < .05) evidence of one cointegrating relationship with up
to k regressors and different numbers of observations across varying amounts of autoregression in x1t , using each of the four
cointegration testing procedures.
I compare the performance of the bounds test to two included, given that in small series an autoregressive I(0)
other procedures. I included the Engle-Granger two-step variable may be indistinguishable from an I(1) series.
procedure by implementing an augmented Dickey-Fuller The results from the first Monte Carlo experiment
unit root test on the residual series, z t , from the coin- are shown in Figure 3. The level of autoregression, ␾1 ,
tegrating equation: yt = ␬0 + ␬1 x1t + · · · + ␬k xkt + z t .18 in the single stationary series—x1t —is on the horizontal
I also used the Johansen procedure for cointegration to axis. The proportion of simulations finding evidence of
test for the existence of a single cointegrating relationship, cointegration is on the vertical axis; higher values indicate
using both the multiple trace testing procedure as well as Type I error. When there are only 35 observations, it is
the number of cointegrating ranks as chosen by mini- clear that the bounds test is the only cointegration pro-
mizing SBIC (Johansen 1995).19 Although cointegration cedure that comes close to the conventional 5% rejection
tests are only supposed to be run on all-I(1) series, the rate (shown by the thin black line). As the number of
purpose of this Monte Carlo experiment is to evaluate test independent variables increases (each column shows the
performance when a stationary regressor is erroneously number of k regressors), all tests tend to have increased
Type I error. For instance, when there are four regres-
sors, we find spurious evidence of cointegration about
18
The same lag restrictions were placed on the additional augment- 60% of the time when using the Engle-Granger test; sur-
ing lags of yt−i needed to remove autocorrelation, as determined prisingly, its high rate of Type I error does not change
by minimizing SBIC. Critical values are from MacKinnon (1994). as T increases. This finding underscores recent work on
19
Lag-order selection was the same as the Engle-Granger procedure. overfitting in short time-series (Helgason 2016; Keele,
Results of r = 1 were recorded as no evidence of Type I error.
8 ANDREW Q. PHILIPS
FIGURE 4 Proportion of Monte Carlo Simulations (Correctly) Detecting

Cointegration
35 Obs. 50 Obs. 80 Obs.
1 1 1
Number of X Variables
2 2 2
3 3 3
Johansen BIC Johansen BIC Johansen BIC

Johansen Rank Johansen Rank Johansen Rank
4 Bounds 4 Bounds 4 Bounds
Engle-Granger Engle-Granger Engle-Granger
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
Proportion Finding Evidence of Cointegration at < 0.05 Level
Johansen BIC Johansen Rank Bounds Engle-Granger
Linn, and Webb 2016). Despite this, the bounds test ex- I next explore the likelihood that the bounds test fails
cels at successfully failing to reject the null hypothesis of to detect cointegration when it exists (Type II error). As
no cointegration under all scenarios. Only the Johansen before, I vary the number of regressors and the number
test appears to have the same low rate of Type I error, but of observations. However, now the independent variables
only when the level of autoregression in x1t approaches a cointegrate with the dependent variable:20
unit root process. xkt = xkt−1 + ␯kt . (15)
The performance of the bounds test is notable in a
number of ways. Not surprisingly, I find evidence that ut = 0.75ut−1 + ␩t . (16)
it, along with other cointegration tests, performs poorly
in small samples. However, this is only when the length yt = 0.25x1t + · · · + 0.25xkt + ut . (17)
of the series is small and the number of regressors large. The errors ␯kt and ␩t are independent. This data gener-
Even then, the rate of Type I error using the bounds ation process yields an adjustment parameter of −0.25
test is often half that of the other cointegration tests, and a long-run multiplier of 0.25 for each of the k inde-
and it remains robust to erroneously including an I(0) pendent variables. The cointegration tests are the same
regressor. Only the Johansen-BIC test has a similar level as in the previous experiment, and conducted on 1,000
of Type I error, but only when all variables are at or simulations across each combination of observations and
near I(1). The fact that the performance of the bounds regressors.
test is barely affected by autoregression indicates that it The results of the second experiment are shown in
is a good test for cointegration in small samples; this Figure 4. Each bar depicts the proportion of cointegrat-
is exactly when we might erroneously include an I(0) ing relationships for a particular cointegration test, across
variable. Finally, while the Engle-Granger procedure is each combination of observations and regressors. Higher
robust to autoregression in a single regressor, it has much values correspond with a lower rate of Type II error. For
larger Type I error as the number of regressors increases. all tests, as the length of the series increases, Type II error
Taken together, this evidence suggests that the bounds decreases. In addition, as the number of cointegrating re-
cointegration test has lower Type I error than other tests, gressors increases, the Engle-Granger test correctly iden-
and it remains robust to short series, multiple regressors,
and erroneously including stationary regressors. 20
A proof of this is in the supporting information.
tifies cointegration at a greater rate than other tests. The conservative and less supportive of welfare. They find no
bounds test has the largest Type II error rate when T = 35, evidence that policy liberalism, income inequality, un-
although this improves sharply as the series lengthen. In employment, or inflation has any effect on public mood
addition, the proportion of simulations correctly iden- in the short-run. There are two reasons to believe these
tifying cointegration varies significantly across tests; the results may be suspect. First, the number of observations
Engle-Granger procedure has between one-third and one- is small. Second, although Kelly and Enns perform unit
half the rate of Type II error as the bounds test, and the root testing on the dependent variable, the authors make
bounds test has about one-half the Type II error as the no mention of testing the regressors.
Johansen tests. I replicated their model of public support for welfare
A number of important findings stand out from these policy.22 Results from their GECM are shown in Table
two experiments on cointegration. The bounds test has 1, Model 1. First, I ensured that the dependent variable
the lowest Type I error across all scenarios; moderate Type is I(1) (see step a in the Figure 1 schematic). Results
I error (20%) occurs only when there are four regressors from five unit root tests are shown in Table 2. While we
and 50 observations or fewer. While the bounds test is can reject the null hypothesis of an I(1) series using the
largely unaffected, the Johansen test tends to experience a augmented Dickey-Fuller test, more powerful ones such
rapid increase in Type I error rates when an I(0) regressor as the Dickey-Fuller Generalized Least Squares (DF-GLS)
is included. Although the Engle-Granger test has the low- and Elliott-Rothenberg-Stock (ERS) tests find evidence of
est Type II error rates, the bounds test tends to perform a unit root process.23 Although the Kwiatkowski-Phillips-
better than the Johansen tests in all scenarios, except for Schmidt-Shin (KPSS) test also provides mixed evidence,
a single regressor or short series. we can tentatively confirm that the data-generating pro-
In the supporting information, I conduct eight addi- cess of the dependent variable is I(1).24
tional Monte Carlo experiments. These include varying After ensuring that all regressors are first-order
the adjustment parameter and long-run multiplier, us- nonstationary or less (step b in Figure 1), I then estimated
ing fractionally (co)integrated series, and examining the the ARDL model in error correction form (step e).25 Us-
percentage of time a given cointegration test correctly or ing SBIC, I found that the lag structure in the original
incorrectly diverges from the other three cointegration model used by Kelly and Enns (2010) was optimal, given
tests. I also examine the ability of the GECM and ARDL- the data. This specification produced white noise residu-
bounds models to recover substantively interesting effects als, as evidenced by a battery of post-estimation diagnos-
(e.g., short- and long-run effects, or the adjustment pa- tics. Thus, the ARDL-bounds model shown in Model 2 in
rameter). Many of the findings are consistent with those Table 1 is identical to the original ECM in Model 1.
above; interested readers are directed to the brief sum- Since the model appears to be dynamically stable,
mary in Table 1 in the supporting information. we next use the bounds test to identify whether a
Taken together, the Monte Carlo results suggest that cointegrating relationship exists between support toward
the bounds test offers an ideal compromise between Type I welfare policy, policy liberalism, and income inequality
and Type II error. Given calls for more conservative coin- (step n in Figure 1). An F-test that the parameters on
tegration tests (Grant and Lebo 2016), the bounds test the variables appearing in lagged levels—Welfare t−1 ,
seems the prudent choice since it strongly avoids spuri- Policy Liberalismt−1 , and Income Inequality t−1 —are
ous cointegration, yet can still identify true cointegrating jointly equal to zero yields an F-statistic of 4.15. Although
relationships, at least for weakly exogenous regressors.21 Narayan (2005) provides the small-sample critical values
I show two applications of this approach below. necessary to evaluate this statistic, these are also available
in Stata and R using the programs pssbounds and pss,
22
See Table 1, Model 4, on page 864 in their article.
Application I: Kelly and Enns (2010) 23
The augmented Dickey-Fuller and Phillips-Perron tests suffer
from size distortions and weak power, and they are often outper-
Kelly and Enns (2010) examine how income inequality formed by the ERS and DF-GLS tests (Choi 2015, 37–54; Enders
2010, 234–37; Maddala and Kim 1998, 98–103).
affects public mood liberalism and support for welfare
24
policy. The authors find that in the long-run, increases in I examine the consequences of concluding stationarity in the
supporting information. Although the final model differs, the sub-
inequality are associated with the public becoming more
stantive results remain unchanged.
25
Unit root tests of the first difference of policy liberalism and
21
Were the regressors endogenous, methods such as the Johansen inequality rejected the I(2) null hypothesis; results are in the sup-
approach should be used. porting information.
10 ANDREW Q. PHILIPS
TABLE 1 Results of the ARDL-Bounds Model for Welfare Policy Mood (Kelly and Enns 2010)
(1) (2) (3) (4) (5)

ARDL-Bounds
ARDL-Bounds Excluding Final Model
Original ARDL- Excluding Policy Income in First
GECM Bounds Liberalismt−1 Inequalityt−1 Differences
Welfaret−1 −0.55∗∗ −0.55∗∗ −0.23 −0.37∗
(0.16) (0.16) (0.12) (0.14)
Policy Liberalismt −0.24 −0.24 −0.33 −0.05 −0.43
(0.42) (0.42) (0.46) (0.46) (0.45)
Policy Liberalismt−1 −0.76 −0.73
(0.45) (0.45)
Policy Liberalismt−1 −0.65∗∗ −0.65∗∗ −0.21
(0.22) (0.22) (0.11)
Income Inequalityt −175.70 −175.70 −136.99 −183.79 −155.62
(122.56) (122.56) (132.23) (134.45) (133.85)
Income Inequalityt−1 −202.02
(137.35)
Income Inequalityt−1 −152.17∗ −152.17∗ 7.89
(65.71) (65.71) (33.50)
Constant 80.03∗ 80.03∗ 3.18 13.11∗ −0.54
(29.92) (29.92) (12.92) (4.79) (1.20)
Observations 33 33 33 33 33
Adjusted R2 0.26 0.26 0.12 0.17 0.07
Breusch-Godfrey ␹ 2 of AR(1) 1.41 1.41 0.06 1.10 0.63
AR(2) 1.43 1.43 1.09 1.64 2.05
AR(3) 1.50 1.50 1.37 2.10 2.37
Durbin’s Alternative ␹ 2 of AR(1) 1.16 1.16 0.05 0.90 0.55
AR(2) 1.13 1.13 0.85 1.31 1.79
AR(3) 1.14 1.14 1.04 1.63 2.02
Cumby-Huizinga ␹ 2 of AR(1)–AR(3) 1.06 1.06 1.48 1.26 1.76
Shapiro-Wilk z −0.44 −0.44 1.66 0.78 2.70
Note: Dependent variable is welfare policy mood. Model 1 shows results from Kelly and Enns (2010), Model 2 shows results using ARDL-
bounds procedure, Models 3 and 4 show the results when testing down using ARDL-bounds, and Model 5 is the final model in first
differences. Lag structures are determined by SBIC. Standard errors are in parentheses. ∗ p < .05, ∗∗ p < .01.
respectively (Jordan and Philips 2016; Philips 2016b). Although the results of the cointegration test were
The critical values for 33 observations and two regressors borderline inconclusive with both policy liberalism and
are a lower stationary bound of 4.183 and an upper income inequality, a single regressor may still cointegrate
I(1) bound of 5.333. Strictly speaking, the F-statistic is with welfare policy mood. The next step is to test that
below the stationary lower bound, so we might conclude the regressors are I(1), since any I(0) regressor can easily
that all regressors are stationary (step q in Figure 1). be excluded from the cointegrating equation (step m in
However, given that the test result was so close to the Figure 1). Unit root testing (available in the supporting
I(0) lower bound of the test, we may want to treat the information) indicated that both policy liberalism and
result as inconclusive, which means that further testing income inequality are I(1).
is needed.26 Since unit root testing did not narrow down which
series should not appear in the cointegrating equation,
26
Moreover, the one-sided bounds t-test on the significance of the I estimated two different models (step n). In Model 3,
lagged dependent variable, −3.46, falls between the asymptotic I test to see whether only income inequality has a coin-
upper I(0) and lower I(1) critical bounds of −2.86 and −3.53,
respectively; this supports the “inconclusive” decision. tegrating relationship with public mood toward welfare.
TABLE 2 Public Mood Toward Welfare Is I(1) indicate that income inequality and policy liberalism do
(Kelly and Enns 2010) not have a statistically significant effect on the public’s
feelings toward welfare policy in the short run, a similar
Unit Root Test Welfare conclusion to what Kelly and Enns (2010) find.27
Augmented Dickey-Fuller (with −2.05∗ This replication is informative since it shows how one
drift) should proceed, given an inconclusive bounds test result.
Phillips-Perron −1.94 After finding that all regressors were I(1), I proceeded to
Dickey-Fuller GLS (with trend) −2.55 iterate through two different models, excluding one of the
Elliott-Rothenberg-Stock −2.55 regressors from the cointegrating equation in Models 3
Kwiatkowski-Phillips-Schmidt-Shin 0.49∗ (no lag), and 4. Since there was no evidence for cointegration when
(H0 = stationary) 0.29 (1 lag) isolating out income inequality and policy liberalism, the
final model was one of first differences since the error
Conclusion I(1)
correction framework is no longer appropriate.
Note: Thirty-three observations with 1-year lag are included for all While suggestive, this replication does not completely
tests unless otherwise noted. H0 = series contains a unit root for overturn the findings of Kelly and Enns (2010). Short se-
all tests except KPSS ∗ p < .05.
ries introduce a large amount of uncertainty into coin-
tegration tests, so it seems reasonable that different re-
Therefore, policy liberalism does not appear in levels in searchers might come to different conclusions.28 Overall,
Model 3. In order to produce white noise residuals (steps given the best available methods, there appear to be null
g and k), the lagged first difference of policy liberalism findings in their model of public mood toward welfare.29
was included. Because Model 3 reflects a data-generating
process where only income inequality is cointegrating,
evidence of cointegration in Model 3 would indicate that
income inequality, not policy liberalism, is cointegrat- Application II: Volscho and Kelly
ing with public mood toward welfare. An F-test of the (2012)
significance of the lagged variables in Model 3 yields an
F-statistic of 1.72. Since this is below the critical value of Volscho and Kelly (2012) use a GECM to probe the de-
5.290 for the I(0) lower bound and 6.175 for the I(1) up- terminants of the rise in top income shares in the United
per bound, we can conclude that income inequality and States from 1949 to 2008. I examine their power resource
public mood toward welfare are not cointegrating. model, which investigates whether the share of income
Next, I test to see whether only policy liberalism has a of the top 1% is determined by political and institutional
cointegrating relationship with public mood toward wel- factors. Results from their original model are shown in
fare. Therefore, in Model 4, income inequality does not Table 3, Model 1. As Volscho and Kelly find, increases
appear in levels. To produce white noise residuals, one in Democratic strength in Congress, union membership,
lag of the first difference of income inequality was in- and the presence of divided government tend to decrease
cluded. For Model 4, a rejection of the null hypothesis the share of income held by the superrich, but only in
using the bounds test would suggest that policy liberal- the long run. In contrast, Democratic presidents have no
ism, not income inequality, is cointegrating with public effect.
mood toward welfare. An F-test of the significance of the To implement the ARDL-bounds procedure, I first
lagged variables yields an F-statistic of 3.57. Since this ensured that the dependent variable, T op 1% Share, was
falls below the I(0) critical value of 5.290 (as well as the I(1), as shown in Table 4 (step a in Figure 1). After con-
upper I(1) critical value, 6.175), we can conclude that firming that the regressors are I(1) or less (step b), I used
policy liberalism and public mood toward welfare are not
cointegrating. 27
What differs is that the authors find evidence of a long-run effect,
Since neither income inequality nor policy liberalism whereas the ARDL-bounds approach does not.
on their own appear to have a cointegrating relationship 28
The Monte Carlo results show that while the bounds test tends
with welfare policy mood—nor do the three variables all to avoid spurious conclusions of cointegration in small samples, it
together, as found in Model 2—we can conclude that there also tends to have a high rate of false negatives; thus, it is hard to
ascertain whether their result holds.
is no cointegration (step q ). Since the two independent
29
variables may still affect public mood toward welfare in However, I find evidence of cointegration using this same
approach when examining Kelly and Enns’s other dependent
the short run, we may run a model of first differences variable—public mood liberalism—as detailed in the supporting
(step r ). This is shown in Model 5 in Table 1. The results information.
TABLE 3 Results of the ARDL-Bounds Model (Volscho and Kelly 2012)
(1) (2)
Original GECM ARDL-Bounds
Top 1% Sharet−1 −0.36∗∗ (0.09) −0.30∗∗ (0.07)
Democratic Presidentt 1.47∗∗ (0.53)
Democratic Presidentt−1 0.14 (0.56)
Democratic Presidentt−1 −0.20 (0.36) 0.11 (0.34)
% Congressional Democratt −0.03 (0.04)
% Congressional Democratt−1 0.05 (0.04)
% Congressional Democratt−1 −0.12∗∗ (0.04) −0.12∗∗ (0.03)
Divided Governmentt 0.37 (0.46)
Divided Governmentt−1 −0.11 (0.50)
Divided Governmentt−1 −0.93∗ (0.42) −0.83∗ (0.37)
Union Membershipt 0.29 (0.28) 0.04 (0.28)
Union Membershipt−1 −0.11∗∗ (0.03) −0.09∗∗ (0.02)
Constant 15.05∗∗ (3.83) 13.30∗∗ (2.81)
Observations 60 61
Adjusted R2 0.20 0.29
Breusch-Godfrey ␹ 2 of AR(1) 1.39 3.19
AR(2) 1.39 3.21
AR(3) 2.79 5.03
Durbin’s Alternative ␹ 2 of AR(1) 1.16 2.76
AR(2) 1.14 2.72
AR(3) 2.29 4.31
Cumby-Huizinga ␹ 2 of AR(1)–AR(3) 4.41 5.09
Shapiro-Wilk z 0.17 0.99
Note: Dependent variable is the share of income of the top 1%. Model 1 shows results from Volscho and Kelly (2012), and Model 2 shows
results using ARDL-bounds procedure, with lag structure determined by minimizing SBIC. Standard errors are in parentheses. ∗ p < .05,
∗∗
p < .01.
TABLE 4 Top 1% Share Is I(1) (Volscho and Kelly SBIC. While theory should always guide model specifica-
2012) tion, users must ensure that the residuals are white noise
in order to run the bounds test; in this example, both
Unit Root Test Top 1% Share the dead-start and standard GECM yielded white noise
Augmented Dickey-Fuller (with drift) 0.02 residuals.30
Phillips-Perron −0.21 Since Model 2 contains white noise residuals, we can
Dickey-Fuller GLS (with trend) −1.35 move onto cointegration testing using the bounds test
Elliott-Rothenberg-Stock −1.35 (step n in Figure 1). An F-test of the joint significance
Kwiatkowski-Phillips-Schmidt-Shin 2.20∗ of the five lagged variables (the four regressors plus the
(H0 = stationary) dependent variable) yields an F-statistic of 5.02. Critical
Conclusion I(1) values for 61 observations and four regressors are 3.068
and 4.274 for the lower and upper bounds, respectively.
Note: T = 60 with 1-year lag included for all tests. H0 = series Since the F-statistic is greater than the I(1) upper bound,
contains a unit root for all tests except KPSS. ∗ p < .05.
we can conclude that there is a cointegrating relationship
(step o). As further confirmation, we can use the bounds
SBIC to assist in lag selection for the ARDL model in error t-test; the t-statistic on the lagged dependent variable is
correction form, the result of which is shown in Model 2 −4.01, which is below the critical value of the I(1) lower
(step e). Although the authors may have had theoretical
reasons to use the “dead-start” GECM, I find instead that
30
a model of contemporaneous short-run effects has a lower Therefore, one could use the bounds test on either model.
bound (−3.99). Thus, there is strong evidence that all four top 1%. In the supporting information, I also replicate
regressors are cointegrating with the dependent variable. Ura (2014) and find evidence of cointegration.
The largest difference between Volscho and Kelly’s Although the examples above are representative of
(2012) original model and the ARDL-bounds model is most situations practitioners are likely to encounter, I
the significance of the short-run effect of a Democratic briefly review how users should proceed, given their own
president. To see whether this leads to different conclu- theoretically specified model:
sions than the ones made by the authors, in the support-
1. Unit root testing of the dependent variable. If
ing information I use dynamic simulations to help inter-
the dependent variable is I(1), proceed with the
pret how changes in one regressor affect the dependent
ARDL in error correction form.32
variable over time. Model-based dynamic simulations are
2. Ensure that no independent variables are of an
growing in popularity in political science (King, Tomz,
order of integration higher than I(1). The main
and Wittenberg 2000; Williams and Whitten 2012), and
advantage of the bounds approach is that users
they are especially valuable for examining complex model
do not have to make difficult decisions between
specifications such as autoregressive relationships with
I(0) and I(1) regressors; the results of the bounds
interactions (Williams and Whitten 2011) or dynamic
test inform us of these characteristics. However,
compositional dependent variables (Philips, Rutherford,
users must ensure that no variables are integrated
and Whitten 2015, 2016). The ARDL-bounds procedure’s
more than I(1), are explosive, or contain seasonal
lag structure makes it a prime candidate for dynamic sim-
unit roots.33
ulations. Using the program dynpss to create dynamic
3. Estimate the ARDL in error correction form. Since
simulations of the ARDL-bounds model (Philips 2016a),
the bounds testing procedure relies on white
I find that in the short run, moving from a Republican
noise residuals, add lags of the first differences of
to a Democratic president increases the income concen-
the dependent variable and regressors as needed.
tration of the top 1%. However, this effect loses statistical
Use theory and information criteria to aid in lag
significance after 4 years, it is not statistically significantly
specification. Ensure that the residuals are white
different from the predictions using Volscho and Kelly’s
noise.
(2012) GECM, and the long-run effect is nearly zero.31
4. Test the joint significance of all lagged variables
These results are available in the supporting information.
appearing in levels using a Wald/F-test. Use small-
In summary, I find evidence for cointegration in the
sample critical values of the bounds test in
power resources model of Volscho and Kelly (2012). While
Narayan (2005). As an auxiliary test, use the one-
the ARDL-bounds model had slight specification differ-
sided t-test of the lagged dependent variable us-
ences, the substantive findings do not change, as evi-
ing asymptotic critical values in Pesaran, Shin,
denced by dynamic simulations. Institutional and politi-
and Smith (2001).
cal factors may affect the income share of the top 1%, but
5. If the results of the bounds test.
only in the long-run.
(a) Suggest cointegration: All variables appearing
in levels appear to be I(1) and have a cointe-
grating relationship with the dependent vari-
Discussion and Conclusion able.
(b) Suggest stationarity: All regressors appearing
The two examples above represent a variety of situations in levels are I(0) and cannot possibly be in
that the ARDL-bounds approach is designed to handle. a cointegrating relationship. A model of first
For the Kelly and Enns (2010) replication, I find no differences must be estimated since the vari-
evidence of cointegration. Using the steps outlined in ables may still affect the dependent variable
Figure 1, I find no evidence that policy liberalism and in- in the short run.
come inequality affect welfare policy mood in the long- or (c) Are inconclusive: Each regressor should be
short-run. For the Volscho and Kelly (2012) replication, I tested for a unit root. Only I(1) variables can
find evidence of cointegration; these findings support the
32
authors’ conclusions about the long-run effect of institu- If the dependent variable is I(0), it is not first differenced, leading
tions and politics on the concentration of income of the to a lagged dependent variable model as shown in the Figure 1
schematic.
31 33
This is confirmed analytically by calculating the long-run mul- While the test statistics can be adjusted to account for determin-
tiplier, which is 0.36 and is not statistically significantly different istic trends in the dependent variable, it is advisable to identify and
from zero. detrend instead.
appear in levels in the error correction model. Stata and R designed to help users test for cointegration
Stationary variables may still appear in first and create dynamic simulations.36
differences.34 Repeat Steps 3 and 4. If the re- This article was motivated by a series of recent ar-
sulting statistic is still inconclusive, combi- ticles in the time-series literature that stress the impor-
nations of variables appearing in levels may tance of careful unit root and cointegration testing. To
need to be tested. Continue testing until (5a) achieve this, I have advocated for the autoregressive dis-
or (5b) is reached. tributed lag bounds approach. I have shown that the
6. Interpretation. Use dynamic simulations and an- ARDL-bounds procedure starts with a theoretically spec-
alytical calculations for hypothesis testing. ified model and moves step-by-step to arrive at an in-
formed conclusion. Through careful testing and model
While the ARDL-bounds procedure provides a com- specification, the ARDL-bounds procedure is a power-
prehensive approach to modeling time-series and testing ful approach to a difficult problem in applied time-series
for cointegration, it is not a remedy for all problems. First, analysis.
like all time-series models, it tends to perform poorly in
small samples. As a precaution against overfitting, Keele,
Linn, and Webb (2016, 40) suggest a minimum of be-
tween 10 and 20 observations per parameter.35 However,
References
as shown by Monte Carlo simulations, the bounds coin-
Balke, Nathan S., and Thomas B. Fomby. 1997. “Threshold
tegration test tends to perform at least as well as other Cointegration.” International Economic Review 38(3): 627–
cointegration tests in small samples. Second, this single- 45.
equation model imposes a causal ordering and assumes Box-Steffensmeier, Janet M., and Renee M. Smith. 1998. “In-
weak exogeneity of the regressors (Pesaran, Shin, and vestigating Political Dynamics Using Fractional Integration
Smith 2001, 293), a disadvantage shared with GECMs. Methods.” American Journal of Political Science 42(2): 661–
Users unwilling to impose a causal ordering should con- 89.
sider alternative methods such as vector error correction Choi, In. 2015. Almost all about unit roots: Foundations, develop-
ments, and applications. Cambridge: Cambridge University
models, which can account for multiple cointegrating re- Press.
lationships. Third, the cointegration test serves as a sub- De Boef, Suzanna, and Jim Granato. 1997. “Near-Integrated
stitute for unit root testing to distinguish between I(0) Data and the Analysis of Political Relationships.” American
and I(1) regressors only when the test results fall outside Journal of Political Science 41(2): 619–40.
of the critical bounds. Given an inconclusive test result, De Boef, Suzanna, and Luke Keele. 2008. “Taking Time Seri-
users must use unit root tests on all regressors and identify ously.” American Journal of Political Science 52(1): 184–200.
the stationary, I(1), and I(1)-and-cointegrating variables Dickinson, Matthew J., and Matthew J. Lebo. 2007. “Reexamin-
through an iterative process, as shown in the Kelly and ing the Growth of the Institutional Presidency, 1940–2000.”
Journal of Politics 69(1): 206–19.
Enns (2010) replication. Last, this procedure still requires
Enders, Walter. 2010. Applied econometric time series. 3rd ed.
balanced equations (Grant and Lebo 2016; Keele, Linn, New York: John Wiley and Sons.
and Webb 2016); although stationary regressors can ap- Engle, Robert F., and Clive W. J. Granger. 1987. “Co-integration
pear in levels in the final model, I(1) regressors that are not and Error Correction: Representation, Estimation, and Test-
cointegrating cannot appear in levels in the final model ing.” Econometrica 55(2): 251–76.
without risk of spurious regression. Esarey, Justin. 2016. “Fractionally Integrated Data and the Au-
To aid in the use of this approach, this article has todistributed Lag Model: Results from a Simulation Study.”
Political Analysis 24(1): 42–49.
provided a step-by-step guide for practitioners that can
be used with any software package that contains unit root, Grant, Taylor, and Matthew J. Lebo. 2016. “Error Correction
Methods with Political Time Series.” Political Analysis 24(1):
autocorrelation, and the F- and t-tests necessary for the 3–30.
bounds test (e.g., R, Stata, or EViews). In addition, in the Helgason, Agnar Freyr. 2016. “Fractional Integration Methods
supporting information, I discuss software programs in and Short Time Series: Evidence from a Simulation Study.”
Political Analysis 24(1): 59–68.
34
I(0) variables could appear in levels in the final model without 36
risking spurious regression. In Stata, these are pssbounds for displaying critical values of
the bounds test and dynpss for creating dynamic simulations of
35
I address concerns about overfitting in the supporting the ARDL-bounds model (Philips 2016a, 2016b). The pss package
information. implements these commands in R (Jordan and Philips 2016).
Johansen, Soren. 1995. Likelihood-based inference in cointe- Philips, Andrew Q., Amanda Rutherford, and Guy D. Whitten.
grated vector autoregressive models. Oxford: Oxford Univer- 2015. “The Dynamic Battle for Pieces Of Pie—Modeling
sity Press. Party Support in Multi-Party Nations.” Electoral Studies 39:
Jordan, Soren, and Andrew Q. Philips. 2016. “pss: R Package to 264–74.
Perform the Bounds Test for Cointegration and Create Dy- Philips, Andrew Q., Amanda Rutherford, and Guy D. Whitten.
namic Simulations.” https://github.com/andyphilips/pss. R 2016. “Dynamic Pie: A Strategy for Modeling Trade-Offs in
package version 1.3.9. Compositional Variables over Time.” American Journal of
Keele, Luke, Suzanna Linn, and Clayton M. Webb. 2016. “Treat- Political Science 60(1): 268–83.
ing Time with All Due Seriousness.” Political Analysis 24(1): Pickup, Mark. 2009. “Testing for Fractional Integration in Pub-
31–41. lic Opinion in the Presence of Structural Breaks: A Comment
Kelly, Nathan J., and Peter K. Enns. 2010. “Inequality and the on Lebo and Young.” Journal of Elections, Public Opinion and
Dynamics of Public Opinion: The Self-Reinforcing Link be- Parties 19(1): 105–16.
tween Economic Inequality and Mass Preferences.” Ameri- Ura, Joseph Daniel. 2014. “Backlash and Legitimation: Macro
can Journal of Political Science 54(4): 855–70. Political Responses to Supreme Court Decisions.” American
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Mak- Journal of Political Science 58(1): 110–26.
ing the Most of Statistical Analyses: Improving Interpreta- Volscho, Thomas W., and Nathan J. Kelly. 2012. “The Rise of the
tion and Presentation.” American Journal of Political Science Super-Rich: Power Resources, Taxes, Financial Markets, and
44: 347–61. the Dynamics of the Top 1 Percent, 1949 to 2008.” American
Lebo, Matthew J., Robert W. Walker, and Harold D. Clarke. Sociological Review 77(5): 679–99.
2000. “You Must Remember This: Dealing with Long Mem- Williams, Laron K., and Guy D. Whitten. 2011. “Dynamic Simu-
ory in Political Analyses.” Electoral Studies 19(1): 31–48. lations of Autoregressive Relationships.” Stata Journal 11(4):
MacKinnon, James G. 1994. “Approximate Asymptotic Distri- 577–88.
bution Functions for Unit-Root and Cointegration Tests.” Williams, Laron K., and Guy D. Whitten. 2012. “But Wait,
Journal of Business and Economic Statistics 12(2): 167– There’s More! Maximizing Substantive Inferences from
76. TSCS Models.” Journal of Politics 74(3): 685–93.
Maddala, Gangadharrao S., and In-Moo Kim. 1998. Unit roots,
cointegration, and structural change. Cambridge: Cambridge
University Press. Supporting Information
Narayan, Paresh Kumar. 2005. “The Saving and Investment
Nexus for China: Evidence from Cointegration Tests.” Ap- Additional Supporting Information may be found in the
plied Economics 37(17): 1979–90.
online version of this article at the publisher’s website:
Pesaran, M. Hashem, Yongcheol Shin, and Richard J. Smith.
2001. “Bounds Testing Approaches to the Analysis of Level 1. Programs to Assist in Implementing the Pesaran, Shin
Relationships.” Journal of Applied Econometrics 16(3): 289– and Smith (2001) ARDL Procedure
326.
2. Summary of Monte Carlo Results
Philips, Andrew Q. 2016a. “dynpss: Stata Module to Dynami-
3. Additional Monte Carlo Results
cally Simulate Autoregressive Distributed Lag (ARDL) Mod-
els.” https://andyphilips.github.io/dynpss/. 4. Proof of the Equivalence of the Triangular Error-
Philips, Andrew Q. 2016b. “pssbounds: Stata Module to Con- Correction Representation to the Standard Representa-
duct the Pesaran, Shin, and Smith (2001) Bounds Test for tion
Cointegration.” http://andyphilips.github.io/pssbounds/. 5. Three Replications
The relationship between Trade, FDI and Economic growth in Tunisia: An
application of autoregressive distributed lag model
Dr. Mounir BELLOUMI

Address: Faculty of Economics and Management of Sousse, University of Sousse
City Erriadh 4023 Sousse Tunisia.
E-mail: mounir.balloumi@gmail.com / mounir.belloumi@fdseps.rnu.tn
Phone: +216 73 30 18 09
Fax: +216 73 30 18 88
Abstract:
This paper examines the dynamic causal relationships between foreign direct investment
(FDI), trade and economic growth in Tunisia by applying the bounds testing (ARDL)
approach to cointegration for the period from 1970 to 2008. The bounds tests suggest that the
variables of interest are bound together in the long-run when foreign direct investment is the
dependent variable. The associated equilibrium correction was also significant confirming the
existence of long-run relationship. The results indicate also that there is no significant
Granger causality from FDI to economic growth, from economic growth to FDI, from trade to
economic growth and from economic growth to trade in the short run.
Key words: FDI, trade, economic growth, ARDL cointegration, Tunisia.
JEL classification: C22, F13, F21.
1
1. Introduction
Trade and FDI inflows are well known as very important factors in the economic growth
process. Trade plays the role of upgrading skills through the importation and adoption of
superior production technology and innovation. Exporters use innovation and developed
production technology either by acting as subcontractors to foreign enterprises or through
international markets competition. Producers of import-substitutes face competition from
foreign firms. They are pushed to adopt more capital-intensive production facilities to face the
hard competition in developing countries where products are usually capital-intensive
(Frankel and Romer, 1999). The impact of trade openness on economic growth can be
positive and significant due mainly to the accumulation of physical capital and technological
transfer.
Inward FDI can play an important role by increasing and augmenting the supply of funds for
domestic investment in the host country. This is can be done through production chain when
foreign investors buy locally made inputs and sell intermediate inputs to local enterprises.
Furthermore, inward FDI can increase the host country’s export capacity causing the
developing country to increase its foreign exchange earnings. FDI can also encourage the
creation of new jobs and enhance technology transfer and boosts overall economic growth in
host countries.
The majority of past empirical studies have dealt with either trade and FDI interaction on
economic growth (Balasubramanyam et al., 1996; Karbasi et al., 2005), or the relationship
between FDI and economic growth (Lipsey, 2000) or the relationship between trade and
economic growth (Pahlavani, et al., 2005). All these studies have concluded that both FDI
inflows and trade promote economic growth. However, the studies have failed to provide a
conclusive result on the relation in general and the direction of the causality in particular in
many developing countries. The growth enhancing effects from FDI inflows and trade vary
from country to country and overtime. For some countries FDI and trade can even negatively
affect the economic growth (Balasubramanyam et al., 1996; Borensztein et al., 1998; Lipsey,
2000; De Mello, 1999; Xu, 2000).
Some past studies on this subject suffer from two limitations. The first limit is that these
studies used cointegration techniques based on either the Engle and Granger (1987)
cointegration test or the maximum likelihood test based on Johansen (1988) and Johansen and
Juselius (1990). Or, these cointegration techniques may not be appropriate when the sample
2
size is too small (Odhiambo, 2009). Odhiambo (2009) uses the bounds testing cointegration
approach developed by Pesaran et al. (2001) which is more robust for the small sample. The
second limit is that by using cross-sectional data some studies do not address the country
specific issues (Odhiambo, 2009; Ghirmay, 2004; Casselli et al., 1996).
The current study investigates the dynamic causal relationship between trade, FDI and
economic growth in Tunisia by implementing the newly developed ARDL-Bounds testing
approach to cointegration. Trade and FDI are expressed as a ratio of GDP. The proxy of
economic growth is real GDP per capita. Labour and capital investments are also considered
in the model. The Granger procedure is used to test the direction of causality within the
Vector Error Correction Model (VECM). If a set of variables is cointegrated, they must have
an error correction representation wherein an error correction term (ECT) must be
incorporated in the model (Engle and Granger, 1987). The advantage of VECM is the
reintroduction of the information lost by differencing time series. This step is fundamental to
investigate the short-run dynamics and the long run equilibrium.
Despite the abundant literature on FDI, trade and economic growth in many emerging and
developing countries, there is little empirical work on this subject in Tunisia. By contrasting
the big role of FDI inflows, we can draw important lessons and guidelines for policy makers
in their pursuit for a more effective scheme to promote economic growth in Tunisia which is
suffering from a huge ratio of unemployment. What role that can play FDI and trade in the
New Tunisia to meet the challenges that the revolution spawned? This study will add valuable
knowledge to the existing literature in Tunisia. The study is relevant because the twin policy
targets of FDI attraction and trade liberalisation have been integral preoccupation of Tunisia
since the IMF Structural Adjustment Programme of 1986 and continue to be after the
revolution of 14th January 2011.
The rest of the paper is structured as follows: Section 2 presents a brief literature review.
Section 3 gives an overview of Tunisian’s foreign direct investment and regional trade
agreements. Section 4 describes the used data, while section 5 deals with the estimation
technique and the empirical analysis of the results. Section 6 concludes the paper.
3
2. A brief literature review
The literature studying the impacts of FDI and trade on economic growth is very large. The
effect of each one of the two variables of FDI and trade on economic growth has generally
been studied for many countries using various sample periods and econometric approaches
and methods. The results of some papers studying the effects of trade (or exports) and FDI on
economic growth in developing countries are promising (Balassa, 1985; Sengupta and
Espana, 1996). There is evidence for the export-led growth hypothesis (ELGH) and FDI-led
growth hypothesis (FLGH). These hypotheses, which are supported, are based on the idea that
exports and FDI variables are the main drivers of economic growth.
Ghirmay et al. (2001) studied the relationship between exports and economic growth in
nineteen developing countries. Their results supported a long-run relationship between the
two variables only in twelve of the developing countries and the promotion of exports
attracted investment and increased GDP in these countries. By using a bivariate technique,
Mamun and Nath (2003) found a long-run unidirectional causality from exports to economic
growth in Bangladesh. Narayan et al. (2007) examined the export-led growth hypothesis for
Fiji and Papua New Guinea. Their results support the ELGH in the long-run for Fiji, while for
Papua New Guinea there is evidence of ELGH in the short-run.
Empirical researches, which have studied FLGH, have found that FDI promotion can greatly
benefit host countries by the introduction of new technologies and skills, the creation of new
jobs, surging domestic competition and expanding access to international marketing networks.
According to Blomstrom et al. (1992), FDI promotes economic growth when the host
economy is a developed one. The findings of Boyd and Smith (1992) are that FDI may affect
negatively growth due to misallocation of resources in the presence of some distortions in pre-
existing trade, price and others. Borensztein et al. (1998) studied the effect of FDI on
economic growth in a cross-country regression approach. According to their findings, FDI can
be an important tool and a channel to the transfer of modern technology, but its effectiveness
depends on the stock of human capital in the host country. By referring to Nair-Reichert and
Weinhold (2001) findings, the causal relationship between foreign and domestic investment
and economic growth in developing countries is heterogeneous. The authors justify these
results by the homogeneity of assumptions imposed across countries. By using new statistical
4
techniques and two new databases to reassess the relationship between economic growth and
FDI, Carkovic and Levine (2002) found that there is no evidence of FLGH.
According to Anthukorala (2003), FDI had a positive effect on GDP and a unidirectional
causality running from GDP to FDI in Sri Lanka. The finding of Baliamoune-Lutz (2004) is
that the impact of FDI on economic growth is positive and there is a bidirectional relationship
between exports and FDI in Morocco. This result implies that FDI can also promote exports
and vice versa. Also, some authors have studied the relationship between regional integration
and FDI. Darrat et al. (2005) investigated the impact of FDI on economic growth in Central
and Eastern Europe (CEE) and the Middle East and North Africa (MENA) regions. They
found that FDI inflows stimulate economic growth in EU accession countries, while the
impact of FDI on economic growth in MENA and in non-EU accession countries is either
non-existent or negative. Similar to that of Darrat et al. (2005), Hisarciklilar et al. (2006)
don’t find causality between FDI and GDP for most of the following Mediterranean countries
of Algeria, Cyprus, Egypt, Israel, Jordan, Morocco, Syria, Tunisia and Turkey for the period
of 1979-2000. These countries could create an environment that attract FDI and lead to the
transfer of technology and skills and increase production, creation of new jobs and exports.
Research examining the impacts of exports and FDI on GDP within the same model has also
concluded ambiguous results. For example, by referring to Alia and Dcal (2003), there is
evidence of ELGH for Turkey but not FLGH because the spillover effects from FDI to GDP
are not present. In the Latin American countries (Argentina, Brazil, and Mexico), Alguacil et
al. (2000) found that the FLGH is confirmed but not ELGH. The authors found that FDI
promotes economic growth and trade. Dritsaki and Adamopoulos (2004) found a
unidirectional causal relationship from FDI to economic growth and a bidirectional causal
relationship between exports and economic growth for Greece. According to Yao (2006),
there is a strong relationship between exports, FDI and economic growth for China. Rahman
(2007) re-examined the effects of exports, FDI and expatriates’ remittances on real GDP of
some Asian countries (Bangladesh, India, Pakistan and Sri Lanka) using the ARDL technique
for cointegration for the period of 1976-2006. The ARDL technique confirmed cointegrating
relationship among variables in these three countries. The short-run net effects of exports on
real GDP of Bangladesh are more visible than those of FDI. The same apply to India as well
with some minor exceptions for relatively stronger short-run effects. In the case of Pakistan,
FDI was found to exert net restrictive effects on its real GDP, though not highly significant.
For Sri Lanka, FDI was found to have consistently restrictive effects on real GDP.
5
Alalaya (2008) investigated the relationship between economic growth, trade and FDI for
Jordan for the period of (1990 -2008) by applying the ARDL model for cointegration. He
found a unidirectional causal effect from trade and FDI to economic growth. It was also found
that the speed of adjustment in the model is 0.587 and it seems relatively high and significant.
3. Tunisian’s foreign direct investment and regional trade agreements
During the last decades, many measures have been adopted by Tunisian government to attract
FDI inflow by the belief that this inflow will introduce modern technology, enhance
productivity and stimulate export-led growth. Tunisian’s structural adjustment plan was set in
1986. It has led to encourage standard fiscal and monetary policy reforms and liberalization of
financial sector. This programme has characterized the moving forward of Tunisia’s
economic development. A policy of gradual trade liberalization was pursued, first by
implementing current account convertibility, followed by accession to the GATT agreements
and by a free trade association with the European Union in 1995, which went into effect on
January 1, 2008. The objective of the agreement is to eliminate customs tariffs and other trade
barriers on a wide range of goods and services. However, the most important aspect of the
association agreement may well be that it has served to anchor Tunisia’s commitment to
reforms.
“Tunisia provided a wide range of incentives such as a tax relief up to 35 percent on
reinvested revenues and profits (30 percent starting from 2007), exemptions from customs
duties and a 10 percent reduction of VAT for imported capital goods having no Tunisian
manufacturing equivalent, a suspension of VAT and sales tax on locally produced equipment
at company start-up and an optional depreciation scheduling for capital equipment older than
seven years. Additional incentives are provided to off-shore industries or totally exporting
industries such as full exemption on corporate profits earned on export for the first ten years
and 50 percent reduction thereafter (granted also to partially exporting firms), full tax
exemption on reinvested profits and income, total exemption from customs duties on imported
capital goods, raw materials, semi finished goods and services necessary for business” (Ghali
and Rezgui, 2007).
According to Ghali and Rezgui (2007), the net FDI flows to GDP attained 2.2% in 1990.
About 80 percent of FDI was mainly oriented to the petroleum and gas sector until the first
half of the 1990's. Due to the privatization program, the share of total FDI in the petroleum
6
and gas sector decreased and attained 58 percent in 1998. There is an FDI shift to
manufacturing sector.
The largest foreign investor in Tunisia is the European Union (EU). Its FDI is mainly oriented
to the development of the infrastructure network and the textiles and clothing sectors.
Trade openness is important as a vehicle for technological spillovers. In order to benefit from
trade openness, Tunisia needs to have trade partners that are capable to provide it with
technology embodied in products, machines and equipments in which the country is in short
supply. So, by importing capital equipment and intermediate products from developed
countries that have a larger stock of knowledge, Tunisia can improve its own stock of
knowledge.
Tunisia has been a member of the WTO since March 1995. In order to benefit from trade
openness, Tunisia signed a Euro-Mediterranean Association Agreement (AA) with the
European Union in July 1995. It was the first country to sign an AA with the EU among the
South Mediterranean countries which are engaged in the Barcelona Process. However, this
agreement was ratified and entered into force in March 1998. The main objective of the AA is
liberalisation and facilitation of the exchange of goods, services and capital. Already, Tunisia
finished the tariffs dismantling for industrial products in 2008.
The first trading partner of Tunisia is the EU. The main exports of Tunisia to the EU are
manufactured products, raw energy and phosphate, and agricultural products. It accounted for
about 80% of its exports in 2008 and experienced a growth rate of more than 9% from 2003 to
2008. The main imports of Tunisia from the EU are machinery and transport equipment,
textiles, chemicals and refined energy. These imports accounted for near 65% of Tunisian’s
needs in goods from EU countries and grew at an estimated average annual rate of 7.2%
(Boughzala, 2010).
On the other side Tunisia has some international trade relations with some Arabic countries.
Tunisia signed a bilateral agreement with Libya which entered into force in 2002. It signed
the Agadir agreement with Morocco, Egypt and Jordan in 25 February 2004. This committed
all partners to removing substantially all tariffs on trade between them and to harmonizing
their legislation with regard to standards and customs procedures. Even this agreement
entered into force in July 2006, its effective implementation did start only in April 2007.
Tunisia signed also a free trade agreement with a Middle East country which is Turkey in
November 2004. This agreement replaced the old one, which was signed in 1992, and entered
into force in July 2005.
7
The Tunisia’s Euro-Med agreement with the EU can increase the openness of the Tunisian
economy and hence increase FDI inflows to Tunisia. The aim of Mediterranean countries was
to create an environment which can attract FDI that could lead to the transfer of technology
and increase production, creation of new jobs and exports. This objective is our main
motivation to investigate FDI-economic growth relationship in Tunisia. In this study we try to
see if FDI shift has beneficial effects for employment, trade, and economic growth in Tunisia.
4. Data sources and description of variables
Annual time series data on economic growth, FDI, trade, labour and capital stock, which
cover the 1970-2008 period, have been used in this study. The data has been obtained from
different sources, including Tunisia Central Bank annual reports, quarterly bulletins, etc. In
addition, different volumes of the International Financial Statistics (IFS) Yearbook, published
by the International Monetary Fund, and World Development Indicators 2009 edition
published online by the World Bank have been used to supplement the local data.
The economic growth variable, which is measured by real GDP per capita, is noted by Y. FDI
is the value of real gross foreign direct investment inflows to GDP ratio; Trade openness is
the total sum of exports and imports divided by GDP; L is measured as the volume of the total
labour force; capital stock (K) is measured by the real value of gross fixed capital formation
(GFCF).
5. Econometric methodology and empirical results
5.1. Unit roots tests
In time series analysis, before running the causality test the variables must be tested for
stationarity. For this purpose, in this current study we use the conventional ADF tests, the
Phillips-Perron test following Phillips and Perron (1988) and the Dickey-Fuller generalised
least square (DF-GLS) de-trending test proposed by Elliot et al. (1996).
The ARDL bounds test is based on the assumption that the variables are I(0) or I(1). So,
before applying this test, we determine the order of integration of all variables using the unit
root tests. The objective is to ensure that the variables are not I(2) so as to avoid spurious
results. In the presence of variables integrated of order two, we cannot interpret the values of
F statistics provided by Pesaran et al. (2001).
The results of the stationarity tests show that all variables are non-stationary at level. These
results are given in Table 1. The ADF, the Phillips-Perron and DF-GLS tests applied to the
8
first difference of the data series reject the null hypothesis of nonstationarity for all the
variables used in this study (Table 2). It is, therefore, worth concluding that all the variables
are integrated of order one.
Table 1. ADF and DF-GLS unit root tests on log levels of variables
ADF test DFGLS test PP test
Variables SIC t-Stat Critical value SIC t-Stat Critical t-Stat Critical
lag at 5% lag value at 5% value at 5%
Ln(Y) 0 -2.40*** -3.53 0 -1.75*** -3.19 -2.55*** -3.53
Ln(K) 1 -3.14*** -3.53 1 -2.52*** -3.19 -2.01** -2.94
Ln(L) 0 -2.18** -2.94 0 -0.92*** -3.19 -2.20** -2.94
Ln(F) 0 -2.79** -2.94 0 -2.73** -1.95 -2.76** -2.94
Ln(T) 1 -3.17*** -3.53 1 -2.58*** -3.19 -2.53*** -3.53
*model without constant and trend, **model without trend, ***model with constant and trend
Table 2. ADF and DF-GLS unit root tests on first differences of log levels of variables
ADF test DFGLS PP test
Variables SIC t-Stat Critical SIC t-Stat Critical value t-Stat Critical
lag value at lag at 5% value at 5%
5%
Ln(Y) 0 -6.21** -2.94 1 -3.07** -1.95 -6.26** -2.94
Ln(K) 0 -3.73** -2.94 0 -3.62** -1.95 -3.24* -1.95
Ln(L) 0 -5.58** -2.94 0 -5.62** -1.95 -5.59** -2.94
Ln(F) 0 -7.82* -1.95 0 -7.66** -1.95 -10.80* -1.95
Ln(T) 0 -4.87** -2.94 0 -4.89** -1.95 -4.47* -1.95
*model without constant and trend, **model without trend, ***model with constant and trend
5.2. ARDL Bounds tests for cointegration
In order to empirically analyse the long-run relationships and short run dynamic interactions
among the variables of interest (trade, FDI, labour, capital investment and economic growth),
we apply the autoregressive distributed lag (ARDL) cointegration technique as a general
9
vector autoregressive (VAR) model of order p, in Zt, where Zt is a column vector composed
of the five variables: Zt = (Yt Kt Lt Ft Tt)’. The ARDL cointegration approach was developed
by Pesaran and Shin (1999) and Pesaran et al. (2001). It has three advantages in comparison
with other previous and traditional cointegration methods. The first one is that the ARDL does
not need that all the variables under study must be integrated of the same order and it can be
applied when the under-lying variables are integrated of order one, order zero or fractionally
integrated. The second advantage is that the ARDL test is relatively more efficient in the case
of small and finite sample data sizes. The last and third advantage is that by applying the
ARDL technique we obtain unbiased estimates of the long-run model (Harris and Sollis,
2003). The ARDL model used in this study is expressed as follows:
p
D(ln(Yt ))  a 01  b11 ln(Yt 1 )  b21 ln( K t 1 )  b31 ln( Lt 1 )  b41 ln( Ft 1 )  b51 ln(Tt 1 )   a1i D(ln(Yt i ))
i 1
q q q q
  a 2i D(ln( K t i ))   a3i D(ln( Lt i ))   a 4i D(ln( Ft i ))   a5i D(ln(Tt i ))   1t (1)
i 1 i 1 i 1 i 1
p
D(ln( K t ))  a02  b12 ln(Yt 1 )  b22 ln( K t 1 )  b32 ln( Lt 1 )  b42 ln( Ft 1 )  b52 ln(Tt 1 )   a1i D(ln( K t i ))
i 1
q q q q
  a 2i D(ln(Yt i ))   a 3i D(ln( Lt i ))   a 4i D(ln( Ft i ))   a5i D(ln(Tt i ))   2t ( 2)
i 1 i 1 i 1 i 1
p
D(ln( Lt ))  a 03  b13 ln(Yt 1 )  b23 ln( K t 1 )  b33 ln( Lt 1 )  b43 ln( Ft 1 )  b53 ln(Tt 1 )   a1i D(ln( Lt i ))
i 1
q q q q
  a 2i D(ln( K t i ))   a3i D(ln(Yt i ))   a 4i D(ln( Ft i ))   a5i D(ln(Tt i ))   3t (3)
i 1 i 1 i 1 i 1
p
D(ln( Ft ))  a04  b14 ln(Yt 1 )  b24 ln( K t 1 )  b34 ln( Lt 1 )  b44 ln( Ft 1 )  b54 ln(Tt 1 )   a1i D(ln( Ft i ))
i 1
q q q q
  a 2i D(ln( K t i ))   a3i D(ln( Lt i ))   a 4i D(ln(Yt i ))   a5i D(ln(Tt i ))   4t ( 4)
i 1 i 1 i 1 i 1
10
p
D(ln(Tt ))  a 0  b15 ln(Yt 1 )  b25 ln( K t 1 )  b35 ln( Lt 1 )  b45 ln( Ft 1 )  b55 ln(Tt 1 )   a1i D(ln(Tt i ))
i 1
q q q q
  a 2i D(ln( K t i ))   a3i D(ln( Lt i ))   a 4i D(ln( Ft i ))   a5i D(ln(Yt i ))   5t (5)
i 1 i 1 i 1 i 1
Where all variables are as previously defined in section 4, ln(.) is the logarithm operator, D is
the first difference, and εt are the error terms.
The bounds test is mainly based on the joint F-statistic which its asymptotic distribution is
non-standard under the null hypothesis of no cointegration. The first step in the ARDL bounds
approach is to estimate the five equations (1, 2, 3, 4 and 5) by ordinary least squares (OLS).
The estimation of the five equations tests for the existence of a long-run relationship among
the variables by conducting an F-test for the joint significance of the coefficients of the lagged
levels of the variables, i.e., : H0: b1i = b2i = b3i = b4i = b5i = 0 against the alternative one : H1:
b1i ≠ b2i ≠b3i≠ b4i ≠ b5i ≠ 0 for i= 1, 2, 3, 4, 5. We denote the F-statistic of the test which
normalize on Y by FY (Y\ K, L, F, T). Two sets of critical values for a given significance level
can be determined (Pesaran et al., 2001). The first level is calculated on the assumption that
all variables included in the ARDL model are integrated of order zero, while the second one is
calculated on the assumption that the variables are integrated of order one. The null
hypothesis of no cointegration is rejected when the value of the test statistic exceeds the upper
critical bounds value, while it is accepted if the F-statistic is lower than the lower bounds
value. Other ways, the cointegration test is inconclusive.
The use of this approach is guided by the short data span. We choose a maximum lag order of
2 for the conditional ARDL vector error correction model by using the Akaike information
criteria (AIC). The calculated F-statistics are reported in Table 3 when each variable is
considered as a dependent variable (normalized) in the ARDL-OLS regressions. Their values
are: for equation (1), FY (Y \L, K, F, T) = 1.992; for equation (2), FL (L \Y, K, F, T) = 0.762; for
equation (3), FT (T\Y, K, F, L) = 2.736; for equation (4), FK (K\Y, L, F, T) = 2.552; and for
equation (5), F \Y, K, L, T) = 6.701. From these results, it is clear that there is a long run
relationship amongst the variables when FDI is the dependent variable because its F-statistic
(6.701) is higher than the upper-bound critical value (4.15) at the 5% level. This implies that
the null hypothesis of no cointegration among the variables in equation (5) is rejected.
However, for the other equations (1) - (4), the null hypothesis of no cointegration is accepted.
11
Table 3: Results from bound tests
Dependant variable AIC lags F-statistic Decision
FF (F\Y, K, L, T) 2 6.701 Cointegration
FY (Y\L, K, F, T) 2 2.365 No cointegration
FL (L\Y, K, F, T) 1 0.762 No cointegration
FT (T\Y, K, F, L) 1 2.736 No cointegration
FK (K\Y, L, F, T) 1 2.552 No cointegration
Lower-bound critical value at 1% 3.06
Upper-bound critical value at 1% 4.15
Lower and Upper-bound critical values are taken from Pesaran et al. (2001), Table CI(ii) Case II.
5.3. Granger short run and long run causality tests
Once cointegration is established, the conditional ARDL (p, q1, q2, q3, q4) long-run model for
ln(Ft) can be estimated as:
p q1 q2
ln( Ft )  a0   a1i ln( Ft i )   a2i ln( K t i )   a3i ln( Lt i ) 
i 1 i 0 i 0
q3 q4
a
i 0
4i ln(Yt i )   a5i ln(Tt i )   t
i 0
(6)
Where, all variables are as previously defined. The orders of the ARDL (p, q1, q2, q3, q4)
model in the five variables are selected by using AIC. Equation (6) is estimated using the
following ARDL (1, 0, 0, 0, 0) specification. The results obtained by normalizing on FDI, in
the long run are reported in Table 4.
Table 4. Estimated long run coefficients using the ARDL approach
Variable Coefficient t-statistic Probability
C -14.57 -1.44 0.15
Ln(Y) 0.93 0.89 0.37
Ln(L) -1.82 -2.45 0.01
Ln(K) 1.87 2.50 0.01
Ln(T) -1.17 -1.26 0.21
12
The estimated coefficients of the long-run relationship are significant for capital and labour
but not significant for trade and economic growth. Capital investment has a positive
significant impact on FDI at the 5% level. The labour force variable is negatively signed and
significant at the 5% level. This is indicative of the growing unemployment problem and the
low productivity of labour in Tunisia. Considering the impact of trade openness, it is
insignificant at 5% probability and has a negative impact on FDI. Economic growth is also
insignificant at 5% level and has a positive impact on FDI.
Following the research papers of Odhiambo (2009) and Narayan and Smyth (2008), we obtain
the short-run dynamic parameters by estimating an error correction model associated with the
long-run estimates. The long-run relationship between the variables indicates that there is
Granger-causality in at least one direction which is determined by the F-statistic and the
lagged error-correction term. The short-run causal effect and is represented by the F-statistic
on the explanatory variables while the t-statistic on the coefficient of the lagged error-
correction term represents the long-run causal relationship (Odhiambo, 2009; Narayan and
Smyth, 2006). The equation where the null hypothesis of no cointegration is rejected is
estimated with an error-correction term (Narayan and Smyth, 2006; Morley, 2006).
The vector error correction model is specified as follows:
P q q
D(ln( Ft ))  a0   a1i D(ln( Ft i ))   a 2i D(ln( K t i ))   a3i D(ln( Lt i )) 
i 1 i 1 i 1
q q
a
i 1
4i D(ln(Yt i ))   a5i D(ln(Tt i ))   ECTt 1   t
i 1
(7)
P q q
D(ln(Yt ))  a0   a1i D(ln(Yt i ))   a 2i D(ln( K t i ))   a3i D(ln( Lt i )) 
i 1 i 1 i 1
q q
a
i 1
4i D(ln( Ft i ))   a5i D(ln(Tt i ))   t
i 1
(8)
P q q
D(ln( K t ))  a0   a1i D(ln( K t i ))   a 2i D(ln(Yt i ))   a3i D(ln( Lt i )) 
i 1 i 1 i 1
q q
a
i 1
i 1
(9)
13
P q q
D(ln( Lt ))  a0   a1i D(ln(Yt i ))   a 2i D(ln( Lt i ))   a3i D(ln( K t i )) 
i 1 i 1 i 1
q q
a
i 1
i 1
(10)
P q q
D(ln(Tt ))  a0   a1i D(ln(Tt i ))   a 2i D(ln( K t i ))   a3i D(ln( Lt i )) 
i 1 i 1 i 1
q q
a
i 1
4i D(ln( Ft i ))   a5i D(ln(Yt i ))   t
i 1
(11)
Where a1i, a2i, a3i, a4i and a5i are the short-run dynamic coefficients of the model’s convergence
to equilibrium and  is the speed of adjustment.
The equations (7) – (11) are estimated by OLS regression separately. The results of the short-
run dynamic coefficients associated with the long-run relationships obtained from the
equation (7) are given in Table 5. Beginning with the results for the long-run, the coefficient
on the lagged error-correction term is significant at 1% level with the expected sign, which
confirms the result of the bounds test for cointegration. Its value is estimated to -0.69 which
implies that the speed of adjustment to equilibrium after a shock is high. Approximately 69%
of disequilibria from the previous year’s shock converge back to the long-run equilibrium in
the current year. In the long run real GDP per capita, labour, capital and trade Granger cause
FDI. This result implies that causality runs interactively through the error-correction term
from real GDP per capita, labour, capital and trade to FDI. In the short run, only capital
investment is significant at 5% level and has an important impact on FDI. Economic growth
and trade have a negative impact but not significant. The impact of labour is positive but not
significant.
The regression for the underlying ARDL equation (7) fits very well and the model is globally
significant at 1% level. It also passes all the diagnostic tests against serial correlation (Durbin
Watson test and Breusch-Godfrey test), heteroscedasticity (White Heteroskedasticity Test),
and normality of errors (Jarque-Bera test). The Ramsey RESET test also suggests that the
model is well specified. All the results of these tests are shown in Table 6.
The stability of the long-run coefficient is tested by the short-run dynamics. Once the ECM
model given by equation (7) has been estimated, the cumulative sum of recursive residuals
(CUSUM) and the CUSUM of square (CUSUMSQ) tests are applied to assess the parameter
stability (Pesaran and Pesaran (1997)). Graphs 1 and 2 plot the results for CUSUM and
14
CUSUMSQ tests. The results indicate the absence of any instability of the coefficients
because the plot of the CUSUM and CUSUMSQ statistic fall inside the critical bands of the
5% confidence interval of parameter stability.
The Chow Breakpoint and Chow Forecast tests are used to examine significant structural
break in the data in 1995 and over the post-Barcelona period of 1995- 2008. The pre-
Barcelona period is 1970-1995. We choose 1995 as a breakpoint because in July 1995,
Tunisia signed an association agreement with the EU among the South Mediterranean
countries engaged in the Barcelona Process. The F-statistics and the Log likelihood ratios do
not indicate any structural break (Table 7).
Table 5. Results of equation (7), ARDL (1, 0, 0, 0, 0) selected based on AIC
Variable coefficient t-statistic probability
C -0.05 -0.48 0.63
D(Ln(Y)) -0.14 -0.05 0.95
D(Ln(T)) -1.47 -1.31 0.19
D(Ln(L)) 0.92 0.41 0.68
D(Ln(K)) 2.60 2.32 0.02
ECT(-1) -0.69 -4.09 0.0003
R-squared 0.43
F-statistic 4.98 0.001
DW-statistic 1.98
Table 6. Results of diagnostic tests
2 statistic Probability
Breusch-Godfrey Serial Correlation test 0.04 0.82
White Heteroskedasticity test 7.86 0.64
Jarque-Bera test 1.06 0.58
Ramsey RESET Test (log likelihood ratio) 15.49 0.11
15
Table 7. Statistical output for stability tests
Forecast period, F- Probability of F- Log likelihood Probability of Log

Breakpoint statistic statistic ratio likelihood ratio
Chow Forecast 1995 – 2008 0.87 0.59 19.68 0.14

Test
Chow 1995 0.76 0.60 6.20 0.40

Breakpoint Test
Graph 1. Plot of CUSUM Test for equation (7)
20
15
10
-5
-10
-15
-20
1980 1985 1990 1995 2000 2005
CUSUM 5% Significance
16
Graph 2. Plot of CUSUMSQ Test for equation (7)
1.6
1.2
0.8
0.4
0.0
-0.4
1980 1985 1990 1995 2000 2005
CUSUM of Squares 5% Significance
Results of short run Granger causality tests are shown in Table 8. In the short-run, the F-
statistics on the explanatory variables suggest that at the 10% level or better there is bi-
directional Granger causality between capital investment and economic growth and between
capital investment and trade, unidirectional Granger causality running from capital investment
to FDI and from FDI to trade. There is no Granger causality from trade to FDI. Hisarciklilar et
al. (2006) found that there is no Granger causality from FDI to trade or from trade to FDI for
Tunisia. The Granger causality test results for the relationship between FDI and real GDP per
capita are interesting. These results indicate that there is no significant Granger causality from
FDI to GDP or from GDP to FDI and they are consistent with those of Hisarciklilar et al.
(2006). Turning to the Granger causality test results for real GDP per capita and trade
openness, there is also no significant Granger causality from trade to real GDP per capita or
from real GDP per capita to trade. Hisarciklilar et al. (2006) found that the direction of
causality is from economic growth to trade. Our results support the idea that FDI will only be
growth enhancing if it affects technology permanently and positively.
17
We can conclude that domestic investment which promotes trade, FDI and economic growth
in the short-run for Tunisia. Domestic investment is the main catalyser of economic growth in
Tunisia.
Table 8. Results of short run Granger causality
F statistics
Dependent D(Ln(Y)) D(Ln(T)) D(Ln(L)) D(Ln(K)) D(Ln(F)) Direction of causality
variable
D(Ln(Y)) - 0.36 1.05 3.89* 1.88 K→Y
D(Ln(T)) 0.40 - 0.98 4.76* 2.81** K → T; F→ T
D(Ln(L)) 0.39 0.67 - 0.009 0.92 -
D(Ln(K)) 4.99* 5.19* 0.19 - 0.90 Y → K; T → K
D(Ln(F)) 0.002 1.72 0.16 5.39* - K→F
(*) and (**) denote statistical significance at the 5% and 10% levels respectively.
6. Conclusion
The paper examines the dynamic causal relationship among the series of economic growth,
foreign direct investment, trade, labour and capital investment for Tunisia for the period of
1970-2008. It implements ARDL model to cointegration to investigate the existence of a long
run relation among the above noted series; and the Granger causality within VECM to test the
direction of causality between the variables. The topic merits special importance due to the
possible interrelations among the series with implications for economic growth. The results
show that there is cointegration among the variables specified in the model when FDI is the
dependent variable. Trade openness and economic growth promote foreign direct investment
in Tunisia in the long run. The results indicate that there is no significant Granger causality
from FDI to economic growth or from economic growth to FDI in the short run. Turning to
the Granger causality test results for economic growth and trade openness, there is also no
significant Granger causality from trade to economic growth or from economic growth to
trade in the short run.
Domestic capital investment is the catalyser of economic growth in Tunisia. This finding
generates important implications and recommendations for policy makers in Tunisia. The
results suggest that for FDI to bring in the anticipated positive impacts on economic growth,
18
Tunisian government will undertake serious reforms with clear objectives and strong
commitments.
References
Adamopoulos, A., Dritsaki, Ch., and Dritsaki, M. 2005. “A Causal Relationship between
Trade, Foreign Direct Investment, and Economic Growth for Greece.” American Journal of
Applied Sciences 1: 230-235.
Alalaya M.M. 2008. “ARDL Models Applied for Jordan Trade, FDI and GDP Series (1990-
2008)”. European Journal of Social Sciences – Volume 13, Number 4, 605-616.
Alia, A.A. and Ucal, M.S. 2003. “Foreign direct investment, exports and output growth of
Turkey: Causality Analysis”, Paper presented at the European Trade Study Group (ETSG)
fifth annual conference, Madrid, 11-13, Sept.
Alguacil, M.T., Cuadros A. and Orts, V. 2000. “Openness and Growth: Re-Examining
Foreign Direct Investment, Trade, and Output Linkages in Latin America.” University Jaume
I of Caastellon, Spain.
Athukorala, P.P.A.W. 2003. “The Impact of Foreign Direct Investment for Economic Growth:
A Case Study in Sri Lanka.” International Conference on Sri Lanka Studies,
http://www.freewebs.com/slageconf/9thics/spprslfulp092.pdf
Balassa, B. 1985. “Exports, Policy Choices and Economic Growth in Developing Countries
after the 1973 Oil Shock.” Journal of Development Economics 18 (1): 23-35.
Balasubramanyam, V.N., Salisu M.A. and Sapsford, D. 1996. “Foreign direct investment and
growth in EP and IS countries”. The Economic Journal, 106: 92-105.
Baliamoune-Lutz, M. 2004. “Does FDI Contribute to Economic Growth? Knowledge about

the Effects FDI Improves Negotiating Positions and Reduce Risk for Firms Investing in
Developing Countries”. Business Economics April: 49-55.
Blomstrom, M., Lipsey R. and Zejan M. 1992. “What explains Developing Country
Growth?”, NBER Working Paper Series, No. 4132.
Borensztein, E., Gregorio, J.D. and Lee, J.W. 1998. “How does foreign direct investment
affect economic growth?” Journal of International Economics, 45: 115-35.
Boughzala, M., 2010. “The Tunisia-European Union Free Trade Area Fourteen Years”.
http://www.iemed.org/anuari/2010/aarticles/Boughazala_Tunisia_EU_en.pdf
Boyd, J.H. and Smith, B.D. 1992. “Intermediation and the Equilibrium Allocation of
Investment Capital: Implications for Economics Development”, Journal of Monetary
Economics, Vol. 30, pp. 409-32.
19
Carkovic, M. and Levine, R. 2002. “Does Foreign Direct Investment Accelerate Economic
Growth?”, in Does Foreign Direct Investment Promote Development? Moran T.H., Graham
E.M. and Blomstrom M. (eds.), Institute for International Economics.
Casselli, F., Esquivel, G. and Lefort, F. 1996. “Reopening the convergence debate: A new
look at cross-country growth empirics”. Journal of Economic Growth 1(3).
Darrat A.F., Kherfi S. and Soliman M. 2005. “FDI and Economic Growth in CEE and
MENA Countries: A Tale of Two Regions”. 12th Economic Research Forum’s Annual
Conference, Cairo, Egypt.
De Mello, L.R., Jr., 1999. “Foreign direct investment-led growth evidence from time series
and panel data”. Oxford Economics Papers, 51: 133-151.
Dritsaki, M., C. Dritsaki and A. Adamopoulos, 2004. “A Causal Relationship between Trade,
Foreign Direct Investment and Economic Growth for Greece”. American Journal of Applied
Science, 1: 230-235.
Elliot, G., T.J. Rothenberg and J.H. Stock, 1996. “Efficient tests for an autoregressive unit
root”. Econometrica, 64: 813-36.
Engle, R.F. and Granger, C.J. 1987. “Cointegration and Error-correction - Representation,
Estimation and Testing”, Econometrica 55, 251-78.
Frankel, J.A. and D. Romer, 1999. “Does trade cause growth?” American Economic Review,
89: 379-99.
Ghali S., Rezgui S., 2007. “FDI Contribution to Technical Efficiency in The Tunisian
Manufacturing Sector: A combined empirical approach”. 14th Economic Research Forum’s
Annual Conference, Cairo, Egypt.
Ghirmay, T., Grabowski, R., and Sharma, S. 2001. “Exports, Investment, Efficiency, and
Economic Growth in LDCs an empirical investigation.” Applied Economics 33 (6),
Department of Economics, Southern Illinois University, Carbondale, IL.
Harris, R. and Sollis, R. 2003. “Applied Time Series Modelling and Forecasting”. Wiley,
West Sussex.
Hisarciklilar, M., Kayam, S.S. Kayalica, M.O. and Ozkale. N.L. 2006. “Foreign direct
investment and growth in Mediterranean countries”.
Johansen, S. (1988), “Statistical Analysis of Cointegration vectors”, Journal of Economic

Dynamics and Control, 12, pp.231-54.
Johansen, S. and Juselius, K. 1990. “Maximum likelihood estimation and inference on

cointegration-with application to the demand for money”. Oxford bulletin of economics and
statistics, 52: 169-210.
20
Karbasi, A., Mahamadi E. and Ghofrani, S. 2005. “Impact of foreign direct investment on
economic growth”. 12th Economic Research Forum’s Annual Conference, Cairo, Egypt.
Levin, A., Lin, C.F. and Chu, C. 2002. “Unit root tests in panel data: Asymptotic and finite-
sample properties”. Journal of Econometrics, 108: 1–24.
Lipsey, R.E., 2000. “Inward FDI and economic growth in developing countries”.
Transnational Corporations, 9: 61-95.
Mansouri, B., 2005. “The interactive impact of FDI and trade openness on economic growth:
Evidence from Morocco”. 12th Economic Research Forum’s Annual Conference, Cairo,
Egypt.
Morley, B. 2006. “Causality Between Economic Growth and Migration: An ARDL Bounds
Testing Approach”, Economics Letters 90, 72-76.
Nair-Reichert U. and Weinhold D. 2001. “Causality Tests for Cross-Country Panels: A New
Look at FDI and Economic Growth in Developing Countries”, Oxford Bulletin of Economic
and Statistics, Vol. 63, pp. 153-171.
Narayan, P.K. and Smyth, R. 2006. “Higher Education, Real Income and Real Investment in
China: Evidence From Granger Causality Tests”, Education Economics 14, 107-125.
Narayan, P. K., Narayan, S., Prasad, B. C., Prasad, A. 2007. "Export-led growth hypothesis:
evidence from Papua New Guinea and Fiji", Journal of Economic Studies, 34: (4), 341 -351.
Narayan, P.K. and Smyth, R. 2008. “Energy Consumption and Real GDP in G7 Countries:
New Evidence From Panel Cointegration With Structural Breaks”, Energy Economics 30,
2331-2341.
Odhiambo, N.M. 2009. “Energy Consumption and Economic Growth in Tanzania: An

ARDL Bounds Testing Approach”, Energy Policy, 37: (2).
Pahlavani, M., Wilson, E., and Worthington, A.C. 2005. “Trade-GDP nexus in Iran: An
application of autoregressive distributed lag (ARDL) model”. American Journal of Applied
Science, 2: 1158-1165.
Pesaran, M. and Shin, Y. (1999), “An Autoregressive Distributed Lag Modeling Approach to
Cointegration Analysis” in S. Strom, (ed) Econometrics and Economic Theory in the 20th
Century: The Ragnar Frisch centennial Symposium, Cambridge University Press, Cambridge.
Pesaran, M.H., Shin, Y. and Smith, R.J. 2001. “Bounds testing approaches to the analysis of
level relationship.” Journal of Applied Economics 16: 289-326.
Phillips, P.C.B. and Perron, P. 1988. “Testing for a Unit root in Time Series Regression”,
Biometrika 75: 335-346.
Rahman, M. 2007. “Contributions of Exports, FDI and Expatriates’ Remittances to Real GDP
Of Bangladesh, India, Pakistan and Sri Lanka”. Southwestern Economic Review, 141-154.
21
Sengupta, J.K. and Espana, J.R. 1996. “Exports and economic growth in Asian NICs: An
Econometric analysis for Korea”. Applied Economics, 26.
Xu, B., 2000. “Multinational enterprises, technology diffusion and host country productivity
growth”. Journal of Development Economics, 62: 477-93.
Yao, S. 2006. “On Economic Growth, FDI, and Exports in China”. Applied Economics 38
(3): 339-351.
22
Introduction ARDL model EC representation Bounds testing Postestimation Further topics Summary
ardl: Estimating autoregressive distributed lag

and equilibrium correction models
Sebastian Kripfganz1 Daniel C. Schneider2
1 University of Exeter Business School, Department of Economics, Exeter, UK

2 Max Planck Institute for Demographic Research, Rostock, Germany
London Stata Conference

September 7, 2018
ssc install ardl

net install ardl, from(http://www.kripfganz.de/stata/)
S. Kripfganz and D. C. Schneider ardl: Estimating autoregressive distributed lag and equilibrium correction models 1/44
ARDL: autoregressive distributed lag model
The autoregressive distributed lag (ARDL)1 model is being

used for decades to model the relationship between
(economic) variables in a single-equation time series setup.
Its popularity also stems from the fact that cointegration of
nonstationary variables is equivalent to an error correction
(EC) process, and the ARDL model has a reparameterization
in EC form (Engle and Granger, 1987; Hassler and Wolters, 2006).
The existence of a long-run / cointegrating relationship can
be tested based on the EC representation. A bounds testing
procedure is available to draw conclusive inference without
knowing whether the variables are integrated of order zero or
one, I(0) or I(1), respectively (Pesaran, Shin, and Smith, 2001).
1
Another commonly used abbreviation is ADL.
Analyzing long-run relationships
The ARDL / EC model is useful for forecasting and to

disentangle long-run relationships from short-run dynamics.

Long-run relationship: Some time series are bound together
due to equilibrium forces even though the individual time
series might move considerably.
5
1960 1965 1970 1975 1980
log consumption
log income
log investment
Data: National accounts, West Germany, seasonally adjusted, quarterly, billion DM, Lütkepohl (1993, Table E.1).
ARDL model
ARDL(p, q, . . . , q) model:
p q
β 0i xt−i + ut ,
X X
yt = c0 + c1 t + φi yt−i +
i=1 i=0
p ≥ 1, q ≥ 0, for simplicity assuming that the lag order q is

the same for all variables in the K × 1 vector xt .
ardl depvar [indepvars ] [if ] [in ] [, options ]
ardl options for the lag order selection:
Fixed lag order for some or all variables: lags(numlist )
Optimally with the Akaike information criterion: aic
Optimally with the Bayesian information criterion:2 bic
Maximum lag order for selection criteria: maxlags(numlist )
Store information criteria in a matrix: matcrit(name )
Default: lags(.) bic maxlags(4)
2
The BIC is also known as the Schwarz or Schwarz-Bayesian information criterion.
Reproducible example: ARDL lag specification

. webuse lutkepohl2
(Quarterly SA West German macro data, Bil DM, from Lutkepohl 1993 Table E.1)
. ardl ln_consump ln_inc ln_inv, lags(. . 0) aic maxlags(. 2 .) matcrit(lagcombs)
ARDL(4,1,0) regression
Sample: 1961q1 - 1982q4 Number of obs = 88

F( 7, 80) = 49993.34
Prob > F = 0.0000
R-squared = 0.9998
Adj R-squared = 0.9998
Log likelihood = 304.37474 Root MSE = 0.0080
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_consump |
L1. | .4568483 .1064085 4.29 0.000 .2450887 .6686079
L2. | .3250994 .1127767 2.88 0.005 .1006666 .5495322
L3. | .1048324 .1092992 0.96 0.340 -.11268 .3223449
L4. | -.1632413 .0853844 -1.91 0.059 -.3331616 .0066791
|
ln_inc |
--. | .4629184 .078421 5.90 0.000 .3068557 .6189812
L1. | -.202756 .0965775 -2.10 0.039 -.3949513 -.0105607
|
ln_inv | .0080284 .0118391 0.68 0.500 -.0155322 .0315889
_cons | .0373585 .0143755 2.60 0.011 .0087504 .0659667
------------------------------------------------------------------------------
Example (continued): Information criteria
. matrix list lagcombs
lagcombs[12,4]
ln_consump ln_inc ln_inv aic
r1 1 0 0 -585.22447
r2 1 1 0 -585.39189
r3 1 2 0 -583.88179
r4 2 0 0 -590.66282
r5 2 1 0 -592.6904
r6 2 2 0 -591.62792
r7 3 0 0 -588.69069
r8 3 1 0 -590.83183
r9 3 2 0 -589.67101
r10 4 0 0 -590.03466
r11 4 1 0 -592.73282
r12 4 2 0 -592.15636
. estat ic
Akaike’s information criterion and Bayesian information criterion
-----------------------------------------------------------------------------
Model | Obs ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
. | 88 -64.51057 304.3747 8 -592.7495 -572.9308
-----------------------------------------------------------------------------
Note: N=Obs used in calculating BIC; see [R] BIC note.
Example (continued): Fast automatic lag selection

. timer on 1
. ardl ln_consump ln_inc ln_inv, aic dots noheader
Optimal lag selection, % complete:

----+---20%---+---40%---+---60%---+---80%---+-100%
..................................................
AIC optimized over 100 lag combinations
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 1
. timer list 1
1: 0.01 / 1 = 0.0150
Example (continued): Slow automatic lag selection

. timer on 2
. ardl ln_consump ln_inc ln_inv, aic dots noheader nofast

----+---20%---+---40%---+---60%---+---80%---+-100%
..................................................
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 2
. timer list 2
2: 0.75 / 1 = 0.7520
Example (continued): Sample depends on lag selection

. ardl ln_consump ln_inc ln_inv, aic maxlags(8 8 4)

F( 8, 75) = 56976.90
Prob > F = 0.0000
R-squared = 0.9998
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_consump |
L1. | .30383 .0942165 3.22 0.002 .1161411 .491519
L2. | .3195318 .0776321 4.12 0.000 .1648808 .4741828
|
ln_inc | .3767587 .0389267 9.68 0.000 .2992128 .4543046
|
ln_inv |
--. | .0581759 .0170736 3.41 0.001 .0241635 .0921884
L1. | -.0185484 .0214624 -0.86 0.390 -.0613036 .0242068
L2. | .01012 .021505 0.47 0.639 -.0327202 .0529602
L3. | -.0146641 .0213098 -0.69 0.493 -.0571154 .0277872
L4. | -.0488136 .0174121 -2.80 0.006 -.0835003 -.0141269
|
_cons | .0416317 .0107782 3.86 0.000 .0201603 .063103
------------------------------------------------------------------------------
ARDL model: Optimal lag selection

The optimal model is the one with the smallest value (most
negative value) of the AIC or BIC. The BIC tends to select
more parsimonious models.
The information criteria are only comparable when the sample
is held constant. This can lead to different estimates even
with the same lag orders if the maximum lag order is varied.
ardl uses a fast Mata-based algorithm to obtain the optimal
lag order. This comes at the cost of minor numerical
differences in the values of the criteria compared to estat ic
but the ranking of the models is unaffected. The option
nofast avoids this problem but it uses a substantially slower
algorithm based on Stata’s regress command.
For very large models, it might be necessary to increase the
admissible maximum number of lag combinations with the
option maxcombs(# ).
EC representation
Reparameterization in conditional EC form (ardl option ec):
∆yt = c0 + c1 t − α(yt−1 − θxt )

p−1 q−1
ψ 0xi ∆xt−i + ut .
X X
+ ψyi ∆yt−i +
i=1 i=0
Pp
with the speed-of-adjustment coefficient α = 1 − j=1 φi and
Pq
β
j=0 j
the long-run coefficients θ = α .
Alternative EC parameterization (ardl option ec1):
∆yt = c0 + c1 t − α(yt−1 − θxt−1 )

p−1 q−1
0
ψ 0xi ∆xt−i + ut ,
X X
+ ψyi ∆yt−i + ω ∆xt +
i=1 i=1
Example (continued): EC representation
. ardl ln_consump ln_inc ln_inv, aic ec noheader
------------------------------------------------------------------------------
D.ln_consump | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
ln_inv | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inv |
D1. | .080464 .0187106 4.30 0.000 .0432214 .1177066
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
Example (continued): Alternative EC representation

. ardl ln_consump ln_inc ln_inv, aic ec1 noheader
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc |
L1. | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
|
ln_inv |
L1. | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inc |
D1. | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
D1. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
Example (continued): Attaching exogenous variables
. ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) aic ec noheader
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3788728 .0420886 -9.00 0.000 -.4626481 -.2950975
-------------+----------------------------------------------------------------
LR |
ln_inc | .9669152 .0039557 244.44 0.000 .9590416 .9747889
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.346926 .0806726 -4.30 0.000 -.5075007 -.1863512
L2D. | -.1074193 .0790118 -1.36 0.178 -.2646883 .0498497
|
ln_inv |
D1. | .0758713 .0176989 4.29 0.000 .0406425 .1111002
LD. | .0422224 .0191523 2.20 0.030 .0041008 .080344
L2D. | .0678568 .0185208 3.66 0.000 .030992 .1047216
L3D. | .0485441 .0179609 2.70 0.008 .0127938 .0842944
|
_cons | .0504873 .0114518 4.41 0.000 .027693 .0732816
------------------------------------------------------------------------------
EC representation: Interpretation
The long-run coefficients θ are reported in the output section

LR. They represent the equilibrium effects of the independent
variables on the dependent variable. In the presence of
cointegration, they correspond to the negative cointegration
coefficients after normalizing the coefficient of the dependent
variable to unity. The latter is not explicitly displayed.
The negative speed-of-adjustment coefficient −α is reported
in the output section ADJ. It measures how strongly the
dependent variable reacts to a deviation from the equilibrium
relationship in one period or, in other words, how quickly such
an equilibrium distortion is corrected.
The short-run coefficients ψyi , ψ xi (and ω) are reported in the
output section SR. They account for short-run fluctuations not
due to deviations from the long-run equilibrium.
EC representation: Integration order
The independent variables are allowed to be individually I(0)

or I(1).
The independent variables must be long-run forcing (weakly
exogenous) for the dependent variable, i.e. there can be at
most one cointegrating relationship involving the dependent
variable. (There might be further cointegrating relationships
among the independent variables themselves.)
By default, each independent variable is included in the
long-run relationship. I(0) variables that shall only affect the
short-run dynamics can be specified with the option
exog(varlist ). An automatic lag selection or
first-difference transformation is not performed for the latter.
Testing the existence of a long-run relationship
Pesaran, Shin, and Smith (2001) bounds test:

1 Use the F -statistic
Pto test thejoint null hypothesis
q
H0F : (α = 0) ∩ j=0 β j = 0 versus the alternative
P
q
hypothesis H1F : (α 6= 0) ∪ β
j=0 j 6
= 0 .3
2 If H0F is rejected, use the t-statistic to test the single
hypothesis H0t : α = 0 versus H1t : α 6= 0.
3 If H1F is rejected, use conventional z-tests (or Wald tests) to
test whether the elements of θ are individually (or jointly)
statistically significantly different from zero.
There is statistical evidence for the existence of a long-run /
cointegrating relationship if the null hypothesis is rejected in
all three steps.
3 Pq
The test is not directly performed on the long-run coefficients θ = βj /α.
j=0
The distributions of the test statistics in steps 1 and 2 are

nonstandard and depend on the integration order of the
independent variables.
Kripfganz and Schneider (2018) use response surface
regressions to obtain finite-sample and asymptotic critical
values, as well as approximate p-values, for the lower and
upper bound of all independent variables being purely I(0) or
purely I(1) (and not mutually cointegrated), respectively.
These critical values supersede the near-asymptotic critical
values provided by Pesaran, Shin, and Smith (2001) and the
finite-sample critical values by Narayan (2005), among others.
The critical values depend on the number of independent

variables, their integration order, the number of short-run
coefficients,4 and the inclusion of an intercept or time trend.
ardl options for the deterministic model components:
1 No intercept, no trend: noconstant
2 Restricted intercept, no trend: restricted
3 Unrestricted intercept, no trend: the default
4 Unrestricted intercept, restricted trend: trend(varname ) and
restricted
5 Unrestricted intercept, unrestricted trend: trend(varname )
4
The number of short-run coefficients only affects the finite-sample but not the asymptotic critical values
(Cheung and Lai, 1995; Kripfganz and Schneider, 2018). The elements of ω in the ec1 parameterization for
variables that have 0 lags in the ARDL model do not count towards this number.
Test decisions:
Do not reject H0F or H0t , respectively, if the test statistic is
closer to zero than the lower bound of the critical values.
Reject the H0F or H0t , respectively, if the test statistic is more
extreme than the upper bound of the critical values.
The first two steps of the bounds test are implemented in the
ardl postestimation command estat ectest.
By default, finite-sample critical values for the 1%, 5%, and
10% significance levels are provided. Asymptotic critical values
are displayed with option asymptotic. Alternative significance
levels can be specified with option siglevels(numlist ).
The test statistics in step 3 have the usual asymptotic
standard normal (or χ2 ) distributions irrespective of the
integration order of the independent variables.5
5
The OLS estimator for the long-run coefficients θ of I(1) independent variables is “super-consistent” with
√
convergence rate T instead of T (Pesaran and Shin, 1998; Hassler and Wolters, 2006).
Example (continued): Bounds test
. estat ectest
Pesaran, Shin, and Smith (2001) bounds test
H0: no level relationship F = 40.952

Case 3 t = -9.002
Finite sample (1 variables, 88 observations, 6 short-run coefficients)
Kripfganz and Schneider (2018) critical values and approximate p-values
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.032 4.831 | 4.958 5.843 | 7.070 8.119 | 0.000 0.000
t | -2.550 -2.899 | -2.861 -3.225 | -3.470 -3.854 | 0.000 0.000
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
Example (continued): EC model with restricted trend
. ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic ec restricted noheader
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.341178 .0431316 -7.91 0.000 -.4270464 -.2553096
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.14358 .0782318 14.62 0.000 .9878321 1.299327
qtr | -.0036516 .0016171 -2.26 0.027 -.006871 -.0004322
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.4362663 .0851 -5.13 0.000 -.6056874 -.2668452
L2D. | -.1899566 .0825977 -2.30 0.024 -.354396 -.0255172
|
ln_inv |
D1. | .0842961 .0173889 4.85 0.000 .0496775 .1189146
LD. | .0517241 .0188448 2.74 0.008 .0142069 .0892412
L2D. | .0726232 .017972 4.04 0.000 .0368437 .1084027
L3D. | .0482872 .0173383 2.79 0.007 .0137693 .0828051
|
_cons | -.3188651 .1422961 -2.24 0.028 -.602155 -.0355753
------------------------------------------------------------------------------
Example (continued): Bounds test with restricted trend
. estat ectest

Case 4 t = -7.910
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.066 4.582 | 4.784 5.351 | 6.396 7.057 | 0.000 0.000
t | -3.107 -3.384 | -3.412 -3.704 | -4.014 -4.327 | 0.000 0.000
do not reject H0 if
reject H0 if
Further information on the bounds test
The validity of the bounds test relies on normally distributed

error terms that are homoskedastic and serially uncorrelated,
as well as stability of the coefficients over time.
If in doubt about remaining serial error correlation, increase
the lag order for testing purposes (e.g. use the AIC instead of
the BIC to obtain the optimal lag order).
A more parsimonious model for interpretation and forecasting
purposes can be estimated after the testing procedure.
If the bounds test does not reject the null hypothesis of no
long-run relationship, an ARDL model purely in first differences
(without an equilibrium correction term) might be estimated.
Postestimation commands
Besides estat ectest, the ardl command supports

standard Stata postestimation commands such as estat ic,
estimates, lincom, nlcom, test, testnl, and lrtest.
predict allows to obtain fitted values (option xb) and
residuals (option residuals) in the usual way. In addition,
the option ec generates the equilibrium correction term:
b t = yt−1 − θ̂xt after ardl, ec
ec
b t = yt−1 − θ̂xt−1 after ardl, ec1
ec
The diagnostic commands sktest, qnorm, and pnorm are
helpful as well to detect nonnormality of the residuals.
The final ardl estimation results are internally obtained with

the regress command. These underlying regress estimates
can be stored with the ardl option regstore(name ) and
restored with estimates restore name .
Subsequently, all the familiar regress postestimation
commands are available, in particular:
estat hettest and estat imtest for heteroskedasticity and
normality testing,
estat bgodfrey and estat durbinalt for serial-correlation
testing,6
estat sbcusum, estat sbknown, and estat sbsingle for
structural-breaks testing.
6
estat dwatson is not valid for ARDL / EC models because the lagged dependent variable is not strictly
exogenous by construction.
Example (continued): Serial-correlation testing

. quietly ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic ec regstore(ardlreg)
. estimates restore ardlreg
(results ardlreg are active now)
. estat bgodfrey, lags(1/4) small
Breusch-Godfrey LM test for autocorrelation

---------------------------------------------------------------------------
lags(p) | F df Prob > F
-------------+-------------------------------------------------------------
1 | 0.116 ( 1, 77 ) 0.7341
2 | 0.068 ( 2, 76 ) 0.9340
3 | 0.364 ( 3, 75 ) 0.7791
4 | 0.453 ( 4, 74 ) 0.7702
---------------------------------------------------------------------------
H0: no serial correlation
. estat durbinalt, lags(1/4) small
Durbin’s alternative test for autocorrelation

---------------------------------------------------------------------------
-------------+-------------------------------------------------------------
1 | 0.102 ( 1, 77 ) 0.7505
2 | 0.059 ( 2, 76 ) 0.9426
3 | 0.314 ( 3, 75 ) 0.8150
4 | 0.389 ( 4, 74 ) 0.8162
---------------------------------------------------------------------------
Example (continued): Heteroskedasticity testing
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity

Ho: Constant variance
Variables: fitted values of D.ln_consump
chi2(1) = 0.26
Prob > chi2 = 0.6067
. estat imtest, white
White’s test for Ho: homoskedasticity

against Ha: unrestricted heteroskedasticity
chi2(54) = 52.03
Prob > chi2 = 0.5508
Cameron & Trivedi’s decomposition of IM-test
---------------------------------------------------
Source | chi2 df p
---------------------+-----------------------------
Heteroskedasticity | 52.03 54 0.5508
Skewness | 12.24 9 0.2000
Kurtosis | 0.02 1 0.8967
---------------------+-----------------------------
Total | 64.29 64 0.4664
---------------------------------------------------
Example (continued): Normality testing

. predict resid, residuals
(4 missing values generated)
. sktest resid
Skewness/Kurtosis tests for Normality

------ joint ------
Variable | Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2
-------------+---------------------------------------------------------------
resid | 88 0.3270 0.8107 1.04 0.5939
. qnorm resid
. pnorm resid
.02 1.00
.01 0.75
0 0.50
−.01 0.25
−.02 0.00
−.02 −.01 0 .01 .02 0.00 0.25 0.50 0.75 1.00
Example (continued): Structural-breaks testing

. estat sbcusum
Cumulative sum test for parameter stability

Ho: No structural break
1% Critical 5% Critical 10% Critical

Statistic Test Statistic Value Value Value
------------------------------------------------------------------------------
recursive 1.4690 1.1430 0.9479 0.850
------------------------------------------------------------------------------
Recursive cusum plot of D.ln_consump

with 95% confidence bands around the null
4
−2
−4
1961 1966 1971 1976 1981

. estat sbcusum, ols


------------------------------------------------------------------------------
ols 0.6793 1.6276 1.3581 1.224
------------------------------------------------------------------------------
OLS cusum plot of D.ln_consump

2
−1
−2
1961 1966 1971 1976 1981
. estat sbsingle, all

----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
..........
Test for a structural break: Unknown break date
Number of obs = 88
Full sample: 1961q1 - 1982q4

Trimmed sample: 1964q3 - 1979q3
Test Statistic p-value

-----------------------------------------------
swald 20.1088 0.3040
awald 13.9245 0.1019
ewald 7.9897 0.1939
slr 22.7977 0.1605
alr 16.3306 0.0330
elr 9.3047 0.0886
-----------------------------------------------
Exogenous variables: L.ln_consump ln_inc LD.ln_consump L2D.ln_consump D.ln_inv LD.ln_inv
L2D.ln_inv L3D.ln_inv qtr
Coefficients included in test: L.ln_consump ln_inc LD.ln_consump L2D.ln_consump D.ln_inv LD.ln_inv
L2D.ln_inv L3D.ln_inv qtr _cons
. estat sbsingle, breakvars(L.ln_consump ln_inc) all

----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
..........
Number of obs = 88


-----------------------------------------------
swald 8.9039 0.1457
awald 2.5060 0.2608
ewald 2.0321 0.1738
slr 9.7492 0.1046
alr 2.8269 0.2027
elr 2.3571 0.1225
-----------------------------------------------
Coefficients included in test: L.ln_consump ln_inc
Note: This is a test for a structural break in the speed-of-adjustment and long-run coefficients.
Further topics
The ardl command can estimate autoregressive models

without independent variables. In this case, the bounds test
collapses to the familiar augmented Dickey-Fuller unit root
test. The Kripfganz and Schneider (2018) critical values cover
this special case, too.
The forecast command suite can be used for model
forecasting after ardl.
ardl does not compute robust standard errors. Yet, once the
optimal lag order is obtained, the final model can be
reestimated with the newey command to obtain Newey-West
standard errors.
Example (continued): Augmented Dickey-Fuller regression
. ardl dln_inv, aic ec restricted
ARDL(4) regression

R-squared = 0.6462
------------------------------------------------------------------------------
D.dln_inv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ |
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
-------------+----------------------------------------------------------------
LR |
_cons | .015006 .0060544 2.48 0.015 .0029618 .0270501
-------------+----------------------------------------------------------------
SR |
dln_inv |
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
------------------------------------------------------------------------------
Note: The aim is to test whether dln inv, the first difference of ln inv, is nonstationary.
Example (continued): Augmented Dickey-Fuller test
. estat ectest

Case 2 t = -3.290
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 3.823 3.812 | 4.677 4.659 | 6.644 6.601 | 0.026 0.025
t | -2.565 -2.569 | -2.869 -2.874 | -3.463 -3.472 | 0.017 0.017
do not reject H0 if
reject H0 if
Note: The null hypothesis is that dln inv follows a unit root process (without drift).
. dfuller dln_inv if e(sample), lags(3) regress
Augmented Dickey-Fuller test for unit root Number of obs = 87
---------- Interpolated Dickey-Fuller ---------

Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value
------------------------------------------------------------------------------
Z(t) -3.290 -3.528 -2.900 -2.585
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.0153
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
|
_cons | .0113337 .0060208 1.88 0.063 -.0006437 .023311
------------------------------------------------------------------------------
Example (continued): Forecasting

. quietly ardl ln_consump ln_inc ln_inv if qtr < tq(1981q1), trend(qtr)
. estimates store ardl
. forecast create ardl
Forecast model ardl started.
. forecast estimates ardl, predict(xb)

Added estimation results from ardl.
Forecast model ardl now contains 1 endogenous variable.
. forecast exogenous ln_inc ln_inv qtr

Forecast model ardl now contains 3 declared exogenous variables.
. forecast solve, begin(tq(1981q1))
Computing dynamic forecasts for model ardl.

-------------------------------------------
Starting period: 1981q1
Ending period: 1982q4
Forecast prefix: f_
1981q1: ...........
1981q2: ...........
1981q3: ...........
1981q4: ...........
1982q1: ...........
1982q2: ..........
1982q3: ..........
1982q4: ...........
Forecast 1 variable spanning 8 periods.

---------------------------------------
Example (continued): Forecast versus actual data

. twoway (tsline f_ln_consump if qtr>=tq(1979q1)) (tsline ln_consump if qtr>=tq(1979q1)), tline(1981q1)
7.75
7.7
7.65
7.6
7.55
1979 1980 1981 1982
log consumption (ardl f_)

log consumption
Note: The forecast period (1981q1 – 1982q4) is excluded from the estimation period (1961q1 – 1980q4).
Example (continued): Newey-West standard errors

. quietly ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic regstore(ardlreg)
. quietly estimates restore ardlreg
. local cmdline ‘"‘e(cmdline)’"’
. gettoken cmd cmdline : cmdline
. newey ‘cmdline’ lag(4)
Regression with Newey-West standard errors Number of obs = 88

maximum lag: 4 F( 9, 78) = 62645.21
Prob > F = 0.0000
------------------------------------------------------------------------------
| Newey-West
-------------+----------------------------------------------------------------
ln_consump |
L1. | .2225557 .0931767 2.39 0.019 .0370552 .4080562
L2. | .2463097 .1003579 2.45 0.016 .0465125 .4461068
L3. | .1899566 .1013927 1.87 0.065 -.0119008 .3918141
|
ln_inc | .3901642 .0400174 9.75 0.000 .3104956 .4698327
|
ln_inv |
D1. | .0842961 .0258047 3.27 0.002 .0329229 .1356693
LD. | .0517241 .0158053 3.27 0.002 .0202582 .08319
L2D. | .0726232 .0156803 4.63 0.000 .0414061 .1038404
L3D. | .0482872 .017342 2.78 0.007 .013762 .0828124
|
qtr | -.0012458 .000383 -3.25 0.002 -.0020083 -.0004833
_cons | -.3188651 .1104624 -2.89 0.005 -.5387789 -.0989513
------------------------------------------------------------------------------
Example (continued): Long-run coefficient
. nlcom _b[ln_inc] / (1 - _b[L.ln_consump] - _b[L2.ln_consump] - _b[L3.ln_consump])
_nl_1: _b[ln_inc] / (1 - _b[L.ln_consump] - _b[L2.ln_consump] - _b[L3.ln_consump])
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_nl_1 | 1.14358 .0691576 16.54 0.000 1.008033 1.279126
------------------------------------------------------------------------------
Note: This is the same long-run coefficient as earlier but with Newey-West standard errors.
Summary: The ardl package for Stata
The ardl command estimates an ARDL model with optimal

or prespecified lag orders, possibly reparameterized in EC form.
The bounds test for the existence of a long-run /
cointegrating relationship is implemented as the
postestimation command estat ectest.
Asymptotic and finite-sample critical value bounds are
available (Kripfganz and Schneider, 2018).
The augmented Dickey-Fuller unit root test is a special case in
the absence of independent variables.
The usual regress postestimation commands can be applied.
ssc install ardl
help ardl
help ardl postestimation
References
Cheung, Y.-W., and K. S. Lai (1995). Lag order and critical values of the augmented Dickey-Fuller test.
Journal of Business & Economic Statistics 13(3): 277–280.
Engle, R. F., and C. W. J. Granger (1987). Co-integration and error correction: representation, estimation,
and testing. Econometrica 55(2): 251–276.
Hassler, U., and J. Wolters (2006). Autoregressive distributed lag models and cointegration. Allgemeines
Statistisches Archiv 90(1): 59–74.
Kripfganz, S., and D. C. Schneider (2018). Response surface regressions for critical value bounds and
approximate p-values in equilibrium correction models. Manuscript, University of Exeter and Max Planck
Institute for Demographic Research, www.kripfganz.de.
Lütkepohl, H. (1993). Introduction to Multiple Time Series Analysis (2nd edition), Berlin, New York:
Springer.
Narayan, P. K (2005). The saving and investment nexus for China: evidence from cointegration tests.
Applied Economics 37(17): 1979–1990.
Pesaran, M. H., and Y. Shin (1998). An autoregressive distributed-lag modelling approach to cointegration
analysis. In Econometrics and Economic Theory in the 20th Century. The Ragnar Frisch Centennial
Symposium, ed. S. Strøm, chap. 11, 371–413. Cambridge: Cambridge University Press.
Pesaran, M. H., Y. Shin, and R. Smith (2001). Bounds testing approaches to the analysis of level
relationships. Journal of Applied Econometrics 16(3): 289–326.
Frontiers in African Business Research
Almas Heshmati Editor
Studies on Economic
Development and
Growth in Selected
African Countries
Series editor
Almas Heshmati, Jönköping International Business School,
Jönköping, Sweden
This book series publishes monographs and edited volumes devoted to studies on
entrepreneurship, innovation, as well as business development and management-
related issues in Africa. Volumes cover in-depth analyses of individual countries,
regions, cases, and comparative studies. They include both a specific and a general
focus on the latest advances of the various aspects of entrepreneurship, innovation,
business development, management and the policies that set the business environ-
ment. It provides a platform for researchers globally to carry out rigorous analyses,
to promote, share, and discuss issues, findings and perspectives in various areas
of business development, management, finance, human resources, technology, and
the implementation of policies and strategies of the African continent. Frontiers in
African Business Research allows for a deeper appreciation of the various issues
around African business development with high quality and peer reviewed contri-
butions. Volumes published in the series are important reading for academicians,
consultants, business professionals, entrepreneurs, managers, as well as policy
makers, interested in the private sector development of the African continent.
More information about this series at http://www.springer.com/series/13889

Almas Heshmati
Editor
Studies on Economic
Development and Growth
in Selected African Countries
123
Editor
Almas Heshmati
Jönköping International Business School
Jönköping University
Jönköping
Sweden
and
Department of Economics
Sogang University
Seoul
South Korea
ISSN 2367-1033 ISSN 2367-1041 (electronic)

ISBN 978-981-10-4450-2 ISBN 978-981-10-4451-9 (eBook)
DOI 10.1007/978-981-10-4451-9
Library of Congress Control Number: 2017936352
© Springer Nature Singapore Pte Ltd. 2017

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Contents
1 Introduction to Studies on Economic Development

and Growth in Selected African Countries . . . . . . . . . . . . . . . . . . . . 1
Almas Heshmati
Part I Women’s Empowerment and Demand for Healthcare

2 Measuring Women’s Empowerment in Rwanda . . . . . . . . . . . . . . . 11
Abdou Musonera and Almas Heshmati
3 Determinants of Demand for Outpatient Health Care
in Rwanda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Charles Mulindabigwi Ruhara and Urbanus Mutuku Kioko
Part II The Impact of Institutions, Aid, Inflation and FDI

on Economic Growth
4 The Impact of Institutions on Economic Growth
in Sub-Saharan Africa: Evidence from a Panel
Data Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Kokeb G. Giorgis
5 Fiscal Effects of Aid in Rwanda. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Thomas Bwire, Caleb Tamwesigire and Pascal Munyankindi
6 Relationship Between Inflation and Real Economic Growth
in Rwanda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Ferdinand Nkikabahizi, Joseph Ndagijimana
and Edouard Musabanganji
7 Macroeconomic, Political, and Institutional Determinants
of FDI Inflows to Ethiopia: An ARDL Approach . . . . . . . . . . . . . . 123
Addis Yimer
v
vi Contents
Part III Capital Structure and Bank Loan Growth Effects

8 Firm-Specific Determinants of Insurance Companies’
Capital Structure in Ethiopia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Yitbarek Takele and Daniel Beshir
9 Income Distribution and Economic Growth . . . . . . . . . . . . . . . . . . . 177
Atnafu Gebremeskel
Part IV Trade, Mineral Exports and Exchange Rate

10 Determinants of Trade with Sub-Saharan Africa:
The Secret of German Companies’ Success . . . . . . . . . . . . . . . . . . . 207
Johannes O. Bockmann
11 An Assessment of the Contribution of Mineral Exports
to Rwanda’s Total Exports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Emmanuel Mushimiyimana
12 Testing the Balassa Hypothesis in Low- and Middle-Income
Countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Fentahun Baylie
Part V Growth, Productivity and Efficiency in Various Industries

13 Agricultural Tax Responsiveness and Economic Growth
in Ethiopia. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Hassen Azime, Gollagari Ramakrishna and Melesse Asfaw
14 Improving Agricultural Productivity Growth in Sub-Saharan
Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Olaide Rufai Akande, Hephzibah Onyeje Obekpa
and Djomo-Raoul Fani
15 Determinants of Service Sector Firms’ Growth in Rwanda . . . . . . . 331
Eric Uwitonze and Almas Heshmati
16 Labor-Use Efficiency in Kenyan Manufacturing and Service
Industries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Masoomeh Rashidghalam
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Contributors
Olaide Rufai Akande Department of Agricultural Economics, University of

AgricultureMakurdiNigeria
Melesse Asfaw School of Graduate Studies, Ethiopian Civil Service University,
Addis Ababa, Ethiopia
Hassen Azime Institute of Tax and Customs Administration, Department of Public
Finance Ethiopian Civil Service University, Addis Ababa, Ethiopia
Fentahun Baylie Department of Economics, Addis Ababa University, Addis
Ababa, Ethiopia
Daniel Beshir Libya Oil Ethiopia Ltd, Addis Ababa, Ethiopia
Johannes O. Bockmann Department of Economics and Logistics, International
School of Management Hamburg, Hamburg, Germany
Thomas Bwire Bank of Uganda, Kampala, Uganda
Djomo-Raoul Fani Department of Agricultural Economics, University of
Agriculture, Makurdi, Nigeria
Atnafu Gebremeskel Department of Economics, Addis Ababa University, Addis
Ababa, Ethiopia
Kokeb G. Giorgis Department of Economics, Addis Ababa University, Addis
Ababa, Ethiopia
Almas Heshmati Jönköping International Business School (JIBS), Jönköping
University, Jönköping, Sweden; Department of Economics, Sogang University,
Seoul, South Korea
Urbanus Mutuku Kioko University of Nairobi, Nairobi, Kenya
Pascal Munyankindi National Bank of Rwanda, Kigali, Rwanda
vii
viii Contributors
Edouard Musabanganji Economy and Rural Development Unit, Gembloux

Agro-Bio Tech, University of Liège, Liège, Belgium
Emmanuel Mushimiyimana Department of Political Science and International
Relations, College of Arts and Social Sciences, University of Rwanda, Butare,
Rwanda
Abdou Musonera MIFOTRA-SPIU, Kigali, Rwanda
Joseph Ndagijimana College of Business and Economics, University of Rwanda,
Butare, Rwanda
Ferdinand Nkikabahizi College of Business and Economics, University of
Rwanda, Butare, Rwanda
Hephzibah Onyeje Obekpa Department of Agricultural Economics, University of
Agriculture, Makurdi, Nigeria
Gollagari Ramakrishna School of Graduate Studies, Ethiopian Civil Service
University, Addis Ababa, Ethiopia
Masoomeh Rashidghalam Department of Agricultural Economics, University of
Tabriz, Tabriz, Iran
Charles Mulindabigwi Ruhara University of Rwanda, Butare, Rwanda
Yitbarek Takele College of Business and Economics, Addis Ababa University,
Caleb Tamwesigire University of Kigali, Kigali, Rwanda
Eric Uwitonze Ministry of Gender and Family Promotion (MIGEPROF),
MIGEPROF, Single Project Implementation Unit, Kigali, Rwanda
Addis Yimer Department of Economics, Addis Ababa University, Addis Ababa,
Ethiopia
Abbreviations
AD Aggregate demand
ADF Augmented Dickey-Fuller
AfDB African Development Bank
AIC Akaike information criteria
ANOVA Analysis of variance
AR Autoregressive
ARDL Autoregressive distributed lag
AS Aggregate supply
ATM Automatic teller machines
AVC Agriculture value chains
CBHIS Community-based health insurance schemes
CD Cobb–Douglas function
CEO Chief executive officer
CLRM Classical linear regression models
COPIMAR Mining Cooperative of Artisan Miners
CORR Control of corruption
CPI Consumer Price Index
CS Capital structure
CUSUM Cumulative sum
CUSUMQ Cumulative sum of squares
DC Developing countries
DHS Demographic and Heath Survey
EAC East African Countries
ECM Error correction model
EDPRS Economic Development and Poverty Reduction Strategy
EIA Ethiopian Investment Authority
EICV Integrated Household Living Conditions Survey
ELH Ethno-linguistic heterogeneity
ix
x Abbreviations
EP Export performance
EPRDF Ethiopian People Revolutionarily Democratic Front
EU European Union
FAO Food and Agricultural Organization
FDI Foreign direct investment
GDP Gross domestic product
GLR Great Lakes region
GMM Generalized methods of moment
GOVEFFE Government effectiveness
HAC Heteroskedasticity and auto-correlation consistent covariance
ICMM International Council on Mining and Metals
ICT Information and communication technologies
IMF International Monetary Fund
KES Kenyan shilling
KEU Kenya Economic Update
LDE Logistic diffusion equation
LDEFO Logistic differential equation of first order
LIC Low-income countries
LM Lagrangian multiplier
LSDV Least square dummy variable
MDG Millennium development goals
MIC Middle-income countries
MLE Maximum likelihood estimation
MNC Multinational corporation
MOFED Ministry of Finance and Economic Development
NBE National Bank of Ethiopia
NGO Non-Governmental Organization
NISR National Institute of Statistics for Rwanda
NRG New Resolutions Geophysics
OECD Organization for Economic Development and Cooperation
OLS Ordinary least squares
OOPE Out-of-pocket healthcare expenditures
PMG Pooled mean group
POLS Pooled OLS
POLSTAB Political stability
PPP Purchasing power parity
PTA Prospective target areas
PWT Penn World Tables
R&D Research and Development
RA Representative Agent
RoL Rule of Law
RQ Regulatory Quality
SAP Structural adjustment programs
SIDA Swedish International Development Cooperation Agency
SME Small- and medium-sized enterprises
Abbreviations xi
SSA Sub-Saharan African

TFP Total factor productivity
UK United Kingdom
UN United Nation
UNCTAD United Nations
UNDP United Nations Development Program
UNECA United Nations Economic Commission for Africa
USD United States Dollar
VAR Vector Auto-Regression
VAT Value-added tax
VECM Vector error correction model
VOIACC Voice and accountability
WB World Bank
WDI World development indicators
WGI World governance indicators
Chapter 1
Introduction to Studies on Economic
Development and Growth in Selected
African Countries
Almas Heshmati
Abstract A major policy challenge facing Africa is how to sustain a high rate of
economic growth that is both socially inclusive and environmentally sustainable.
Growth and its sustainability influence many other challenges facing the continent.
This volume is a collection of selected empirical studies on economic development
and growth in Africa. The papers were presented at the second conference on
Recent Trends in Economic Development, Finance and Management Research in
Eastern Africa, Kigali, Rwanda, June 20-22, 2016. The studies are grouped into
domains influencing economic development and growth in Africa.

Keywords Economic Development Economic Growth Sustainable Growth

Determinants of Growth Governance and Institutions African Countries
1.1 Background
The major policy challenges facing Africa are how to sustain a high rate of eco-
nomic growth that is both socially inclusive and environmentally sustainable.
Population aging, population growth, rapid urbanization, infrastructure for pro-
viding services, facilitating production expansion, the need to reverse declined
economic growth after the 2008 global financial crisis, corruption, inefficiency, and
responding to climate change are among the other challenges facing Africa. In this
background, Jönköping International Business School and the University of
Rwanda organize a conference on economic development in the region every year.
This volume is a collection of selected empirical studies on economic development
and growth in Africa. The papers were selected from a set of more than 90 papers
presented at the second conference on Recent Trends in Economic Development,
Finance and Management Research in Eastern Africa, Kigali, Rwanda, June
A. Heshmati (&)
Jönköping International Business School, Jönköping University, Jönköping, Sweden
e-mail: almas.heshmati@gmail.com
A. Heshmati
Department of Economics, Sogang University, Seoul, South Korea
© Springer Nature Singapore Pte Ltd. 2017 1

A. Heshmati (ed.), Studies on Economic Development and Growth in Selected
African Countries, Frontiers in African Business Research,
DOI 10.1007/978-981-10-4451-9_1
2 A. Heshmati
20–22, 2016. Following a process of review and revisions, 15 papers were accepted
for publication in this edited volume on economic development and growth.
The studies are grouped into domains influencing economic development and
growth in Africa. The core argument for using a multiple approach perspective is
the need to account for different approaches for enhancing growth and develop-
ment. The aim is not to identify specific determinants of growth and development
and to apply them to a set of countries assuming that every country is affected in the
same way and by the same factors. This volume realizes that the countries have
different initial and factor endowment conditions, and as such, they respond
differently to development and growth policies. Together, the chapters included in
the volume provide a comprehensive picture of the state of development and growth
and their country-specific determinants and policies. Heterogeneity of countries and
efficient policies and practices in growth and their distribution on selected parts of
the African continent as a whole and also in selected countries mainly in Eastern
Africa are considered. Development and growth represent a major challenge for
governments and organizations whose aim is development and alleviating poverty.
This volume contains a collection of empirical studies on the level of devel-
opment and growth, and their variations and determinants in Africa. The first
chapter is an introduction/summary written by the volume’s Editor. The remaining
15 chapters are inter-related studies that are grouped into five domains which
influence the level, variations, and developments on the African continent as a
whole and also in individual countries. The results can have strong implications for
the development and policies in Africa.
1.2 Summary of Individual Studies
This edited volume is a collection of studies on economic development and growth

in selected African countries. The volume consists of 16 chapters including an
introduction/summary and 15 inter-related empirical studies. The studies are largely
grouped into five research areas: women’s empowerment and demand for health
care; the impact of institutions, aid, inflation, and FDI on economic growth; capital
structure and bank loan growth effects; trade, mineral exports, and exchange rate;
and growth, productivity, and efficiency in various industries. The studies provide a
comprehensive picture of the state of economic development and growth in most
parts of the African continent. Though several studies cover major parts of the
continent, the main focus of the edited volume is on economic development and
growth in Ethiopia and Rwanda—two countries on the path of rapid economic and
social development.
Africa is developing rapidly. Among the aspects of development in the region
are region-wise formations of groups of countries cooperating to establish economic
integration, a currency union, trade zones, and sustainability of growth. The
chapters investigate in single and comparative cases factors such as gender equality,
health care, the quality of the institutions, and their effectiveness along with the
1 Introduction to Studies on Economic Development … 3
effect of institutions on growth and development in the nations on the continent.

The chapters also investigate other factors of importance for the industry, service,
and agricultural sectors. Among the factors that are explored are those that influence
production and flows of factors including inflation rate, foreign direct investment,
sources of finance, and trade and exchange rate. These issues have not been well
researched so far. With contributions from African professionals in the field, this
book attempts to shed light on the importance and effects of various determinants of
economic development and growth. Hence, it will help fill existing gaps in the
region-specific literature and also provide necessary policy tools for decision
makers.
Part A. Women’s Empowerment and Demand for Health care
Part A covers two important development areas—women’s empowerment and
demand for health care—in Rwanda.
The first study (Chap. 2), Measuring Women’s Empowerment in Rwanda, by
Abdou MUSONERA and Almas HESHMATI, examines the determinants of
women’s empowerment in Rwanda using demographic and health survey data. It
uses a regression analysis to investigate the association between women’s
empowerment and its covariates. The study also uses a multinomial logistic
regression to assess what determines households’ decision making and attitudes
toward physical abuse of spouses. It finds that education and media exposure are
positively associated, while residence and the age at first marriage are negatively
associated with women’s empowerment. Strengthening regulations and public
support improves women’s empowerment.
The second study (Chap. 3), Determinants of Demand for Outpatient Health
care in Rwanda, by Charles M. RUHARA and Urbanus M. KIOKO, examines the
factors that influence the demand for outpatient care in Rwanda using the household
living conditions survey. It estimates a structural model of demand for health care to
measure the healthcare demand effects of covariates. The findings indicate that
health insurance is a significant determinant of outpatient medical care. In addition,
the price of health care and household income are among the main drivers of the
utilization of health care. The study recommends that the government should reduce
out-of-pocket health care expenditure through subsidies and reduce the premium for
community-based health insurance schemes to increase coverage rates.
Part B. The Impact of Institutions, Aid, Inflation, and FDI on Economic
Growth
This part analyses the impact of institutional quality, provision of aid, inflation, and
foreign direct investment on economic growth in a large number of African
countries.
The first study (Chap. 4), Economic Growth and the Impact of Institutions, by
Kokeb G. GIORGIS, discusses the effect that institutional variables have on eco-
nomic growth. It empirically investigates the impact of institutional quality proxied
by control of corruption, government effectiveness, and protection of property
rights on economic growth in 21 sub-Saharan African countries during the sample
period 1996–2012. The results indicate that improving institutional quality and
4 A. Heshmati
specifically protecting property rights will contribute positively to growth in output

per capita.
The second study (Chap. 5), Fiscal Effects of Aid in Rwanda, by Thomas
BWIRE, Caleb TAMWESIGIRE and Pascal MUNYANKINDI, analyses the
dynamic relationship between foreign aid and domestic fiscal variables in Rwanda.
The hypothesis of aid exogeneity is not statistically supported. The anticipated aid
appears to have been taken into account in budget planning. Aid is associated with
increased tax efforts and public spending and lower domestic borrowings. In terms
of policy, continued efforts by donors to coordinate aid delivery systems, making
aid more transparent, and supporting improvements in government fiscal statistics
are contributing to improving fiscal planning in Rwanda. Estimation results show
that aid has contributed to Rwanda’s improved fiscal performance.
The third study (Chap. 6), Exploring the Relationship between Inflation and
Real Economic Growth in Rwanda, by Ferdinand NKIKABAHIZI,
Joseph NDAGIJIMANA and Edouard MUSABANGANJI, examines the impact
that economic stability measures of inflation and unemployment rates have had on
real GDP in Rwanda. The study concludes that inflation and unemployment have a
long-run negative and significant relationship with real GDP. Real GDP increases
when inflation and unemployment decrease. The effect of the shock reduces by
19.32% in each of the four quarters, ending after a five-year period. The study also
finds a weak relationship between real GDP and inflation and unemployment rates.
The fourth research (Chap. 7), Macroeconomic, Political and Institutional
Determinants of FDI Inflows to Ethiopia: An ARDL Approach, by Addis YIMER,
investigates the various determinants of FDI inflows to Ethiopia. Using the time
series methodology, it finds that political and institutional factors are crucial both in
the long and in the short run for FDI inflows to the country. On the one hand,
market size, availability of natural resources, and openness to trade and exchange
rate depreciation affect FDI inflows positively. On the other hand, macroeconomic
instability affects FDI inflows negatively. In addition, political stability, the absence
of violence, and the effectiveness of the government in formulating and imple-
menting sound development policies are found to affect FDI inflows positively.
Part C. Capital Structure and Bank Loan Growth Effects
Part C discusses the insurance companies’ capital structure and manufacturing
firms’ bank loans income distribution and growth effects in Ethiopia.
Using the panel data methodology, the first study (Chap. 8), Firm-specific
Determinants of Insurance Companies’ Capital Structures in Ethiopia, by
Yitbarek TAKELE and Daniel BESHIR, examines the impact that a firm’s char-
acteristics have on decisions about the capital structure in the Ethiopian insurance
industry. A number of tests are conducted to validate the results. It finds that
pecking order, static trade-off, and agency cost theories are the most important in
explaining decisions on the capital structure of insurance companies, though
pecking order appears to be dominant. Profitability, asset tangibility, growth, and
liquidity play a significant role in shaping the insurance industry’s financing
decisions, while business risk and size of the firm do not.
The second study (Chap. 9), Income Distribution and Economic Growth, by
Atnafu GEBREMESKEL, links access to bank loans and income distribution to
productivity growth of firms. Using Ethiopian manufacturing firm-level data, the
study examines how functional income distribution can influence the evolution of
productivity, thereby promoting economic growth. It employs an evolutionary
economic framework and econometric approach for the analysis. The results show
lack of strong evidence of intra-industry selection for fostering productivity growth
and structural change. The key policy lesson is that access to bank loans is of great
importance to firms for their structural transformation.
Part D. Trade, Mineral Exports, and Exchange Rate
Part D covers German SMEs’ trade with sub-Saharan Africa, contributions of
mineral exports to Rwanda’s trade, and the relationship between economic growth
and real exchange rate in low- and middle-income countries.
The first paper (Chap. 10), SME Trade with sub-Saharan Africa: The secret of
German companies’ success, by Johannes O. BOCKMANN, evaluates the degree
to which internal, micro- and macro-environmental variables explain how some
SMEs based in Germany export more successfully to sub-Saharan Africa than
others in the same category. The econometric methodology is used for identifying
the determinants of export performance. Estimation results indicate that
sub-Saharan Africa has specific requirements for successful exports. Knowledge
about these particular characteristics of the market enables managers and policy-
makers to improve trade relations. By focusing on the export performance of
German SMEs in SSA, this study fills a research gap since no previous study has
dealt with this specific aspect.
The second study (Chap. 11), An Assessment of the Contribution of Mineral
Exports to Rwanda’s Total Exports, by Emmanuel MUSHIMIYIMANA, is an
assessment of the mineral industry’s contribution to Rwanda’s growing mineral
exports. Mineral exports can be a means of increasing exports for agrarian and low-
and middle-income countries. The results, based on the econometric methodology,
show that mineral exports are the main contributor in increasing Rwanda’s total
exports. This implies that the Government of Rwanda needs to introduce significant
reforms in the mining sector and take Botswana and Namibia as its role models in
developing its mineral industry which can play a role in the industrialization of the
country.
The third study (Chap. 12), Testing the External Balassa Hypothesis in Low- and
Middle-Income Countries, by Fentahun BAYLIE, analyzes the long-run relation-
ship between economic growth and the real exchange rate for 15 low- and
middle-income countries. It establishes a co-integration relationship between
growth and exchange rate by controlling for heterogeneity and cross-sectional
dependence. It implies that the productivity effect is estimated consistently and
without any bias. Moreover, the results indicate that the effect of the Balassa term
depends more on income levels than on the rate of economic growth. In general, the
power of the effect is stronger for higher income countries in the long run.
However, in the short run, fiscal policy and exchange rate volatility clearly explain
the variations in the real exchange rate.
6 A. Heshmati
Part E. Growth, Productivity, and Efficiency in Various Industries

Part E deals with tax-growth responsiveness, productivity, and efficiency in agri-
culture, manufacturing, and services in select African countries.
The first study (Chap. 13), Agricultural Tax Responsiveness and Economic
Growth in Ethiopia, by Hassen AZIME, Gollagari RAMAKRISHNA, and
Melesse ASFAW, looks at the pattern of tax revenues and its nexus with economic
growth in developing countries. Since tax revenue is one of the important sources of
government revenue, tax policy assumes significance as a viable and long-term
source of revenue and economic growth. Similarly, economic growth has aug-
menting effects on tax revenues. The relationship between the two is essential for
formulating fiscal policy. The study suggests policy interventions for improving tax
revenue structures.
The second study (Chap. 14), Improving Agricultural Productivity in
sub-Saharan Africa, by Olaide R. AKANDE, Hephzibah O. OBEKPA and
Djomo-Raoul FANI, looks at improved agricultural productivity which is central to
achieving inclusive development, reducing poverty, and enhancing the living
standards of most people in sub-Saharan Africa. The study seeks answers to
questions of whether agro-processing activities and export of raw agricultural
materials have a backward linkages’ effect on agricultural production activities. The
estimation results indicate that while agro-processing activities have a positive
effect on agricultural productivity, increased exports of agricultural raw materials
negatively influence productivity growth in agriculture.
The third study (Chap. 15), Determinants of Service Sector Firms’ Growth in
Rwanda, by Eric UWITONZE and Almas HESHMATI, views the service sector as
an avenue for economic transformation. It discusses the role that services can
play in the economic growth of African economies leading to their transformation
into service-based economies. Services are considered as an alternative to
manufacturing-led development. This research attempts to study the development
of the service sector and investigates the factors behind the development of this
sector in Rwanda. It specifies and estimates models to assess the factors con-
tributing to sales growth, innovation, and turnover of service firms in speeding up
the shift from a low-income to a middle-income development state.
The last study (Chap. 16), Labor-use Efficiency in Kenyan Manufacturing and
Service Industries, by Masoomeh RASHIDGHALAM, estimates the efficiency in
the use of labor in Kenyan manufacturing and service industries at the firm level.
The study provides evidence of efficiency in the use of labor in the country. It
identifies the determinants of labor-use efficiency and estimates their effects.
Labor-use efficiency is important for firms’ competitiveness in both the domestic
and international labor and goods markets. It makes a number of recommendations
for promoting higher labor-use efficiency at the firm level and also through public
labor market and industrial policies.
1.3 Final Words
The primary market for this edited book includes undergraduate and graduate
students, lecturers, researchers, public and private institutions, NGOs, international
aid agencies, and decision makers. This book can serve as complementary reading
to texts on economic growth, development, welfare, inequality, and poverty anal-
yses in Africa. The organizers of the annual conference on economic development
in East Africa will market the book at their annual East Africa conferences. There
are many books on growth and development in Africa, but they rarely cover such
diversity in approaches and their country specificity character and policy
recommendations.
This edited book is authored by African experts in the field who employ diverse
up-to-date methods to provide robust empirical results based on representative
disaggregate data at the household and firm levels and aggregate data covering
individual or multiple countries on the continent. It contains a wealth of empirical
evidence, deep analyses, and sound recommendations for policymakers and
researchers for designing and implementing effective economic policies and
strategies to achieve rapid and higher levels of development. As such, the book is a
useful resource for policymakers and researchers involved in development- and
growth-related tasks. It will also appeal to a broader audience interested in eco-
nomic development, resources, policies, economic welfare, and inclusive growth.
The Editor is grateful to a host of dedicated authors and rigorous referees
who helped in assessing the submitted papers. Many were presenters at the 2016
conference at the University of Rwanda. Special thanks go to
Bideri Ishuheri Nyamulinda, Rama Rao, and Lars Hartvigson and the remaining
members of the Organization Committee for their efforts in organizing the con-
ference. The Editor would also like to thank William Achauer at Springer Singapore
for guidance and for assessing this manuscript for publication by Springer.
Financial support by the Swedish International Development Cooperation Agency
(SIDA) to organize the conference is gratefully acknowledged.
Part I
Women’s Empowerment and Demand for
Healthcare
Chapter 2
Measuring Women’s Empowerment
in Rwanda
Abdou Musonera and Almas Heshmati
Abstract This study examines the determinants of women’s empowerment in

Rwanda using the data obtained from the Demographic and Heath Survey
(DHS) (2010). It uses a regression analysis to investigate the association between
women’s empowerment and its covariates. The study also uses a multinomial
logistic regression to assess what determines households’ decision-making and
attitudes toward physical abuse of spouses. It finds variables of sources of
empowerment such as education and media exposure to have a net positive asso-
ciation with women’s empowerment, while other variables such as residence and
the age at first marriage to be negatively associated with women’s empowerment.
A further analysis shows that the effects of education, age of the respondent, wealth
and the number of children ever born remain strong conditions which effect
households’ decision-making and attitudes about physical abuse. In general, it
seems that for women to fully realize their potential and rights, specific emphasis
should be put on variables that increase their access to resources and knowledge
such as education, employment for cash, and media exposure, but variables that are
negatively associated with their empowerment such as higher age at first marriage
should also be taken into account.
Keywords Women’s empowerment Physical abuses Household decision-

making Rwanda
JEL Classification Codes D63 D91 I15 I25 J12
A. Musonera
MIFOTRA-SPIU, Kigali, Rwanda
e-mail: abdoumusonera@gmail.com
A. Heshmati (&)
Jönköping International Business School (JIBS),
Jönköping University, Jönköping, Sweden
A. Heshmati
Department of Economics, Sogang University, Seoul, South Korea

DOI 10.1007/978-981-10-4451-9_2
12 A. Musonera and A. Heshmati
2.1 Introduction
In recent years, a range of organizations have increasingly shown commitment to

women’s empowerment; they have also realized that empowering women is a
win-win situation that benefits both women and society. Golla et al. (2011) claim
that women’s economic empowerment is fast becoming a key instrument in pro-
moting their abilities to achieve their rights and well-being which subsequently
reduces household poverty and increases economic growth, productivity, and
efficiency.
There is a growing body of literature which recognizes the social and economic
importance of involving women in the development process. Some literature
focuses on spillover benefits resulting from allowing women to have greater control
over resources and the impact that this has on the health and education of their
children and on better well-being prospects for future generations (The World Bank
Poverty, Inequality and Gender Group 2012). Other literature pays particular
attention to the relationship between women’s empowerment and health outcomes
(see, for example, Abadian 1996; Bloom et al. 2001; Fotso et al. 2009; Larsen and
Hollos 2003; Lee-Rife 2010; Patrikar et al. 2014; Sado et al. 2014; Schuler et al.
1996, 1997; Upadhyay and Karasek 2012; Upadhyay et al. 2014; Bloom et al.
2001).
A great deal of previous literature on women’s empowerment focused on two
indicators of their empowerment—household decision-making and self-esteem
(El-Halawany 2009; Ghuman et al. 2004; Kishor and Subaiya 2008; Mahmud and
Tasmeen 2014; Mahmud et al. 2012; Malhotra and Mather 1997; Sado et al. 2014).
However, other studies have described the role of women’s access to finance and
labor force participation in the empowerment process (Ali et al. 2014; Allendorf
2007; Allsopp and Tallontire 2014; Faridi et al. 2009; Ganle et al. 2015; Naqvi and
Shahnaz 2002). Together, these studies provide evidence that measurement issues
still exist in the process of translating ‘evidence of empowerment’ and ‘access to
sources of empowerment’ into agency especially using cross-sectional survey data
(Kishor and Subaiya 2008) and thus highlight the need for going beyond structural
and merely simplistic factors (family, social, and economic) to be able to measure
women’s empowerment in a comprehensive way (Malhotra and Mather 1997). In
the same vein, Ghuman et al. (2004) argue that difficulties in measuring women’s
empowerment call for an in-depth understanding of gender relations by spending
enough time in the community and doing pre-testing.
There is evidence also of positive effects of women’s empowerment from around
the world. There is also internationally recognized knowledge about channels of
empowerment and effects. For example, the World Bank Poverty and Gender
Group Report (2012) shows that women’s control over resources creates spillover
benefits that have a significant positive impact on the health and education of
children, thus leading to better well-being prospects for future generations.
2 Measuring Women’s Empowerment in Rwanda 13
Similarly, Golla et al. (2011) highlight women’s empowerment as one of the key
drivers in promoting their abilities, rights, and well-being which subsequently
reduce poverty and increase economic growth, productivity, and efficiency.
However, very few empirical studies use Rwandan data, for example Ali et al.
(2014) in their study on the environmental and gender impact of land tenure reg-
ularization in Africa and Mukashimana, and Sapsford (2013) in their study on
marital conflicts in Rwanda.
In this study, we investigate the determinants of women’s empowerment in
Rwanda, especially what determines household decision-making and self-esteem.
We address two questions: Whether variables of sources of empowerment (edu-
cation, employment for cash, regular media exposure, and wealth) have a significant
positive association with women’s empowerment. Some variables of ‘setting’ (age
of the respondent and children ever born) are positively related to women’s
empowerment, while others such as residence and the age at first marriage are
negatively associated with women’s empowerment.
Data used in the current study are from the Demographic and Health Survey
(DHS) conducted in 2010 by the National Institute of Statistics for Rwanda (NISR
2010a, 2013). Respondents were married women aged between 15 and 49.
A multiple regression analysis was used to empirically analyze the determinants of
women’s empowerment in Rwanda. A multinomial logistic regression was also
used to examine the relationship between household decision-making, justifications
about wife beating, and women’s empowerment covariates.
We found evidence that women’s empowerment can be achieved through pro-
viding education, media exposure, labor force participation, shifting negative tra-
ditional cultural norms (such as giving respect to women with more children,
marrying girls at an earlier age), and by focusing on integrated development.
The rest of this paper is organized as follows: The next section reviews literature
on the relationship between women’s empowerment and health outcomes, labor
force participation, access to finance and cultural norms. Section 2.3 describes the
empirical strategy. After an overview of the findings in Sect. 2.4, these are dis-
cussed in Sect. 2.5. The last section gives a conclusion.
2.2 Literature Review
We review literature from three perspectives: The first is concerned with the defi-
nitions of women’s empowerment. The second pertains to the determinants of
women’s empowerment and the association between their empowerment and dif-
ferent health outcomes, cultural norms and the influence of labor force participation
and women’s access to finance on their empowerment. The third strand relates to
the conceptual framework.
2.2.1 Definitions of Women’s Empowerment
Several attempts have been made by authors to improve upon definitions of

women’s empowerment. Empowerment is a continuous, phased, and relational
process that occurs across scales and pathways (Goldman and Little 2014). Allsopp
and Tallontire (2014) define empowerment as a dynamic process that follows a
series of sequential steps in which ownership of one type of power increases the
likelihood and the ability to exercise other forms of power thus creating a positive
‘power spiral.’
Kabeer (2005) views the empowerment concept as revolving around the idea of
power to make a choice and conceptualizes disempowerment as the denial of the
possibility of making a choice by people who deserve to make the choice. Put
differently, empowerment can be conceptualized as a dynamic process by which
people who were previously deprived of the ability to make a choice gain such an
ability. For this to happen and the choice to be successful, there should be the
capacity or possibility to choose otherwise.
Empowerment is a person’s potential to make functional choices, that is, the
ability to translate choices into desired outcomes and actions (Alsop and Heinsohn
2005). Kishor and Subaiya (2008) define empowerment as a process that enables
powerless people to have control over the circumstances of their lives. The idea
behind this is not power to dominate over others but power to achieve goals and
ends, and this process appears to be affected by different social, cultural and eco-
nomic factors (Upadhyay et al. 2014).
Empowerment is a process which results from two milestones—agency and
opportunity structure. Agency is defined as the potential to make effective choices,
and opportunity structure is conceptualized as the environment/context in which
individuals exercise agency or pursue their interests including institutional, political
and social contexts, and societal informal rules and norms (Samman and Santos
2009).
However, three main concepts should be analyzed cautiously while defining and
measuring empowerment—the existence of choice (whether a choice exists), use of
choice (whether individuals use a chance to choose), and the achievement of choice
(whether the choice generates desired outcomes/results) (Samman and Santos
2009).
Choice can either be the first choice or a ‘strategic life choice’ (choice of
livelihood, choice of residence, choice of a partner, whether to have children or not
and the number of children to have, who has rights over the children, freedom of
movement, and the choice of friends). Second-order choices are choices that are not
strategic to life (Kabeer 1999a, b). The potential to make strategic life choices can
be conceptualized in the form of three dimensions or ‘moments’—resources
(pre-conditions to empowerment), agency (process), and achievements (outcomes).
According to Kabeer (2005), agency can be either passive (action taken when the
choice is limited), active (meaningful and purposeful choice), greater effectiveness
of agency (carrying out their roles and responsibilities), and transformative (ca-
pacity to act on the restrictive aspects of roles and responsibilities and being able to
challenge them).
2.2.2 Some Major Theories on Women’s Empowerment
In the new global economy, women’s empowerment has become a central issue for
countries to achieve development goals such as economic growth, poverty reduc-
tion, health, education, and welfare (Golla et al. 2011). Of late there is a renewed
interest in the relationship between women’s empowerment and health outcomes.
Some of these theories focus on women’s empowerment and health care use
(Bloom et al. 2001; Fotso et al. 2009; Lee-Rife 2010; Sado et al. 2014). Women’s
empowerment has been identified as a driving force in ensuring improved maternal
health care (Sado et al. 2014). The place of delivery is mainly influenced by wealth,
education, and demographic and health covariates, while autonomy,
decision-making and freedom of movement are found to have little influence on the
place of delivery (Fotso et al. 2009).
Women’s involvement in decision-making and their attitudes toward negative
cultural norms such as domestic violence have been highlighted as the main
determinants in the use of maternal healthcare services (Sado et al. 2014).
Overall, these studies highlight the need for policy actions that focus not only on
education but also on other factors that are likely to enhance health status with the
aim of improving health outcomes for women and their families.
However, a majority of these maternal health studies mainly focus on women’s
individual-level variables such as age, education, and income or community-level
factors while little attention is paid to the effect of bargaining powers within
households. Thus, without an unbiased and accurate measurement of power,
decision-making processes and different paths through which they affect repro-
ductive health outcomes, our understanding of the covariates of maternal health and
child health are incomplete.
A large and growing body of literature has investigated the association between
women’s empowerment and fertility preferences (Abadian 1996; Al Riyami et al.
2004; Larsen and Hollos 2003; Patrikar et al. 2014; Schuler et al. 1996; Upadhyay
and Karasek 2012; Upadhyay et al. 2014). Fertility preferences are mainly influ-
enced by women’s resource control, freedom of movement, and freedom from
household domination. The most striking result to emerge from the data is that all
three variables exert little influence on contraceptive use (Schuler et al. 1996).
The results are not consistent with regard to the number of children because some of
the studies show a negative relationship between women’s empowerment and the
number of children, while others show that there is a positive connection between
women’s empowerment and fertility preferences (having children or not). A few
studies also show that there is no connection between empowerment and fertility
preferences (Upadhyay et al. 2014).
Women’s access to fundamental freedoms and increased access to and control

over resources improves not only their welfare but also contributes to a reduction in
fertility (Abadian 1996). Women’s autonomy, as measured by the level of educa-
tion, age at first marriage, and spousal age difference, is inversely associated with
fertility (Abadian 1996). Wealth is likely to increase not only access to health care
and reducing child mortality rates but also in increasing access to education and
reducing child labor through increased chances for children to attend school
(Abadian 1996). Larsen and Hollos (2003) postulate that the progression from
having one child to the next declines owing to the status of women, especially free
partner choice, women’s education, and household wealth. Attitudes toward wife
beating have a negative relationship with a small ideal number of children while
household decision-making and positive attitudes toward violence are strongly
associated with a larger ideal number of children (Upadhyay and Karasek 2012).
However, these findings suggest the need for further research to determine the most
appropriate empowerment measures that are context-specific. These findings also
highlight the need to emphasize on not only factors enhancing health outcomes but
also on other factors that are driving forces for an improved quality of life.
A lot of previous research on women’s empowerment has mainly focused on the
determinants of women’s empowerment indicators, including household
decision-making and self-esteem (El-Halawany 2009; Ghuman et al. 2004; Kishor
and Subaiya 2008; Mahmud and Tasmeen 2014; Mahmud et al. 2012; Malhotra and
Mather 1997; Sado et al. 2014; Trommlerova et al. 2015). Measuring a dynamic
process like women’s empowerment necessitates indicators that measure the end
result, that is, indicators that measure evidence of empowerment, the various
sources of empowerment, and the setting of empowerment. Potential sources of
empowerment are defined as those factors which provide a basis for empowerment,
including knowledge, media exposure, and access to and control over resources (as
explained by being employed for cash). Indicators of the setting for empowerment
are those conditions that reflect both the past and current environments of the
respondents, and these factors appear to condition the views and the chances
available for women (Kishor and Subaiya 2008).
Empowerment is largely determined by education, age, economic activity,
country of residence, and being a polygamous married male (see Trommlerova
et al. 2015). Kishor and Subaiya (2008) argue that social development indicators
such as education are positively associated both with taking decisions alone and
jointly. They further show that women’s empowerment is largely determined by
access to and control over resources, indicators of sources of empowerment (edu-
cational attainment, employment for cash and media exposure) and a setting of
empowerment including indicators such as a higher age at first marriage and smaller
spousal age difference.
A positive association has been found between household decision-making and
other factors related to women’s economic empowerment (Sado et al. 2014).
Household wealth is a strong determinant of resource control but it has a significant
negative association with women’s overall household decision-making, and the
association between covariates and different empowerment indicators is not
consistent (Mahmud et al. 2012). Factors associated with sources of empowerment

(employment, education, and wealth status) have higher explanatory powers than
factors related to the setting of empowerment (age and family structure) (see Sado
et al. 2014).
Mahmud et al. (2012) show that there is no association between women’s
freedom of mobility and household wealth. This is not surprising because freedom
of mobility is high for the poorest women who are always obliged to travel outside
their homes to participate in the labor force. They further state that women from
wealthier households are less likely to have a say in household decision-making;
instead, they tend to have the view that their voice is not relatively worthwhile but
there is a high likelihood of their having access to cash for spending. Conversely
and surprisingly, residing in an extended family increases the likelihood of a
woman having high decision-making powers and self-esteem (Sado et al. 2014).
However, there are variations and differences in the nature and determination of
financial, social, and organizational dimensions which imply that women’s control
over one of the family aspects does not necessarily imply control over other aspects.
For example, while education and employment are the main determinants of a
woman’s input in financial decision-making, these variables exert no influence on
social and organizational related household decision-making.
Three important themes emerge from studies on the determinants of women’s
empowerment discussed so far: (i) measurement issues still exist while translating
‘evidence of empowerment’ and ‘access to sources of empowerment’ into agency
especially using cross-sectional survey data; (ii) it is very important to go beyond
structural and merely simplistic factors (family, social, and economic) to measure
women’s empowerment in a comprehensive way; and (iii) these difficulties in
measuring women’s empowerment call for an in-depth understanding of gender
relations by spending enough time in the community and doing pre-testing.
2.2.3 Conceptual Framework
Women get empowered through two pathways (different ways of being and
experience sharing) that operate individually. However, it is also found that a
woman’s potential to attain positive outcomes is accelerated when she possesses
more than one pathway (Allsopp and Tallontire 2014). The level of empowerment
in a village depends on different pathways (personal, economic, and political) and
linkages across scale ranging from personal bodies and household relations to the
community (Goldman and Little 2014). Kabeer (1999a, b) points out that women’s
empowerment is conceptualized as a three-dimensional process that encompasses
resources or pre-conditions of empowerment, agency, or process and achievements
that measure outcomes. Kabeer further argues that women’s potential to exercise
strategic life choices is conceptualized in terms of three dimensions or moments for
the social change process to be completed:
Resources ðpre-conditionsÞ [ agency ðprocessÞ [ achievements ðoutcomesÞ
Kabeer (2001) conceptualize empowerment in terms of agency, resources, and

achievements. Kishor and Subaiya (2008) conceptualize the empowerment process
in terms of evidence and sources of empowerment but acknowledge that the extent
of translating evidence on empowerment and access to sources into agency and the
capacity to make a choice and act upon it is not yet measured. Samman and Santos
(2009) claim the importance of three indicators of empowerment: source, evidence,
and setting.
Measuring the empowerment process is conceptualized at different levels, in
different domains, and at different levels of an actor’s life (Alsop and Heinsohn
2005). These domains include the state in which people are civic actors, the market
in which persons are economic actors, and society in which they are social actors.
These domains also contain sub-domains which in turn comprise of different levels.
For example, the market domain is composed of the sub-domains of credit, labor,
and production and consumption of goods. Society comprises of family and com-
munity. There also exist three levels at which empowerment is exercised: the local
level which is contiguous with people’s residence, the intermediate level which is
between the residential and national levels, and finally, the national level which is
thought to be the furthest from an individual.
Kabeer (2005) claims that the empowerment concept can be measured through
three interlinked dimensions—agency, resources, and achievements. Agency is
central to the concept of empowerment and is defined as the process by which a
choice is made and transformed into effect. Resources are conceptualized as a
medium through which agency is exercised and achievements are conceptualized as
outcomes of agency. Similarly, Rowlands (1997) and Samman and Santos (2009)
highlight that agency and empowerment are interrelated concepts, that is,
empowerment does not happen in a vacuum. In the categorization of power,
Rowlands classifies empowerment as a process by which people gain power over
(resistance to manipulation), power to (ability to create new possibilities), power
with (ability to be an actor in a group), and power from within (enhancing
self-respect and self-acceptance).
Alsop and Heinsohn (2005) postulate that the level of empowerment for a given
person is associated with his/her personal capacity to make meaningful and pur-
posive choices (agency) and the institutional environment in which the choices are
made (opportunity structure). Similarly, Samman and Santos (2009) argue that
empowerment occurs along different dimensions including economic,
social-cultural, legal, political, and psychological. They further find that agency is
exercised at different levels—the micro-level (household), meso-level (community),
and macro-level (state and the country). The empowerment model consists of five
stages: motivation for action, empowerment support, initial individual action,
empowerment program, and institutionalization and replication (Kar et al. 1999).
2.3 Empirical Strategy
Our study set out to assess what determines women’s empowerment in Rwanda
using household decision-making and self-esteem indicators. The results will
extend our knowledge of variables which are a source and setting of empowerment.
The data used are from the 2010 Demographic and Health Survey (DHS) by the
National Institute of Statistics for Rwanda (NISR 2010b, 2013). The respondents
were married women aged between 15 and 49. A multiple regression analysis was
used to empirically analyze the determinants of women’s empowerment in Rwanda.
A multinomial logistic regression was used to examine the relationship between
household decision-making, justifications for wife beating and, women’s empow-
erment covariates.
2.3.1 Model Specification
2.3.1.1 Women’s Empowerment and Its Covariates
In order to provide a proper specification of the model and to conduct a sensitivity

analysis of the results, the baseline model was specified in three ways:
• CEI = f(Age, Educ Wealth, PaidWork, Resid, Media, Children, AgeFM).
• DEC.IND = f(Age, Educ, Wealth, PaidWork, Resid, Media, Children, AgeFM)
• EST.IND = f(Age, Educ, Wealth, EmpCash, Resid, Media, Children, AgeFM)
where CEI is the cumulative empowerment index which is obtained by com-
bining the decision-making and self-esteem indices. DEC.IND is the
decision-making index. EST.IND is the self-esteem index. Age in age cohorts
represents the age of the respondents classified into four categories (15–19, 20–29,
30–39, and 40–49). Educ is a respondent’s education level (no education, primary
education, secondary education, and higher education). Wealth is a respondent’s
wealth that falls in five categories (poorest, poorer, middle, richer, and richest).
EmpCash is defined as a respondent’s employment status where the respondent can
either be employed for cash or not. Resid is the residence of a respondent (either in
an urban area or a rural area). Media is media exposure that is defined as either
regular media exposure or no-media exposure. The variable Children indicates
number of children ever born (none, 1 or 2, 3 or 4, and 5 and above). AgeFM
represents the age of a respondent at first marriage. This is classified into three
groups (less than 18, 18–24, and 25 years and above).
2.3.1.2 Household Decision-Making and Attitudes Toward Physical

Abuse
Questions on who had the final say on what to do with a respondent’s earnings,
respondent’s health care, large household purchases, and visits to family or relatives
were asked during the survey. Different responses for each question were labeled
as: others (0), joint decision (1), and decision alone (2). Then, each decision was
used as a dependent variable to determine the likelihood of that decision being
taken given different covariates of women’s empowerment using a multinomial
logistic regression.
Moreover, attitude toward physical abuse (in the survey labeled as wife beating)
was investigated using five questions that were asked to know the circumstances
under which wife beating was justified: going outside without permission,
neglecting children, arguing with husband, burning food and refusing to have sex
with her husband. Responses to the questions were labeled: Yes (1), No (2) and
others (0). Then, a multinomial logistic regression was used to regress each decision
on different covariates of women’s empowerment to determine the odds in their
ratios. The covariates used were the same as those used in the previous model with
women’s empowerment, that is, age group, children ever born, education, media
exposure, employment for cash, residence, wealth and age at first marriage.
This baseline model is associated with models used by Kabeer and Subaiya
(2008), Sado et al. (2014), Mahmud et al. (2012), and Mahmud and Tasmeen
(2014). Kabeer and Subaiya (2008) point out that women’s empowerment is largely
determined by access and control over resources, indicators of sources of
empowerment (educational attainment, employment for cash and media exposure)
and a setting of empowerment including indicators such as a higher age at first
marriage and smaller spousal age difference.
The main weakness of Kabeer and Subaiya’s (2008) study is the paucity of data
on all indicators of women’s empowerment (only data on household
decision-making and attitudes toward wife beating was available) and some of the
covariates that were used in previous studies. Another weakness of their study is
that the results might have been affected by measuring women’s empowerment
using data which contained missing values.
2.3.1.3 Data and Variables
Data used in our study were obtained from the Demographic and Health Survey
(DHS 2010a). The respondents were married women aged between 15 and 49.
Women’s empowerment was investigated using two indicators—household
decision-making and attitudes toward gender roles.
A. Dependent variables
The dependent variables used in our study were the cumulative empowerment
index (the main component) and its constituents, that is, the decision-making index,
the self-esteem index, decision-making (alone and jointly) and agreeing with jus-
tifications for wife beating (yes or no).
The decision-making index
Respondents were asked different questions regarding who had the final say on
different household decisions such as respondent’s health care, visits to family and
relatives, large household purchases and decision on what to do with the money that
the husband earned. The responses were coded 1 if the decision was taken by the
respondent alone, 2 if the decision was jointly taken by the respondent and her
husband, 3 if the decision was taken by the respondent and another person, 4 if the
decision was taken by the husband/partner alone, 5 if the decision was taken by
someone else, and 6 for others.
The decision-making index was computed by assigning scores to different
responses. A (2) was assigned to every response where the decision was taken alone
by the respondent, (1) was assigned to every response where the decision was
jointly taken and (0) otherwise. Then, individual scores for the different decisions
were added to get total scores out of 10 (10 is the maximum score), that is, 2 (marks
maximum/decision) * 5 questions.
The self-esteem index
Respondents were asked questions about their attitudes toward gender roles and
norms. They were also asked whether wife beating was justified under one of the
following circumstances:
• When she goes out without telling her husband.
• If she neglects the children.
• If she argues with her husband.
• If she refuses to have sex with her husband.
• If she burns the food.
Responses were coded (1) if the respondent said yes and (0) if the respondent
said no.
In our study, the scores assigned to different responses were: (1) for every
response where the respondent said no and (0) for every response where the
respondent answered yes. Finally, individual scores were added to get the total
scores out of five (maximum 1 mark *5 questions).
The value of either the decision-making index or the self-esteem index should
fall in the interval 0–1 or alternatively between 0 and 100%.
The cumulative empowerment index
While conducting DHS, the respondents were not asked to assign weights to
different indicators of women’s empowerment. Therefore, we assumed that all the
indicators had the same weight and then computed the cumulative empowerment
index using a nonparametric method as indicated by:
CEI ¼ ðW1 Dec.Index þ W2 S.Est.IndexÞ=2
where W1 and W2 are weights assigned to each woman’s empowerment indices

which reflect weights attached to each indicator in the aggregation.
Dec.Index is the decision-making index which was obtained by adding the
scores obtained from responses to different questions about household
decision-making.
S.Est.Index is the self-esteem index which was obtained by adding scores of
different responses about respondents’ attitudes toward justifications for wife
beating.
The same approach for computing women’s empowerment has been followed by
authors in previous studies such as by Lee-Rife (2010), Mahmud and Tasneem
(2014), Mahmud et al. (2012), Patrikar et al. (2014), Sado et al. (2014), Sultana and
Hossen (2013), Upadhyay and Karasek (2012).
Decision-making (alone or jointly)
Different decisions were labeled according to who took the decision. Any
decision that was taken by the respondent herself was labeled (2). A decision that
was jointly taken by the respondent and her husband or by the respondent and
another person was labeled (1). Finally, other possible options mentioned earlier
were labeled (0).
Agreeing with justifications for wife beating
Agreement with any of five reasons was coded (1) while rejection of wife
beating for any of the five reasons was coded (2). Others were coded (0).
This type of computation is consistent with that used by Kishor and Gupta
(2004) and Kishor and Subaiya (2008).
B. Independent variables
Women’s empowerment covariates include variables at household and com-
munity levels. These variables include age in years, children ever born, regular
exposure to media, employment for cash, age at first marriage, residence in urban
area, spousal age difference, and household wealth. Some of these variables are
considered the potential sources of empowerment, specifically age, media exposure,
educational level, and employment for cash. Other variables are conceptualized as
aspects of a setting for empowerment (nuclear family and urban residence, wealth,
age at first marriage and spousal age difference) (Kishor and Subaiya 2008).
Age: women’s age is positively associated with her level of empowerment as
believed by a majority of religions around the world especially when women’s
empowerment is measured using indicators that measure household
decision-making. Nonetheless, when empowerment is measured using indicators of
attitudes toward gender equality, it is not clear whether empowerment is positively

associated with age.
Number of children ever born: More respect is accorded to women who have
children. Nonetheless, it is hard to predict the direction of causality between the
number of children ever born and attitudes to gender roles.
Education and media exposure: Education and media exposure equip women
with information and means that can allow them to effectively adapt to the changing
modern world thus increasing their level of empowerment. People with higher
education are exposed to new ideas and alternative behaviors and gender norms and
roles. Thus, education is a critical source of empowerment. For example, women
with higher education are less likely to accept wife beating for any reason and are
more likely to believe that it is a woman’s right to refuse sex with her husband.
Employment for cash: Earning cash is more likely to increase women’s bar-
gaining powers within households. It gives women a sense of personal achieve-
ment, and it also helps in creating awareness about the fact that they are like men
and can provide financial support for their families. In addition, off-farm profes-
sional occupations potentially empower women through financial autonomy and
alternative sources of identity and social exposure to new structures of power free of
kin networks (Kishor and Subaiya 2008).
Media exposure: Access to media (watching television on a regular basis,
reading newspapers, and frequency of listening to the radio) have the same direction
of causality as education as they too expose women to new ideas and gender roles
and norms. This postulates that women with frequent exposure to media have a low
likelihood of accepting that their being beaten is justified for any reason and they
are more likely to accept that it is a woman’s right to refuse sex with her husband
when necessary.
Age at first marriage: A younger age at first marriage is negatively associated
with women’s empowerment as it puts to an end a woman’s chances to have access
to sources of empowerment like education (Kishor and Subaiya 2008). In addition,
a younger age at first marriage is associated with a high probability of a woman
agreeing that wife beating is justified for any reason.
Urban residence: In cities, there are people from different backgrounds doing a
variety of off-farm jobs with a variety of services, including easy access to edu-
cation and regular media exposure. Hence, as compared to rural women, urban
women are more likely to reject wife beating for any reason. These women are of
the view that they have the right to refuse sex with their husbands.
Wealth: Wealth and gender equality do not go hand in hand easily. On the one
hand, household wealth is a source of empowerment as it brings education,
exposure to media and exposure to networks of intellectuals, but on the other hand,
wealthier households are more likely to be strongly attached to patriarchal gender
norms.
Husband’s education: A husband’s education level, especially secondary edu-
cation and above, is likely to have a positive association with women’s
empowerment.
2.4 Empirical Results
The results of a linear regression analysis between women’s empowerment (cu-

mulative empowerment index, decision-making index and self-esteem index) and
its covariates are presented in Table 2.1. The results of a multinomial logistic
regression analysis between women’s empowerment indicators (taking decisions
alone or jointly), attitudes toward justifications for wife beating), and women’s
empowerment covariates are summarized in Tables 2.2, 2.3, and 2.4.
2.4.1 Relationship Between Women’s Empowerment

and Its Covariates
Table 2.1 depicts the relationship between women’s empowerment and its
covariates. In column 1, it gives the association between the cumulative empow-
erment index and its covariates. It is apparent from this column that there is a
significant positive correlation between women’s empowerment and some of its
covariates such as age, number of children ever born, education, employment for
cash, exposure to media and wealth. Younger women in their twenties are less
likely to be empowered (0.0274) as compared to older women (0.0339). The results
show that women with more children (five and above) are more likely to be
empowered (0.160) than women with less children (one or two) whose coefficient is
only 0.114. The results also indicate that women with higher education are more
empowered (0.171) than those with primary education (0.030). Similarly,
employment for cash and media exposure is positively associated with the cumu-
lative empowerment index (see Table 2.1, column 1). Women in wealthier families
are more likely to be empowered (0.0525) as compared to those from poor families
(0.0190).
In the same way, the same direction of causality is observed with the
decision-making index (see Table 2.1, column 2). These results match those
observed in previous studies. Women’s empowerment was found to be positively
associated with education levels, age, household wealth (income), and employment
status (such as in Sultana and Hossen 2013). Likewise, Khan and Noreen (2012)
found that women’s empowerment was mainly determined by age, husband’s
education, assets inherited from the father, number of children alive, and the
amount of microfinance.
On the contrary, living in a rural area and getting married at a younger age were
found to be negatively associated with both the cumulative empowerment and
decision-making indices. Moreover, the results reveal a significant positive asso-
ciation between self-esteem and variables such as education, wealth, and age of the
respondent (see Table 2.1, column 3). Women with higher education had higher
levels of self-esteem (0.268) than those with primary education (0.0527). Women
from wealthier families had higher self-esteem (0.080) than those from poor
Table 2.1 Women’s empowerment and its covariates

Cumulative Decision-making Self-esteem
empowerment index index index
Age groups
15–19 (Ref.)
20–29 0.0274*** 0.0525*** 0.00225
(5.22) (9.62) (0.26)
30–39 0.0455*** 0.0591*** 0.0320**
(6.45) (8.04) (2.77)
40–49 0.0339*** 0.0230* 0.0448**
(3.58) (2.33) (2.89)
Children categories
None (Ref.)
1 or 2 0.114*** 0.223*** 0.00424
(22.15) (41.77) (0.50)
3 or 4 0.134*** 0.276*** −0.00743
(21.43) (42.32) (−0.72)
5 and above 0.160*** 0.332*** −0.0116
(21.72) (43.25) (−0.96)
Education
None (Ref.)
Primary 0.0365*** 0.0203*** 0.0527***
(7.34) (3.92) (6.47)
Secondary 0.104*** 0.0193** 0.188***
(15.18) (2.71) (16.83)
Higher 0.171*** 0.0730*** 0.268***
(12.19) (5.01) (11.71)
Employment for cash
No paid work (Ref.)
Paid work 0.0202*** 0.0332*** 0.00734
(5.26) (8.28) (1.17)
Media exposure
No regular media
exposure (Ref.)
Regular media exposure 0.0159*** 0.0237*** 0.00820
(4.54) (6.47) (1.43)
Residence
Urban (Ref.)
Rural −0.0230*** −0.00642 −0.0396***
(−4.36) (−1.17) (−4.59)
(continued)
Table 2.1 (continued)

Cumulative Decision-making Self-esteem
empowerment index index index
Age at first marriage
Less than 18 years (Ref.)
18–24 years −0.0238** −0.0473*** −0.000
(−3.10) (−5.91) (−0.02)
25 years and above −0.0281** −0.0578*** 0.0014
(−2.79) (−5.49) (0.09)
Wealth index
Poorest (Ref.)
Poorer 0.0190*** 0.0103 0.0278**
(3.52) (1.82) (3.14)
Middle 0.0295*** 0.0102 0.0488***
(5.37) (1.78) (5.43)
Richer 0.0381*** 0.0194*** 0.0568***
(6.77) (3.31) (6.17)
Richest 0.0525*** 0.0250*** 0.0800***
(8.44) (3.86) (7.86)
Cons 0.265*** −0.0639*** 0.593***
(31.02) (−7.20) (42.48)
N 13,671 13,671 13,671
Note t-statistics in parenthesis
*p < 0.05, **p < 0.01, ***p < 0.001
families (0.020). However, residence (rural) and age at first marriage were found to
be negatively associated with self-esteem (see Table 2.1, column 3).
These results are in agreement with those obtained by Kishor and Subaiya
(2008) who found that women in urban areas were more likely to reject wife beating
as compared to women in rural areas and younger age at first marriage was asso-
ciated with a high likelihood of accepting justifications for wife beating.
2.4.2 Determinants of Household Decision-Making
Tables 2.2 and 2.3 present odds ratios (using a multinomial logistic regression) for
respondents’ decision-making (jointly and alone) on five household decisions—
what to do with a respondent’s earnings, respondent’s health care, large household
purchases, visits to family or relatives, and what to do with the money that the
husband earns. Women in their twenties had high odds in favor of taking decisions
alone on all the five aspects as compared to older women. Table 2.2 shows that
women with more children (five and above) were more likely to take the five
Table 2.2 Odds ratios (using a multinomial logistic regression) for household decision-making
(alone)
What to do with Respondent’s Large Visits to What to do
respondent’s health care household family and with husband’s
earnings purchases relatives earnings
Age groups
15–19
20–29 1.902*** 1.919*** 2.039*** 2.117*** 1.985***
(9.81) (12.42) (13.56) (14.20) (13.58)
30–39 1.741*** 1.805*** 2.020*** 2.039*** 1.850***
(8.60) (10.94) (12.65) (12.79) (11.87)
40–49 1.304*** 1.480*** 1.678*** 1.482*** 1.376***
(5.96) (8.08) (9.53) (8.33) (7.98)
Children categories
None
1 or 2 2.318*** 2.405*** 2.354*** 2.524*** 2.412***
(23.88) (28.28) (29.94) (32.17) (30.50)
3 or 4 2.588*** 2.822*** 2.556*** 2.839*** 2.628***
(24.24) (29.22) (28.69) (31.30) (29.30)
5 and 2.806*** 3.305*** 2.979*** 3.494*** 3.101***
above (23.68) (30.22) (29.49) (33.28) (30.47)
Education
No
education
Primary 0.0980 0.208*** 0.133* 0.151* 0.158**
(1.51) (3.33) (2.25) (2.45) (2.70)
Secondary 0.149 0.0228 0.0248 0.0289 0.0267
(1.45) (0.23) (0.27) (0.30) (0.29)
Higher 0.904*** 0.516** 0.733*** 0.707*** 0.613***
(4.56) (2.60) (4.00) (3.73) (3.38)
Employment for cash
No paid
work
Paid work 1.186*** −0.0383 0.127* 0.190*** 0.0558
Exposure to media
No media
exposure
Low 0.359*** 0.365*** 0.430*** 0.463*** 0.471***
media (6.33) (6.75) (8.41) (8.71) (9.26)
exposure
High 0.344* 0.442** 0.273 0.362* 0.464**
media (2.20) (2.89) (1.89) (2.43) (3.25)
exposure
(continued)

What to do with Respondent’s Large Visits to What to do
respondent’s health care household family and with husband’s
earnings purchases relatives earnings
Residence
Rural 0.138 0.0805 0.0677 0.0465 0.192**
(1.73) (1.06) (0.95) (0.63) (2.71)
Less than
18
18– −0.246** −0.454*** −0.369*** −0.502*** −0.439***
24 years (−2.61) (−4.92) (−4.25) (−5.43) (−5.10)
25 and −0.311* −0.693*** −0.509*** −0.607*** −0.575***
above (−2.45) (−5.65) (−4.44) (−5.03) (−5.05)
Wealth index
Poorest
Poorer 0.250** 0.289*** 0.140* 0.259*** 0.193**
(3.21) (3.89) (1.99) (3.54) (2.77)
Middle 0.311*** 0.291*** 0.224** 0.324*** 0.244***
(3.90) (3.80) (3.11) (4.31) (3.41)
Richer 0.404*** 0.560*** 0.379*** 0.527*** 0.440***
(4.97) (7.13) (5.13) (6.81) (5.98)
Richest 0.473*** 0.542*** 0.442*** 0.500*** 0.493***
(5.09) (6.07) (5.26) (5.71) (5.90)
Cons −6.478*** −5.230*** −5.141*** −5.317*** −5.242***
(−29.19) (−28.86) (−29.33) (−29.88) (−30.49)
N 13,671 13,671 13,671 13,671 13,671
*p < 0.05, **p < 0.01, ***p < 0.001
household decisions alone as compared to women with less children. The results
also show that women with higher education had higher chances of taking decisions
alone compared to those with primary education. Media exposure was found to
increase a respondent’s likelihood of taking decisions alone for all the five ques-
tions. Likewise, women from wealthier families had higher odds when it comes to
taking decisions alone as compared to those from poor families. Surprisingly,
women with low age at first marriage (18–24) were found to be more likely to take
decisions alone compared to those with a higher age at first marriage. However,
employment for cash influenced taking decisions alone for some decisions, while
residence had no influence on decision-making alone.
As shown in Table 2.3, the odds of joint decision-making for four of the five
questions were high among younger women as compared to older women.
Table 2.3 Odds ratios (using a multinomial logistic regression) for household decision-making
(jointly)
What to do Respondent’s Large Visits to What to do
with health care household family or with
respondent purchases relatives husband’s
earnings earnings
Age groups
15–19
20–29 2.402*** 2.399*** 2.019*** 2.090*** 2.193**
(4.03) (5.66) (3.31) (5.25) (2.95)
30–39 2.613*** 2.585*** 2.576*** 2.184*** 2.461**
(4.33) (6.01) (4.16) (5.36) (3.24)
40–49 2.296*** 2.623*** 2.568*** 2.059*** 2.217**
(3.74) (5.95) (4.05) (4.88) (2.84)
Children categories
None
1 or 2 2.460*** 2.706*** 2.162*** 2.659*** 2.269***
(10.94) (15.54) (7.82) (13.56) (6.91)
3 or 4 2.845*** 3.280*** 2.610*** 3.253*** 2.535***
(12.13) (17.92) (9.11) (15.83) (7.28)
5 and 3.171*** 3.588*** 2.949*** 3.824*** 3.113***
above (12.92) (18.44) (9.87) (17.52) (8.54)
Education
No
education
Primary 0.0189 0.175* 0.177 0.0860 0.0294
(0.19) (2.06) (1.47) (0.94) (0.19)
Secondary 0.0915 0.465*** 0.368 0.189 -0.005
(0.57) (3.49) (1.79) (1.21) (-0.02)
Higher 0.611* 1.155*** 0.799 0.710* 0.137
(2.04) (4.59) (1.93) (2.13) (0.27)
Employment for cash
No paid
work
Paid work 1.298*** 0.553*** 0.531*** 0.513*** −0.000
(10.54) (6.11) (3.86) (5.20) (−0.00)
Exposure to media
No media
exposure
Low −0.0250 0.241** −0.0912 0.277*** 0.093
media (−0.28) (3.21) (−0.85) (3.38) (0.66)
exposure
High −0.0962 0.209 −0.00921 0.0677 0.477
media (−0.38) (0.99) (−0.03) (0.26) (1.28)
exposure
(continued)

What to do Respondent’s Large Visits to What to do
with health care household family or with
respondent purchases relatives husband’s
earnings earnings
Residence
Urban
Rural −0.612*** −0.140 −0.528*** −0.223 −0.234
(−5.39) (−1.36) (−3.55) (−1.87) (−1.28)
Less than
18
18– −0.00261 −0.484*** −0.162 −0.447*** −0.219
24 years (−0.02) (−4.06) (−0.99) (−3.40) (−1.03)
25 and 0.0264 −0.556*** −0.0500 −0.482** −0.253
above (0.15) (−3.65) (−0.24) (−2.89) (−0.91)
Wealth index
Poorest
Poorer −0.0432 −0.106 −0.238 0.0791 −0.291
(−0.36) (−1.05) (−1.74) (0.75) (−1.58)
Middle −0.117 −0.128 −0.514*** −0.0800 −0.745***
(−0.92) (−1.23) (−3.32) (−0.70) (−3.38)
Richer −0.236 −0.0575 −0.677*** −0.215 −0.371
(−1.76) (−0.53) (−4.02) (−1.75) (−1.78)
Richest 0.245 0.0635 −0.536** −0.287* 0.022
(1.77) (0.52) (−2.95) (−2.04) (0.11)
Cons −7.697*** −7.235*** −6.827*** −6.784*** −7.265***
(−12.60) (−16.40) (−11.02) (−16.11) (−9.58)
N 13,671 13,671 13,671 13,671 13,671
*p < 0.05, **p < 0.01, ***p < 0.001
Surprisingly, older women were more likely to take a decision jointly on their
health care as compared to younger women. Joint decision-making was found to be
an increasing function of the number of children that a woman had. Employment
for cash increased the odds of joint decision-making on all five household deci-
sions. However, variables such as education, wealth, media exposure, and residence
influenced only a few of the decisions. For example, residence (rural areas) reduced
a respondent’s likelihood to jointly decide about what to do with her earnings and
about large household purchases.
Table 2.4 Odds ratios (using a multinomial logistic regression): justifications for physically
abusing a wife
Beating Beating Beating Beating Beating
justified if justified if justified if justified if justified if
she goes she wife wife refuses wife burns
without neglects argues to have sex the food
telling her children with her with her
husband husband husband
Age group
15–19
20–29 0.130* −0.056 0.003 −0.026 0.007
(2.18) (−0.98) (0.05) (−0.44) (0.10)
30–39 −0.0866 −0.276*** −0.185* −0.0731 −0.159
(−1.09) (−3.59) (−2.25) (−0.91) (−1.61)
40–49 −0.229* −0.320** −0.299** −0.115 −0.310*
(−2.14) (−3.10) (−2.72) (−1.07) (−2.34)
Children categories
None
1 or 2 −0.040 −0.053 0.092 0.037 0.051
(−0.70) (−0.96) (1.54) (0.64) (0.71)
3 or 4 0.0267 0.0606 0.178* 0.0465 0.124
(0.38) (0.89) (2.48) (0.66) (1.44)
5 and above 0.0799 0.0889 0.203* 0.0501 0.127
(0.97) (1.11) (2.41) (0.61) (1.25)
Education
No education
Primary −0.225*** −0.230*** −0.315*** −0.320*** −0.367***
(−4.24) (−4.39) (−5.88) (−6.02) (−6.00)
Secondary −1.090*** −1.021*** −1.215*** −1.257*** −1.295***
(−13.35) (−13.34) (−14.32) (−15.23) (−12.03)
Higher −2.814*** −2.566*** −2.695*** −2.384*** −3.006***
(−7.62) (−8.64) (−7.28) (−7.99) (−5.09)
Employment for cash
No paid work
Paid work 0.0412 −0.0121 −0.115** −0.0948* −0.178***
(0.95) (−0.29) (−2.59) (−2.17) (−3.39)
Exposure to media
No media
exposure
Low media −0.111** −0.068 0.024 −0.0946* 0.018
exposure (−2.61) (−1.66) (0.57) (−2.20) (0.36)
High media −0.0934 0.106 0.0754 0.0459 0.206
exposure (−0.73) (0.90) (0.57) (0.36) (1.29)
(continued)

Beating Beating Beating Beating Beating
justified if justified if justified if justified if justified if
she goes she wife wife refuses wife burns
without neglects argues to have sex the food
telling her children with her with her
husband husband husband
Residence
Urban
Rural 0.148* 0.331*** 0.194** 0.299*** 0.209*
(2.37) (5.56) (2.99) (4.71) (2.57)
Less than 18
18–25 years 0.007 −0.005 0.009 0.0331 0.128
(0.09) (−0.07) (0.11) (0.39) (1.24)
25 and above 0.0850 0.054 −0.0337 −0.084 −0.001
(0.75) (0.50) (−0.29) (−0.76) (−0.01)
Wealth index
Poorest
Poorer −0.178** −0.0917 −0.157** −0.137* −0.180**
(−3.08) (−1.61) (−2.67) (−2.35) (−2.65)
Middle −0.262*** −0.137* −0.284*** −0.310*** −0.272***
(−4.40) (−2.35) (−4.68) (−5.17) (−3.85)
Richer −0.328*** −0.218*** −0.302*** −0.288*** −0.344***
(−5.33) (−3.62) (−4.82) (−4.67) (−4.63)
Richest −0.433*** −0.361*** −0.486*** −0.474*** −0.516***
(−6.20) (−5.36) (−6.75) (−6.71) (−5.88)
Cons −0.124 0.111 −0.202* 0.0142 −0.875***
(−1.29) (1.19) (−2.06) (0.15) (−7.46)
N 13,671 13,671 13,671 13,671 13,671
*p < 0.05, **p < 0.01, ***p < 0.001
2.4.3 Determinants of Respondents’ Attitudes Toward

Justifications for Wife Beating
Table 2.4 illustrates the odds ratios about respondents’ attitudes on justifications for
wife beating. Women with higher education were less likely to agree with wife
beating (for all the five reasons) than those with primary education. Women from
wealthier families were less likely to agree with wife beating for all five reasons
than those from poor families. Residing in rural areas was found to increase the
odds for agreeing with wife beating for all five reasons. However, variables such as
age, children ever born, media exposure, and paid work influenced some of the
reasons. Unlike our expectations, age at first marriage had no influence on attitudes
toward wife beating.
2.5 Discussion of the Results
Our study was designed to measure women’s empowerment in Rwanda using

indicators of household decision-making and self-esteem. Kabeer (2001) and
Kishor (2008) conceptualize empowerment in terms of agency, resources, and
achievements.
It was hypothesized that variables of sources of empowerment (education,
employment for cash, media exposure, and wealth) had a positive association with
women’s empowerment while variables of the setting for empowerment (residence,
age, children, age at first marriage) had either a positive or a negative influence on
women’s empowerment. For example, younger age at first marriage was expected
to be negatively associated with women’s empowerment while a higher age at first
marriage tended to be positively associated with women’s empowerment.
The results from our study show that older women were more likely to be
empowered (0.074) than younger women (0.039). Household decision-making was
found to be high among older women as compared to young women (see
Table 2.1). Similarly, the results show that old respondents had higher self-esteem
(0.0448) as compared to younger women (0.0225). A possible explanation for these
results is that marriage and child bearing are highly valued by a majority of the
societies and this allows women to gain respect, rights, and freedom. These results
are consistent with those obtained by Kishor and Subaiya (2008) in a cross-country
women’s empowerment comparison using DHS data.
Women with more children (five and above) were found to be more empowered
than women with less children (one or two). Likewise, household decision-making
was higher among women with more children than among those with lesser chil-
dren. Surprisingly, no relationship was found between self-esteem and the number
of children ever born. A possible explanation for this positive relationship between
women’s empowerment, decision-making, and child bearing is that more empow-
erment and status are accorded to women with children and this goes hand in hand
with a woman’s age.
The findings also reveal that women’s educational levels were positively asso-
ciated with their levels of empowerment. Women with higher education were more
empowered than those with primary education. Similarly, women with higher
education had higher decision-making abilities than those with primary education;
this is consistent with the findings of Sado et al. (2014). Women with higher
education had higher self-esteem than those with primary education (see Table 2.1),
and a possible explanation for this is that higher education exposes women to new
ideas and alternative gender norms and behaviors thus having a gender-egalitarian
view of the world. These results are in agreement with those obtained by Mahmud
et al. (2012). Employment for cash had a positive association with both the cu-
mulative empowerment index (0.0202) and the decision-making index (0.0332).
However, employment for cash had no association with the self-esteem index.
Regular media exposure was positively associated with both the cumulative
empowerment and decision-making indices. This can be attributed to the fact that
the media exposes women to a world outside their homes, including new ideas and
non-traditional roles for them. These results are consistent with Mahmud et al.’s
(2012) findings. Unlike our expectations, no relationship was found between media
exposure and women’s empowerment and self-esteem. Residence (rural area) was
negatively associated with the cumulative empowerment and self-esteem indices,
but it was unrelated to the household decision-making index (see Table 2.1).
Age at first marriage had a significant negative relationship with the cumulative
empowerment and decision-making indices (see Table 2.1). One possible expla-
nation for this is that an early age at first marriage limits the access that a woman
has to education. She also has less time for her development and maturity without
the interference of marriage and the responsibilities of raising children. Moreover,
being young she is less likely to be accorded much power and independence in her
parents’ home. These findings are similar to those by Kishor and Subaiya (2008).
However, unlike them, our study did not find any association between self-esteem
and age at first marriage.
Wealth was found to be positively associated with the cumulative empowerment
and self-esteem indices. Women from wealthier families were more empowered and
had higher self-esteem than those from poor families. However, wealth was posi-
tively associated with household decision-making for only the rich but was unre-
lated to the poorest, poorer, and middle-income families (see Table 2.1).
Younger women (20–29) were less likely to take decisions alone and jointly as
compared to those in the 30–39 years age bracket, but women in the 40–49 years
age group were less likely to take four or five decisions alone and jointly as
compared to women in their twenties (see Table 2.1). Surprisingly, older women
were more likely to take decisions jointly about their health care than younger
women (see Tables 2.2 and 2.3). These results are in line with those of previous
studies such as those by Mahmud et al. (2012), whose findings revealed that young
and older women had lower decision-making powers while women in their
mid-twenties had high decision-making powers. This phenomenon can be
explained by the fact that there are chances that young women live in extended
families and old women are no longer involved in decision-making as most of them
rely on their adult sons.
Decision-making alone and jointly increased with the number of children for all
five decisions (see Tables 2.2 and 2.3). These results further support Kishor and
Subaiya’s (2008) findings who state that the proportion of women who take
decisions alone or jointly increases with the number children.
As a potential source of empowerment, education was positively associated with
household decision-making, notably with decision-making alone. The odds of
women’s participation in decision-making increased with the level of education but
with variations in terms of type of participation and decisions. The results show that
compared to primary education, higher education was positively associated with

decision-making alone for all five decisions (see Tables 2.2 and 2.3). However, the
proportion of women with higher education who took decisions jointly was higher
for only three decisions (what to do with respondent’s earnings, respondent’s health
care and large household purchases). These results are in agreement with
El-Halawany’s (2009) findings which show that education was strongly associated
with women’s autonomy, empowerment, and gender equality through their par-
ticipation in household decision-making.
Employment for cash-affected decision-making alone (positive association) for
only three decisions (what to do with respondent’s earnings, large household
purchases, and visits to family or relatives) (see Table 2.2). Unlike our expecta-
tions, employment for cash-affected decision-making jointly for four decisions
(what to do with respondent’s earnings, respondent’s health care, large household
purchases, and visits to family or relatives) (Table 2.3). These results match those
observed in earlier studies such as those by Mahmud and Tasmeen (2014) who
argue that the likelihood of spending one’s own income on clothes, health care,
investments in major assets, and having a bank account were higher among women
with formal employment outside the family than in other categories. Similarly,
Malhotra et al. (2009) found that innovations promoted women’s empowerment
through increased freedom, having a say in household decision-making, control
over household resources, and confidence to challenge gender inequalities.
The odds in favor of taking a decision alone increased with the level of media
exposure for all five decisions. However, exposure to media affected joint
decision-making for only two decisions (respondent’s health care and visits to
family or relatives). These findings further support Kishor and Subaiya’s (2008)
findings that women with regular exposure to the media tend to have positive
attitudes toward gender equality than those who are not exposed to the media. They
further argue that women who live in communities that favor women’s exposure to
the media or allow them to benefit from social development levels have a higher
likelihood of taking decisions alone and a low likelihood of taking decisions jointly.
Age at first marriage had a significant negative association with decision-making
alone for all five questions (see Table 2.2), while it had significant negative asso-
ciation with decision-making jointly for only two decisions (what to do with
respondent’s earnings and large household purchases) (see Table 2.3). Contrary to
our expectations, residence (rural area) increased the odds in favor of taking
decisions alone on what to do with husband’s earnings (see Table 2.2), while
residence (rural area) reduced the likelihood of taking a decision jointly for only
two decisions (what to do with respondent’s earnings and large household pur-
chases) (see Table 2.3).
Wealth had a significant positive relationship with taking decisions alone for all
five questions with women from wealthier families having higher chances of taking
decisions alone compared to those from poor families (see Table 2.2). Wealth had a
statistically negative association with decision-making jointly for only two deci-
sions (large household purchases and visits to family or relatives). These results are
in accordance with recent studies which indicate that women from wealthier
households were less likely to have a say in household decision-making and that
they tended to have the view that their voices were relatively not worthwhile but
there was a high likelihood of their having access to cash to spend (Mahmud et al.
2012).
Older women were found to be less likely to agree with four of the five justi-
fications for wife beating. Education was negatively associated with agreeing with
justifications for wife beating for all five reasons (see Table 2.4). Women with
higher education were less likely to agree with wife beating for any of the five
reasons as compared to those with lower education levels (primary education).
These findings are in agreement with Kishor and Subaiya’s (2008) findings which
show that the higher the education level, the lower the likelihood of a woman
agreeing that wife beating was justified for any reason, and the higher the likelihood
of her agreeing with the fact that it was a woman’s right to refuse sex with her
husband.
Women with paid work were less likely to agree with justifications for wife
beating for three of the five reasons (see Table 2.4). Women with regular exposure
to the media were less likely to agree with wife beating for two of the five reasons.
Women residing in rural areas were more likely to agree with justifications for wife
beating for all the five reasons. Wealth reduced the odds in favor of saying yes to
justifications for wife beating for all the five reasons. Women from wealthier
families were less likely to agree with justifications for wife beating for all five
reasons as compared to women from poor families.
Table 2.4 illustrates the odds ratios about respondents’ attitudes toward justifi-
cations for wife beating. Women with higher education were less likely to agree
with wife beating (for all five reasons) than those with primary education. Women
from wealthier families were also less likely to agree with wife beating for all five
reasons than those from poor families. Residing in rural areas increased the odds in
favor of justifications for wife beating for all five reasons. However, variables such
as age, children ever born, media exposure, and paid work influenced some of the
reasons. Unlike our expectations, age at first marriage had no influence on attitudes
toward wife beating.
2.6 Conclusions
The most obvious finding of this study is that education, age of the respondent,
media exposure, and employment for cash and wealth had a positive relationship
with women’s empowerment. Our study also found that education, wealth, age, and
the number of children had high explanatory powers for women’s empowerment as
compared to the other variables. Taken together, the findings suggest that women’s
empowerment can be achieved by providing them education, labor force partici-
pation, media exposure, shifting negative traditional cultural norms, and by
focusing on integrated development.
The main weakness of this study is the paucity of data on all indicators of
women’s empowerment (only data on household decision-making and attitudes
toward wife beating was available) and some of the covariates that were used in
previous studies. Another weakness is that the results might have been affected by
missing values on the data on measuring women’s empowerment. As society is
evolving fast through education, technology, urbanization, and globalization,
continuous improvement in survey structures is required; there is also a need to
collect data on women’s empowerment indicators that have not been taken into
account in previous surveys.
More studies need be carried out on the uncovered aspects of women’s
empowerment, especially the relationship between women’s empowerment and
variables such as fertility, health care, contraceptive use, and microfinance.
Women’s autonomy and their determination to participate in the labor force, as well
as their contribution to economic growth and well-being also need to be considered.
References
Abadian S (1996) Women’s autonomy and its impact on fertility. World Dev 24(12):1793–1809
Al Riyami A, Afifi M, Mabry RM (2004) Women’s autonomy, education and employment in
Oman and their influence on contraceptive use. Reprod Health Matters 12(23):144–154
Ali DA, Deininger K, Goldstein M (2014) Environmental and gender impacts of land tenure
regularisation in Africa: pilot evidence from Rwanda. J Dev Econ 110:262–275
Allendorf K (2007) Do women’s land rights promote empowerment and child health in Nepal?
World Dev 35(11):1975–1988
Allsopp MS, Tallontire A (2014) Pathways to empowerment? Dynamics of women’s participation
in global value chains. J Cleaner Prod 107:114–121
Alsop R, Heinsohn N (2005) Measuring empowerment in practice: structuring, analysing and
framing indicators. World Bank Policy Research Working Paper 3510
Bloom SS, Wypij D, Gupta MD (2001) Dimensions of women’s autonomy and the influence of
maternal health care utilization in a north Indian city. Demography 38(1):67–78
El-Halawany HS (2009) Higher education and some upper Egyptian women’s negotiation of
self-autonomy at work and home. Res Comp Int Educ 4(4):423–436
Faridi MZ, Chaudhry IS, Anwar M (2009) The social-economic and demographic determinants of
women’s work participation in Pakistan: evidence from Bahawalpur district. MPRA Paper
22831
Fotso JC, Ezeh AC, Essendi H (2009) Maternal health in resource-poor urban settings: How does
women’s autonomy influence the utilization of obstetric care services? Reprod Health, 16 June
2009
Ganle JK, Afriyie K, Segbefia AO (2015) Microcredit: empowerment and disempowerment of
rural women in Ghana. World Dev 66:335–345
Ghuman SJ, Lee HJ, Smith HL (2004) Measurement of women’s autonomy according to women
and their husbands: results from five Asian countries. Soc Sci Res 35:1–28
Goldman MJ, Little JS (2014) Innovative grassroots NGOs and the complex processes of women’s
empowerment: an empirical investigation from northern Tanzania. World Dev 66:762–777
Golla AM, Malhotra A, Nanda P, Mehra R (2011) Understanding and measuring women’s
economic empowerment: definitions, framework and indicators. International Center for
Research on Women (ICRW)
Kabeer N (1999a) Resources, agency, achievements: reflections on the measurement of women’s

empowerment. Dev Change 30:435–464
Kabeer N (1999b) The conditions and consequences of choice: reflections on the measurement of
women’s empowerment. UNRISD discussion paper 108
Kabeer N (2001) Conflicts over credit: re-evaluating the empowerment potential of loans to
women in rural Bangladesh. World Dev 29(1):63–84
Kabeer N (2005) Gender equality and women’s empowerment: a critical analysis of the third
millennium development goal 1. Gender Dev 13(1):13–24
Kar SB, Pascual CA, Chickering KL (1999) Empowerment of women for health promotion: a
meta-analysis. Soc Sci Med 49:1431–1460
Khan A, Noreen S (2012) Microfinance and women’s empowerment: a case study of Bahawalpur
district (Pakistan). Afr J Bus Manage 6(12):4514–4521
Kishor S, Gupta K (2004) Women’s empowerment in India and its states: evidence from the
NFHS. Econ Polit Wkly 39(7):694–712
Kishor S, Subaiya L (2008) Understanding women’s empowerment: a comparative analysis of
Demographic and Health Survey (DHS) data. USAID
Larsen U, Hollos M (2003) Women’s empowerment and fertility decline among the pare of
Kilimanjaro region, northern Tanzania. Soc Sci Med 57:1099–1115
Lee-Rife SM (2010) Women’s empowerment and reproductive experiences over the life course.
Soc Sci Med 71:634–642
Mahmud S, Tasneem S (2014) Measuring ‘empowerment’ using quantitative household survey
data. Women’s Stud Int Forum 45:90–97
Mahmud S, Shah NM, Becker S (2012) Measuring women’s empowerment in rural Bangladesh.
World Dev 40(3):610–619
Malhotra A, Mather M (1997) Do schooling and work empower women in developing countries?
Gender and domestic decisions in Sri Lanka. Sociol Forum 12(4):559–630
Malhotra A, Schulte J, Patel P, Petesch P (2009) Innovations for Women’s empowerment and
gender equality. International Center for Research on Women (ICRW)
Mukashema I, Sapsford R (2013) Marital conflicts in Rwanda: points of view of Rwandan
psycho-socio-medical professionals. Procedia-Soc Behav Sci 82:149–168
Naqvi ZF, Shahnaz L (2002) How do women decide to work in Pakistan? Pak Dev Rev 41
(4):495–513
NISR (2010a) Demographic and Health Survey (DHS). National Institute of Statistics for Rwanda,
Kigali
NISR (2010b) Integrated Household Living Conditions (EICV3): Gender thematic report. National
Institute of Statistics for Rwanda
NISR (2013) Statistical year book. National Institute of Statistics for Rwanda, Kigali
Patrikar SR, Basannar DR, Sharma MS (2014) Women empowerment and use of contraception.
Med J Armed Forces INDIA 70:253–256
Rowlands J (1997) Questioning empowerment. Oxfam, Oxford
Sado L, Spaho A, Hotchkiss DR (2014) The influence of women’s empowerment on maternal
health care utilization: evidence from Albania. Soc Sci Med 114:169–177
Samman E, Santos ME (2009) Agency and empowerment: a review of concepts, indicators and
empirical evidence. Oxford Poverty and Human Development Initiative
Schuler SR, Hashemi SM, Riley AP, Akther S (1996) Credit programs, patriarchy and men’s
violence against women in rural Bangladesh. Soc Sci Med 43(12):1729–1742
Schuler SR, Hashemi SM, Riley AP (1997) The influence of women’s changing roles and status in
Bangladesh’s fertility transition: evidence from a study on credit programs and contraceptive
use. World Dev 25(4):563–575
Sultana A, Hossen S (2013) Role of employment in women empowerment: evidence from Khulna
city of Bangladesh. Int J Soc Sci Interdisc Res 2(7):117–125
The World Bank (2012) Gender inequality and development. World Development Report 2012.
The World Bank
Trommlerova SK, Klasen S, Lebmann O (2015) Determinants of empowerment in capability-

based poverty approach: evidence from the Gambia. World Dev 66:1–15
Upadhyay UD, Karasek D (2012) Women’s empowerment and the ideal family size: an
examination of DHS empowerment measures in sub-Saharan Africa. Int Perspect Sexual
Reprod Health 38(2):78–89
Upadhyay UD, Gipson JD, Withers M, Lewis S, Ciaraldi EJ, Fraser A, Huchko MJ, Prata N (2014)
Women empowerment and fertility: a review of literature. Soc Sci Med 115:110–120
Chapter 3
Determinants of Demand for Outpatient
Health Care in Rwanda
Charles Mulindabigwi Ruhara and Urbanus Mutuku Kioko
Abstract In the 2000s, the Government of Rwanda initiated health sector reforms
aimed at increasing access to health care. Despite these reforms, there has not been
a corresponding increase in demand for health services, as only about 30% of the
sick use modern care (NISR in Preliminary results of interim demographic and
health survey 2010. NISR, Kigali, 2011). The objective of this paper was to
examine the factors influencing the demand for outpatient care in Rwanda and
suggesting appropriate measures to improve utilization of health services. The data
are from the Integrated Household Living Conditions Survey (EICV2) conducted in
2005 by the National Institute of Statistics Rwanda (NISR). A structural model of
demand for health care is estimated to measure the demand effects of covariates.
The findings indicate that health insurance is a significant determinant of outpatient
medical care. In addition, the price of health care and household income are among
the main drivers of utilization of health care. Women are more likely to seek
outpatient health care as compared to men. Two main policy recommendations
emerge from these findings. First, the government should reduce out-of-pocket
healthcare expenditures (OOPE) through subsidies for public health facilities.
Second, the government should reduce the premiums for community-based health
insurance schemes (CBHIs) to increase coverage rates.

Keywords Outpatient Health insurance Endogeneity User fees Logit model
JEL Classification Codes I10 I11 I12 I13 D12
C.M. Ruhara (&)

University of Rwanda, Butare, Rwanda
e-mail: ruharamch@yahoo.fr
U.M. Kioko
University of Nairobi, Nairobi, Kenya
e-mail: urbanusmutukukioko@gmail.com

DOI 10.1007/978-981-10-4451-9_3
42 C.M. Ruhara and U.M. Kioko
3.1 Introduction
The theoretical model for analyzing human capital and health and its effect on
productivity, earnings, and labor supply was first developed by Grossman (1972).
The premise of his theory was that an increase in a person’s stock of health raises
his or her productivity in both market and non-market activities. There exist large
productivity and wage payments benefits of a better health. There is evidence to
show that sickness can have adverse effects on learning, and that these impacts can
later influence economic outcomes (Bhargava et al. 2001). Better health can make
workers more productive either through fewer days off or through increased pro-
ductivity while working. Improved nutrition and reduced diseases, particularly in
early childhood, lead to improved cognitive development, enhancing the ability to
learn. Healthy children also gain more from school because they are absent for
fewer days due to ill health.
While health is determined by many factors including medical care, food,
housing conditions, and exercising, it is accepted that medical care is one of the key
determinants in the health production function (McKeown 1976). Santerre and
Neun (2010) argue that as a firm uses various inputs such as capital and labor to
manufacture a product, an individual uses healthcare inputs to produce health.
When other factors are held constant, an individual’s health status indicates the
maximum amount of health that can be generated from the quantity of medical care
consumed.
Considering the importance of medical care, both policymakers and researchers
have directed much attention to the question of how broad access to health services
can be ensured (Lindelow 2002). Early policy and research initiatives focused on
the need to improve physical access through an expansion of the network of health
facilities. This consisted of improving healthcare delivery including healthcare
professionals, equipment, and buildings. A growing literature on health care,
however, points out that supply is not sufficient and this means that providing
maximum access to health care remains a challenge for governments in many
low-income countries.
In Rwanda, access to health care was identified as an important objective for
formulating public policies since good health is recognized as a necessary condition
for enjoying economic and social opportunities. The country has developed a
healthcare setting open to all Rwandans that is accessible to everyone regardless of
socioeconomic status. For instance, in the Rwanda Economic Development and
Poverty Reduction Strategy (EDPRS, 2008), access to health care is one of the
strategies for eradicating poverty. The strategy’s objective is promoting health care
among the entire population, increasing geographical accessibility, increasing the
availability and affordability of drugs, and improving the quality of services.
Increased accessibility to health care has several benefits particularly among the
poor segments of the population (The World Bank 2001). The millennium devel-
opment goals (MDGs) also recognize health as an essential ingredient in the social
and economic progress of any country. However, despite improvements in access to
3 Determinants of Demand for Outpatient Health Care in Rwanda 43
health care through community-based health insurance schemes (CBHIs) and other
insurance providers, it is not known why healthcare utilization has remained low in
Rwanda.
To increase access to health services, the Government of Rwanda initiated a
number of health policies and other economic stimulus efforts, some of them tar-
geting the supply side of the market while other policies are aimed at increasing
service utilization. The policies include Vision 2020, the Economic Development
and Poverty Reduction Strategy (EDPRS) 2008–2012, One-Cow-One-Family, the
Social Security Policy 2009, and the Health Policy 2004 (Ministry of Health 2009).
These policies are meant to increase access to health services and hence improve the
health status of the population. The reforms are also meant to strengthen the
healthcare system and make it more accessible. Despite these reforms, less than two
out of five sick people sought formal health care in Rwanda (NISR 2011). The
ineffectiveness of previous policies aimed at increasing healthcare utilization is due
to their implementation without adequate evidence about the factors influencing
health service utilization in Rwanda. The aim of this study is to examine the factors
that influence demand for outpatient healthcare services in Rwanda.
Although economic theory offers potential factors that influence demand for
health care, there is lack of a quantitative assessment of their effects in Rwanda.
Evidence on these factors is needed for implementing policies designed to improve
health service utilization in the country. To my knowledge, no studies have been
done in Rwanda in recent years to determine the factors influencing healthcare
demand. The only available evidence on this is from studies by Jayaraman et al.
(2008) and Shimeles (2010), which focus on maternal health care and on effects of
CBHIs at the district level. In countries in which estimates of demand for health
care exist, research results provide conflicting evidence of the demand effects of
price, income, and insurance suggesting that more studies are needed.
Most studies on demand for health care do not address the problems of endo-
geneity (reverse causality) and heterogeneity (variations in the estimated effect size
due to unobservables). Failure to address these problems leads to biased estimates
(Kabubo-Mariara et al. 2009; Lawson 2004; Rosenzweig and Schultz 1982).
McCool et al. (1994) point out that differences in data, model specifications, and/or
empirical methods can contribute to diversity in demand estimates and hinder
clarity in healthcare financing policies. Our paper addresses these estimation
problems by providing rigorous evidence on outpatient healthcare demand deter-
minants in Rwanda that policymakers can use for improving health service uti-
lization across all the regions.
Healthcare services are demanded as an input into the production of health that is
part of an individual’s utility function together with other goods. Empirically, an
analysis of health services examines their determinants based on the microeconomic
theory of consumer behavior. These determinants include factors related to indi-

viduals, households, and the community. Numerous studies have attempted to
quantify how much healthcare people consume, the type of health care that they
use, and the factors underlying the utilization of health care.
Several studies have documented the impact of insurance on demand for health
care and found that the effect of insurance on utilization varies across the population
and the level and type of coverage (see Barros and Machado 2008; Buchmueller
et al. 2005). Hahn’s (1994) study found that uninsured households had lower
average rates of utilization compared to persons with private or Medicaid coverage.
Those with Medicaid for the full year were found to have the highest rate of
healthcare utilization while uninsured persons were found to have the lowest mean
utilization for all types of services. In a similar study, Barros and Machado (2008)
estimated the effect of private health insurance coverage beyond a National Health
System on the demand for several health services in Portugal. Their study estimated
the impact of having additional coverage on demand for three different health
services: the number of visits, number of blood and urine tests, and the probability
of visiting a dentist. The results showed large positive effects of the coverage on the
number of visits and tests.
Similar findings were also reported by Jones et al. (2006), who found private
insurance to be positively associated with the probability of health visits in Ireland,
Italy, Portugal, Spain, and the UK. Another study by Shimeles (2010) examined the
effects of CBHIs on healthcare utilization at the district level in Rwanda. The study
used the matching estimator to address the endogeneity problem. As in Hahn
(1994), higher utilization of healthcare services was reported among insured as
compared to uninsured households. The results indicate that CBHIs had a strong
positive impact on access to health care. These results are consistent with the
findings of Saksena et al. (2010), Rashad and Markowitz (2009), and Jutting
(2003), who found that insurance was an important factor in explaining health
seeking behavior.
However, other studies have found that insurance may have little effect on
demand for health care depending on geographical locations (Buchmueller et al.
2005). Cunningham and Kemper (1998) document that in areas where a
well-functioning healthcare system exists, the lack or reduction of insurance cov-
erage may not imply a significant lack of access to care. The expansion of coverage
would then result in smaller changes in utilization than in locations where the
uninsured have fewer. Mwabu et al. (2003) reported a negative effect of insurance
suggesting that insured people made fewer visits to health facilities relative to
uninsured people. The reason for this unlikely result was that people with insurance
may have better health endowments and thus demand lesser health care relative to
uninsured people. However, none of the studies controlled for heterogeneity of
insurance. Since the effect of insurance on utilization may vary across the popu-
lation, geographical location, and the level and type of insurance coverage, research
on healthcare demand needs to handle the problem of heterogeneities to produce
reliable estimates.
There is extensive literature on health economics that seeks to estimate the

elasticity of income on demand for health services. Most of the literature shows that
demand for medical care is income inelastic indicating that medical care is a
necessity (Mocan et al. 2004). The positive sign of the elasticity indicates that as
income increases, demand for health services also increases. However, literature is
inconclusive but notes that income effects vary widely across studies, countries, and
regions. Ringel et al. (2002) report that income elasticity of demand using
cross-sectional data ranged between 0 and 0.2. This kind of magnitude suggests that
the effect of income on demand is relatively small. The difference in estimates
across time frames relies on the inclusion of the effects of changes in medical
technology that use long time series data (Ringel et al. 2002). Income elasticities
based on cross-sectional data or on time series data covering a relatively short
period assume that the level of available medical technology is constant. As real
incomes in the population increase, the aggregate demand for new medical tech-
nologies and new treatment approaches increases as well. Thus, from previous
studies on the effect of income, no consensus has emerged and the debate on
whether health care is a luxury or necessity continues (Blomqvist and Carter 1997).
To account for the price effect at different levels of visits rather than the average
effect obtained using ordinary least squares (OLS), Mwabu et al. (2003) used the
quintile regression method to analyze the effects of price on demand for health
services in Kenya. The fees were found to have a negative effect on demand for
health care but it differed across the quintiles. Their findings established that an
increase of 10 shillings reduced visits by 0.2%. The price elasticity of demand for
medical care was found to be small in magnitude and consistent with Akin et al.
(1986) and Sauerborn et al. (1994). The study did not, however, address the en-
dogeneity and heterogeneity problems to produce unbiased estimates. Given that
demand for treatment is not determined by an individual alone, several studies have
investigated household and community factors. Controlling for the unobserved
effects at the household and community levels that affect health seeking behavior
Lépine and Nestour (2008) shows that household economic status and quality of
health care are important determinants of the probability of seeking treatment from
a qualified provider. In addition, transportation costs were found to be an important
determinant of the likelihood of seeking care as an increase in the average transport
cost decreased the likelihood of seeking curative care by 25%.
Evidence from empirical studies on the relationship between demand for health
care and its main determinants differs in several ways. In addition, most of the
previous studies have assumed an exogenous insurance and do not consider the
reverse causality that is more likely to exist between demand for medical care and
health insurance. Our study provides new evidence on the factors, which affect
demand for health care using data from Rwanda, and handles the endogeneity and
heterogeneity problems to ensure that the estimates are unbiased and consistent.
3.3 Methodology
Following Grossman (1972), individuals maximize their utility over health and
other goods subject to market and non-market factors. Health is one of the several
commodities over which individuals have well-defined preferences. Market factors
include availability of health inputs and their prices, insurance, and household
incomes. Non-market factors include household characteristics, location or dis-
tance, and individual characteristics such as age, education, health status, and the
perceptions that they have about the quality of health services (Ajakaiye and
Mwabu 2007; Appleton and Song 1999; Bategeka et al. 2009). Assuming that
health care is a consumption good, a consumer’s problem can be expressed as
Max U ¼ UðH; Z; X; YÞ ð3:1Þ
where U is the utility derived from consumption of different goods; Y represents

health-related goods that yield utility to the sick person and improve health status;
H is the health production function; Z stands for health inputs such as health care
while X represents all other goods and services.
The utility function is maximized subject to the following constraints:
B ¼ XPx þ YPy þ ZPz ð3:2Þ
H ¼ HðZ; I; S; C; A; hs ; Ph ; NO Þ ð3:3Þ
where Z is defined as in Eq. 3.1 and I is household characteristics including

insurance; S represents socio-demographic variables including age, sex, and edu-
cation; C is community characteristics including distance to health facility; A is the
household asset; hs is the size of the household; Ph is the price of health while NO
are household non-observable characteristics. In the first constraint, B is the
exogenous income and Px, Py, and Pz are, respectively, the price of health neutral
goods (such as clothing), health-related consumer goods Z (such as health care), and
health investment goods Y such as exercising.
The maximization problem is then expressed as
Max U ¼ UðH; Z; X; YÞ ð3:4Þ
Given H ¼ HðZ; I; S; C; A; hs ; Ph ; NO Þ
s.t. B ¼ XPx þ YPy þ ZPz
Solving the maximization problem yields a demand function for health care
specified as
Dh ¼ f ðI; B; A; S; C; hs ; Ph ; NO Þ ð3:5Þ
where Dh refers to the demand for outpatient; I is health insurance; B is the budget
or income; A is household asset; and S represents socio-demographic variables;
C represents community characteristics including distance to health facility; hs is
the household composition; Ph is the price of health care; and NO are household
non-observable characteristics.
Equation 3.5 is a structural outpatient healthcare demand equation that includes
an endogenous variable among the independent variables. The endogenous variable
is health insurance because of reverse causality between demand for health care and
insurance while exogenous variables include the monetary price for health care,
income, age, gender, educational attainment of the individual, household size,
location, and region. In our study, the demand for outpatient care is discrete rather
than continuous because patients seek or do not seek health care. In Eqs. 3.1 and
3.2, a health investment good is purchased only for the purpose of improving health
so that it enters an individual’s utility function only through H.
In the demand for outpatient model, insurance is assumed to improve access to
health services. In addition, the heterogeneity of health insurance due to a nonlinear
interaction of demand for health services with unobservable and omitted variables
could bias the estimates. Our study assumes that demand for health services has
only one endogenous variable and demand for outpatient refers to any curative
outpatient service provided by a physician or any other medical staff. Given the
dichotomous nature of outpatient care, the estimation adopts a binary discrete
model, where health care is either sought or not. Assuming that the errors are
distributed logistically, we adopt a logit regression method to estimate both out-
patient and inpatient healthcare demands. The dependent variable takes any two
values: l if an individual uses outpatient health care and zero representing indi-
viduals who do not use any health services. The logit regression is also preferred
because most of the studies on demand for health services use a logit regression
(see Hahn 1994; Lépine and Nestour 2008). This relationship can be expressed as

1 if the event takes place ðthe individual seeks outpatient serviceÞ
Yi ¼
0 if the event has not taken place ðthe individual has not sought treatmentÞ
Equation 3.5 expressing the demand for health care can be rewritten as
yi ¼ x0i b þ ei ð3:6Þ
where yi is a latent variable showing the probability that medical care is sought or
not sought, x0i is a vector of characteristics related to the individual, household, and
community, and ei is the error term.
Y ¼ 1 if yi [ 0; that is; ðx0i b þ ei Þ [ 0

and Y ¼ 0 if yi \0; that is; ðx0i b þ ei Þ\0
The values zero and 1 are used because they allow the definition of probability
of occurrence of an event as the mathematical expectation of the variable Y. This
can be expressed as
E½Yi ¼ PrðYi ¼ 1Þ 1 þ PrðYi Þ 0 ¼ PrðYi ¼ 1Þ ¼ pi ð3:7Þ
Equation 3.7 shows that we need to compute the probability of occurrence

(Y = 1) over the probability of no occurrence (Y = 0). Assuming that the error term
has an extreme value distribution, this can be done using the logit relation as shown
by
expðb0 þ b1 X1i þ b2 X2i þ þ bk Xki Þ

PrðYi ¼ 1Þ ¼ ð3:8Þ
1 þ expðb0 þ b1 X1i þ b2 X2i þ þ bk Xki Þ
In terms of log odds, Eq. 3.8 can be reformulated as

X
PrðYi ¼ 1Þ PrðYi ¼ 1Þ pi k
ln ¼ ln ¼ ln ¼ b0 þ bj Xji
1 PrðYi ¼ 1Þ PrðYi ¼ 0Þ 1 pi j¼1
¼ log itðpi Þ ð3:9Þ
which can be expressed as
X
3
log itðpi Þ ¼ b0 þ bj Xji ¼ b0 þ b1 X1i þ b2 X2i þ b3 X3i þ ei ð3:10Þ
j¼1
where
Yi is an indicator of the choice of modern health care (outpatient) by the ith
household member,
X1i Vector of characteristics related to an individual such as age, education, and
sex,
X2i Vector of characteristics related to a household such as income and insurance,
and
X3i Vector of characteristics related to community-level characteristics such as a
medical specialist and the distance from the household to the health facility.
If in Eq. 3.10, bj [ 0, then an increase in Xji (for instance, household income),
while all other exogenous variables remain unchanged will increase the log-odds
ratio of individual i seeking health services. If bj \0, then an increase in Xji (for
example, user fee) will reduce the log-odds ratio. If bj ¼ 0, then the variable has no
effect.
However, in the case of Eq. 3.10, bs indicates changes in the logistic index with
the sign of b indicating the direction of the eventual change in the probability of
seeking care from a given health facility. Equation 3.10 is the structural form of the
probabilistic healthcare demand function. In this equation, as in recent literature,
one of the independent variables—health insurance—is endogenous and the esti-
mation has to address this problem. Endogeneity is due to reverse causality between
health insurance and demand for health care. So, in order to obtain unbiased and
consistent estimates, instrumentation of the endogenous variable is required. The
instrumental variable should be correlated with the endogenous regressor but
unrelated directly to the dependent variable (Ajakaiye and Mwabu 2007).
Health insurance in Eq. 3.10 is endogenous to the dependent variable. Thus,
estimating the equation without taking into account this problem might encounter
the problem of simultaneity which is due to the possibility of reverse causality
between demand functions and health insurance. Endogeneity of health insurance
arises because the decision to purchase health insurance and the utilization of health
services are intertwined. First, since insurance reduces the effective price of medical
care, insured people tend to consume more health services (Rashad and Markowitz
2009). Second, even if individuals cannot perfectly predict their future health needs,
they are likely to have information about their health status which could lead them
to anticipate higher use of health services and then decide to buy health insurance.
Thus, healthcare utilization not only depends on an individual’s health insurance
coverage, but the level of coverage may also be influenced by the anticipated
utilization of health services (Jutting 2004). Manning et al. (1987) argue that
treating insurance as exogenous in demand for healthcare models produces biased
results because people who anticipate consuming more health services have an
obvious incentive to obtain insurance cover either by selecting a more generous
option at the place of employment by working for an employer with a generous
insurance plan, or by purchasing a generous coverage privately.
Existing literature suggests useful methods for dealing with the endogeneity
problem. Among the common approaches to this problem is the use of the
two-stage residuals inclusion (2SRI) regression method which is appropriate for
nonlinear models. The procedure is used to address problems relating to mea-
surement error, simultaneity, and omitted variables. This method requires identi-
fying an observable variable or instrument that is correlated with the endogenous
variable but uncorrelated with the error term (Ajakaiye and Mwabu 2007; Kioko
2008; Rosenzweig and Schultz 1982; Strauss and Thomas 1995; Wooldridge 2002).
The problem, however, is identifying an observable variable, zi, that satisfies two
conditions. First, the selected variable is uncorrelated with the error term. This
means that cov(zi, e) = 0, that is, zi is exogenous in the estimation of the endoge-
nous equation (see Wooldridge 2002; Behrman and Deolalikar 1988; Griliches and
Mairesse 1998; Ackerberg and Caves 2003). The second requirement involves the
relationship between the identified instrument, zi, and demand for health services.
This means that the identified variable should not have an impact on health
insurance, that is, zi must be relevant. This requires regressing health insurance
against all the exogenous variables, including the instrument (Greene 2007; Jowett
et al. 2004; Wooldridge 2002). In the first regression, the variables should have
significant coefficients when the choice variable is regressed on the identifying
variable together with all other exogenous variables (Ackerberg and Caves 2003).
In the first stage, we estimated the reduced form of health insurance on all
exogenous variables including instrumental variables. In the second stage, we
regressed demand for health care on all independent variables plus insurance and
insurance residuals obtained from the first-stage regression (Palmer et al. 2008;
Terza et al. 2008).
Following Ajakaiye and Mwabu (2007) and Kabubo-Mariara et al. (2009), we
can re-formulate the demand for health services in the form of simultaneous
equations as
D ¼ dd Z1 þ bj Ij þ eij ; j ¼ 1. . .2 ð3:11Þ
I ¼ dj Z þ e2 ð3:12Þ
where D and I are demand for health care and health insurance, respectively. Z is a
vector of independent variables consisting of Z1 covariates that belong to the
demand for health services function and a vector of instrumental variables that
affect insurance but have no direct impact on demand for health services. d and b
are parameters to be estimated and e is a disturbance term. Equation 3.11 is the
structural equation to be estimated while Eq. 3.12 is the linear projection of the
potentially endogenous variable I on all the exogenous variables. The system of
equations assumes that there is only one endogenous regressor in the demand
equation.
A major challenge of the instrumental variable approach is obtaining a valid
instrument for identifying the effect of endogenous variables in a structural model.
Once the potential instrument is identified, it is important to test for its suitability by
assessing whether it has three properties: relevance, strength, and exogeneity of
instruments (Kabubo-Mariara et al. 2009; Stock 2010). An instrument satisfying all
three properties is said to be a strong and valid instrument. As used in Meer and
Harvey (2004), after testing for validity and strength, the variables’ employment
status and community health association membership were used as an instrument
for insurance.
We tested for the endogeneity of insurance and the validity of instruments. First,
we carried out the test for endogeneity of health insurance. If insurance was
exogenous, there would be no justification to estimate the structural model of
demand for health care because the logit models will yield unbiased estimates. We
used the Durbin–Wu–Hausman test. The results showed that the Durbin–Wu–
Hausman statistic values were significant at the 10% level.
We also conducted the Wald test of exogeneity of the insurance variable which
showed that the values were significant at the 1% level. We then rejected the null
hypothesis of exogenous insurance. Second, the coefficients of insurance residuals’
variable were also significant at the 1% level to the demand for medical care
services. Third, we tested the impact of the instruments on the dependent variable.
These were found to be insignificant. Fourth, the strength of the instruments was
tested by considering the impact of the instruments on the endogenous variable. As
the coefficients of instruments were large and significant at the 1% level, the
instruments were strong. In addition, we conducted the F-test to check the role of
the instruments on the endogenous variable. While an F-statistic of at least 10 is
recommended (Kioko 2008; Staiger and Stock 1997), the minimum eigenvalue
statistic for the F-test was 133.04 suggesting that the null hypothesis of weak
instrument had to be rejected.
A second estimation issue is the heterogeneity bias which arises from unob-
served factors interacting with the variable of interest and thus biasing the results.
These are some unobservable preferences and health endowments of individuals
that influence their demand for health care (Kabubo-Mariara et al. 2009; Schultz
2008). Even with valid instruments, in practice, it is not easy to separate the impact
of endogenous variables from the effect of unobservables on a structural model.
Failure to take into account heterogeneity could lead to unreliable estimates.
In our study, heterogeneity may arise from at least three sources. First, a risk
reduction effect where the preferred level of utilization is greater because of the
financial certainty created by insurance than utilization under uncertainty (Meza
1983). Second, an access effect where the insurance may extend an individual’s
opportunity set by giving access to health care that would otherwise not be avail-
able. Nyman (2005) has argued that the pooling effect of insurance provides access
to expensive medical technologies that would otherwise not be affordable. Third, an
income transfer effect where insurance creates an ex-post transfer of income from
the healthy to the ill and this may increase utilization through an income effect on
the demand for medical care (Nyman 2005). The three sources relate to reasons
known by an individual but not by a researcher because of which health insurance
may affect demand for health services.
To handle the problem of heterogeneity, we used the control function approach
(CFA) (Florens et al. 2008). This involved estimating a reduced form of insurance
residual (I*), where the inclusion of the residuals was identical to the one obtained
by 2SRI using an instrument for insurance. Assuming that the unobserved com-
ponent was linear in the insurance residual (I*), we introduced an interaction term
[of the insurance and its residual (II*)] as a second control variable to eliminate an
endogeneity bias even in a case where the reduced-form insurance was
heteroscedastic (Card 2001).
Introducing the control function variables (insurance residual and interaction)
gives
D ¼ b0 þ dd Z1 þ sI þ cII þ e1 ð3:13Þ
where I* are the fitted residuals from the reduced form of the insurance variable,
which is explained by Z1; all other variables are as defined earlier. sI* captures the
nonlinear indirect effects of insurance (I) on demand for health services (D),
because the fitted residuals serve as a control for unobservable variables which are
correlated with insurance. Inclusion of both I* and the interaction term II* controls
for the effects of unobservable factors and therefore purges the coefficients of the
effects of the unobservables (Ajakaiye and Mwabu 2007; Card 2001). If any
unobservable variable is linear in I*, it is only the intercept in Eq. 3.13 that is
affected by inclusion of the unobservable variable, and therefore, the 2SRI estimates
are efficient without the interaction term (II*). The 2SRI estimates will be unbiased
and consistent if at least one of two conditions holds: First, the expected value of
the interaction between insurance and its fitted residuals is zero. Second, the
expectation of the interaction between insurance and the fitted residuals is linear.
The data used in this paper are drawn from the Integrated Household Living
Conditions Survey (EICV2) conducted in 2005 by the National Institute of
Statistics of Rwanda (NISR). This nationally representative survey collected data
from 7620 households and 34,819 individuals. Data were collected at the household
and individual levels. EICV2 aimed at enabling the government to assess the impact
of its different policies and programs which had been implemented for improving
the living conditions of the population in general.
The survey covered all the 30 districts in Rwanda and collected data on a wide
spectrum of socioeconomic indicators—labor, housing, health, agriculture, debt,
livestock and expenditure and consumption in different areas, regions, and locations
in the country. Household level information included consumption expenditures on
health and OOPE (consultation, laboratory tests, hospitalization, and medication
costs). Individual level information included socioeconomic indicators and insur-
ance status. There were also a number of community variables such as distance to
the nearest health facility. To improve the reliability of the data, the recall period for
the use of health services was two weeks prior to the survey. In this paper, demand
for healthcare services was estimated for a single visit because the survey did not
capture multi-visits to health facilities. Hence, the demand for outpatient care was
limited to the last consultation or admission.
3.4 Results and Discussion
In Table 3.1, Wald chi2 tests measuring the goodness of fit indicate that the esti-
mated models give an adequate description of the data because it is highly sig-
nificant implying that all the model’s parameters are jointly different from zero. The
2SRI results are reported in columns 4–5 of Table 3.1 while the first-stage
regression estimates are given in Table 3.3 in the Annexure 1. Columns 6–7 in
Table 3.1 present the results of demand for outpatient care after correcting for
heterogeneity of insurance. Due to the inclusion of insurance residuals and inter-
action between insurance residuals and insurance, the results remain close to the
2SRI results in terms of signs of coefficients although they are different in mag-
nitude. The significance of the coefficient on insurance residuals suggests that
insurance is endogenous to outpatient medical care. The coefficient on the
Table 3.1 Logistic demand estimates for outpatient care: Dependent variable is probability of an
outpatient visit
Explanatory Baseline z-statistics 2SRI z- Control z-
variables estimates estimates statistics function statistics
estimates
Household income 0.00030 3.50*** 0.0004 3.60*** 0.003 3.40***
User fees −1.108 −26.74*** −0.980 −15.40** −1.43 −18.9***
Quality of health −0.011 −0.27 −0.010 −0.41 −0.004 −0.11
care (=1)
Health insurance 0.492 13.26*** 0.921 1.87* 4.106 29.29***
(=1)
Distance to the −0.434 −8.00*** −0.072 −5.2*** −0.239 −4.29***
health facility
Household size −0.019 −2.52** 0.004 1.79* −0.017 −2.31**
Age 0.013 2.57** 0.056 1.91* −0.0008 −0.74
Square age −0.001 −2.90** −0.0051 −2.79** −0.0002 −1.80*
Primary (=1) 0.006 1.89* 0.021 3.2** 0.018 2.4**
Secondary (=1) 0.030 2.90* 0.040 1.95* 0.028 1.99*
Tertiary (=1) 0.002 5.8*** 0.008 4.12*** 0.067 2.02**
Male (=1) −0.163 −4.44*** −0.023 −3.66*** −0.148 −3.85***
Urban (=1) −0.311 −4.19*** −0.340 −5.15*** −0.164 −2.14**
Kigali region (=1) −0.035 −0.45 −0.070 −1.43 −0.024 −0.26
Southern region −0.066 1.23 −0.204 −2.67** −0.063 −1.18
(=1)
Western region 0.027 0.53 0.024 2.40** 0.035 0.68
(=1)
Northern region 0.195 3.25*** 0.17 3.54*** 0.164 2.73**
(=1)
Insurance residuals – – −1.3 −4.7*** −2.869 19.05***
Interaction of – – – – −1.269 −6.88***
insurance and
insurance residuals
Constant −2.644 −24.56 −1.789 −5.67 −2.411 −25.62
Number of 5040 5040 5040
observations=
Durbin–Wu– 0.054*
Hausman chi-sq
F(1, 5040)= 133.88
LR chi2(19) 5880.20*** 5889.70*** 5897.44***
Log likelihood −3020.4388 −3016.3138 −3006.2254
Source Researcher’s construction
Note ***, **, and * = significant at the 1, 5, and 10% levels, respectively
interaction between the insurance residuals and insurance is significant at the 1%

level indicating the presence of heterogeneity arising from an interaction of
insurance with unobserved determinants of demand for outpatient care. For com-
parison purposes, the baseline model (logit) estimates are also presented in columns
2–3. They appear to be weaker than the 2SRI results since the coefficient on health
insurance increases from 0.49 to 0.9 across model specifications (moving from logit
to 2SRI) while the z-value remains statistically significant. This shows that treating
insurance as exogenous highly understates its impact on demand for outpatient
medical care.
On average, higher user fees reduce the probability of using outpatient health
services. This finding is similar to the results reported by Diop et al. (1995), Litvack
and Bodart (1993), Manji et al. (1992) and Ridde (2003) who report negative effects
of user fees on health service uptake. In particular, Manji et al. (1992) show that
uptake of treatment in Kenyan schools fell from 75 to 19% after fees were intro-
duced. This suggests that the introduction of cost sharing was responsible for a
major part of the reduction in uptake. Similarly, De Bethune et al. (1989) and Yoder
(1989) found the price of health care to be a significant hindrance to demand for
medical services in Swaziland. However, their studies confirmed the results of other
cross-sectional studies that demand for health care are inelastic to price. Oxaal and
Cook (1998) show that the relationship between price and health is inelastic
because of failure to disaggregate its effect from the income one.
The coefficients on education indicate a positive association with demand for
outpatient health services in Rwanda. This result is consistent with Katz et al.
(2001), which shows that the more the individuals get educated, the more they
come in contact with other educated individuals who have a high demand for health
care. Social interaction which begins during schooling years continues to the
workplace and leads to the adoption of health-improving behaviors, including
health service utilization. The evidence from Rwanda is also in line with Elo (1992)
and Blunch (2004) who observed a strong positive association between education
and the use of health services.
Insurance was found to be an important determinant of demand for outpatient
medical services in Rwanda. Insurance reduced the price of health care which made
the service more affordable than it would be without insurance. The result on
insurance finds support in findings from previous studies which addressed the
endogeneity problem when estimating the demand effect of insurance (see, for
example, Rashad and Markowitz 2009; Shimeles 2010; Meer and Harvey 2004).
Similar results were reported by Phelps and Newhouse (1974) who used data on
co-insurance plans in the USA, Canada, and the UK. The results were such that the
level of sensitivity of demand depended on the co-insurance rate.
The evidence presented in our paper reveals that gender is an important factor
affecting the use of outpatient health services in Rwanda where females are more
likely to use outpatient services as compared to men. The results are in line with
those reported by Miller (1994) who argued that females demanded more health
care than males because of their role in childbearing. Miller (1994) adds that some
illnesses such as cardiovascular diseases, osteoporosis, immunologic diseases, and
Alzheimer’s disease are more prevalent in women than men. In line with this,
Ahmad (2001) further adds that gender differences in healthcare utilization for
women were related to specific diseases such as cardiovascular and chronic
illnesses.
Some research has shown that women use less outpatient health care than men
because of the time they spend taking care of the elderly and other people with
disabilities. Caregivers, especially women elderly caregivers, were found to neglect
their own health in order to fulfill this responsibility (Fredman et al. 2008). These
responsibilities made it difficult for severely disadvantaged women to take steps to
improve their living situations and health behaviors by consuming less health
services than men. Similarly, Oxaal and Cook (1998) show that the constraints to
access for poor women and girls made them less likely to have access to appropriate
care and to seek adequate treatment. Their paper notes that the range of factors
limiting access for women includes the socioeconomic status of the household, time
constraints, composition of households, intra-household resource allocation and
decision-making, less education and employment and legal or social constraints on
access to care, heavy work burdens, and the opportunity costs of time in seeking
care.
Given these results, a number of recommendations emerge. Since user fees are
an impediment in using health care in Rwanda, the government should reduce user
fees in health facilities through increased budget allocations for all health facilities,
particularly in the public sector, where the poor go for medical care. From 2003,
OOPE increased gradually to reach 32.2% of the total health expenditure in 2010.
High OOPE has a variety of negative consequences, including household impov-
erishment. Subsidies on user fees should target vulnerable groups such as children
and women or low-income households. The government should also consider
subsidizing private health facilities to increase access to high-quality services by
low-income households. The subsidies will help reduce the effect of income
inequalities on healthcare utilization.
Health insurance is an important determinant of healthcare seeking behavior in
Rwanda. Thus, policies that increase health insurance coverage will substantially
increase health service utilization. The 2013 health insurance coverage rate in
Rwanda was 73%, the highest in the East African Community, but the high
premiums associated with this coverage are not sustainable. The government should
subsidize health insurance to make it accessible to the most disadvantaged people.
The current level of premium (of $4.5) for CBHIs per year, per person should be
reduced. The premium rate more than doubled in 2011 from $1.7 to $4.5, and this
reduced the coverage rate from 91 to 73%. In addition, while with the earlier
premium level, healthcare expenditure represented 10% of the total household
expenditure holding other factors constant, and with the new premium, the
healthcare expenditure for households represents 26% of household expenditure.
This will cause households to incur catastrophic expenditures and push them into
poverty. Further, with an average household size of 6.6 persons, this level of
premium per individual does not seem to be sustainable given that 44.9% of the
population lives on less than $1 per day.
Annexure 1
See Tables 3.2 and 3.3.
Table 3.2 Marginal effects for the determinants of outpatient care

Explanatory Baseline model z-statistics 2SRI marginal z-statistics
variables marginal effects effects
Household income 0.0004 3.46*** 0.00083 3.09***
User fees −0.0810 −11.47*** −0.170 −21.46***
Quality of health −0.0002 −0.27 −0.008 −0.20
care (=1)
Health insurance 0.0130 10.20*** 0.942 1.99*
(=1)
Distance to the −0.0120 −6.13*** −0.535 −7.43***
health facility
Household size −0.0004 −2.51** 0.011 0.77
Age 0.0003 2.56** 0.005 2.13**
Square age −0.0002 −2.13** −0.00004 −2.40*
Male (=1) −0.0030 −4.38*** 0.149 3.89***
Urban (=1) −0.0060 −4.71*** −0.391 −4.65***
Kigali region (=1) −0.0008 −0.46 −0.370 −1.25
Southern region −0.0010 −1.27 −0.280 −2.67**
(=1)
Western region 0.0006 0.52 0.140 2.01**
(=1)
Northern region 0.0050 2.76** 0.317 3.94***
(=1)
Primary (=1) 0.0001 1.96* 0.001 1.98*
Secondary (=1) 0.0004 2.50** 0.023 2.10*
Tertiary (=1) 0.0006 2.67* 0.006 0.90
Insurance residuals – – 0.0054 2.31**
Table 3.3 Determinants of demand for health insurance, first-stage regression (demand for
outpatient care model)
Explanatory variables Estimates Standard errors z-statistics
Employment status (=1) 0.0510 0.0064 7.90***
Household income 0.0034 0.0004 8.50***
User fees −0.0278 0.0231 −1.20
Quality of health care (=1) 0.0033 0.0069 0.47
Distance to the health facility −0.0483 0.0108 −4.47***
Household size −0.0132 0.0013 −10.58***
Age 0.0072 0.0008 9.20***
Age squared −0.0001 0.00001 −6.00***
Primary (=1) 0.0023 0.0045 5.10***
Secondary (=1) 0.0052 0.0085 0.61
Tertiary (=1) 0.0023 0.0087 0.26
Male (=1) 0.0068 0.0058 1.17
Urban (=1) 0.0847 0.0138 6.13***
Kigali (=1) −0.0385 0.0129 −2.98***
Southern (=1) −0.0624 0.0088 −7.04***
Western (=1) 0.0555 0.0087 6.32***
Northern (=1) 0.0582 0.0099 5.87***
Constant 0.3250 0.0174 18.62***
Number of observations 5040
F(18, 27934) = 56.19***
References
Ackerberg DA, Caves K (2003) Structural identification of production functions: an application to

the timing of input choice. UCLA, Department of Economics, Los Angeles, CA
Ahmad F (2001) Rural physicians perspectives on cervical and breast cancer screening: a
Gender-based analysis. J Women’s Health Gender-Based Med 5(2):201–208
Ajakaiye O, Mwabu G (2007) The demand for reproductive health services: an application of
control function approach. AERC, Nairobi
Akin JS, Griffin DK, Popkin BM (1986) The demand for primary health services in the bicol
region of the Philippines. Econ Dev Cult Change 34(4):755–782
Appleton S, Song L (1999) Income and human development at the household level, evidence from
six countries. Oxford University, Mimeo
Barros P, Machado P (2008) Moral hazard and the demand for health services: a matching
estimator approach. J Health Econ 27(4):1006–1025
Bategeka LO, Asekeny L, Musiime JA (2009) The determinants of birth weight in Uganda.
AERC, Nairobi
Behrman JR, Deolalikar AB (1988) Health and nutrition. In: Chenery H, Srinivasan TN
(eds) Handbook of development economics. Elsevier Science Publishers BV, North Holland
Bhargava A, Jamison D, Lau L, Murray C (2001) Modeling the effects of health on economic
growth. J Health Econ 20:423–440
Blomqvist AG, Carter RAL (1997) Is healthcare really a luxury? J Health Econ 16(2):207–229
Blunch N (2004) Maternal literacy and numeracy skills and child health in Ghana. Paper presented
at the Northeast Universities Development Consortium Conference, HEC Montreal
Buchmueller TC, Grumbach K, Kronick R, Kahn JG (2005) The effect of health insurance on
medical care utilization and implications for insurance expansion: a review of the literature.
Med Care Res Rev 62(1):3–10
Card D (2001) Estimating the return to schooling: progress on some persistent econometric
problems. Econometrica 69(5):1127–1160
Cunningham P, Kemper P (1998) Ability to obtain medical care for the uninsured: How much does
it vary across communities? JAMA 280(10):921–927
De Bethune X, Alfani S, Lahaye JP (1989) The influence of an abrupt price increase on health
service utilization: evidence from Zaire. Health Policy Plan 4(1):76–81
Diop F, Yazbeck A, Bitrán R (1995) The impact of alternative cost recovery schemes on access
and equity in Niger. Health Policy Plan 10:223–240
Elo I (1992) Utilization of maternal health-care services in Peru; the role of women’s education.
Population Studies Center, University of Pennsylvania
Florens JP, Heckman MC, Vytlacil E (2008) Identification of treatment effects using control
functions in models with continuous, endogenous treatment and heterogeneous effects.
Econometrica 76(5):1191–1206
Fredman L, Cauley JA, Satterfield S (2008) Caregiving, mortality, and mobility decline: the health,
aging, and body composition (Health ABC) study. Arch Intern Med 168:2154–2162
Greene WL (2007) Econometric analysis. Macmillan Publishing Company, New York
Griliches Z, Mairesse J (1998) Production functions: the search for identification. National Bureau
of Economic Research, Working paper no. 5067
Grossman M (1972) On the concept of health capital and the demand for health. J Polit Econ 80
(2):223–235
Hahn B (1994) Healthcare utilization: the effect of extending insurance to adults on medicaid or
uninsured. Med Care 32(3):227–239
Jayaraman AS, Chandrasekhar T, Gebreselassie T (2008) Factors affecting maternal healthcare
seeking in Rwanda. USAID, Working paper
Jones AM, Koolman X, Doorslaer EV (2006) The impact of having supplementary private health
insurance on the use of specialists in European Countries. Anales d’Economie et de Statistique
83(84):251–275
Jowett M, Deolalikar A, Martinsson P (2004) Health insurance and treatment seeking behaviour:
evidence from a low-income country. Health Econ 13:845–857
Jütting JP (2003) Do community-based health insurance schemes improve poor people’s access to
healthcare? Evidence from rural Senegal. World Dev 32:273–288
Katz L, Kling J, Liebman J (2001) Moving to opportunity in Boston: early results of a randomized
mobility experiment. Q J Econ 116(2):607–654
Kioko MU (2008) The economic burden of malaria in Kenya: a household level investigation. PhD
thesis, University of Nairobi
Lawson D (2004) A microeconomic analysis of health, healthcare and chronic poverty, The
University of Nottingham (unpublished)
Lépine A, Nestour A (2008) Healthcare utilization in rural Senegal: the factors before the
extension of health insurance to farmers. International Labor Office, Research paper no. 2
Lindelow M (2002) Healthcare demand in rural Mozambique: evidence from 1996/97 household
survey. International Food Policy Research Institute (IFPRI), FCND discussion paper no. 126
Litvack JI, Bodart C (1993) User fees plus quality equals improved access to health care: results of
a field experiment in Cameroon. Soc Sci Med 37:369–383
Manji JE, Moses SF, Bradley NJ, Nagelkerke MA, Plummer FA (1992) Impact of user fees on
attendance at a referral centre for sexually transmitted diseases in Kenya. Lancet 340:463–466
Manning WG, Newhouse JP, Duan N, Keeler EB, Leibowitz A (1987) Health insurance and the
demand for medical care: evidence from a randomized experiment. Am Econ Assoc Rev 77
(3):251–277
McCool JH, Kiker BF, Ng YC (1994) Estimates of the demand for medical care under different
functional forms. J Appl Econometrics 9(2):201–218
McKeown T (1976) The role of medicine: dream, mirage or nemesis? Basil Blackwell, Oxford
Meer J, Harvey SR (2004) Insurance and the utilization of medical services. Soc Sci Med
58:1623–1632
Meza D (1983) Health insurance and the demand for healthcare. J Health Econ 2:47–54
Miller L (1994) Medical schools put women in curricula. Wall Street J B1
Ministry of Health (MoH), Republic of Rwanda (2009) Health sector strategic plan (unpublished)
Ministry of Health Rwanda (2009) Rwanda health financing policy review of Rwanda—options
for universal coverage. World Health Organization
Mocan NH, Tekin E, Jeffrey SZ (2004) The demand for medical care in urban China. World Dev
32(2):289–304
Mwabu KMJD, Nd’enge GK (2009) The consequences of fertility for child health in Kenya:
endogeneity, heterogeneity and the control function approach. AERC, Nairobi
Mwabu GJ, Wang’ombe B, Nganda B (2003) The demand for medical care in Kenya. African
Development Bank, Oxford
National Institute of Statistics of Rwanda (NISR) (2011) Preliminary results of interim
demographic and health survey 2010. NISR, Kigali
Nyman JA (2005) Health insurance theory: the case of the missing welfare gain, University of
Minnesota (unpublished)
Oxaal Z, Cook S (1998) Health and poverty gender analysis. Swedish International Development
Co-operation (unpublished)
Palmer T, Thompson J, Tobin M, Sheehan N, Burton P (2008) Adjusting for bias and unmeasured
confounding in Mendelian randomization studies with binary responses. Int J Epidemiol
37(5):1161–1168
Phelps CE, Newhouse JP (1974) Coinsurance, the price of time, and the demand or medical
service. Rev Econ Stat 56:334–342
Rashad IK, Markowitz S (2009) Incentives in obesity and health insurance. Inquiry 46:418–432
Ridde V (2003) Fees-for-services, cost recovery, and equity in a district of Burkina Faso operating
the Bamako Initiative. Bull World Health Organ 81:532–538
Ringel JS, Hosek SD, Ben AV, Mahnovski S (2002) The elasticity of demand for healthcare.
A review of the literature and its application to the military health system. National Defense
Research Institute (unpublished)
Rosenzweig MR, Schultz TP (1982) The behavior of mothers as inputs to child health: the
determinants of birth weight, gestation, and the rate of fetal growth. In: Fuchs VR
(ed) Economic aspects of health. The University of Chicago Press, Chicago, pp 53–92
Saksena P, Xu K, Elovaino R, Perrot J (2010) Health services utilization and out-of-pocket
expenditure at public and private facilities in low-income countries. World Health
Organization, Background paper no. 20, Geneva
Santerre RE, Neun SP (2010) Health economics: theories, insights and industry studies, 5th ed.
Cengage Learning
Sauerborn R, Nougtara A, Latimer E (1994) The elasticity of demand for healthcare in Burkina
Faso: differences across age and income groups. Health Policy Plan 9(2):186–192
Schultz TP (2008) Population policies, fertility, women’s human capital. In Schultz TP, Strauss J
(eds) Handbook of development economics, vol 4, Chap. 52. Elsevier, Amsterdam
Shimeles A (2010) Community based health insurance schemes in Africa: the case of Rwanda.
African Development Bank group, Working paper no. 120
Staiger D, Stock JH (1997) Instrumental variables regression with weak instruments.

Stock JH (2010) The other transformation in econometric practice: tools for inference. J Econ
Perspect 24(2):83–94
Strauss J, Thomas D (1995) Human resources: empirical modeling of household and family
decisions. In: Behrman J, Srinivasan TN (eds) Handbook of development economics, vol 3A.
Elsevier Science, Amsterdam, pp 1883–2023
Terza J, Basu A, Rathouz P (2008) Two-stage residual inclusion estimation: addressing
endogeneity in health econometric modeling. J Health Econ 27(3):531–543
The World Bank (2001) World Bank Development Report 2000–2001: attacking poverty. Oxford
University Press, Washington, DC
Wooldridge JM (2002) Econometric analysis of cross-section and panel data. MIT Press,
Cambridge, MA
Yoder R (1989) Are people willing and able to pay for health services? Soc Sci Med 29(1):35–42
Part II
The Impact of Institutions, Aid, Inflation
and FDI on Economic Growth
Chapter 4
The Impact of Institutions on Economic
Growth in Sub-Saharan Africa: Evidence
from a Panel Data Approach
Kokeb G. Giorgis
Abstract This study sheds light on the effect of institutional variables on economic
growth in sub-Saharan African countries. It empirically analyzes the impact of
institutional quality proxied by control for corruption, government effectiveness,
and protection of the property right index among others on economic growth in
sub-Saharan African countries during the sample period 1996–2012. The sample
consisted of 21 sub-Saharan African countries. The methodology is based on
first-differenced GMM estimator proposed by Arellano and Bond (Rev Econ Stud
58(2):277–297, 1991) for dynamic panel data, which is robust for taking care of
individual fixed effects, heteroskedasticity, and auto-correlation in the presence of
endogenous covariates. The results of this study indicate that improving institu-
tional quality, specifically protecting property rights on average had a positive
contribution to growth in output per capita in the sampled countries though its effect
was small. However, institutional variables such as control for corruption and
government effectiveness had a positive effect on growth though they were statis-
tically insignificant. These findings agree with some of the studies conducted so far
on the effect of institutions on growth.
Keywords Economic growth Institutions Panel data GMM Sub-Saharan

Africa
4.1 Introduction
After the independence of many African countries in the 1950s and 1960s there was
a widely held expectation that poor countries in Africa would ‘catch up,’ that is,
converge in per capita income terms with developed countries. However, this was
confirmed to be an unrealistic expectation as more than half a century after inde-
pendence the continent is still the poorest in the world by any standard where more
K.G. Giorgis (&)

Department of Economics, Addis Ababa University, Addis Ababa, Ethiopia
e-mail: kokebgg@gmail.com

DOI 10.1007/978-981-10-4451-9_4
64 K.G. Giorgis
than a quarter of its population is estimated to be food-insecure. Achieving high and

persistent economic growth is a prerequisite to decreasing widespread poverty yet
for long years after independence most of the African countries have failed to
achieve even moderate economic growth rates. It is only in the last eight years or so
that African countries have started recording a moderate growth rate. So the puzzle
is why unlike other countries in the world, the African continent and other poor
countries in the world are still poor whereas some others are at the top of the per
capita income ladder?
As there is a dearth and incompleteness of macroeconomic and institutional data
for sub-Saharan Africa, it is motivating to investigate the determinants of Africa’s
poor economic growth record. Since the 1990s, with increasing data availability,
cross-country regression analyses have indicated that the ‘classical determinants of
growth’ such as level of technology, international trade, availability of natural
resources or population have not fully explained the poor performance of growth in
many poor countries. During the last 20 years or so, growth economists have
increasingly referred to institutions as answers to the long-standing question con-
cerning what determines economic growth.
There are two rationales for our study. First, studies which assess the effect of
institutions on economic growth in developing countries and particularly in
sub-Saharan Africa are limited and inadequate even though theoretically the
importance of institutions affecting growth is getting more emphasis. A paper worth
mentioning here is a study by Naude (2004) in sub-Saharan Africa which inves-
tigates the effect of policy, institutions, and geography on economic growth in the
continent. It used panel data from 1970 to 1990 using data from 44 African
countries. This study, among others, is justified by the fact that it used not one but
three major indicators of institutions to achieve its objectives. Second, although few
studies have been done so far on the effect of institutions on growth in developing
and emerging countries, there is no consensus on which specific indicators of an
institution matter the most for growth. Thus, our study also investigated institu-
tions’ indicators that are important in affecting growth in the context of Africa.
Based on literature there are two broader classifications of institutions to look to
determine how institutions affect growth: an informal one represented by social
capital or culture (such as work culture of the society) and formal ones such as laws
or regulations. Our paper is based on the formal classification where according to
North (1990) institutions are defined as follows:
the rules of the game in a society, or more formally, the humanly devised constraints that
shape human interaction … they structure incentives in human exchange, whether political,
social, or economic.
By including proxy variables for institutional quality in traditional growth

equations such as the Solow-Swan growth model (the neo-classical growth equa-
tion), the effect of institutions can be seen in economic growth.
As far as a proxy for institutional quality is concerned, our paper uses ‘protection
of property rights,’ ‘corruption and graft,’ and ‘government effectiveness.’ Thus,
our paper tries to investigate the determinants of Africa’s poor economic growth
4 The Impact of Institutions on Economic Growth in Sub-Saharan … 65
record taking into account the effects of institutions using the Arellano-Bond GMM
estimator. The regression is based on data from 21 sub-Saharan Africa countries
employing panel data covering the period 1996–2012.
This paper is organized as follows: Section 4.2 provides a brief review of the-
oretical and empirical literature; Sect. 4.3 deals with descriptive statistics of the
growth and institutional patterns in sub-Saharan Africa during the sample period;
The empirical methodology is described in Sect. 4.4; and the results are presented
in Sect. 4.5; Section 4.6 gives the conclusion.
4.2.1 Theoretical Review of Institutions Versus Growth
Growth literature uses three major theories to explain the difference in output per
capita among nations. First, the neo-classical and endogenous growth theories
which have long recognized that differences in output per capita in a society are
intimately related to differences in the amount of human capital, physical capital,
and technology that workers and firms in that country have access to. For instance,
the Solow model emphasizes capital accumulation as a major driver of growth
(Solow 1956) while Grossman and Helpman’s (1991) theoretical model highlights
the quality of capital stock to boost growth. Second, the geographic theory which
explains how essential the geographic location of a country is in affecting its
growth; this is linked to market access and climatic conditions. Theoretical and
empirical research has so far found strong causality between geographic location
and the level of income in a country. Third, the last and recent theory, deals with an
institutional approach. It emphasizes the importance of institutions in affecting
growth.
Institutions are often seen as providing the ‘rules of the game’ required to set up
baseline situations for human interactions which consequently have an impact on
social, economic, and political relationships in a society. Institutions include the
moral, ethical, and behavioral norms of a society so as such they matter for growth
and development (Nelson and Sampat 2001).
To empirically analyze the effect of institutions on economic growth, it is
important to identify which types of institutions are more important in affecting
economic growth. Different researchers and international organizations including
the Heritage Foundation have different classifications of institutions depending on
their respective objectives. According to literature, there are at least three types of
institutions: political, economic, and financial. The quality of each of these types of
institutions is measured through different variables. For example, the main variables
of economic institutions are protection of property rights; regulation and the
business freedom index; freedom in doing business; financial freedom; investment
freedom; and the quality of the regulation system. The main variables for political
66 K.G. Giorgis
institutions are the rule of law that contains the rule of law index, controlling
corruption and corruption freedom, and other variables.
Our study used the main economic and political institutional indicators which
are expected to have an impact on economic growth in the context of Africa. With
this objective, the three indicators used are ‘protection of property rights,’ ‘control
of corruption,’ and ‘government effectiveness.’
When it comes to the extent to which institutional aspects such as property
rights, incentive structures, and transaction costs affect economic growth, North
(1981) was a pioneer who developed the contract and predatory theory by
extending the neo-classical theory to include institutional variables. The contract
theory states that if contracts are well enforced, then they contribute beneficially to
the efficiency of business and society. If a state provides the legal framework that
reduces transaction costs in the presence of some institutions, productivity and
innovation increase. On the other side, the predatory theory treats the state as a
vehicle for collecting monopolistic rents and transferring the resources among
different groups in order to maximize incomes.
4.2.2 Empirical Review of Institutions Versus Growth
How important institutions are in promoting growth in developing and emerging

economies has sparked renewed interest in recent years. As a result, a growing
literature seeks to determine the extent to which institutions (economic or/and
political institutions) affect growth. However, the dearth and limitations of both
institutional and macroeconomic data for many developing countries including
those in sub-Saharan Africa prevent robust policy interpretations on a
country-by-country basis.
A study by Hall and Jones (1999) focused on explaining the enormous differ-
ences in per capita incomes among countries. Their empirical findings suggest that
differences in capital accumulation, productivity, and ultimately in per capita
income are due to differences in institutions and government policies. The authors
also argue that controlling for endogeneity of institutions and government policies’
long-run economic performance is primarily determined by social infrastructure,
which depends on differences in capital accumulation and productivity.
Rodrik et al. (2004) empirically investigated the contribution of institutions,
geography, and trade on differences in per capita incomes across countries. Their
study found that the effect of institutions was higher compared to the effects of
geography and trade in explaining differences in per capita incomes across countries.
Redek and Sušjan (2005) using panel data from 1995 to 2002 based on 24
transition economies in the then eastern socialist economies of Europe examined
the effect of institutional quality proxied by private property protection, legal sys-
tem, regulation, government intervention, and international relations drawn from
the Heritage Foundation index. Their study confirmed that the better the protection
and regulation of property rights, the lower the fiscal burden and the higher the
growth. That is, as institutional quality increased by 1%, the government’s fiscal
burden decreased by 0.03%. Similarly, Naude (2004) sheds light on the same
objective, but this time using data from 44 African countries and employing both
single-year cross-section data and panel data covering the period 1970–90. For
comparative purposes, the study used different econometric estimation methods
including a dynamic Arellano-Bond GMM estimator. Moreover, the study used
three proxies for institutional quality (ethno-linguistic heterogeneity (ELH),
corruption and graft and the incidence of revolutions and coups) as proxies. Based
on the GMM estimator, the author concluded that none of these had a significant
impact on growth but supported Acemoglu et al.’s (2001) ‘reversal of fortune’
thesis, namely that settler mortality (instrumenting for the quality of institutions) is
inversely related to economic growth.
Likewise, a study by Valeriani and Peluso (2011) analyzed the impact of
institutions on economic growth and examined whether the eventual impact differed
depending on the level of development in a country. They used panel data from
1950 to 2009 for 181 countries (both developing and developed) through a pooled
regression model and a fixed effects model. They employed institutional indicators
of civil liberties, number of veto players, and quality of government and found that
institutional quality impacted economic growth in a positive way. This was true for
all three institutional indicators that were examined. The only difference between
how developing and developed countries were affected by institutional quality was
the size of the impact and not in the direction of it. On a more specific level, out of
the three institutional indicators, improved civil liberties had a greater effect on
economic growth in developing countries, whereas the number of veto players
assumed more importance for developed countries’ economies.
With a similar objective, a study by Dushko et al. (2011) used cross-country data
from 212 groups of countries and geographic regions and applied different
econometric models (OLS, G2SLS, 2SLS). It used the rule of law, revolutions, and
Freedom House ratings as well as war casualties as indicators of institutional
quality. Their study found that in all the models used, institutional quality had a
positive and significant effect in enhancing GDP per capita on average for the
sampled countries during the study period.
Acemoglu and Robinson (2010) investigated whether political or economic
institutions should be given primacy. Even though their study emphasized that
differences in prosperity across countries were due to differences in economic
institutions, it also underscored that without building strong political institutions it
was not possible to build strong economic institutions which could facilitate growth
because economic institutions are the outcome of a political process. Hence, the
study deduced that solving the problem of development entailed understanding what
instruments can be used to push a society from a bad to a good political equilibrium.
Unlike Acemoglu and Robinson’s (2010) study, Glaeser et al.’s (2004) study
had the objective of exploring the causal link between institutions and growth. It
confirmed that rather than political institutions, human capital had a causal effect on
economic growth. Importantly, in that framework, institutions did not directly affect
growth.
68 K.G. Giorgis
In general, the empirical literature discussed here indicates positive relationships

between the different indicators of institutional quality and cross-country income
differences. This means that better institutions foster long-run economic growth,
and countries with better institutions have higher per capita income levels. But it is
also important to stress that all indicators of institutional variables are not equally
important for countries at different levels of development. Moreover, it is also clear
from the reviewed literature that there is a dearth of studies on the effect of insti-
tutions on growth in general and in sub-Saharan Africa in particular. There is also a
lack of consensus on which economic or political indicators of institutions are
important in affecting growth and hence per capita income differences among
countries.
4.3 Descriptive Statistics of Economic Growth

and Institutional Quality in Sub-Saharan Africa
Our sample consists of 21 sub-Saharan African countries. It excludes countries such

as Somalia, Eritrea, and others due to missing or incomplete data on one or more of
the variables of interest. Mainly data on institutional variables such as protection of
property rights, controlling corruption, and government effectiveness were incom-
plete for a number of countries in the sample region. The sampled countries are
Burkina Faso, Botswana, Cameron, Ethiopia, Ghana, Guinea, Kenya, Lesotho,
Mozambique, Mauritania, Mauritius, Malawi, Namibia, Niger, Nigeria, Rwanda,
Senegal, Chad, Togo, Uganda, and South Africa.
Period average growth in per capita was near 2.6% with substantial differences
across countries; this difference was due to internal or external factors (Table 4.1).
During 1996–2012 on average, the sampled countries experience a real GDP
growth rate of 5.16%, where growth in these countries intensified in the last five
years. Table 4.1 also shows that the gross capital formation as a percentage of GDP
was around 21%.
Table 4.1 Summary of explanatory variables (1996–2012)

Variable Mean Std. dev. Min. Max.
GDP (million USD) 2938.85 5010.85 1062.11 307,313
GDP growth rate (%) 5.16 4.14 −9.52 33.74
GDP per capita 1291.57 1757.25 128.24 6683.66
GDP per capita growth rate (%) 2.55 3.95 −12.18 30.344
Gross capital formation as % of GDP 21.14 8.37 5.46 74.82
Government spending % GDP 15.22 6.90 3.86 39.50
FDI_GDP as % of GDP 0.23 0.20 0.004 1.31
Gross enrollment rate/schooling 36.0 25.0 2.22 101.89
Source Author’s calculations based on World Bank data
Table 4.2 Descriptive statistics for measures of institutional quality

Variable Mean Std. dev Min Max
Control of corruption Overall 38.91 22.34 1.46 85.85
Between 21.48 10.03 78.48
Within 7.65 13.98 68.39
Protection of property rights Overall 40.57 14.91 10 75
Between 13.09 23.52 70.29
Within 7.66 19.40 65.2
Government effectiveness Overall 37.36 20.71 2.42 79.02
Between 19.78 7.82 72.1
Within 7.42 15.98 62.9
Source Author’s calculations based on World Bank data
Table 4.2 shows that the average measures of institutional quality for the study
period were not greater than 40% implying that sub-Saharan African countries had
poor quality institutions. One can also see that there was a huge difference between
the sampled countries. For example, regarding controlling corruption the minimum
figure is 10% while the maximum is around 78% which shows that there was a clear
difference among countries in the region concerning controlling corruption; this was
also true for the other two variables.
Figure 4.1 shows the index for control of corruption proposed by the World
Governance Indicators (WGI) of the World Bank which could serve as a proxy for a
country’s level of institutional development. It indicates the degree of corruption
within a given political system by taking into consideration financial corruption
(import and export licenses, exchange controls, tax assessments, or police protec-
tion), as well as the following forms of corruption: patronage, nepotism, job
reservations, ‘favor-for-favors,’ and secret party funding. On average, for the
80
60
40
20
0
Camerron
Botswana
Ghana
Kenya
Mozambique
Mauritania
Uganda
Namibia
Guinea
Niger
Ethiopia
Malawi
Rwanda
Senegal
Togo
Lesotho
Nigeria
South Africa
Burkina Faso
MauriƟus
Chad
Fig. 4.1 Rank of sub-Saharan African countries by average control of corruption (1996–2012).
Source Author’s calculations based on WGI data, the World Bank
70 K.G. Giorgis
80
70
60
50
40
30
20
10
0
Kenya
Uganda
Guinea
Niger
Senegal
Togo
Camerron
Ethiopia
Botswana
Lesotho
Mozambique
Nigeria
Rwanda
Ghana
Mauritania
Malawi
Namibia
Burkina Faso
MauriƟus
South Africa
Chad
Fig. 4.2 Average rank of Sub-Saharan African countries by government effectiveness
(1996–2012). Source Author’s calculations based on WGI data, the World Bank
sample period, the worse corruption was in Nigeria, Niger, Chad, Cameroon, and
Kenya while Botswana, South Africa, Mauritius, and Namibia had relative control
over corruption.
Another proxy for institutional development is the quality of government poli-
cies which is analyzed through WGI’s government effectiveness index. This is a
multi-dimensional index which reflects both the quality of public services and of
civil services. It accounts for the quality of policies formulated and implemented,
for political pressures and also for the government’s credibility. The country with
the lowest level of government effectiveness was Togo, followed by Chad and
Nigeria. South Africa and Botswana registered the highest average on the gov-
ernment effectiveness index during 1996–2012 (Fig. 4.2).
Figure 4.3 shows the average percentage of protection of property rights for the
sampled countries in the study period. Botswana and Mauritius were relatively
better in the protection of property rights. Rwanda, Chad, and Togo showed poor
performance.
4.4 Data and Methodology
Following North (1981) our paper assesses the effect of institutions on economic
growth. For this, one can incorporate a proxy for institutions in the neo-classical
growth model. To do so we started with the aggregate production function which
describes how inputs (labor, physical and human capital, and technology) are
combined to produce the output:
80
70
60
50
40
30
20
10
0
Namibia
Senegal
Malawi
Niger
Rwanda
Camerron
Botswana
Uganda
Ghana
Kenya
Mozambique
Mauritania
Guinea
Togo
Ethiopia
Lesotho
South Africa
Nigeria
Burkina Faso
MauriƟus
Chad
Fig. 4.3 Average rank of protection of property right for sub-Saharan African countries
(1996–2012) (%). Source Author’s calculations based on WGI data, the World Bank
Yit ¼ At Kth Htb L1hb

t ð4:1Þ
where Y is output, H is human capital, L is labor and the parameter A represents the
level of technology in the economy, and K is physical capital. Where human capital
is the knowledge, skills, and abilities of people who are or who may be involved in
the production process while the labor force is the number of people who are able to
work. Rewriting Eq. 4.1 in per capita form yields:
yit ¼ At kth hbt ð4:2Þ
Traditional macroeconomic growth models implicitly assumed an underlying set

of good institutions. Hence, they did not take into account the influence of insti-
tutional quality as a factor of economic growth. However, the fact that institutions
have an important role in the growth process makes economists try to implement
institutional quality in growth models. Thus,
/1 ðII Þ /2 ðII Þ
At ¼ A0 kt ht ð4:3Þ
where A0 represents the basic level of technology, I* and I denote the best-quality
institutions and the country’s current level of institutional quality, respectively. The
traditional growth model considers that economies function close to best-quality
institutions hence in these models I = I*. This reduces the effect of institutional
quality to zero. However, since North (1981) more recent growth theories recognize
the importance of institutions. Accordingly, the mathematical statement, I − I*,
measures the degree to which a country’s institutions fall short of the best
conditions.
72 K.G. Giorgis
Therefore, substituting Eq. 4.3 in the equation on the production function per
worker, and rewriting it, gives the following:
h þ /1 ðII Þ b þ /2 ðII Þ
yit ¼ A0 kt ht ð4:4Þ
To study the dynamic of output per capita, taking the log of Eq. 4.4 and a
derivative with respect to time (t) and rearranging it gives the following:
Dyt DA0 Dkt Dht

¼ þ ½ðh /1 I Þ þ /1 I þ ½ðb /2 I Þ þ /2 I
yt A0 kt ht
DA0 Dk Dht
; p1 ¼ h /1 I ; p2 ¼ b /2 I ; p3 ¼/1
t
Let p0 ¼ þ /2
A0 kt ht
ð4:5Þ
and adding an error term e gives growth rate of output per capita as follows:
Dyt Dkt Dht

¼ p0 þ p1 þ p2 þ p3 DI þ e ð4:6Þ
yt kt ht
The coefficient estimates for p1 and p2 measure the return to physical and human
capital investments, while coefficient p3 measures an increasing return to physical
and human capital investments as the country’s institutional quality improves.
Therefore, Eq. 4.6 is used to test the impact of institutions on growth where p3
measures the effect of a change in institutional quality on growth through a change
in the productivity of both human and physical capital.
To investigate the impact of institutions on economic growth, we used the
first-differenced GMM estimator proposed by Arellano and Bond (1991) for
dynamic panel data. Thus, Eq. 4.6 can further be rewritten in dynamic panel
specification as follows:
ln GDPit ¼ l0 þ l1 ln GDPi;t1 þ l2 Ii;t þ l3 ln Xi:t þ gi þ ei;t ð4:7Þ
where, 1
yt ¼ w ln GDPit represents the natural logarithm of real GDP per capita
expressed in constant 2000 US$ for country i at time t and hence Dy yt is the growth
t
rate of GDP per capita as discussed earlier. I i;t stands for the institutional variables
for country i at time t (controlling of corruption, government effectiveness, and
protection of property rights). Xi:t represents both physical and human capital
variables as discussed earlier and other macroeconomic control variables. gi sig-
nifies the individual fixed effects specific to each country, and it is constant in time.
ei N ð0;r2 Þ is a random disturbance term.
Using the OLS method for estimating, Eq. 4.7 raises several concerns. First, the
presence of the lagged dependent variable ln GDPi,t − 1, which is correlated with the
fixed effects ηi, gives rise to a dynamic panel bias (Nickell 1981). The coefficient
estimate for lagged ln GDP is inflated by attributing a predictive power that actually
belongs to the country’s fixed effects. Moreover, it is clear that estimating a panel
data model with a lagged dependent variable will lead to biased results at least in
small samples with a small time period (Judson and Owen 1999).
Therefore, the alternative solution is to use the generalized method of moments
(GMM) developed by Arellano and Bond (1991). It is an efficient estimator for
dynamic panels. It is popular in the context of empirical growth research as it allows
relaxing some of the OLS assumptions. The Arellano and Bond estimator corrects
endogeneity in the lagged dependent variable and provides consistent estimates.
Moreover, it allows auto-correlation and heteroskedasticity among others
(Roodman 2006).
The first step of the GMM procedure is to differentiate Eq. 4.7 to remove
individual effects, that is, gi which gives the following:
D ln GDPit ¼ l1 D ln GDPi;t1 þ l2 DI i;t þ l3 D ln Xi:t þ Dei;t ð4:8Þ
In the differenced Eq. 4.8, we still have a correlation between Dei;t and
D ln GDPi;t1 , which could be addressed by instrumenting D ln GDPi;t1 . Finding a
valid external instrument is very difficult; hence, GMM draws instruments from
within the dataset, that is, lagged values of the dependent and independent variables
in case of endogeneity. Thus, the GMM procedure gains efficiency compared to
OLS by exploiting additional moment restrictions.
The regression outputs from Eq. 4.7 are short-term estimates in the context of
economic growth. Since the effect of different factors should be evaluated in the
long run, it is also vital to compute the long-run coefficients. Hence, transforming
Eq. 4.7 yields the following:
D ln GDPit ¼ xðln GDPi;t1 q2 I i;t þ q3 ln Xi:t Þ þ gi þ ei;t ð4:9Þ

lj
where x ¼ ð1 l1 Þ and qj ¼ 1x ; j ¼ 2; 3:
According to Neuhaus (2006), in Eq. 4.9 the brackets show the long-term
relationship among the variables and qj are long-term coefficients of the model. x is
the speed of adjustment to the long-term value and x is the error correction
coefficient denoting adjustment of the system of variables to the state of long-run
equilibrium (Neuhaus 2006).
As a general estimation strategy, we first estimated a baseline equation con-
taining the lagged GDP levels and the classical growth determinants: gross fixed
capital formation (as a percentage of GDP) and trade openness (expressed in terms
of exports as a percentage of GDP) which are expected to have a significant positive
contribution to growth. In the second model institutional variables (protection of
property rights, controlling of corruption and government effectiveness), and gross
enrollment as a proxy for human capital, which is expected to have a positive
contribution to growth are included. Finally, Model 3 is tested for robustness by
introducing one control variable, the general government final consumption
expenditure as a percentage of GDP.
74 K.G. Giorgis
Table 4.3 Variables used in the regression analysis

Variable Description Source
GDP Real GDP per capita in constant 2000 US$ The World Bank
GFCF Gross Fix Capital formation as % in GDP The World Bank
Trade Trade openness as % in GDP The World Bank
Corrupt Controlling of corruption WGI
Govef Government effectiveness WGI
Pright Protection of property rights WGI
Schooling Gross enrollment rate World Bank
Gcons General government final consumption World Bank
expenditure (% of GDP)
First, Eq. 4.8 is estimated using the Arellano-Bond first difference GMM esti-
mator to get the short-run coefficients. Second, the long-run coefficients and the
error correction term are computed and tested for its significance using the Wald
test. The short-term equations correspond to Model 1 to Model 3 in Table 4.4,
while the corresponding long-term equations (Model 1 to Model 3) are given in
(Table 4.5).
Finally, to test the consistency of the GMM estimator, checking the validity of
the moment conditions is required which can be done using two specification tests:
the Hansen test which is a test for over-identifying restrictions and the joint null
hypothesis (the instruments are valid) and the Arellano-Bond test for no
second-order serial correlation in the error term. To ascertain the consistency of the
estimator both the tests are applied.
Table 4.3 represents the various macroeconomic variables and national accounts
data. To capture institutional quality, we used some of the vital indicators from the
WGI database: controlling of corruption, government effectiveness, and protection
of property rights. The dependent variable is represented by real GDP per capita in
21 sub-Saharan African countries. The analyzed period is 1996–2012, covering a
series of financial and economic crises.
As can be seen from short-run estimates in Table 4.4, Model 1 is the baseline
equation where besides the lagged level of GDP, we also introduce classical growth
determinants such as gross fixed capital formation and trade openness as a per-
centage of GDP (export to GDP ratio). Both the lagged levels of GDP and exports
to the GDP ratio have the expected signs and are significant while gross fixed
capital formation has an unexpected negative sign.
An increase in trade openness (exports as a percentage of GDP) by 1% will raise
GDP per capita by 0.093%. The gross fixed capital formation (as a percentage of
Table 4.4 Institutions and economic growth—short-run estimations Dependent variable: Real per
capita GDP (logarithm)
Regressors Model 1 Model 2 Model 3
L.lngdpc 0.156*** 0.132** 0.106**
(0.059) (0.065) (0.058)
lnGFCF −0.067*** −0.07*** −0.063*** (0.023)
(0.018) (0.018)
lnTrade 0.093** 0.107*** 0.113***
(0.027) (0.027) (0.027)
lnschooling 0.003 −0.003
(0.016) (0.014)
Government effectiveness 0.001 0.0004
(0.001) (0.002)
Controlling of corruption −0.0001 −0.0002
(0.001) (0.0008)
Property rights 0.003*** 0.002***
(0.001) (0.001)
lnGovernmentConsumption −0.089**
(0.051)
N 315 315 315
No. of instruments 75 79 80
Hansen j statistic (p value) 0.122 0.476 0.357
Serial correlation test AR2 (p value) 0.839 0.901 0.563
Note
Robust standard errors in brackets
*, ** and *** denote significance levels of 10, 5 and 1%
Dependent variable: Real per capita GDP (logarithm)
N represents the number of panel observations
Method used is Arellano and Bond’s (1991) first difference GMM
Instruments, Arrelano-Bond type: the dependent variable from lags 2 to 5. Standard instruments:
the level of all other regressors
The Hansen test reports the validity of the instrumental variables test. The null hypothesis is that
the instruments are not correlated with the residuals (for robust estimations Stata reports the
Hansen j statistic instead of the Sargan test)
For the Arellano-Bond test, the null hypothesis is that of no serial correlation between residuals
GDP) is negatively related to GDP per capita because of the crowding out effect—
in this case, domestic investments are much more important than public invest-
ments. On a similar basis, if we look at the long-run estimates (Table 4.5 Model 1)
an increase in trade openness by 1% will raise GDP per capita by 0.11%, moreover,
a 1% increase in the gross fixed capital formation will reduce GDP per capita by
0.08%. Further, the catch-up term has the expected negative sign, and it is statis-
tically significant.
In Table 4.4, Model 2, the proxy for institutional quality such as the index for
controlling of the corruption, the government effectiveness index, and index for
protection of property rights have been added to the classical growth determinants
to see the effect of institutions on economic growth. The results show that the index
76 K.G. Giorgis
Table 4.5 Institutions and economic growth—long-run estimations Dependent variable: Real per
capita GDP (logarithm)
Regressors Model 1 Model 2 Model 3
L.lngdpc −0.844*** −0.868*** −0.894***
(Convergence.Coefficient) (0.059) (0.065) (0.058)
lnGFCF −0.079** −0.081** −0.071**
(0.036) (0.041) (0.034)
lnTrade 0.110** 0.123*** 0.126***
(0.053) (0.027) (0.035)
lnschooling 0.004 −0.003
(0.02) (0.014)
Government effectiveness 0.002 0.0004
(0.01) (0.002)
Controlling of corruption 0.002 −0.0002
(0.001) (0.001)
Property rights 0.004*** 0.002***
(0.001) (0.001)
lnGovernmentConsumption −0.090**
(0.056)
Note
Standard errors in brackets
*, ** and *** denote significance levels of 10, 5 and 1%
for protection of property rights has a positive though the negligible impact on
growth in GDP per capita. However, government effectiveness and controlling of
corruption indices are not statistically significant although they have the expected
sign. Similarly, the gross enrollment rate (schooling) which is a proxy for human
capital, even if it is insignificant has the expected positive sign in the presence of
institutions. As expected, gross fixed capital formation and trade openness remain
highly significant, and the impact of gross fixed capital formation on growth even
increases slightly in the presence of institutions.
From Table 4.5, Model 2, the long-term effect of the institutional variable
‘protection of property rights’ is slightly higher as compared to its short-run esti-
mate, that is, it increases from 0.003 to 0.004% for a 1% increase in the quality of
protection of property rights indicating that even in the long run its effects are
negligible. Besides, the introduction of institutional variables slightly raises the
speed of convergence to the steady state from 0.84 to 0.86.
To test the robustness of the models, we introduced one control variable, the
general government final consumption expenditure (as a percentage of GDP) in
Model 3 (Tables 4.4 and 4.5). The impact of institutions on growth was still sig-
nificant to the introduction of the macroeconomic policy variable. The impact of
corruption and government effectiveness on economic growth remained
insignificant.
Two major concerns in using GMM estimators is how valid the instruments are
and controlling the serial correlations of residuals. The p values obtained (see
Table 4.4) using the Hansen test indicate exogeneity of the instruments used, that
is, the instrument sets were orthogonal to the regressors and were therefore valid for
estimation. Similarly, to tackle the problem of the serial correlation of residuals, we
needed to test auto-correlation of second order or more in the errors. Therefore, as
can be seen from Table 4.4, the Arellano and Bond test confirmed the null
hypothesis of the absence of second-order auto-correlation.
4.6 Conclusion
Understanding the determinants of poor growth performance in poor countries like

those in Africa is vital. To understand how important institutions are in determining
the growth performance of sub-Saharan Africa countries, this paper empirically
analyzed the impact of institutional quality proxied by controlling of corruption,
government effectiveness, and the protection of property rights index among others
on economic growth in sub-Saharan African countries during the sample period
1996–2012.
The study was based on 21 sub-Saharan African countries. The methodology
was based on the first-differenced GMM estimator proposed by Arellano and Bond
(1991) for dynamic panel data, which is robust to take into account individual fixed
effects, auto-correlation, and heteroskedasticity in the presence of endogenous
covariates.
Our study indicates that lagged GDP per capita and trade openness had a sig-
nificantly positive effect on the growth of real per capita GDP, while gross fixed
capital formation and government consumption had negative and significant effects
both in the short and long run. While human capital represented by schooling had
the expected sign it was not significant. Our study also shows that out of the three
institutional variables protection of property rights had a positive and significant
effect on growth both in the short and long term, that is, an increase in protection of
property rights by 1% increased output per capita by 0.004% at the 99% level of
significance in the long term. While institutional variables such as controlling of
corruption and government effectiveness had a positive effect on growth, they were
statistically insignificant.
Hence, this preliminary study indicates that improving institutional quality in
terms of enhancing protection of property rights on average had a positive con-
tribution to growth in output per capita in the sampled countries though its mag-
nitude was very small. This result agrees with some of the studies conducted so far
on the effect of institutions on growth. However, it must be considered that all the
empirical researches have investigated the relationship between institutions and
economic growth but we still face the difficulty of getting good institutional quality
indicators which is also true for this study.
78 K.G. Giorgis
References
Acemoglu D, Robinson JA (2010) Why Africa is poor? Econ Hist Dev Reg 25:21–50
Acemoglu D, Johnson S, Robinson JA (2001) The colonial origins of comparative development.
Am Econ Rev 91(5):1369–1401
Arellano M, Bond S (1991) Some tests of specification for panel data: Monte Carlo evidence and
an application to employment equations. Rev Econ Stud 58(2):277–297
Dushko J, Darko L, Risto F, Cane K (2011) Inst and growth revisited: OLS, 2SLS, G2SLS,
Random Effects IV regression and panel fixed (within) regression with cross Country data
Glaeser E, Porta RL, Lopez-de-Silanes F, Shleifer A (2004) Do inst cause growth? J Econ Growth
9:271–303
Grossman GM, Helpman E (1991) Comparative advantage and long-run growth. Am Econ Rev 80
(4):796–815
Hall RE, Jones CI (1999) Why do some countries produce so much more output per worker than
others? Quart J Econ 114:83–116
Judson R, Owen A (1999) Estimating dynamic panel data models: a guide for macroeconomists.
Econ Lett 65:9–15
Naude WA (2004) The effects of policy, inst and geography on economic growth in Africa: an
econometric study based on cross-section and panel data. J Int Dev 16:821–849
Nelson RR, Sampat BN (2001) Making sense of institutions as a factor shaping economic
performance. J Econ Behav Organ 44:31–54
Neuhaus M (2006) The impact of FDI on economic growth. Physica-Verlag, Wurzburg
Nickell S (1981) Biases in dynamic models using fixed effects. Econometrica 49:1417–1426
North DC (1981) Structure and change in economic history. Norton, New York
North DC (1990) Inst, inst change and economic performance. Cambridge University Press, New
York
Redek T, Sušjan A (2005) The impact of inst on economic growth: the case of transition
economies. J Econ Issues 39(4):995–1027
Rodrik D, Subramanian A, Trebbi F (2004) Institutions rule: the primacy of inst geography and
integration in economic development. J Econ Growth 9:131–165
Roodman D (2006) How to do xtabond2: an introduction to difference and system GMM in
STATA. Center for Global Development, Working paper no. 103
Solow RM (1956) A contribution to the theory of economic growth. Quart J Econ 70:65–94
Valeriani E, Pelso S (2011) The impact of inst quality on economic growth and development: an
empirical investigation. J Knowl Manage 6:1–25
Chapter 5
Fiscal Effects of Aid in Rwanda
Thomas Bwire, Caleb Tamwesigire and Pascal Munyankindi
Abstract This paper analyzes the dynamic relationship between foreign aid and
domestic fiscal variables in Rwanda using a co-integrated vector auto-regressive model
for quarterly data over the period 1990Q1–2015Q4. The results show that aid and fiscal
variables form a long-run stationary relationship and that aid is a significant element of
long-run fiscal equilibrium and the hypothesis of aid exogeneity is not statistically
supported; anticipated aid appears to have been taken into account in budget planning.
Aid is associated with increased tax efforts, public spending, and lower domestic bor-
rowings. Aid has contributed to improved fiscal performance in Rwanda, although the
slow growth in tax revenue and regular aid shortfalls has prevented sustaining a balanced
budget inclusive of aid. In terms of policy, continued efforts by donors to coordinate aid
delivery systems, make aid more transparent, and support improvements in government
fiscal statistics will all contribute to improving fiscal planning. Recipients need to know
how much aid is available to finance spending and how this is delivered, that is, whether
through donor projects or government budgets.
Keywords Domestic fiscal variables Aid CVAR Rwanda

JEL Classification C32 F35 O23 O55
The views expressed in this paper are those of the authors. They do not necessarily represent the
views of the Bank of Uganda, University of Kigali and the National Bank of Rwanda or their
affiliated organizations.
T. Bwire (&)
Bank of Uganda, Kampala, Uganda
e-mail: tbwire@bou.or.ug
C. Tamwesigire
University of Kigali, Kigali, Rwanda
e-mail: ctamwesigire@yahoo.com
P. Munyankindi
National Bank of Rwanda, Kigali, Rwanda
e-mail: pmunyankindi@bnr.rw

DOI 10.1007/978-981-10-4451-9_5
80 T. Bwire et al.
5.1 Introduction
The underlying economic rationale for foreign aid to developing countries can be
traced back to Chenery and Strout’s (1966) two-gap model. In their model,
investments are the cornerstone of growth, but they require domestic savings and, at
least initially, imported capital goods. Low-income countries are constrained by two
gaps: insufficient domestic savings to provide the resources needed for financing the
level of investments required to achieve their target growth rates and insufficient
foreign exchange earnings (as they are unlikely to have sufficient export earnings)
to finance capital imports. As these savings and foreign exchange gaps constrain
growth, capital flows (of which foreign aid is one form) are an important source of
development finance (Franco-Rodriguez et al. 1998; McGillivray and Morrissey
2000) as they relax savings and foreign exchange constraints.
Aid is premised on different development constraints. However, the fact that
most of the aid that is spent in a country goes to (or through) the government or
finances the provision of public goods and services that would otherwise place
demands on the budget (Franco-Rodriguez et al. 1998; McGillivray 1994, 2001)
makes understanding its effects on central government fiscal behavior a necessary
condition for its effective and successful deployment.
Fiscal response models (hereafter FRM) offer important insights into how for-
eign aid donors expect their efforts to impact the fiscal behavior of a recipient
government. This is because the new incentives and conditions created by the
addition of foreign aid to the actions of the state definitely disrupt how the state
disposes of the fiscal tools of tax revenues, expenditure, and public debt, but only in
uncertain ways. Aid packages come with strong pressures to spend, so there is an
expectation that aid will increase spending (O’Connell et al. 2008). Moreover,
reforms linked to aid conditionalities are expected to increase tax revenues and tax
rates either because of influences on tax efforts or because they affect tax rates or the
tax base (Morrissey 2015). Perhaps because donors’ conditionality often requires
recipient governments to reduce budget deficits (Adam and O’Connell 1999;
McGillivray and Morrissey 2000), aid is also expected to lower domestic bor-
rowings. However, in reality, these are general expectations and may not always
hold true.
In the 10 years before 2008, the total overseas development assistance (ODA) as
a share of GDP averaged 29.7% (The World Bank 2008). Over the same time, a
World Bank report puts foreign direct investments and domestic savings as shares
of GDP at some dismal 0.23 and −1.4%, respectively, on average.
With the new G-8 initiative on debt forgiveness and donors’ increased focus on
the poorest countries, the level of support to Rwanda was scaled up until 2012 when
the country suffered aid suspension. During FY 2013–14, Rwanda’s budget and
sector support as well as project financing, grants, and loans accounted for 11.6% of
GDP and 40% of government spending. This is illuminating and clearly highlights
the importance of ODA in sustaining Rwanda’s broad growth prospects, making it
an interesting case study for the effects of foreign aid.
5 Fiscal Effects of Aid in Rwanda 81
Studies that have investigated the effect of aid on the fiscal behavior of recipient
countries are reviewed and discussed in Morrissey (2015). As echoed in Riddell
(2007), the debate suggests that country-based evidence provides the only reliable
backdrop for exploring aid–fiscal behavior dynamics as experiences between
countries vary due to their different institutional foundations. Our paper investigates
the fiscal effects of foreign aid in Rwanda using a quarterly dataset for
1990Q1–2015Q4. The advantage of quarterly data is that aid is measured by the
Ministry of Finance and Economic Planning and should be closer than the donors’
measurement of aid as recorded in the budget. A potential disadvantage is that this
may not correspond fully with an annual budget planning cycle. Nonetheless, as
shown in Bwire et al. (2016), quarterly data give qualitative results similar to what
are obtained from annual data and these in general are consistent with what is
known about the fiscal effects of aid.
The rest of the paper is organized as follows. Section 5.2 provides a brief
literature review, while the data, econometric methodology, and aid-related
hypotheses of interest are presented in Sect. 5.3. Section 5.4 discusses the empir-
ical results. Section 5.5 gives the conclusion and policy recommendations.
5.2 Literature Review of the Fiscal Effects of Aid
There is significant empirical literature on the impact of aid on the fiscal behavior of
aid recipients. A detailed review of this literature is provided in McGillivray and
Morrissey (2004) and Morrissey (2015). An important distinction is made between
fungibility and fiscal response studies.
Fungibility studies analyze the effects of foreign aid on the composition of
government spending. Aid is said to be fungible if the recipients fail to use it in the
manner intended by the donor. As presented in World Bank (1998), the underlying
assumption is that donors grant aid to finance public investments as increments to
the capital stock which are the principle determinants of growth; fungibility arises
when recipients divert the aid to finance government consumption spending. This is
undesirable because such a diversion reduces the effectiveness of aid. However, to
the extent that consumption spending is a necessary complement to investment
spending (recurrent spending is required to operate investments such as nurses and
medicines for a healthcare center), the assumption that fungibility diminishes the
effectiveness of aid may be misleading.
Analogously, fungibility is said to occur if aid intended to finance a particular
sector such as health or education services that would otherwise be funded by tax
revenues, release domestic resources for spending in some other sectors of the
economy. In this case, fungibility arises because donors and recipients have dif-
fering expenditure allocation preferences. Evidence as to whether aid has been
fungible or not and whether fungibility limits aid effectiveness is imprecise largely
due to data limitations. Morrissey (2015) details the practical difficulties of directly
linking aid, donor intentions, and sector spending, given the need to distinguish
82 T. Bwire et al.
between on-budget and off-budget aid and the problematic classification of

spending. As this is not the focus of our study, readers are referred to a more
detailed discussion on this in McGillivray and Morrissey (2004).
FRMs adopt a broader approach allowing for the dynamic effect of aid on
expenditure (current and capital spending), tax revenue, and domestic borrowings.
The traditional framework is based on the assumption that a government maximizes
utility based on a quadratic loss function subject to targets for each revenue and
expenditure category (Franco-Rodriguez et al. 1998: 1242–1243). However,
empirical applications of FRMs are limiting on several grounds. For example,
McGillivray and Morrissey (2004) show that they are notoriously difficult to esti-
mate and highly sensitive to data, often yielding inconsistent estimates of core
parameters. Moreover, the theoretical framework does not provide a thorough
representation of government behavior (e.g., there is no explanation of how the
targets are determined) and does not generate specific testable hypotheses of the
effect of aid on fiscal behavior (Osei et al. 2005).
In an effort to overcome these difficulties, there is now a growing body of
empirical literature estimating FRM within a co-integrated vector auto-regressive
(CVAR) framework. The advantage of a CVAR estimation is that the tractable
framework allows formulating and testing a number of different hypotheses on
causal links between aid and domestic fiscal variables. The technique takes into
account interactions between variables over time, allowing a distinction in esti-
mating long-run (equilibrium) and short-run (adjustment to the equilibrium) rela-
tions. There is one equation for each and every variable, so all variables in the
system are treated as potentially endogenous and each variable is explained by its
own lags and lagged values of the other variables. Assumptions about exogeneity
are tested directly, avoiding the need for strong a priori assumptions; by design, the
econometric model allows the data to identify the statistical relationship between
variables. And it is an atheoretical approach—that is, one does not have to maintain
the existence of, estimate or test specific theoretical formulations of budgetary
planning targets, nor is it necessary to estimate structural parameters. Rather,
economic theory is invoked to choose the variables to include in the analysis, and
select the appropriate normalizations and restrictions to identify particular effects
and to interpret the results.
Surveys and discussions in the literature on country-specific fiscal effects of aid
using a CVAR approach are provided in Morrissey (2015). These include the first
CVAR studies: Osei et al. (2005) for Ghana; Morrissey et al. (2007) for Kenya;
Martins (2010) and Mascagni and Timmis (2014) for Ethiopia; and very recently
Bwire et al. (2013, 2016) for Uganda. It is clear that the impact of aid is country
specific, but this should not be surprising as governments differ in their fiscal
behavior. Osei et al. (2005) find that for Ghana, aid is weakly exogenous to do-
mestic fiscal variables (i.e., donors do not respond to fiscal imbalances in deter-
mining how much aid to allocate), but aid has effects on spending, domestic
borrowings, and domestic tax revenues. Specifically, aid was associated with
reduced domestic borrowings and increased tax revenues. They also found that
recurrent spending increased more than investment spending following an increase
in aid and this was not because aid was fungible but because investment spending
was linked to borrowing and declined as borrowing was reduced, whereas recurrent
spending was linked to tax and this increased as revenues increased.
Morrissey et al. (2007) extended this approach with official Kenyan data for
1964–2004 and estimated two relationships: the fiscal effects of aid grants and
loans, and the impact of aid on growth. They found that aid grants were associated
with increased spending, while loans were a response to unanticipated deficits; that
is, if spending exceeded revenues (tax and grants), the government sought loans to
finance the deficit. Aid grants were positively associated with growth through
financing government spending, and loans were negatively associated with growth
perhaps because they were associated with deficits. There was no evidence that aid
affected tax revenue or that tax had an effect on growth (except indirectly via
financing spending).
Martins (2010) provides a comprehensive application of the CVAR method
using quarterly data for Ethiopia over 1993–2008. He finds evidence of a long-run
positive relationship between aid and development spending, but not between aid
and recurrent spending (hence, no evidence that aid is fungible), domestic bor-
rowings increased in response to shortfalls in revenue (tax and grants), and there
was no evidence that aid reduced tax efforts. Further, aid grants adjusted to the level
of development spending.
Bwire et al. (2013, 2016) formulated a set of testable hypotheses for the fiscal
effects of aid (budgetary constraints, a balanced budget, aid additionality/illusion,
tax revenue displacement, and aid-domestic borrowing substitution) in Uganda
within the CVAR framework on both annual and quarterly fiscal data. They found
that aid was a significant element in the long-run fiscal equilibrium and did not find
evidence supporting the assumption that aid was exogenous in the fiscal equilib-
rium. Aid was associated with increased tax efforts, lower domestic borrowings,
and increased public spending. Further investigation of the long-run relation among
the fiscal variables revealed support for the existence of a budget constraint and a
non-balanced budget excluding aid. Mascagni and Timmis (2014) applied a CVAR
analysis to Ethiopian government data over 1960–2009: Aid (grants and loans) was
positively related to tax revenue; tax did not adjust to aid but aid was an adjusting
variable, implying that donors rewarded Ethiopia when tax revenues were
increasing. Table 5.1 presents the results of selected country-specific studies on the
dynamic effect of aid.
Our study used a CVAR model to evaluate hypotheses of interest relating to the
interaction of aid with domestic fiscal aggregates in Rwanda based on quarterly
time series data for the period 1990Q1–2015Q4. In particular, our study evaluated if
there exists a fiscal equilibrium among the fiscal variables, including aid; if aid
forms part of this fiscal equilibrium relation; if donor governments do not react to
fiscal disequilibrium; if donors’ aid allocation is not influenced by past fiscal
conditions in Rwanda; if aid does not influence the fiscal conditions in Rwanda; and
it also estimates the long-run impact of aid on domestic fiscal aggregates.
84
Table 5.1 Results of selected studies on the dynamic impact of aid

Study Sample Aid measure Aid exogeneity Current Incremental impact of aid on
spending Capital Total Domestic Dom.
spending spending revenue borrowing
Bwire et al. (2016) Uganda annual data ODA from Not supported n.r n.r ++ ++ –
(1972–2008) OECD Not supported ++ ++ –
Quarterly data ODA from
(1997–2014) MoFPED
Bwire et al. (2013) Uganda ODA from Not supported n.r n.r ++ ++ –
OECD
Mascagni and Ethiopia annual data Grants loans Aid treated as ++ ++ + n.r
Timmis (2014) (1960–2009) an additional source + + n.r
of revenue
Martins (2010) Ethiopia Grants loans Not supported n.r ++ n.r ++ ?
(1993Q3–2008Q2) n.r n.r n.r n.r n.r
Osei et al. (2005) Ghana (1966–1998) ODA Not supported ++ + ++ ++ –
Morrissey et al. Kenya (1964–2004) Grants loans Aid treated as n.r n.r + n.r n.r
(2007) an additional source n.r n.r + – n.r
of revenue
Note
(i) ++ (strongly positive), + (moderately positive), – (strongly negative), - (moderately negative), (insignificant)? ambiguous), n.r (not reported or cannot be
inferred)
(ii) Due to differences in the measurement of aid, results are not directly comparable across the table
T. Bwire et al.
5.3 Data and Econometric Methodology
5.3.1 Data
Our study used quarterly time series data (1990Q1–2015Q4) in Rwandan francs
reported at constant 2011 prices. Fiscal data on foreign aid, tax revenue, domestic
borrowings from the banking system, and recurrent and capital government
spending are from Rwanda’s Ministry of Finance and Economic Planning. The
non-tax revenue component of domestic revenue and other forms of borrowing are
omitted from the system as we are not estimating an identity. Aid data capture total
net disbursements from all donors as recorded by the government and comprises
capital and budgetary grants. As this data is from fiscal authorities, it is assumed to
fairly measure the actual aid known to the fiscal authorities and should be capable
of affecting budget planning. Nonetheless, while this is true for all on-budget or
program support, caution should be taken as an appropriate treatment of capital
grants is more complicated. Some of the these grants may be on-budget such as
sector projects that are known to the government, especially if matching funds are
required; some may be known and influence spending allocations such as health
projects that permit the government to reduce its own health spending; and some
may be genuinely off-budget such as technical assistance in an area that the gov-
ernment would not otherwise fund and this is spent either within the donor country
or under the control of the donors or that the donors retain control over project aid.
Some previous applications (Martins 2010; Morrissey 2001) have disaggregated
aid into grants and loans in principal, because they may have different effects (gov-
ernments prefer grants because they do not have to be repaid; loans may encourage
fiscal planning for future servicing and repayment costs), so that there could be an aid
aggregation bias. However, as argued in McGillivray and Morrissey (2001) and
Bwire et al. (2013), in practice, such a bias is likely to be minor. Aid loans are long
term, and governments currently in power are unlikely to be around when repayments
are due so they could be treated as grants. Indeed, the share of aid loans/GDP fell from
4.7% through the 1990s to 3.9% in the last 15 years. Over the same time, the share of
aid grants/GDP rose sharply from an average of 1.6–6.3%. Thus, capital grants are
similar to budgetary grants and are treated as grant or aid in this study.
Raw data are reported in Fig. 5.1. A visual inspection of the data reveals two
important features. First, levels were low and relatively persistent until the start of
the 2000s after which spending and revenue followed a clear upward trend but only
a slight irregular upward trend for aid. Aid was generally low during the 1990s,
hitting negatives in 1994 (perhaps reflecting the genocide) but increased dramati-
cally between 2000 and 2010. It increased erratically until 2015, dropping sharply
during 2012 when the country suffered aid suspension. In terms of spending, aid
was equivalent to 28.1% through the 1990s, increased steadily through the 2000s to
43.7% and averaged 30.7% over the last five years. Within years, aid tended to be
highest in the fourth quarter (or sometimes the second) and this was also the case,
86 T. Bwire et al.
240
Tax Revenue
Rwandan Francs (billions, 2011 prices)
200 Aid
DomesƟc Financing
160 Recurrent Spending
Capital Spending
120
80
40
0
1990q1
1991q1
1992q1
1993q1
1994q1
1995q1
1996q1
1997q1
1998q1
1999q1
2000q1
2001q1
2002q1
2003q1
2004q1
2005q1
2006q1
2007q1
2008q1
2009q1
2010q1
2011q1
2012q1
2013q1
2014q1
2015q1
-40
Fig. 5.1 Series in levels. Source Rwanda, Ministry of Finance and Economic Planning and
Ministry of Finance and Economic Planning, Rwanda
but less pronounced, for tax revenue. Domestic borrowings were negative
throughout most of the mid-1990s and late 2000s.
Second, all variables typically trended over time, suggesting a multiplicative
rather than additive model specification which under log transformation is brought
back into additive form. However, as argued in Bwire et al. (2013) and Juselius
et al. (2011), such transformation is innocuous only and only if the series data
points are strictly positive or are at least not too close to zero. In our study sample,
log transformation of domestic borrowing series and some data points in the aid
series are problematic with dire estimation consequences which perhaps make it
even more undesirable. First, it obviously generates lost observations, shortening an
already small sample. This alone weakens the power of the tests—making the
CVAR analysis less reliable. Second, the omission of non-positive observations
will be nonrandom, leading to a selection bias. And third, the trending in the data
begins from the early 2000s—a shift that might be lost with log transformation.
Given this, all series are left in non-log specifications. However, while a trade-off
in the choice between log and non-log specifications might matter, as we show our
analysis gives results that are consistent with what is known about the fiscal impact
of aid in some of the previous country-specific applications, particularly those that
typically used log transformations due to trends in the variables. This in itself
suggests that there is little to be gained from log over non-log specifications.
5.3.2 The Co-integrated VAR (CVAR) Model
Following Johansen’s (1988) adoption of the estimation and testing of multivariate

relationships among nonstationary data, vector auto-regressive (VAR) methods
have become the “tool of choice” in much of time series macro-econometrics. As a
reduced form representation of a large class of dynamic structural models

(Hamilton 1994: 326–327), VAR offers both empirical tractability and a link
between data and theory in economics.
The principal purpose of our econometric exercise is to investigate the role of aid
to fiscal response in Rwanda. Therefore, the parallel between the economics and
econometrics of fiscal response models is useful in assessing how aid is used in the
budget. From an economic standpoint, aid can be used in the process of budget
planning and/or in relaxing budget constraints. Where aid forms part of the process
of budgetary planning, it may be viewed as having a long-run role, the recipient
directly incorporating the level of aid in budgetary planning. In contrast, aid may
simply relax budget constraints when it is received. This economic distinction
corresponds to the econometric notions of the long and short run in that the process
of budgetary planning defines an equilibrium relation among the fiscal variables (of
which aid may be one element) and a transitory relaxation in fiscal constraints.
Accordingly, in our application where the fiscal aggregates are likely to be
nonstationary and co-integrated, it is convenient to couch the empirical analysis in a
CVAR framework. To facilitate interpretation of the potentially complex dynamic
interactions among the fiscal variables, it is convenient to express CVAR in its error
the variables adjust over time. This is given by:
X
p1
Dxt ¼ Pxt1 þ Ci Dxti þ Wdt þ et ð5:1Þ
i¼1
where xt is a ðn 1Þ vector of jointly determined variables at most integrated of

order 1, I(1), Ci is a (p x p) matrix of short-run adjustment coefficients, i = 1, …,
(p − 1) is the number of lags included in the system, D is a first difference operator,
dt is a ðq 1Þ vector of deterministic terms (such as constants, linear trends, and
dummies), and et is a ðn 1Þ vector of errors with standard properties. Each of the
ðn nÞ matrices Ci ¼ ðPi þ 1 Þ and P ¼ ðI P1 Pp Þ comprises coeffi-
cients estimated by Johansens’s (1988) maximum likelihood procedure using a
(t = 1, … T) sample of data.
Providing, as expected, some of the variables in xt are nonstationary, P has a
reduced rank. This allows P to be formulated as the hypothesis of co-integration:
P ¼ ab0 ð5:2Þ
where a and b are both (n r), and r is the rank of P corresponding to the number
of linearly independent relationships among the variables in xt . The fiscal equi-
librium thought of as the statistical analogue of the budgetary equilibrium in fiscal
response models is defined by the parameters in b. It follows then that b0 xt1
measures the extent to which the budget is out of equilibrium and a measures the
long-run rate at which each of the variables adjusts to restore the equilibrium.
Coefficients in the Ci matrices allow short-run adjustment in each of the variables to
88 T. Bwire et al.
differ from that given by their long-run rates (defined by the coefficients in a) and
hence, potentially at least, accommodate a wide range of dynamic responses.
5.3.3 Model Specification and Hypothesis Testing
The VECM in Eq. (5.1) is particularly attractive in the current context, since it
provides a natural framework in which parallels between the economics and
econometrics of fiscal response models can be exploited. Specifically, the frame-
work not only facilitates a statistical investigation of the role of aid in the budgets of
recipient countries but also shows whether fiscal conditions in recipient countries
affect aid-allocation behavior in donor countries. Because these economic
hypotheses of interest represent parameter restrictions within VECM, they can be
evaluated formally. In what follows, these economic issues of interest are set out as
a number of key propositions.
As discussed earlier, insofar as aid represents an injection of foreign finance, it
relaxes budget constraints. Aid allocated for financing debt or domestic con-
sumption is unlikely to achieve longer term effects on the budget, in which case the
impact of aid will be confined to the short run. In contrast, where aid is used as a
source of investment for development projects such as health care or infrastructure,
there may be more long-term effects on the budget as such investments spawn
further spending (aid illusion) or increased tax revenues. Since development pro-
jects of this sort are likely to have come about as a result of aid’s incorporation into
the process of budgetary planning, it is convenient to think of the aid’s long-run
effects and its incorporation into budgetary planning synonymously. Clearly,
whether aid is anticipated or not has a decisive bearing on the uses to which it is put
and thus the (short and/or long run) effects that it has.
The economic distinction between short and long run ties in neatly with the
VECM’s econometric formulation which in turn offers insights into the role of aid
in an empirical setting. The correspondence between the economics and econo-
metrics of aid in fiscal response is central to our paper, since it provides the basis for
the empirical testing of a range of economic hypotheses relating to the effects that
aid has in developing countries.
5.3.3.1 Formulating Aid Hypotheses
As can be deduced from the discussion earlier, the co-integrating relation is the
statistical analogue of the budgetary equilibrium in fiscal response models. Hence,
the fiscal response theory predicts the presence of a single co-integrating relation
(i.e., a stationary linear combination of the variables in xt ) such that b is an n 1
vector, the coefficients of which quantify the budgetary equilibrium. Of course, this
presupposes that all variables in xt are integrated of order 1, [I(1)]. Where a variable
is I(0), it will form a stationary linear combination with itself, so that there can exist
at most n of these stationary linear combinations; n ¼ r implies that all variables are
I(0). As Johansen (1992) demonstrates, each of the r columns of a corresponds to
the r rows of b0 , so that inference on the number of co-integrating vectors (nonzero
rows in b0 ) can be evaluated by hypothesis testing on the adjustment coefficients
(nonzero columns in a) using likelihood ratio methods. Specifically, standard tests
for co-integration are equivalent to testing that the a0i s are insignificantly small for
r ¼ 1; . . . n. This leads us to the first set of co-integration hypothesis tests, which
amount to zero restrictions on each of the n columns of a in Eq. (5.2): Hc ðrÞ:
ar ¼ 0, where r ¼ 1; . . .n.
To assist the exposition, consider a VAR (5.2) in VECM form with unrestricted
constant partitioned conformably as mentioned earlier:
" 0 0
#
Dx1t a11 a12 b11 b12 x1t1 C11 C12 Dx1t1 e1t
¼ 0 0 þ þ Wdt þ
Dx2t a21 a22 b21 b22 x2t1 C21 C22 Dx2t1 e2t
ð5:3Þ
a and b are partitioned by a co-integrating vector such that b0 is divided by row

into two subsets of co-integrating vectors b01 and b02 which are themselves parti-
tioned by a variable in the same way as xt . Thus, a1 and a2 load each of the subsets
of co-integrating vectors into each equation for correction. We assume that b01
represents the budgetary equilibrium so that b02 (and a2 ¼ ½a12 :a22 ) will be a null
matrix unless aid (or other variable(s)) is I(0). Where aid is found to be I(0), it
cannot belong to the fiscal equilibrium relationship, and thus, its principal role is to
relax budget constraints and may be indicative of countries where aid is too small to
be included in the process of budget planning or the case that aid is diverted away
from investment purposes for other reasons.
Proposition I (The existence of a fiscal equilibrium) Evaluation of Proposition I is

by Hc ð1Þ and is confirmed by a11 = a21 6¼ 0 and a12 = a22 = 0 using co-integration
tests. All variables in xt are tested for the order of integration, but within a mul-
tivariate framework after estimation of Eq. (5.3). Where the result from testing
Hc ðrÞ suggests two (or more) stationary linear combinations of the data, the sta-
tionarity of variables in xt (such as aid) may account for it. As trivial stationary
linear combinations, they are of no economic interest, and we confine them to b02
before removing them from the model. This is achieved by transferring any sta-
tionary variables from xt to dt , so that the former contains only those variables
germane to the long run. Adjusting the dimensions of xt , Dxt , and dt accordingly
yields:1
1
Where variables are found to be I(0) Rahbek and Mosconi (1999) suggest a tractable modification
to ensure that the limiting distributions of the co-integration test statistics are invariant to the
presence of the stationary regressors included in dt.
90 T. Bwire et al.

Dx1t a11 0 0 x1t1 C11 C12 Dx1t1 e
¼ b11 b12 þ þ Wdt þ 1t
Dx2t a21 x2t1 C21 C22 Dx2t1 e2t
ð5:4Þ
Having established the budgetary equilibrium within xt , the next step is to

establish the variables that each contains. We proceed on the assumption that r ¼ 1
having dealt with the multiple co-integrating vector case earlier.2 Testing the
statistical significance of each variable in the co-integrating relation requires
long-run exclusion tests, which amount to testing zero restrictions on each coeffi-
0
cient in b01 , namely He ðjÞ: b1j ¼ 0, j ¼ 1; . . .n.
As with the co-integration tests, long-run exclusion tests have economic and
econometric implications. Since the limiting distributions of co-integration test
statistics depend on the number of nonstationary variables in xt , any that are
redundant to the long-run relation can at most have a short-run impact and thus
enter the model in differenced form via dt as for the stationary variables discussed
earlier. Of particular interest is the significance of the aid variable which gives
rise to:
Proposition II (Aid forms part of the fiscal equilibrium relation) It is evaluated by
0
He ðft Þ: b11 ¼ 0 in Eq. (5.4), while other b coefficients are unrestricted.
Where aid is found to be I(1), but unimportant in the long run, it implies that it
has not had any significant long-run impact on the fiscal variables in the country
and could suggest that institutional factors prevented it from playing a role in the
fiscal equilibrium (e.g., aid leakage where corrupt government officials diverted aid
for private purposes).
Further, in investigating the way in which aid impacts the budgets of recipient
countries, attention naturally focuses on the causal mechanisms that exist between
aid and the other components of the budget. Specifically, we want to establish
whether aid is treated as given in the budget or whether its allocation actually
reflects the state of the budget in some way. This can be accomplished econo-
metrically by applying Johansen’s (1992) long-run weak exogeneity test and leads
to:
Proposition III (Donor governments do not react to fiscal disequilibrium) This is
tested by Hwe ðx1t Þ: a11 ¼ 0 in Eq. (5.4), while other a coefficients are unrestricted.
Where rejected, donors’ aid allocations react to past fiscal imbalances in the
recipient country. Conversely, a non-rejection implies that the aid is weakly
exogenous to the long-run relation so that departures from the recipient’s
2
In practice, long run exclusion tests are applied in conjunction with co-integration tests to
determine whether multiple co-integrating vectors were indeed due to the presence of stationary
variables in xt or multiple co-integrating relations among I(1) variables in xt. Since the latter case is
implausible from an economic viewpoint it is ruled out in the following development.
budgetary equilibrium do not influence the donor’s aid allocation. In effect, we

establish whether aid in Rwanda’s fiscal planning is treated as given or whether its
allocation actually reflects the state of the budget in some way. Similar tests were
applied to individual coefficients within a21 to establish which, if any, components
were weakly exogenous.
Moreover, as the VECM distinguishes the short-run relationships from long-run
relationships among the data, we are able to evaluate whether variables such as
aid are exogenous to both short- and long-run behaviors, which lead into Granger
non-causality testing (Granger 1969) and gives rise to:
Proposition IV (Donors’ aid allocation is not influenced by past fiscal conditions
in the recipient country) This can be expressed in terms of Eq. (5.4) as the null
hypothesis that x2t does not Granger-cause x1t , HG ðx2 ! x1 Þ: ða11 b012 Þ ¼ 0 and
C12 ¼ 0
Where they are upheld, these restrictions ensure that past values of the fiscal
variables do not influence current values of aid, whether in terms of long- or
short-run behaviors. Since the weak exogeneity of x1t (i.e., a11 ¼ 0) ensures that
0
ða11 b12 Þ ¼ 0, then x2t does not Granger-cause x1t provided lagged changes in x2t
do not influence x1t . Where this is so, x1t is described as being strongly exogenous
(Engle et al. 1983) and is evaluated using block exogenity Wald tests.
It is also of interest to evaluate whether aid is Granger non-causal for the
domestic budget (i.e., domestic fiscal variables). Where this hypothesis is upheld,
aid is unlikely to be effective; however, in practice, it may result when aid is
numerically small rather than statistically insignificant. This gives rise to the most
fundamental of the economic hypotheses:
Proposition V (Aid does not influence fiscal conditions in Rwanda) This propo-
sition amounts to the null hypothesis that aid is Granger non-causal for the budget
in the recipient country (i.e., x1t does not Granger-cause x2t ) and is evaluated in
Eq. (5.4) by:
0
HG ðx1 ! x2 Þ: ð=a21 b11 Þ ¼ 0 and C21 ¼ 0 in an analogous manner to that given
earlier.
5.4.1 Preliminaries
The unrestricted model was estimated with a restricted trend and an unrestricted
constant—implying no quadratic growth in the data (Bwire et al. 2016; Juselius
2006). The lag-length was determined as the minimum number of lags that met the
crucial assumption of time independence of the residuals based on a Lagrange
multiplier (LM) test, starting with k = 5—this being quarterly frequency data.
92 T. Bwire et al.
Schwarz Bayesian Criterion (SC) suggests two lags, while both the Hannan-Quinn
(HQ) criteria and the Akaike Information criteria favor five lags. With two lags, the
LM test does not reject the null hypothesis of no serial correlation in the residuals,
suggesting, inter alia, that the underlying CVAR model has to be estimated using
two lags. In addition, this captures many more dynamics of the system. VAR model
residuals are finally subjected to a battery of residual misspecification tests
(Godfrey 1988), but as shown in Annexure 5.1, the histograms portray a reasonably
normal distribution behavior.
5.4.2 Evaluating the Behavior and the Long-Run Fiscal

Impact of Foreign Aid in Rwanda
5.4.2.1 The Existence of a Fiscal Equilibrium
Theoretical predictions suggest the existence of a budgetary equilibrium among the

fiscal variables, especially allowing for a complete fiscal representation. On
determining the appropriate specification of the data generating process, the
co-integration rank was evaluated using Johansen’s trace statistic—a top-to-bottom
sequential procedure which is known to be asymptotically more correct than the
bottom-to-top Max-Eigen statistic (Juselius 2006: 131–134). However, the test
(trace) has been shown to have a finite sample bias with the implication that it often
indicates too many co-integrating relations so that the test is oversized (Juselius
2006: 140–142). Thus, with a sample of 87 observations, though relatively large in
the context of a developing country’s time series, a small sample bias is corrected
by using the Bartlett correction, which ensures a correct test size (Johansen 2002).
Tests results in Table 5.2 support the presence of one equilibrium (stationary)
relationship corrected for a small sample bias at the conventional 5% level of
significance. Moreover, roots of the companion matrix (Annexure 5.1) and graphs
of the potential co-integrating relations (available on request) both suggest that a
rank of one (r = 1) is well supported by the data.
Following the confirmation of the co-integrating rank, the presence of unit roots
is tested, but within the multivariate framework. Here, a stationarity test of a
variable in yi takes the following form:
Ho : b ¼ ðb01 ; b2 Þ; ð5:5Þ
where b01 ¼ ei and b2 is a (p x (r − 1)) dimensional matrix of unrestricted coeffi-

cients (Dennis 2006: 73).
In the test, the null hypothesis that a series is stationary against the alternative of
a unit is conditional on the co-integrating rank, r ðPÞ (Dennis 2006: 11–12), which
here is 1 and is a v2 ð4Þ test. In Table 5.3, the stationarity of each variable by itself in
the system is rejected in all cases, suggesting that the series are unit root
nonstationary.
Table 5.2 Johansen’s p−r r Eig. value Trace Tracea Frac 95

co-integration trace test
results 5 0 0.404 129.669 92.685 88.554
4 1 0.293 78.471 57.177 63.659
3 2 0.174 44.125 33.068 42.770
2 3 0.169 25.185 10.323 25.731
1 4 0.067 6.890 3.055 12.448
Note Trend assumption: linear deterministic trend restricted
a
The small sample corrected test statistic (Dennis 2006:
159–160); Frac 95: the 5% critical value of the test of
H(r) against H(p). The critical is approximated by the Gamma
ðCÞ distribution (Doornik 1998)
Table 5.3 Test of Domfin C_spending K_spending Aid Tax_Rev

stationarity
16.312 37.054 37.238 30.465 37.206
[0.003] [0.000] [0.000] [0.000] [0.000]
Note Restricted trend included in the co-integrating relation(s);
5% C.V = 9.488; p-values in brackets
Table 5.4 reports the long-run b parameters of the equilibrium relationship

normalized on domestic borrowings (as this is a residual incorporated to identify the
fiscal balance) and the associated adjustment coefficients (a). Estimates of the
long-run coefficients are signed in accordance with fiscal equilibrium and suggest,
ceteris paribus, that domestic borrowing is positively related to current and capital
spending and negatively related to aid and tax revenue. The coefficients on tax
revenue are larger, suggesting that in the long run, the budget is driven by tax
revenue (or domestic revenue in general) more than aid. This implies that the fiscal
variables are more strongly related to the known level of tax which reduces the risk
of fiscal vulnerability associated with aid, which is both unpredictable and volatile
(Bulir and Hamman 2003). Aid and tax revenue coefficients have the same sign,
implying that in the long run, aid or associated reforms have increased tax revenues.
Domestic borrowings are the main financing item in the system for a primary
budget deficit net of aid. An increase in aid is associated with lower domestic
borrowings (consistent with a lower deficit to finance or, when borrowings are
negative, with enhanced ability to repay) and the net long-run effect of aid in
Rwanda has, in part, been a reduction in domestic borrowings (aid may have been
used to offset domestic borrowings).
The estimated coefficient for the effect of aid on capital spending (0.61) is higher
than that on recurrent spendings (0.44). This suggests that in Rwanda, more aid has
been used to finance public investments, but with a reasonable share of it diverted to
consumption spending. While this resembles aid being fungible to the extent that
consumption spending is a necessary complement to investment spending (recur-
rent spending is required to operate an investment such as teachers’ salaries for
schools and nurses, medicines and ambulances for a healthcare center), the
assumption that fungibility diminishes the effectiveness of aid may be misleading.
94 T. Bwire et al.
Table 5.4 Estimates of long-run relationships for different normalizations of the fiscal
equilibrium
Domfin C_spending K_spending Aid Tax_Rev
Coefficients of co-integrating relationship (b)
−1.000 1.886 1.359 −0.835 −2.533
(.NA) (7.029) (4.826) (−5.946) (−6.473)
0.530 −1.000 −0.721 0.443 1.343
(7.506) (.NA) (−5.101) (6.636) (9.660)
0.736 −1.387 −1.000 0.614 1.863
(5.745) (−5.686) (.NA) (4.411) (10.739)
−1.198 2.259 1.629 −1.000 −3.034
(−8.314) (8.690) (5.182) (.NA) (−7.596)
−0.395 0.745 0.537 −0.330 −1.000
(−6.147) (8.591) (8.568) (−5.159) (.NA)
Adjustment coefficients (a)
−0.378 −0.079 0.067 0.215 0.097
(−5.467) (−2.768) (3.417) (6.146) (6.695)
Note The rows of (b) represent different normalizations of the only uncovered co-integrating
relationship (t-ratios in parentheses). The adjustment coefficients (a) are those obtained from
normalizing the co-integrating vector on Domfin; p-values in brackets
Overall, more than three-fourth of the aid contributed to spending which is plausible
and is consistent with aid being fully additional. Note that our measure of aid
included project grants and not all of these are included directly as government
spending, so there is no implication that aid has not been additional.
Relative to the coefficient on aid, the coefficient on tax revenue is larger: 1.34 for
recurrent spending and 1.86 for capital spending, suggesting that spending
over-responds to tax revenue. One interpretation is overoptimism regarding the
sustainability of tax increases: The government commits to spending expected
revenues, and if this is not realized, it resorts to some other deficit financing. This,
however, is reflective of poor budget management. The a coefficients suggest that
current spending and domestic financing adjust quite quickly to disequilibrium.
The results for the co-integrating relations in Table 5.4 imply that all variables
are significant, so to provide an empirical content to the structural analysis
underlying the causal link between aid and domestic fiscal variables, we now focus
on two types of long-run parameter restrictions described in Propositions II and III
earlier.
5.4.2.2 Aid Forms Part of the Fiscal Equilibrium and Donor

Governments Do Not React to Fiscal Disequilibrium
Table 5.5 gives the results of the test for Proposition II, that is, long-run variable
exclusion (zero restrictions on each bi ), and Proposition III, that is, weak exo-
geneity (zero restrictions on each ai ) for r = 1, based on the likelihood ratio
Table 5.5 Structural analysis

Test of Domfin C_spending K_spending Aid Tax_Rev
Long-run exclusion 15.475 8.481 11.794 7.362 10.811
[0.000] [0.004] [0.001] [0.007] [0.001]
Weak erogeneity 10.612 3.078 7.426 14.846 16.789
[0.001] [0.079] [0.006] [0.000] [0.000]
Note Null hypotheses are that a variable can be excluded from the co-integrating relations
(long-run exclusion) and that a variable is weakly exogenous (weak exogeneity); Obs: number of
variables = 87; p-values in brackets
(LR) test distributed as v2 ðr Þ. Consistent with the results in Table 5.4, the null
hypothesis of the exclusion of the long-run variable is rejected for all variables
(robust to small sample bias correction). Of particular interest is that aid is a
significant element of a long-run fiscal equilibrium, so it supports spending, just like
tax revenue and domestic borrowings.
Long-run weak exogeneity is also rejected for all variables in the system and
importantly for aid at conventional levels. As in Franco-Rodriguez et al. (1998), this
is consistent with fiscal planners having a target for aid revenue that is taken into
account while forming the budget. Bwire et al. (2013) had a similar result for
Uganda. It is the case that like in Uganda, donors incorporate government spending
in deciding how much aid to allocate to Rwanda (Bwire et al. 2013; Foster and
Killick 2006: 19). Fiscal planners in Rwanda have a forward-looking view and have
achieved reasonable success in getting more aid allocated as budget support and
released early in the budget year. In Rwanda, weak exogeneity of aid suggests that
aid has been responsive to within the year budget planning.
In Rwanda, endogeneity of both current and capital spending as suggested by the
results appears counterintuitive as spending is very difficult to reverse once
implemented (especially if it involves increases in public payrolls or statutory
expenditures). However, it implies that government spending is planned based on
expected revenues, whereas the allocation is affected when the revenue outcome is
realized; that is, spending allocations responds to revenue outturn. While it is
surprising that weak exogeneity of domestic borrowings also cannot be rejected,
both trend developments in Fig. 5.1 and estimates of the long-run relation in
Table 5.4 suggest that it is determined by factors other than domestic fiscal vari-
ables—that is, it depends on aid outturn but not tax revenue.
5.4.2.3 Donors’ Aid Allocation is Not Influenced by Past Fiscal

Conditions and It Does Not Influence the Fiscal Conditions
in Rwanda
Turning to the direction of causality, two issues are of interest: (1) whether past
values of the fiscal variables do not influence current values of aid, whether in terms
of long or short-run behaviors; and (2) whether aid is Granger non-causal for the
96 T. Bwire et al.
Table 5.6 Granger non-causality/block exogeneity Wald tests

Dependent variable! Aid C_spending Domfin K_spending Tax_rev
Excluded#
Aid: v2 ð2Þ – 2.662 5.041 4.515 1.756
(0.265) (0.080) (0.105) (0.416)
All: v2 ð8Þ 39.681 56.601 21.322 27.544 44.225
(0.000) (0.000) (0.006) (0.001) (0.000)
Note p-values in brackets
domestic budget. Results of block exogeneity, given in Table 5.6, suggest that
domestic fiscal variables influence current values of aid, allowing for the possibility
in particular, that government sets spending targets according to its development
objectives and then tries to find aid resources to finance these ambitions, albeit with
some level of unpredictability. This, however, should be interpreted with caution as
it does not imply that the authorities have control over aid allocations by donors
(aid commitments). Instead, as in Eifert and Gelb (2005), the disbursements could
be a reaction to the government’s ability to meet a donor’s administrative
requirements and/or other policy preconditions. As has been the case elsewhere, it
may also reflect exercising incentive clauses by donors in response to events over
which the Rwandan government has some direct control in the context of an
ongoing aid relationship (O’Connell et al. 2008).
The hypothesis that aid is Granger non-causal for the domestic budget is rejected
for domestic financing and for capital spending, although it is weakly significant.
This is consistent with estimates of the long-run parameters and implies in part that
the level of domestic debt is hugely influenced by the level of aid outturn such that
the higher the level of aid outturn, the lower the fiscal deficit to finance or that aid
enhances the authorities’ ability to repay domestic debt. Elsewhere, the weak results
are because overtime, aid as a share of the budget, has become numerically small as
the country strives to become self-reliant.
5.5 Conclusion and Policy Implications
This paper assessed the dynamic relationship between foreign aid and domestic
fiscal variables in Rwanda over 1990Q1 to 2015Q4 using a CVAR model. An
investigation of the long-run relation between the fiscal variables provided inter-
esting insights into the fiscal dynamics in Rwanda.
Aid and fiscal variables form a long-run stationary relationship. Aid is a sig-
nificant element in the long-run fiscal equilibrium, and the hypothesis of aid exo-
geneity is not statistically supported; that is, anticipated aid appears to have been
taken into account in budget planning. Rwandan budget planners may have had a
target for aid revenue or the donors incorporated government spending in deciding
how much aid to allocate to Rwanda or a combination of both. This implies that the
government sets its spending targets according to its own development objectives
and then tried to find resources to finance these ambitions in a priority order of
domestic revenue, aid, and domestic borrowings. As improved public finance
management and reduced domestic borrowings are common policy conditions
attached to aid, the results suggest that aid was either associated with or caused
beneficial policy responses in Rwanda.
Aid was associated with increased tax efforts, lower domestic borrowings, and
increased public spending. Although the results suggest that spending was less than
proportional to incremental aid, this was most probably because our measure of aid
included project grants and not all of these are included directly as government
spending, so this is consistent with aid being fully additional. It is evident that
spending was higher than it could have been in the absence of aid. As tax revenue
share of GDP relative to sub-Saharan African standards remained small over the
period, the government was unable to maintain a budget balance including aid, so
domestic borrowings remained frequent (with repayments in years of high aid).
These results suggest some policy implications. Corroborations from the trend
analysis and estimates of the long-run coefficients suggest that domestic borrowings
remain responsive to the uncertainties associated with aid inflows. Spending targets
appear to have been formed according to anticipated aid and shortfalls in aid
outturns induced domestic borrowings. If donors ensured that aid disbursements
were more reliable and predictable, the Rwandan authorities could improve fiscal
planning and reduce the instability associated with unanticipated deficits and the
need to resort to costly domestic borrowings. Of course, some of the aid volatility
arises because of absorption problems or failure to comply with conditionalities, so
the Rwandan authorities also have a significant role to play in ensuring a stable aid
relationship.
A comprehensive analysis of the relationship between aid and government
spending requires reliable data on aid received by the government, and this is a
deficiency in almost all government statistics, including in Rwanda which should be
addressed. Project grants related to donor-operated projects cannot increase recor-
ded spending as they do not go through the budget. If the government is aware of
donor projects, this could reduce government spending in that area. Therefore,
continued efforts by donors to coordinate aid delivery systems, make aid more
transparent, and support improvement in government fiscal statistics will contribute
to improving fiscal planning. Recipients need to know how much aid is available to
finance spending and how this is delivered, that is, whether through donor projects
or government budgets.
Acknowledgements The authors are grateful to an anonymous referee for constructive comments
and participants of the EABEW 2016 conference held in Kigali on 20–22 June who heard evolving
versions. The usual disclaimers apply. The data used in the analysis are available on request.
98 T. Bwire et al.
Annexure 1
1. Residual plots
Figure 5.2 is a panel containing four plots for each error correction model
equation: (a) actual and fitted values (top left); (b) standardized residuals (bot-
tom left); (c) auto-correlations (top right); and (d) histogram (bottom right).
Overlaid on the histogram is the estimated density function of the standardized
residuals (appears as a dotted line in print) and the density of the standard
normal distribution. It also contains some statistics: the univariate normality test
by Doornik and Hansen (DH) (2008) and Kolmogorov–Smirnov
(KS) (Lilliefors 1967) test for normality, and the Jarque-Bera test computed by
the RATS’ statistics instruction (Dennis 2006).
The actual and fitted residuals show an outlying observation in about 2013 in
virtually all residuals, except for tax revenue and domestic financing. This
notwithstanding, the histograms portray reasonably normal distribution
behavior.
DTAX_REV
14 1.00
Actual and Fitted Autocorrelations
12 0.75
10 0.50
8 0.25
6 0.00
4 -0.25
2 -0.50
0
-0.75
-2
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-4
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
5 0.9
Standardized Residuals Histogram SB-DH: ChiSqr(2) = 79.51 [0.00]
4 0.8 K-S = 0.92 [5% C.V. = 0.09]
J-B: ChiSqr(2) = 191.29 [0.00]
3 0.7
2 0.6
1 0.5
0 0.4
-1 0.3
-2 0.2
-3 0.1
-4 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -5.0 -2.5 0.0 2.5 5.0
DAID
25 1.00
20 0.75
15 0.50
10 0.25
5 0.00
0 -0.25
-5 -0.50
-10
-0.75
-15
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-20
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
6 0.8
0.7 K-S = 0.92 [5% C.V. = 0.09]
4 J-B: ChiSqr(2) = 588.97 [0.00]
0.6
2
0.5
0 0.4
0.3
-2
0.2
-4
0.1
-6 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -6 -4 -2 0 2 4 6
Fig. 5.2 Actual, fitted, and standardized residuals, auto-correlations, and histograms
DK_SPEND
14 1.00
12 0.75
10 0.50
8 0.25
6 0.00
4 -0.25
2 -0.50
0
-0.75
-2
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-4
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
4 0.7
K-S = 0.87 [5% C.V. = 0.09]
3 0.6
J-B: ChiSqr(2) = 43.61 [0.00]
2 0.5
1 0.4
0 0.3
-1 0.2
-2 0.1
-3 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -4 -2 0 2 4 6
DCURRENT_SPEND
25 1.00
0.75
20
0.50
15 0.25
0.00
10
-0.25
5 -0.50
-0.75
0
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-5
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
7.5 1.0
K-S = 0.96 [5% C.V. = 0.09]
J-B: ChiSqr(2) = 2843.89 [0.00]
5.0 0.8
2.5 0.6
0.0 0.4
-2.5 0.2
-5.0 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -4 -2 0 2 4 6 8
DDOMFIN
40 1.00
30 0.75
20 0.50
0.25
10
0.00
0
-0.25
-10
-0.50
-20
-0.75
-30
-1.00
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
-40
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Lag
4 0.9
3 0.8 K-S = 0.92 [5% C.V. = 0.09]
J-B: ChiSqr(2) = 340.04 [0.00]
2 0.7
1 0.6
0 0.5
-1 0.4
-2 0.3
-3 0.2
-4 0.1
-5 0.0
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 -5.0 -2.5 0.0 2.5 5.0
Fig. 5.2 (continued)
2. Roots of companion matrix

The roots of the companion matrix are equal to the inverse of the roots of the
characteristic equation (Juselius 2006). y{t} is stationary when the roots of the
characteristic equation are all outside the unit circle or equivalent when the roots
of the companion matrix are all inside the unit circle. In practice, we need to
100 T. Bwire et al.
Roots of the Companion Matrix

1.0
Rank(PI)=1
0.5
0.0
-0.5
-1.0
-1.0 -0.5 0.0 0.5 1.0
Fig. 5.3 Roots of the companion matrix
choose the rank so that the largest unrestricted root is far from a unit root; that is,
it has modulus lower than 1. The model here is defined for p = 5, k = 1
implying p k = 5 roots in the characteristic polynomial (i.e., we assume full
rank of the P matrix). These are shown in Fig. 5.3, and as expected, all roots are
inside the unit circle.
References
Adam C, O’Connell S (1999) Aid, taxation, and development in Sub-Saharan Africa. Econ Polit
11:225–253
Bulir A, Javier Hamman A (2003) Aid volatility: an empirical assessment. IMF Staff Papers 50
(1):64–89
Bwire TM, Morrissey O, Lloyd T (2013). A time series analysis of the impact of foreign aid on
Central Government’s fiscal budget in Uganda. WIDER Working Paper No. 2013/101
Bwire TM, Morrissey O, Lloyd T (2016) Fiscal reforms and the fiscal effects of aid in Uganda.
J Dev Stud, forthcoming
Chenery H, Strout W (1966) Foreign assistance and economic development. Am Econ Rev
56:679–753
Dennis GJ (2006). CATS in RATS cointegration analysis of time series’. Version 2, Estima,
Evanston, Illinois, USA
Doornik JA (1998) Approximations to the asymptotic distribution of cointegration tests. J Econ
Surv 12:573–593
Doornik JA, Hansen H (2008) An omnibus test for univariate and multivariate normality. Oxford
Bull Econ Stat 70:927–939
Eifert B, Gelb A (2005). Improving the dynamics of aid: towards more predictable budget support.
The World Bank, Washington, DC, Policy Research Working Paper 3732
Engle RF, Hendry DF, Richard JF (1983) Exogeneity. Econometric 51:277–304
Foster M, Killick T (2006) What would doubling aid do for macroeconomic management in
Africa? Overseas Development Institute, London, ODI Working Paper 264
Franco-Rodriguez S, McGillivray M, Morrissey O (1998) Aid and the public sector in Pakistan:
evidence with endogenous Aid. World Dev 26:1241–1250
Godfrey LG (1988) Misspecification tests in econometrics. Cambridge University Press,

Cambridge
Granger CWJ (1969) Investigating causal relations by econometric models and cross-spectral
methods. Econometrica 37:424–438
Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton
Johansen S (1988) Statistical analysis of cointegration vectors. J Econ Dyn Control
12(2–3):231–254
Johansen S (1992) Testing weak exogeneity and the order of integration in UK money demand.
J Policy Model 14:313–334
Johansen S (2002) A small sample correction for tests of hypotheses on the cointegrating vectors.
J Econ 111(2):195–221
Juselius K (2006) The cointegrated VAR model: methodology and application. Oxford University
Press, Oxford
Juselius K, Møller FN, Tarp F (2011) The long-run impact of foreign aid in 36 African countries:
insights from multivariate time series analysis. WIDER Working Paper 2011/51.
UNU-WIDER, Helsinki
Lilliefors HW (1967) On the Kolmogorov-Smirnov test for normality with mean and variance
unknown. Am Stat Assoc J 62:399–402
McGillivray M (1994) The impact of foreign aid on the fiscal behaviour of Asian LDC
governments: a comment on Khan and Hoshino (1992). World Dev 22(12):2015–2017
McGillivray M, Morrissey O (2000) Aid fungibility in assessing aid: red herring or true concern? J
Int Dev 12:413–428
McGillivray M, Morrissey O (2001) Aid illusion and public sector fiscal behaviour. J Dev Stud
37:188–236
McGillivray M, Morrissey O (2004) Fiscal effects of aid. In: Addison T, Roe A (eds) Fiscal policy
for development. Palgrave for UNU-WIDER, Basingstoke, pp 72–96
Martins P (2010) Fiscal dynamics in Ethiopia: a cointegrated VAR model with quarterly data.
University of Nottingham, School of Economics: CREDIT Research Paper 10/05
Mascagni G, Timmis E (2014) Fiscal effects of aid in Ethiopia: evidence from CVAR applications.
University of Nottingham, School of Economics: CREDIT Research Paper 14/06
Morrissey O (2001) Does aid increase growth? Prog Dev Stud 1(1):37–50
Morrissey O (2015) Aid and government fiscal behaviour: what does the evidence say? World Dev
69:98–105
Morrissey O, M’Amanja D, Lloyd T (2007) Aid and growth in Kenya: a time series approach. In:
Lahiri S (ed) Theory and practice of foreign aid. Elsevier, Amsterdam, pp 313–332
O’Connell S, Adam C, Buffie EF (2008) Aid and fiscal instability. Centre for the Study of African
Economics, Oxford, CSAE WPS 18
Osei R, Morrissey O, Lloyd T (2005) The fiscal effects of aid in Ghana. J Int Dev 17:1–17
Rahbek A, Mosconi R (1999) Cointegration rank inference with stationary regressors in VAR
models. Econ J 2:76–91
Riddell R (2007) Does foreign aid really work?. Oxford University Press, Oxford
The World Bank (1998) Assessing aid: what works, what doesn’t and why. Oxford University
Press for the World Bank, Washington, DC
The World Bank (2008) Rwanda—toward sustained growth and competitiveness. CEM 1
Chapter 6
Relationship Between Inflation
and Real Economic Growth in Rwanda
Ferdinand Nkikabahizi, Joseph Ndagijimana

and Edouard Musabanganji
Abstract This study examines the impact of economic stability measures (inflation
and unemployment rates) on real gross domestic product (GDP) in Rwanda. It uses
quarterly data for the period of 2000Q1–2015Q4 collected from the Ministry of
Finance and Economic Planning, Central Bank of Rwanda and the National
Institute of Statistics of Rwanda (NISR). This study concludes that inflation and
unemployment have a long-run negative and significant relationship on real gross
domestic product. In the long run, the coefficients are not significant at the 5% level;
it is only the inflation coefficient and error which are significant. Real gross
domestic product increases when inflation reduces with a p-value of 0.00266; real
gross domestic product increases when unemployment reduces with a p-value of
0.09882. The coefficient from the error correction model means that the effect of the
shock will reduce by 0.0483% each quarter, meaning that the effect of the shock
will reduce by 19.32% in each 4th quarter. This further means that it will end at 20
quarters, that is, after a five-year period. It has to be highlighted that there is a weak
relationship between real gross domestic product and both inflation and unem-
ployment rates.
Keywords Real gross domestic products Inflation Unemployment

Co-integration Vector error correction model
JEL Classification E4 E5 E6
F. Nkikabahizi (&) J. Ndagijimana

College of Business and Economics, University of Rwanda, Butare, Rwanda
e-mail: fnkikabahizi@gmail.com
E. Musabanganji
Economy and Rural Development Unit, Gembloux Agro-Bio Tech,
University of Liège, Liège, Belgium

DOI 10.1007/978-981-10-4451-9_6
104 F. Nkikabahizi et al.
6.1 Introduction
For all countries, both developed and developing, one of the fundamental objectives
of macroeconomic policy is economic stability. Economic stability refers to an
economy that experiences constant growth and low inflation. Advantages of having
a stable economy include increased productivity, improved efficiencies, and low
unemployment. The common signs of instability are extended time in a recession or
crisis, rising inflation, and volatility in currency exchange rates. An unstable
economy leads to a decline in consumer confidence, stunted economic growth, and
reduced international investments. The main goals of any government usually
include economic growth, price stability, and low unemployment. The most
important means of moving toward these goals are detailed tax policies, spending,
regulation, and government management. However, the macroeconomic levers of
the fiscal stance and monetary policy also play a part. Attaining sustainable eco-
nomic growth coupled with price stability continues to be the central objective of
macroeconomic policies for most countries in the world today. Among others, the
emphasis on price stability in conducting monetary policy is with a view to pro-
moting sustainable economic growth as well as strengthening the purchasing power
of the domestic currency (Umaru and Zubairu 2012).
The question on whether or not inflation is harmful to economic growth has
recently been a subject of intense debate among policymakers and macroe-
conomists. Several studies have estimated a negative relationship between inflation
and economic growth. It is imperative for studies which base their arguments on
real business cycle theories to also base them on countries (Pradana and Rathnayaka
2013).
Luppu (2009) has established a positive relationship between inflation and GDP
growth in Romania in the short run. This implies that as inflation increases, GDP
must also increase in the short run. However, when inflation decreases, GDP should
also decrease. Drukker et al. (2005) have noticed that if the inflation rate is below
19.16%, increases in inflation do not have a statistically significant effect on growth,
but when inflation is above 19.16%, a further increase in inflation will decrease the
long-run growth.
Mallik and Chowdury (2001) indicate a long-run positive relationship between
the GDP growth rate and inflation among four South Asian countries. Specifically,
the bone of contention is whether inflation is necessary for economic growth or is it
detrimental to growth.
World economic growth and inflation rates have been fluctuating. Likewise,
inflation rates have been dominating when compared to growth rates over many
years; hence, the relationship between inflation and economic growth has continued
to be one of the most significant macroeconomic problems (Madhukar and
Nagarjuna 2011). Similarly, Ahmed (2010) maintains that this relationship has been
argued in economic literature, and these arguments show differences in relation to
the condition of the world economic order. In accordance with these policies,
increases in total demand have led to increases in production and inflation too.
6 Relationship Between Inflation and Real Economic Growth in Rwanda 105
In the 1970s, countries with high inflation, especially Latin American countries,
started experiencing a decrease in growth rates which led to the emergence of views
which stated that inflation had negative effects and not positive effects on economic
growth. Evidence showing a relationship between inflation and economic growth
from some of the Asian countries such as India showed that its growth in GDP
increased from 3.5% in the 1970s to 5.5% in the 1980s, while the inflation rate
accelerated steadily from an annual average of 1.7% during the 1950s to 6.4% in
the 1960s and further to 9.0% in the 1970s before easing marginally to 8.0% in the
1980s (Prasanna and Gopakumar 2010). Similarly, Xiao (2009) shows that from
1961 to 1977, China’s real GDP growth and real GDP per capita growth averaged
at 4.84 and 2.68%, respectively. Since 1978, China’s economy has grown steadily,
although the growth rate fluctuated among the years, and from 1978 to 2007, the
growth rate of China’s real GDP and real GDP per capita were recorded at 9.992
and 8.69%, respectively.
A study by Stein (2010) shows that in East African countries, Kenya had five
years of very positive economic development with four consecutive years of above
4% growth. The same study shows that Uganda was one of the fastest growing
economies in Africa with sustained growth averaging 7.8% since 2000 with the
annual inflation rate decreasing from 5.1% in 2006 to 3.5% in 2009. The average
annual real GDP growth rate for Rwanda in 1990–99 was −0.1 but from 2006 to
2009, the country had an annual average growth rate of 7.3%.
Since the late 1970s, the Tanzanian economy has experienced many internal and
external shocks. Kilindo (1997) documents the issues and maintains that all sectors
of the economy were affected by shocks, whose manifestations included large
budget deficits and an imbalance between productive and non-productive activities.
He also argues that the signs closely associated with these were high rates of
inflation, large balance of payment (BOP) deficits, declining domestic savings,
growing government expenditure, falling agricultural produce, and decreased uti-
lization of industrial capacity which in turn hindered economic growth.
Macroeconomists, central bankers, and policymakers have often emphasized the
costs associated with high and variable inflation. Inflation imposes negative
externalities on the economy when it interferes with its efficiency. Examples of
these inefficiencies are not hard to find, at least at the theoretical level. Inflation can
lead to uncertainty about future profitability of investment projects (especially when
high inflation is also associated with increased price variability). This leads to more
conservative investment strategies than would otherwise be the case, ultimately
leading to lower levels of investments and economic growth. Inflation may also
reduce a country’s international competitiveness by making its exports relatively
more expensive, thus impacting its balance of payments (Gokal and Hanif 2004).
The conventional view in macroeconomics holds that permanent and predictable
changes in inflation rates are neutral, and they do not affect real activity in the long
run. However, a substantial body of evidence suggests that sustained high inflation
rates can have adverse consequences for real economic growth even in the long run.
Nowadays, a consensus among economists seems to be that high rates of inflation
cause ‘problems’ not just for some individuals but for aggregate economic
performance. However, there is much less agreement about the precise relationship
between inflation and economic performance and the mechanism by which inflation
affects economic activity. The effects of permanent increases in the inflation rate for
long-run activity seem to be quite complicated.
The consensus about the adverse effects of inflation on real economic growth
reveals only a small part of the whole picture. Recently, intensive research has
focused on the nonlinear relationship between these two variables. That is, at lower
rates of inflation, the relationship is not significant or even positive, but at higher
rates, inflation has a significantly negative effect on growth. Bruno and Easterly
(1998) demonstrate that a number of economies have experienced sustained
inflation of 20–30% without suffering any apparently major adverse consequences.
However, once the rate of inflation exceeds some critical level (estimated at 40%),
significant declines occur in the level of real activity. The relationship between
inflation and economic growth is one of the most important economic controversies
among economists, policymakers, and monetary authorities. In particular, the core
of the argument is whether inflation is necessary for economic growth or is it
harmful for economic growth. Although the relationship between inflation and
economic growth has been widely examined and investigated, it has also been
debated in economic literature.
This section discusses different empirical studies which show the relationship
between inflation and economic growth. Previous studies’ concern was not only
finding a simple relationship between inflation and economic growth, but also
finding whether the relationship held in the long run or it was just a short-run
phenomenon, finding the causal direction of the relationship and whether the
relationship was linear or nonlinear and the like.
6.2.1 Theoretical Literature
Adam Smith founded the classical theory. He recognized three factors of production
—land, labor, and capital. His production function can be expressed as‫ ׃‬Y = f(L, K, T),
where Y is output, L is labor, K is capital, and T is land. Smith considered saving as the
most important factor affecting the growth rate. In classical theories, there is no direct
explanation of inflation and its tax effect on profit levels and output. But the rela-
tionship between the two variables is implicitly negative by a reduction in firms’ profit
levels and savings through higher wage costs (Gokal and Hanif 2004).
In 1936, John Maynard Keynes wrote The General Theory of Employment,
Interest and Money, which established the foundation of Keynesianism. Keynesians
believe that the government has to intervene to reach full production. They believe
that intervention by the government in the economy through expansionary
economic policies will boost investment and promote demand to reach full pro-
duction. The Keynesian model is based on aggregate demand (AD) and aggregate
supply (AS) curves. In this model, the AS curve is upward sloping in the short run,
so that a change in the demand side of the economy affects both price and output
(Dornbusch et al. 1996).
Dornbusch et al. (1996) have also argued that AD and AS yield an adjustment
path which shows an initial positive relationship between inflation and economic
growth but eventually which turns negative toward the latter part of the adjustment
path. The initial positive relationship between inflation and economic growth is due
to the time inconsistency problem. Producers feel that only the prices of their
products have increased, while the other producers are operating at the same price
level. However, in reality, overall prices have increased. Therefore, the producers
continue with more and more output. Moreover, according to Blanchard and
Kiyotaki (1987), inflation and economic growth are positively related because of
firms’ agreement to supply on an agreed price. So a firm has to produce even at
increased prices. Later on, the relationship becomes negative. This describes the
phenomenon of stagflation, that is, output decreases or remains the same when
prices increase (Gokal and Hanif 2004).
‘Stagflation’ is a phenomenon that incorporates high inflation and low growth or
high unemployment; this dominated almost all developed countries in the middle of
the 1970s. Monetarism was proposed by Milton Friedman. For this school of
thought, money supply is the only factor that determines price levels in an econ-
omy. They argue that government intervention manages the growth rate of money
supply to harmonize it with the growth rate of output in the long run. Monetarists
also argue that inflation will occur when money supply increases faster than the rate
of growth of national income. But the effect of money supply is different in the long
run and short run. In the short run, money supply has the dominant influence on real
variables (real GDP and employment) and price levels. But in the long run, the
influence of the variations in the money supply is primarily on price levels and on
other nominal variables but not on real variables such as real output and employ-
ment (Richard 1998).
Monetarism looks at the concept of anticipation in two parts—the Phillips curve
and the divide Phillips curve in the short run and long run (Gokal and Hanif 2004).
For this theory, the Phillips curve holds in the short run but not in the long run. In
the long run, anticipated inflation will be consistent with actual inflation. So
inflation will not influence unemployment, output, and other real economic vari-
ables. This concept is called neutrality of money. Gokal and Hanif (2004) explain
the concept of neutrality and super-neutrality as neutrality holds if the equilibrium
values of real variables, including the level of GDP, are independent of the level of
the money supply in the long run and super-neutrality holds when real variables
including the GDP rate of growth are independent of the rate of growth in the
money supply in the long run. Inflation will be harmless in the case of neutrality
and super-neutrality. But this may not be true in reality. Inflation is bad for the
economy because it affects capital accumulation, investments, and exports and
hence, affects output.
The neoclassical growth theory started an era in which economists tried to

generate long-run equilibrium models to formulate economic growth and its
determinants. Solow and Swan are two pioneers who put forward their growth
models under the framework of the neoclassical economic theory. The Solow
growth model assumes ‘diminishing returns to labor and capital separately and
constant returns to both factors jointly’ (Gokal and Hanif 2004). One of the features
of this model is that the saving rate, the population growth rate, and technological
progress are defined to be exogenous. The capital level will move to and stabilize at
the steady state on which output will keep constant at given exogenous variables.
Once this balance is broken by change of exogenous variables, a new steady state
will be achieved. Although the growth accounting method tells us the channels
through which variables influence economic growth, there is still lack of a direct
explanation about the relationship between inflation and economic growth. Mundell
(1963) and Tobin (1965) have successfully explained the effect of inflation on
economic growth based on the neoclassical growth theory. They believe that
increased nominal interest caused by inflation will make people opt for investments
instead of consumption. This will result in increasing capital accumulation which
will stimulate economic growth. This is the Mundell–Tobin effect. Mundell (1963)
and Tobin (1965) depict a positive relationship between inflation and economic
growth. Sidrauski (1967) collaborates monetary factors with the neoclassical
growth model with the assumption of neutrality of money. He tries to testify how
the model will react to a change in the growth rate of money supply. In this model,
although he does not give a distinct path on how the new steady state is achieved
upon the change in the growth rate of money supply, his conclusion is that inflation
will have no relation to output growth rate in the long run. This finding supports the
super-neutrality of money.
Contrary to the conclusion of the Mundell–Tobin effect, Stockman (1981)
developed a long-run equilibrium growth model with the assumption of a
‘cash-in-advance constraint.’ In Mundell (1963) and Tobin’s (1965) models, real
money balances and investments can be substituted. But in Stockman’s (1981)
model, the two variables’ relationship is complementary as returns on investments
are also gained by individuals in the form of money in the future. Inflation will
reduce both real money balances and investments. And then inflation will nega-
tively influence growth. Generally, a theoretical review of the neoclassical growth
theory demonstrates mixed results regarding the relationship between inflation and
economic growth.
The new growth theory is also termed as the endogenous growth theory as it
assumes technological progress as endogenous, which is contrary to the neoclas-
sical growth theory whose assumption is based on the exogenous saving rate,
population growth, and technological progress. Also, the new growth theory
assumes that the marginal product of capital is constant, but in the neoclassical
growth theory the capital is assumed to be diminishing on return. If discussing the
new growth model under the framework of the monetary economy, the relationship
between inflation and return rate on capital will depend on the relationship between
the real money balance and investment. As discussed in the part of neoclassical
theory and as also discussed in Mundell (1963) and Tobin’s (1965) models and in
Haslag (1997) and Stockman (1981), if real money balances substitute investment,
inflation will decrease the return on real money balances, but the return on
investment will increase. But if real money balances complement investment,
inflation will have a negative effect on growth.
6.2.2 Empirical Literature—The Relationship Between

Inflation and Economic Growth
Like theoretical models, existing empirical studies too reflect different views on the
relationship between inflation and output growth. Their findings differ depending
on the data period and countries, suggesting that the association between inflation
and growth is not stable. Still, economists now widely accept the existence of a
nonlinear and concave relationship between these two variables; the traditional
point of view does not consider inflation as an important factor in the growth
equation. This is reflected in the studies of Dorrance (1963) and Johanson (1967)
who did not find any significant impact of inflation on growth in the 1960s.
Nevertheless, the traditional point of view changed when high and chronic inflation
was present in many countries in the 1970s; as a result, different researchers showed
that inflation had a negative impact on output growth.
Fischer (1993) and De Gregorio (1992, 1996) investigated the link between
inflation and growth in time series, cross-sectional and panel datasets for a large
number of countries. The main result of these works is that there is a negative
impact of inflation on growth. Fischer (1993) argues that inflation hampers the
efficient allocation of resources due to harmful changes in relative prices. At
the same time, relative prices appear to be one of the most important channels in the
process of efficient decision making.
Barro (1987) studied the relationship between inflation and economic growth.
He used 30 years data in 100 countries from 1960 to 1990. He included other
determinants of economic growth besides inflation. To analyze the data, he used the
systems of regression equation. The regression results indicated that an increase in
average inflation by 10% per year led to a reduction in the growth rate of real per
capita GDP by 0.2–0.3% per year and a decrease in the ratio of investment to GDP
by 0.4–0.6%. But the result is statistically significant only when high inflation
experiences are included in the sample.
Investigations into the existence and nature of the link between inflation and
growth have had a long history. Although economists now widely accept that
inflation has a negative effect on economic growth, researchers did not detect this
effect in data in the 1950s and the 1960s. A series of studies in the IMF Staff Papers
around 1960 found no evidence of damage from inflation (Bhatia 1960; Dorrance
1963, 1966; Wai 1959). Johanson (1967) found no conclusive empirical evidence
for either a positive or a negative association between the two variables. Therefore,
a popular view in the 1960s was that the effect of inflation on growth was not
particularly important.
Motley (1994) includes inflation in his model to examine the effect of inflation
on the real GDP growth rate. He extended the model developed by Mankiw et al.
(1992), which was based on the Solow growth model by allowing for the possibility
that inflation tended to reduce the rate of technical change. The result indicates a
negative relationship between inflation and the growth rate of real GDP. Khan and
Senhadji (2001) analyzed the relationship between inflation and economic growth
separately for industrial and developing countries. They used new econometric
techniques initially developed by Chan and Tsay (1998) and Hansen (2000) to
show the existence of threshold effects in the relationship between inflation and
economic growth. The authors used an unbalanced panel data containing 140
countries for the period 1960–98. The estimated value of the threshold was 1–3%
and 11–12% for developed and developing countries, respectively. The results
indicated that the threshold for industrialized countries was lower than developing
countries. It also indicated that inflation levels below the threshold level of inflation
had no effect on growth. But inflation rates above the threshold level had a sig-
nificant negative effect on growth.
Mubarik (2005) also estimated the threshold level of inflation for Pakistan. He
found a 9% threshold level of inflation as inflation above this level affected the
economic growth negatively. But inflation below the estimated level was conduc-
tive for economic growth.
Some other studies have shown that the link between inflation and growth is
significant only for certain levels of inflation. For instance, Bruno and Easterly
(1995) studied the inflation–growth relationship for 26 countries over 1961–92.
They found a negative relationship between inflation and growth when the level of
inflation exceeded some threshold. At the same time, they showed that the impact of
low and moderate inflation on growth was quite ambiguous. They argue that in this
case inflation and growth were influenced jointly by different demand and supply
shocks, and thus no stable pattern existed.
Numerous empirical studies have found that the inflation–growth interaction is
nonlinear and concave. Fischer (1993) was the first to investigate this nonlinear
relationship. He used cross-sectional data covering 93 countries and used the
growth accounting framework to detect the channels through which inflation
impacted growth. As a result, he found that inflation influenced growth by
decreasing productivity, growth, and investment. Moreover, he also showed that the
effect of inflation was nonlinear with breaks at 15 and 40%. Sarel (1995) found
evidence of a structural break in the interaction between inflation and growth. He
used the fixed effect technique to deal with a panel data sample covering 87
countries over 21 years (1970–90). His main result is that the estimated threshold
level equaled 8%, exceeding which led to a negative, powerful, and robust impact
of inflation on growth.
6.2.3 Empirical Literature—Causality Between Inflation

and Economic Growth
Mubarik (2005) analyzed the causal relationship between inflation and economic
growth. His test results indicated that causality between the two variables was
unidirectional, that is, inflation caused GDP growth but not vice versa. Chimobi
(2010) studied inflation and economic growth in Nigeria and found unidirectional
causality from inflation to growth. Erbaykal and Okuyan (2008) analyzed the causal
relationship between inflation and economic growth in the framework of the
causality test. Their results indicated no causal relationship between economic
growth and inflation, whereas there was a causality relationship from inflation to
economic growth.
In addition to unidirectional causality from inflation to economic growth and
bilateral causality, there are also studies which indicate unidirectional causality
from growth to inflation. Gokal and Hanif (2004) studied inflation and economic
growth in Fiji. They concluded that Granger causality runs one way, from growth to
inflation but not from inflation to growth. It means that it is unidirectional. Datta
and Kumar (2011) examined the relationship between inflation and economic
growth in Malaysia with data from 1971 to 2007. Their findings show that there
exists short-run causality between the variables and that the direction of causality is
from inflation to economic growth, and in the long run, Granger causes inflation in
economic growth.
Finally, there are also studies which indicate no causality relationship between
inflation and economic growth. Kigume (2011) studied inflation and economic
growth in Kenya from 1963 to 2000. The Granger causality test of his study showed
no causality relation between these two variables.
6.3 Critical Review and Identification of Gaps
Many authors such as Luppu (2009) and Mallik and Chowdury (2001) who carried
out research on related subjects found that both inflation and real economic growth
were positively related in the long run, while Pradana and Rathnayaka (2013) show
the existence of a negative relationship between inflation and economic growth.
Drukker et al. (2005) and Bruno and Easterly (1998) indicate that the relationship
between inflation and economic growth depends on the inflation rate to have either
a positive or a negative impact. Empirical and theoretical evidence suggests that the
relationship between inflation and economic growth is positive, negative, and none,
which leads to ambiguity about the exact relationship.
In Rwanda, the inflation rate is likely to be stable which does not stop its
economy from improving as the inflation rate is low in the short run. Studies which
prove a relationship between variables on economic growth, however, do not focus
on the Rwandan economy as they all focus on the long-run relationship rather than
the short-run relationship. Our study, which examines the relationship between
inflation and economic growth in Rwanda, will enable other scholars and even
macroeconomists and authorities to know the exact relationship between inflation
and real economic growth in Rwanda, and help macroeconomic policymakers to set
strategies leading to economic stability in Rwanda.
6.4 Rationale, Objectives, and Research Questions
Our paper examines the relationship between inflation and economic growth and
analyzes the causality relationship between the two. The research findings are
significant for monetary policy authorities, business owners, and investors. They are
also important for policymakers as they can get to know the link between inflation
and GDP which will help them decide and set strategies concerning variables by
taking into account the fact that all these variables have an impact on a country’s
well-being. As for researchers, apart from their contribution to knowledge about
Rwandan society and inflation and GDP, our study will also give them an oppor-
tunity to know about the correlation between inflation and economic growth in the
world, particularly in Rwanda, and its effect on investment decisions and business
performance in Rwanda.
The purpose of our study is to investigate the relationship between inflation and
economic growth in Rwanda and determine whether there is a turning point or a
threshold level of inflation at which the inflation effect on economic growth
switches from positive or insignificant to negative. For the purpose of economic
stability, unemployment rates are also taken into account in our study.
Our study seeks to answer the following questions: (i) Is there a significant
relationship between inflation and unemployment and economic growth? If so, is
the relationship positive or negative? (ii) Is the causality relationship between
inflation and unemployment and economic growth bidirectional, unidirectional
(either from inflation to economic growth or from economic growth to inflation), or
a no causality relation? (iii) Is the Rwandan economy stable?
6.5 Formulation of the Empirical Model
We believe Granger’s (1969) model is simple and is also accurate in supporting the
specificity of the effect of inflation on economic growth in Rwanda. This leads us to
formulate this model in detail so that it is consistent with the hypotheses of the
study, assuming that an increase in inflation rate has a negative effect on economic
growth as the dependent variable. For the economic stability measure, unemploy-
ment rate is added to the model. The empirical model used for testing the rela-
tionship between real GDP and inflation rate and unemployment rate can be
specified by a simple model as
RGDPt ¼ f ðINFRt ; UNERt Þ ð6:1Þ
where RGDPt is the Rwandan real gross domestic product, INFRt is inflation rate,
and UNERt is unemployment rate.
Next, we estimate the following co-integration equations by VAR:
RGDPt ¼ a0 þ a1 INFRt þ a2 UNERt þ et ð6:2Þ
The coefficients ai ði ¼ 0; 1; 2Þ of Model 2 are parameters associated with

inflation and unemployment rates, respectively, and are to be estimated. The
transformation of the dependent variable in its logarithm leads us to scale reducing
and allows us to interpret the results in terms of elasticity. This leads us to write
Eq. (6.2) in a log-linear format for all variables, and we obtain the following
long-run equation of Rwandan real GDP:
LOGRGDPt ¼ a0 þ a1 INFRt þ a2 UNERt þ et ð6:3Þ
Both long- and short-term relationships were tested using the Johansen
co-integration test and ECM, respectively. VAR was used to estimate all the
parameters.
The data used for this study is basically time series data covering the period
2000–15. The two macroeconomic variables included in this study are inflation rate
and unemployment rate as independent variables and the real gross domestic product
at market prices as an indicator to measure economic growth. Data was sourced from
the Central Bank of Rwanda (BNR), the National Institute of Statistics in Rwanda
(NISR), a World Bank report, and the Ministry of Finance and Economic Planning.
We used a methodology which is presented as follows: test of lags, an analysis of
the stationarity of the series, the Johansen co-integration test, the Granger causality
test and the Chow test for the structure break, and the short-run relationship model
specification by ECM. We performed an economic interpretation of the
co-integration relation between the variables. We used GRETL as the appropriate
software for performing the econometric analysis better, and VAR was adopted for
estimating the parameter. The unit root test was initially performed to find the
stationary properties of each time series. An augmented Dickey–Fuller (ADF) unit
root test was used for this purpose. In testing, if any variable did not show stationary
at level, then the stationary property was tested on its first difference. If the variables
were stationary at their first difference long run, the association of the variable was
tested by using the co-integration technique. To achieve the objective, the station-
arity check used the unit root test named the augmented Dickey–Fuller test, while the
Johansen co-integration test was used to confirm the existence of long-run
relationships (Bourbonnais 2007) if the series was not stationary. As detailed in

Bourbonnais (2007), this methodology is applied in three steps: an analysis of the
stationarity of the series under study to check for the presence of units roots in the
series or their integration order; the co-integration test which has to reveal
the number of co-integrating vectors for a long-run relationship; and lastly, the
estimation of both long- and short-run relationships between the series to be studied
through the mechanism of the vector error correction model (VECM). The following
dynamic model relationship (short-term or error correction model)
DYt ¼ b1 DX1 t1 þ b2 DX2 t2 þ þ bp DXp tp þ c1 et1 þ et ð6:4Þ
was estimated following that of the long-run relationship
Yt ¼ ê0 þ ê1 X1 t1 þ ê2 X2 t2 þ þ êk Xk tk þ lt ð6:5Þ
using the ordinary least squares (OLS) method.
6.7 Theory and Prior Signs (Expected)
6.7.1 Inflation and Real GDP
Investigations into the existence and nature of the link between inflation and growth
have experienced a long history. Although economists now widely accept that
inflation has a negative effect on economic growth, researchers did not detect this
effect in data in the 1950s and the 1960s. A series of studies in the IMF Staff Papers
around the 1960s found no evidence of damage from inflation (Bhatia 1960; Dorrance
1963, 1966; Wai 1959). Johanson (1967) quoted in Ferdous and Shahid (2013) found
no conclusive empirical evidence for either a positive or a negative association
between the two variables. Therefore, a popular view in the 1960s was that the effect
of inflation on growth was not particularly important. Most empirical findings have
established an inverse relationship between inflation and the GDP growth rate. The
persistent increase in general prices of goods and services over time impedes efficient
resource allocation by obscuring the signaling role of relative price changes which is
an important guide to effective decision making (Fischer 1993 quoted in Enu et al.
2013). Inflation makes an economy’s exports relatively expensive, affecting BOPs
negatively thereby reducing a country’s international competitiveness.
6.7.2 Unemployment and Real GDP
In the short run, the relationship between economic growth and unemployment rate
may be a loose one. It is not unusual for the unemployment rate to show a sustained
Table 6.1 Summary of expected signs

Variable Definition Expected
sign
RGDPt The real gross domestic product (the value of final goods and Dependent
services evaluated at base year prices) for each year. variable
RGDPt ¼ Qt Pt
INFRt ‘Too much money in circulation causes the money to lose value’— (−) (+)
this is the true meaning of inflation. In economics, inflation is an
increase in the general level of prices of goods and services in an
economy over a period of time (Ferdous and Shahid 2013)
UNERt Unemployment is a macroeconomic problem that affects individuals (−)
differently and severely. The loss of employment means a reduced
standard of living and psychological stress. Levinsohn (2008)
explains that unemployment is associated with social problems such
as poverty, crime, violence, a loss of morale and degradation
Unemployed people
UNER ¼ Labor force 100
Source Authors’ interpretation
decline sometime after other broad measures of economic activity have turned
positive. Hence, it is commonly referred to as a lagging economic indicator. Over
an extended period of time, there is a negative relationship between changes in the
rates of real GDP growth and unemployment. This long-run relationship between
the two economic variables was most famously pointed out in the early 1960s by
economist Arthur Okun. ‘Okun’s law’ has been included in a list of ‘core ideas’ that
are widely accepted in the economics profession. Okun’s law, which economists
have expanded upon since it was first articulated, states that real GDP growth about
equal to the rate of potential output growth is usually required to maintain a stable
unemployment rate (Levine 2013). Ernst and Berg (2009) as cited in Mosikari
(2013) explain that high growth is associated with a high degree of employment
intensity which is a necessary condition for the reduction of poverty. See Table 6.1.
6.8 Results, Findings, and Economic Interpretations

of the Results
6.8.1 Lag Selection and Unit Root Test
The unit root test was used to examine the stationarity of the datasets. This enabled
us to avoid the problems of spurious results that are associated with non-stationary
time series models. We used the specific unit root test to check the stationarity of
variables, that is, augmented Dickey–Fuller (ADF). The ADF test is based on the
following regression:
DYt ¼ a þ dYt1 þ lt ð6:6Þ
where a is constant, d is slope coefficient, t is a linear time trend, and l is the error
term (Granger 1969 as cited in Iqbal et al. 2012).
In the case of the Dickey–Fuller test, they may create a problem of
auto-correlation. To tackle the auto-correlation problem, Dickey–Fuller developed a
test called the ADF test:
DYt ¼ b1 þ ZYt1 þ i ðModel 1—Intercept onlyÞ ð6:7Þ
DYt ¼ b1 þ b2t þ ZYt1 þ i ðModel 2—Trend and InterceptÞ ð6:8Þ
DYt ¼ ZYt1 þ i ðModel 3—No trend and No InterceptÞ ð6:9Þ
Hypothesis, null hypothesis (H0): The variable has a unit root, not stationary.
Alternative hypothesis (H1): The variable does not have a unit root, stationary. To
make the variable stationary, we go for I(1), 1st differencing, or for I(2), 2nd
differencing if the series has two unit roots in order to induce stationarity. The series
is stationary when the p-value < 5%, H0 is rejected. Same rule applies when ADF is
calculated in absolute value > ADF critical value.
Lag was selected according to vector auto-regression estimates, we chose the
lowest AIC value for the whole model, the lowest the AIC value, the better the
model. Therefore, the lag value selected was equal to 10. It had the lowest AIC
value compared to the others.
6.8.2 Auto-correlation Analysis
Based on the Durbin-Watson statistic value (0.044), which is less than 1, this means
that there is evidence of a positive auto-correlation. In a regression analysis using
time series data, with multiple interrelated data series, auto-correlation in variables
of interest is typically modeled with the vector auto-regression (VAR).
6.8.3 An Analysis of the Stationarity of the Series
The ADF test shows that LRGDP is transformed into its first difference, the null
hypothesis is rejected, and the series becomes stationary. INFR and UNER are I(0).
Therefore, they are said to maintain stationarity at an integration of order one, I(1)
and I(0), respectively. All the results from the ADF test are given in Table 6.2.
Table 6.2 Stationarity tests—augmented Dickey–Fuller (ADF) unit root tests

Variable Crit. val. (5%) ADF stat. p-value Decision
LRGDP −2.909 −8.71 0.0000 Stationarity at first difference. I(1)
INFR −2.909 −4.432 0.0007 Stationarity at level. I(0)
UNER −1.946 −2.874 0.0048 Stationarity at level. I(0)
As the times series variables are stationary, there is no need of testing for
co-integration using Engel and Granger and Johansen tests because the
co-integration test is equivalent to examining whether the residuals of regression
between two non-stationary series are stationary (Gujarati 2004).
6.8.4 Granger Causality Test Among Variables
The Granger (1969) approach to the question of whether x causes y is to see

whether the current y can be explained by p-values of y and then to see whether
adding lagged values of x can improve the explanation. Y is said to be
Granger-caused by x if x helps in the prediction of y, or equivalently if the coef-
ficients on the lagged x’s are statistically significant. It is important to note that the
statement ‘x Granger causes y’ does not imply that y is the effect or the result
of x. Granger causality measures precedence and information content but does not
by itself indicate the causality in the more common use of the term. The results for
the first null hypothesis‚ ‘INFR does not Granger cause LRGDP’ indicates a
p-value of 0.9909 which is greater than the 5% critical value, meaning that we
accept the null hypothesis stating that INFR does not Granger cause LRGDP. On
the other side, the second null hypothesis ‘LRGDP does not Granger cause INFR’
indicates a p-value of 0.1668 which is greater than the 5% critical value, meaning
that we accept the null hypothesis, meaning that LRGDP does not Granger cause
INFR. The third null hypothesis ‘UNER does not Granger cause LRGDP’ gives a
p-value of 0.9717 which is greater than the 5% critical value, meaning that we
accept the null hypothesis stating that UNER does not Granger cause LRGDP. The
fourth hypothesis ‘LRGDP does not Granger cause UNER’ has a p-value of 0.5299
which is greater than the 5% critical value; as result, we accept the null hypothesis,
meaning that LRGDP does not Granger cause UNER. The fifth hypothesis ‘UNER
does not Granger cause INFR’ has a p-value of 0.0847 which is greater than the 5%
critical value. As a conclusion, we accept the statement which means that UNER
does not Granger cause INFR. The sixth hypothesis ‘INFR does not Granger cause
UNER’ has a p-value of 0.5156 which is also greater than the 5% critical value, that
gives the same results. We conclude that we accept the null hypothesis, and INFR
does not cause UNER. The outcome of the whole Granger causality test indicates
that there is no causality between series.
6.8.5 Empirical Estimation of the Long-Run Relationship—

VAR Model
Having confirmed the existence of a co-integrating relationship, we estimated the

long-run VAR (1) model, using the OLS method. The co-integrating equation
relating GDP, unemployment, and inflation is estimated as
LRGDPt1 ¼ 14:97 0:67INFRt1 7:05 UNERt1 þ et

ð6:10Þ
Standard errors ð0:14065Þ ð3:83228Þ
The values in brackets represent the standard errors associated with the estimated
coefficient of Eq. (6.10).
Economic interpretations
All variables in the co-integrating equation have expected signs. Inflation rate,
which is a measure of macroeconomic instability, has a negative sign. This implies
that as inflation increases by 1%, RGDP reduces by 0.67%, inflation discourages
investments and therefore leads to a contraction in real economic activity. Similarly,
the unemployment rate also has a negative sign which means that when the
unemployment rate increases by 1%, real GDP declines by 7.05%. However, the
direction of causality may not necessarily run from unemployment to RGDP, since
unemployment tends to be high during recessions because firms often lay off some
workers. The appropriate method of analysis is using the error correction model
(ECM) that leads to the real impact of all independent variables on LRGDP.
6.8.6 Empirical Estimation of the Short-Run Relationship
Existence of a long-run equilibrium model means that there is also a short-run

relationship, which explains the short-run disequilibrium and shows how this dis-
equilibrium is corrected in order to converge to the long-run equilibrium. ECM is
also estimated using the OLS method as shown in Eq. (6.11). The error correction
term ðet1 Þ has an expected negative sign and is statistically significant at 5%. Other
variables are also significant at 5%.
DLRDGPt ¼ 0:127763DLRGDPt1 0:000719DINFRt1 0:045020DUNERt1 0:000483et1 þ et

ð0:13183Þ ð0:00266Þ ð0:09882Þ ð0:00259Þ
ð6:11Þ
Economic interpretations
Just like in the long-run model, the variables in ECM have expected signs. The
probability of DINFRt1 (0.00266) is less than 5%, meaning that DINFRt1 is
significantly negatively related to LRGDP, since an increase of 1% in INFR reduces
LRGDP by 0.000719%, keeping other factors constant. As expected, DUNERt1

also has a negative sign and is statistically significant at 10%, meaning that the
coefficient of DUNERt1 is not significant, and its probability (0.09882) is greater
than 5%. Thus, a 1% increase in unemployment leads to a 0.045020% reduction in
real economic activities. RGDP (−1) has a negative sign and a standard error
(0.13183) which means that its coefficient is insignificant at the 5% level of sig-
nificance. The error correction term has a probability of 0.00259 which is less than
5%. Therefore, its coefficient is significantly different from zero. The coefficient
(−0.000483) means that for each quarter, the short-run disequilibrium will reduce
by 0.000483%, meaning that the effect of the shock will reduce by 19.32% for each
4 quarters. This further means that it will end in 20th quarter (fifth year). R2 values
are small for all co-integrating equations (0.022675), (0.498962), (0.373759) which
means that actually none of the variables are significant in the short run. Also, the
co-integrating equation explains dynamics in real GDP; in other words, it is a
growth mode.
6.8.7 Chow Test and an Analysis of the Structural Stability

of the Reduced ECM and Link Between Findings
and Prior Signs
The AR roots graph helps test whether the inverse roots of the AR characteristics
polynomial are inside the unit circle. As shown in Fig. 6.1, the AR roots graph
confirms that the estimated VAR model was stable over the period of the study
(also see Table 6.3). We note that the residuals are normally distributed, and
(0.3539) is greater than 5%.
Fig. 6.1 Stability test

Table 6.3 Expected and Variables Expected Obtained Decision

obtained signs
LRGDP
INFR Negative and positive Negative Confirmed
UNER Negative Negative Confirmed
6.9 Summary and Conclusion
Our research carried out a VAR model to trace the impact of economic stability
measures (inflation rate and unemployment) on Rwandan real economic growth
(RGDP). The conclusive outcome of the research shows that between inflation,
unemployment and Rwandan real economic growth (RGDP) there is a long run
negative and significant relationship. However, for Rwanda, a short-run negative
relationship was found between real economic growth and both inflation and
unemployment. In the long run, the related standard error for each coefficient was
greater than 5%; thus, the coefficient was not significant. In the short run, only the
coefficient of unemployment was not significant.
Countries like Rwanda which are characterized by relatively high economic
growth and stability. Macroeconomic conditions do not suffer from an inflation
impact, otherwise inflation and unemployment influence RGDP and thereby have a
long term negative impact on economic growth. Therefore, policymaking bodies’
attention has to aim at macroeconomic policies which provide cost efficiency and a
route for steady and sustainable growth. Therefore, the Rwandan economy was
stable over the period of study.
References
Ahmed S (2010) An empirical study on inflation and economic growth in Bangladesh. OIDA Int J
Sustain Dev 2(3):41–48
Barro R (1987) Determinants of economic growth, a cross country empirical study. MIT Press,
Cambridge, Mass
Bhatia RJ (1960) Inflation, deflation, and economic development. Int Monetary Fund 8(1):
101–114
Blanchard OJ, Kiyotaki N (1987) Monopolistic competition and the effects of aggregate demand.
Am Econ Rev 77(4):647–666
Bourbonnais R (2007) Econométrie, 6ème édn. Dunod, Paris
Bruno M, Easterly W (1995) Inflation crises and long-run growth. NBER Working Papers
No. 5209, National Bureau of Economic Research. Available at: http://ideas.repec.org/p/nbr/
nberwo/5209.html
Bruno M, Easterly W (1998) Inflation crisis and long-run growth. J Monetary Econ 41:3–26
Chan KS, Tsay RS (1998) Limiting properties of the least square estimator of a continuous
threshold autoregressive model. Biometrica 85:413–426
Chimobi OP (2010) Inflation and economic growth in Nigeria. J Sustain Dev 3(2):159–166
Datta K, Kumar C (2011) Relationship between inflation and economic growth in Malaysia.
International conference on economics and finance research IPEDR Vol. 4, No. 2, pp 415–416
De Gregorio J (1992) The effect of inflation on economic growth. Eur Econ Rev 36(2–3):417–425
De Gregorio J (1996) Inflation, growth and Central Banks: theory and evidence. The World Bank,
Policy Research Working Paper No. 1575
Dornbusch R, Fischer S, Kearney C (1996) Macroeconomics. The Mc-Graw-Hill Companies Inc.,
Sydney
Dorrance GS (1966) Inflation and growth. Int Monetary Fund 13(1):82–102
Dorrance S (1963) The effect of inflation on economic development. Int Monetary Fund 10(1):
1–47
Drukker D, Hernandez-Verme P, Gomis-Porgueras P (2005) Threshold effects in the relationship
between Inflation and Growth: a new Panel-Data Approach. Working paper presented at the 11
th International conference on panel—data, Texas A&M University, College Station, Texas
Enu P, Attah-Obeng P, Hagan E (2013) The relationship between GDP growth rate and
inflationary rate in Ghana: an elementary statistical approach. Acad Res Int 4(5):310–318
Erbaykal E, Okuyan H (2008) Does inflation depress economic growth? Evidence from Turkey.
Int Res J Finance and Econ 13(17):40–48
Ernst C, Berg J (2009) The role of employment and labour markets in the fight against poverty. In:
Promoting Pro-Poor Growth, Employment. OECD. http://www.oecd.org/dac/povertyreduction/
43514554.pdf Accessed 15 Apr 2017
Ferdous M, Mahbuba Shahid E (2013) Study on nature of inflation and its relationship with GDP
growth rate: a Case Study on Bangladesh. IOSR J Econ Finance 1(3):40–49
Fischer S (1993) The role of macroeconomic factors in growth. J Monetary Econ 32(3):485–511
Gokal V, Hanif S (2004) Relationship between Inflation and Economic Growth in Fiji. Reserve
Bank of Fiji Suva, Fiji, Economics Department. Working Paper No. 4
Granger CWJ (1969) Investigating causal relationships by econometric models and cross-spectral
methods. Econometrica 37(3):424–438
Gujarati DR (2004) Basic econometrics, 4th edn. Tata McGraw Hill
Hansen BE (2000) Sample splitting and threshold estimation. Econometrica 68:575–603
Haslag JH (1997) Output, growth, welfare, and inflation: a survey. Econ Rev Second Q, Int
Monetary Fund 8(11–12):1011–1014
Iqbal N, Din M, Ghani E (2012) Fiscal decentralisation and economic growth: role of democratic
institutions. Pak Develop Rev 52(3):176–196
Johanson HG (1967) Is inflation a retarding factor in economic growth? In fiscal and monetary
problems in developing states. Proceedings of the third Rehoroth conference. Preager, New
York, pp 121–130
Khan MS, Senhadji SA (2001) Threshold effects in the relationship between inflation and growth.
Int Monetary Fund 48(1):1–21
Kigume RW (2011) The relationship between inflation and economic growth in Kenya. Int J Bus
Soc Sci 3(10). Available at: http://ir-library.ku.ac.ke/handle/123456789/2124
Kilindo A (1997) Fiscal operations, money supply and inflation in Tanzania. Afr Econ Res
Consortium 65(3):1–7
Levine L (2013) Economic growth and the unemployment rate. Congressional research service,
7-5700, R 42063, CRS report for congress, pp 1–10
Levinsohn J (2008) Two policies to alleviate unemployment in South Africa. Center for
International Development, at Harvard University, CID Working Paper No. 166
Luppu DV (2009) The correlation between inflation and economic growth in Romania. Luccrari
Stiintifice, p 53
Madhukar S, Nagarjuna B (2011) Inflation and growth rates in India and China: a perspective of
transition economies. Int Conf Econ Finance Res 4(97):489–490
Mallik G, Chowdury A (2001) Inflation and economic growth: evidence from four South Asian
countries. Asia-Pac Dev J 8(1):123–135
Mankiw NG, Romer D, Well DN (1992) A contribution to the empirics of economic growth.
Quart J Econ 107(2):407–437
Mosikari TJ (2013) The effect of unemployment rate on gross domestic product: case of South
Africa. Mediterr J Soc Sci 4(6):429–434
Motley B (1994) Growth and inflation: a cross-country study. Center for economic policy research,
Stanford University, CEPR Publication No. 395, pp 15–28
Mubarik YA (2005) Inflation and growth: an estimate of the threshold level of INFLATION in
Pakistan. SBP-Res Bull 1(1):35–43
Mundell R (1963) Inflation and real interest. J Polit Econ 71(3):280–283
Pradana MBJ, Rathnayaka MKTR (2013) Testing the link between inflation and economic growth:
evidence from Asia. Mod Econ 4:87–92
Prasanna S, Gopakumar K (2010) An empirical analysis of inflation and economic growth in India.
Int J Sustain Dev 15(2):4–5
Richard T (1998) Macroeconomics theories and policies, 6th edn. University of North Carolina at
Chapel Hill
Sarel M (1995) Nonlinear effects of inflation on economic growth. Int Monetary Fund 43(1):
199–215
Sidrauski M (1967) Inflation and economic growth. J Polit Econ 75(6):796–810
Stein P (2010) The economics of Tanzania, Kenya, Uganda, Rwanda and Burundi. Report
prepared for Swed Fund International AB, pp 12–32
Stockman AC (1981) Anticipated inflation and the capital stock in a cash-in-advance economy.
J Monetary Econ 8:387–393
Tobin J (1965) Money and economic growth. Econometrica 33(4):671–684
Umaru A, Zubairu J (2012) The effect of inflation on the growth and development of the Nigerian
economy: an empirical analysis. Int J Bus Soc Sci 3(10):187–188
Wai UT (1959) The relationship between inflation and economic development: a statistical
inductive study. Int Monetary Fund 7(2):302–317
Xiao J (2009) The relationship between inflation and economic growth of China: empirical study
from 1978–2007. Lund University, Sweden, pp 1–56
Chapter 7
Macroeconomic, Political,
and Institutional Determinants of FDI
Inflows to Ethiopia: An ARDL Approach
Addis Yimer
Abstract Based on the lines of the eclectic theoretical framework of Foreign direct
investment (FDI) flows, this study investigates the macroeconomic, political, and
institutional determinants of FDI inflows to Ethiopia for the period 1970–2013.
Using the ARDL modeling approach, it finds that political and institutional factors
are crucial both in the long run and the short run in FDI inflows to the country. On
the macroeconomic side, the market size of the country, availability of natural
resources, openness to trade, and deprecation in the nominal exchange rate are
found to positively affect FDI inflows to the country. On the other hand, macroe-
conomic instability is found to effect FDI inflows negatively. In addition, better
political stability, government effectiveness and regulatory quality, and better
performance of the rule of law are found to positively affect FDI inflows to the
country. A careful liberalization of the foreign exchange market and that of external
trade, sustaining the current growth momentum of the economy, improving insti-
tutional quality, and strengthening the political stability of the country, among
others, are fundamental areas that the government could work on to strengthen
Ethiopia’s position in FDI inflows on the continent.

Keywords ARDL Determinants Ethiopia FDI Macroeconomic stability

Political Institutional
7.1 Introduction
Foreign direct investment (FDI) plays an important role in the growth process of
poor nations (UNCTAD 2013). Not only does it provide the much needed capital
for filling the saving-investment and foreign exchange gaps in these countries, but it
is also important for generating employment opportunities and transferring tech-
nology and managerial know-how. In addition, by providing access to foreign
A. Yimer (&)
e-mail: addisyimer@gmail.com

DOI 10.1007/978-981-10-4451-9_7
124 A. Yimer
markets and building capacity through the transfer of technology, FDI improves the
integration of the host country into the global economy thus fostering growth.
The Ethiopian economy has to grow at least at an annual growth rate of 11% for
more than two decades so that it can attain the per capita income levels that have
been achieved today by most sub-Saharan African (SSA) countries (UNDP 2011).
However, the country’s domestic sources of finance are limited and cannot help it
achieve such a level of growth. In 2013, its gross domestic capital formation as a
share of GDP was around 33%, with gross domestic savings lagging behind at
around 6%. One alternative for filling this savings gap is through loans and
development assistance from multilateral agencies such as the World Bank and
IMF. However, as noted by Astatike and Assefa (2005) such a source of foreign
finance is unstable in nature.
Acknowledging this fact, the current Ethiopian government has opened several
economic sectors to foreign investors so that they fill the desired saving-investment
gap. The government has issued several investment incentives, including tax hol-
idays, duty-free imports of capital goods, and export tax exemptions to encourage
FDI. Further, the Ethiopian Investment Authority (EIA) has been established to
service investors and streamline investment procedures. In addition to liberalizing
investments, other areas of the external sector have also been liberalized through
unilateral, multilateral, and regional liberalization.
However, despite all these efforts, Ethiopia is not a major recipient of FDI
inflows. The country’s average share of global FDI inflows was only 0.01% in
2000–2013. In the same period, its annual average share in FDI inflows to the SSA
region was only 2%. The central question, therefore, is Why does Ethiopia not
attract much FDI?
There exists a very large body of literature on the determinants of FDI flows.
While most of them are cross-country studies in the developing world in general,
little has been done to investigate the determinants of FDI flows to Ethiopia
specifically. While cross-country studies are able to identify the factors that drive
FDI and examine its impact across countries, they fail to provide in-depth analyses
and country specific factors that are crucial in attracting FDI. Even the few studies
done on Ethiopia (which are by and large unpublished Masters’ theses) deal with
the economic determinants of FDI flows and ignore the role of political, gover-
nance, and institutional determinants of FDI flows to the host country. To the best
of our knowledge, ours is among the first studies that try to capture the effects of a
wide range of political and institutional quality indicators in the host country for
attracting FDI inflows. Among other things, most studies also share the problem of
a short series of data and omission of relevant macroeconomic variables in their
models. They are not theoretically and empirically systematic either. Our study
attempts to address these gaps.
The rest of the paper is organized as follows. Section 7.2 presents the trends in
FDI inflows to Ethiopia. Section 7.3 gives a review of the theoretical and empirical
7 Macroeconomic, Political, and Institutional Determinants of FDI … 125
literature on the determinants of FDI inflows to a host country. Our study’s

empirical methodology is discussed in Sect. 7.4 while Sect. 7.5 discusses the
results of the empirical exercise. Finally, Sect. 7.6 gives a conclusion and some
policy recommendations.
7.2 FDI Inflows to Ethiopia
Net FDI inflows to Ethiopia were at a mere US$3.9 million in 1970, representing a
very negligible share in global investment flows. This figure increased substantially
to US$953 million in 2013, although its share in global FDI flows was still a
decimal. This increase in FDI inflows to the country may be explained by factors
that characterized the economic and political landscape that prevailed over the
period under study. This period mainly witnessed two distinct political regimes.
The first period, 1974–1991 related to the Derg regime, where the socialist ideology
of a centralized command economic system controlled the sphere of socioeconomic
policy making in the country. As noted by Geda (2008), this regime was mainly
characterized by a deliberate repression of market forces and socialization of the
production and distribution process and adoption of a ‘hard control’ regime. In this
period, the country’s economic performance was highly irregular due to its
dependence on the agricultural sector (which is vulnerable to the vagaries of nature)
and the intense conflict that characterized the period (see Geda 2008). The second
period, post-1991 to the present, started with the coming to power of the Ethiopian
People Revolutionarily Democratic Front (EPRDF) in 1991, after the demise of
Derg. In terms of socioeconomic policies, there was a significant move away from
the doctrines of the command system in favor of a free market.
The regime has adopted structural adjustment policies of market liberalization
with the support of the World Bank and IMF (see Geda 2008). Economic perfor-
mance during this period has substantially improved not only by the Derg’s stan-
dards but also by African standards. The improvements in economic performance in
this period appear to be a combined result of the reforms, favorable weather con-
ditions, and better political stability and relative peace that have prevailed (see Geda
2008). Likewise, FDI inflows to the country have also registered a significant
increase in this period. They increased from a period’s average of US$5.9 million
during the Derg regime to around US$270 million in the EPRDF regime
(UNCTAD 2013). Thanks to the ups and downs (due to the global financial crisis in
2008 and deteriorating peace as a result of the war with Eritrea in 1998–2000,
among other things), net FDI inflows reached a level of nearly US$1 billion by
2013 (Fig. 7.1). As argued in a report of the Ethiopian Investment Commission
(2014), this was mainly due to the various liberalization policies, better economic
performance, and a stable political sphere that characterized the period.
126 A. Yimer
1000
953
800
600
400
200
0
1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2010 2013
-200
Fig. 7.1 FDI inflows to Ethiopia (1970–2013) (in million US$). Source Author’s computation
based on World Development Indicators (2015b) and UNCTAD (2013)
25.00
20.00
15.00
10.00
6.28
5.00
0.00
1970
1972
1974
1976
1978
1980
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
-5.00
Fig. 7.2 Ethiopia’s FDI inflows as percentage of gross fixed capital formation. Source Author’s
computation based on World Development Indicators (2015b) and UNCTAD (2013)
Total FDI inflows as a percentage of gross fixed capital formation in the country
were around 0.7% in 1990. This reached a little over 6% in 2013, despite the ups
and downs over the years. However, this is not a very big increase (see Fig. 7.2).
If we see the distribution of FDI inflows by sector, manufacturing led the list
(with a 70.6% share of the total FDI inflows) followed by the service sector (10.7%)
and agriculture (8.7%) (Ethiopian Investment Commission 2014).
7.3 Review of Related Literature
7.3.1 Theoretical Literature
The early neo-classical approach, summarized in MacDougall (1960), hypothesized

that capital flows across countries were governed by differential rates of return. It
argued that such capital inflows were welfare enhancing for both the parties
engaged in the capital’s movement. The MacDougall model assumes perfect
competition, risk-free capital movement, mobility in factors of production, and no
risk of default. The portfolio approach to FDI, presented in a reaction to the
MacDougall model, emphasizes not only return differentials but also risk (Agarwal
1980). In line with this, Ohlin (1933) was one of the first to address the issue of
determinants of FDI. According to Ohlin (1933), FDI was motivated mainly by the
possibility of high profitability in growing markets, along with the possibility of
financing these investments at relatively low rates of interest in the host country.
Other determinants were the necessity to overcome trade barriers and to secure
sources of raw materials. This is strengthened by a theory which emphasizes the
positive relationship between FDI and output (sales in host country) along the lines
of Jorgenson’s (1963) model (see Agarwal 1980).
A major criticism of these theories relates to the question of perfection in
markets. Hymer (1976) and Kindleberger (1969) argue that if foreign firms are to
compete and succeed in the host country, then they must be in possession of a
specific and transferable competitive advantage both over local firms and other
potential entrants into the local market. Building on Hymer’s (1960) analysis
Kindleberger (1969) posited that instead of multinational firms’ behavior deter-
mining the market structure, it is the market structure (monopolistic competition)
that determines a firm’s conduct by internalizing its production. Caves (1971) has
supported such an analysis and has further argued that FDI is also related to trade
barriers and could be taken as a way of avoiding uncertainties in supplies, or as a
way of imposing barriers to new firms in the external market. This analysis also
focuses on the micro-foundations of FDI by moving from a simple capital move-
ment/ portfolio theory to a broader production and industrial organizational theory.
This school of thought has formed the basis of a whole strand of literature.
According to this line of thinking, some advantages of competitive foreign firms
include cheaper sources of financing; the use of brand names and patent rights;
technological, marketing, and managerial skills; economies of scale; and entry and
exit barriers (Agarwal 1980; Kindleberger 1969).
A related micro-based theory of FDI has also emerged with the development of
Vernon’s product cycle theory (Vernon 1966). The product cycle theory is an
advance over previous theories in that it incorporates an analysis of oligopoly and
strategic market considerations. Based on Vernon’s theory of ‘product cycle,’ and
the existence of ‘new’ and ‘old’ goods, Krugman 1979) developed this theoretical
avenue further for explaining FDI flows. Specifically, he extended the analysis to a
North–South framework with innovation (in the ‘North’) and technology transfer
128 A. Yimer
(to the ‘South’) representing its crucial aspects. Krugman (1979) notes that tech-
nological progress raises the marginal product of capital and provides an incentive
for FDI. On the other hand, this process may be reversed through technology
transfer. Mainstream trade theories usually underlie this type of analysis. Recent
theories of trade such as that of the ‘economies of specialization’ which emphasize
the existence of intra-industry (as well as intra-firm) trade, also provide scope for an
analysis of FDI (see, for instance, Ocampo’s 1986 survey).
Notwithstanding Vernon’s contribution, building on Hymer’s original contri-
bution a second wave of refinements to the neo-classical capital movement/portfolio
theory of FDI has also come into being with the emergence of explanations based
on the ideas of ‘international firm’ and ‘industrial organization.’ The fact that
decision making about FDI takes place within the context of oligopolistic firm
structures and that such an investment includes a package of other inputs such as
intermediate imports and capital flows has led to the development of alternative
explanations grounded in the theory of industrial organization (see Agarwal 1980;
Dunning 1993; Helleiner 1989). In this approach as set out by Hymer, foreign firms
are seen as having an advantage over local ones. The foreign firms’ pursuit of FDI
is explained by the theory of internalization. This is characterized by the desire to
minimize transaction costs, a la Coase (1937) to tackle risks and uncertainties,
increase control and market power, achieve economies of scale, and ensure
advantageous transfer pricing (Buckley and Casson 1976; Hymer 1976). In this
approach, oligopoly is seen as mitigating, rather than creating market imperfections
(Helleiner 1989).
Dunning’s (1993) work, which he terms the ‘eclectic paradigm,’ represents a
culmination of this trend toward a refinement of FDI theories. Without departing
much from the Heckscher–Ohlin–Samuelson theory of trade for explaining the
spatial distribution of multinational firms, Dunning’s paradigm summarizes this
strand of theory under an ‘ownership-specific, location and internalization’
(OLI) framework (see Dunning 1993). Framed in a micro-macroeconomic frame-
work, Dunning’s (1981, 1988, 1993) approach provides a flexible and popular
framework where he argues that FDI is determined by three sets of advantages
which direct investments should have over the other institutional mechanisms
available for a firm in satisfying the needs of its customers at home and abroad. The
first of the advantages is an ownership (O)-specific one which includes the
advantage that a firm has over its rivals in terms of its brand name, patent, or
knowledge of technology and marketing. This allows the firm to compete with other
firms in the markets that it serves regardless of the disadvantages of being foreign.
The second is location (L)-specific advantages which relate to the importance for a
firm operating and investing in the host country and these advantages that make the
chosen foreign country a more attractive site for FDI than others. The third
advantage is the internationalization (I) advantage which relates to the preference of
a ‘bundled’ FDI approach over ‘unbundled’ product licensing, capital lending, or
technical assistance (Wheeler and Mody 1992). These refer to the superior com-
mercial benefits for firms resulting from the exploitation of ownership and
location-specific advantages by investing in foreign affiliates that they control,
rather than through transactions with unrelated firms located abroad. Helleiner
(1989) notes that ‘this “eclectic” theory of direct investment drawing on firm-
specific attributes, location advantages and internalization advantages—is widely
accepted.’ There also exists an international trade version of FDI determination
(termed the macro-approach) which is associated with Kojima (1973) work. The
Kojima model argues that FDI may be explained by the ‘comparative disadvantage’
of industry in the investing countries. According to Kojima’s theory, this may be
mitigated by investing in a foreign industry, which may be able to achieve com-
parative advantages in the production of a particular product and potentially even
export back to the home country. Naturally, this type of FDI will also have the
effect of increasing trade volumes (Kojima 1973).
In sum, the determinants of the FDI theory cover a range of explanations: the
pure capital movement, product cycle, industrial organization, the stagnation thesis,
and other political considerations. In the African context, the pure capital theory
does not work since the assumptions do not hold. Neither is Krugman’s hypothesis
workable since it is more relevant for countries with a good industrial base and
infrastructure. On the other hand, the concentration of multinational corporations in
the mining sectors in most African countries and, to a good degree, the importance
of the colonial history in determining their spatial pattern (see Geda 2002) might be
taken as lending support to the importance of the ‘eclectic’ approach. This theo-
retical insight is used in identifying FDI determinants in the empirical analysis and
construction of our model.
7.3.2 Empirical Literature: Empirical Regularity in Africa
The empirical literature on the determinants of FDI in developing countries is

voluminous and is based on both country case and cross-sectional analyses.
However, in the discussion that follows, we focus on evidence found in African
studies which offer some insights about the empirical analysis conducted in our
study. In general, the findings of these studies reveal that labor costs, country size,
economic openness, exchange rate regime, return on investment, human capital,
and political factors are among the most important factors explaining FDI flows to
the region.
Most studies on Africa report that FDI to Africa is largely motivated by natural
resource endowments of the countries on the continent (Asiedu 2002, 2003; Asiedu
and Gyimah-Brempong 2008; Basu and Srinivasan 2002; Morisset 2000; among
others). Based on a survey conducted in 29 African countries using both panel and
cross-sectional analysis, Morisset (2000) reported a high correlation between FDI
inflows and total value of natural resources in each country. He further reported that
economic growth and trade openness had a large impact on the level of FDI inflows
that a given country received. Basu and Srinivasan (2002) found that almost 40% of
the FDI in their African study found its way to the primary sector, particularly in the
oil and mineral extraction business. Countries such as Angola, Botswana, Namibia,
130 A. Yimer
and Nigeria received foreign investments targeted at the oil and minerals sectors of
their economies (Basu and Srinivasan 2002). Though natural resource abundance is
a common factor which explains much of the FDI inflows, a few successful African
countries have also managed to attract FDI by creating favorable economic, social,
and political environments (Basu and Srinivasan 2002; UNCTAD 1998). For
instance, countries such as Mauritius and Seychelles have managed to attract FDI
by tailoring their FDI policies through liberalization, export orientation, tax, and
other investment incentives. Moreover, some countries such as Lesotho and
Swaziland have attracted FDI because they are near South Africa and investors
wanting to serve the large market in South Africa have located their subsidiaries in
these countries (Basu and Srinivasan 2002; UNCTAD 1998).
Asiedu (2002) analyzed 34 countries in sub-Saharan Africa over 1980–2000.
Using a panel data analysis, she found that openness to trade, higher incomes and
better growth prospects, and better institutional frameworks and infrastructure were
‘rewarded’ with more investments. Later studies by Asiedu (2003, 2006) show the
significant role of a country’s market size and natural resource endowment in
enhancing FDI. Lower inflation, good infrastructure, an educated population,
openness, less corruption, political stability, and a reliable legal system were also
found to have similar positive effects on FDI flows into the continent in these
studies. Asiedu and Gyimah-Brempong (2008) validated these finding to a large
extent and noted that countries that were small or lacked natural resources could
attract FDI by improving their institutions and policy environments.
Based on a co-integration analysis for 1970–2000 using data from 19 SSA
countries, Bende-Nabende (2002) found market growth, export-oriented policies,
and liberalization as the most dominant long-run determinants of FDI in Africa. In
line with Bende-Nabende (2002), focusing on manufactured goods, primary com-
modities, and services, Kandiero and Chitiga (2003) analyzed the impact of
openness on FDI flows to Africa in 51 African countries. Their findings indicate
that FDI responds significantly to increased openness in the whole economy in
general and in the service sector in particular.
Using fixed and random effects models on a panel dataset for 29 African
countries over the period 1975–1999, Onyeiwu and Shrestha (2004) identified
economic growth, inflation, openness of the economy, international reserves, and
natural resource availability as important determinants of FDI to Africa. Contrary to
conventional wisdom, political rights and infrastructure were found to be unim-
portant in their study. Krugell (2005) also empirically tested the significance of a
number of hypothesized determinants of FDI in sub-Saharan Africa. The pooled
cross-country and time-series estimation covered the period 1980–1999 in 17
countries. Krugell’s results are in line with the findings mentioned earlier, partic-
ularly with respect to economic growth and openness.
Abdoul (2012) estimated a model of FDI determination using five-year panel
data with the system-GMM technique over 1970–2009 for 53 African countries. He
found that larger countries attracted more FDI. However, regardless of their size,
more open and politically stable countries that offered higher returns to investments
also attracted FDI. FDI inflows were also found to be persistent in the sense that
countries that manage to attract FDI today are likely to attract more FDI in the
future. Using cross-country data for 53 African countries for the period 1996–2008,
Anyanwu (2012) found market size (whose proxy is urban population as percentage
of total population and GDP per capita of the host country), openness to trade, the
rule of law, foreign aid, natural resources, and past FDI inflows (increased
agglomeration) to have a positive effect on FDI inflows. He also found domestic
financial development to have a negative effect on FDI inflows. Further, he found
that East and Southern African sub-regions appeared positively disposed to
obtaining higher levels of inward FDI.
Among the most recent FDI studies on Africa, Geda and Yimer (2015) have
estimated a model of FDI determination for Africa based on a new analytical
country classification of African economies as ‘Fragile, Factor, and Investment
driven’ economies. Using a panel co-integration approach over 1996–2012 they
found market size, availability of natural resources, openness to international trade,
a stable macroeconomic environment, better infrastructure, and an effective
bureaucracy to have a strong positive impact on attracting FDI to the continent. On
the other hand, they also found that political and macroeconomic instability and
high financial and transfer risks had a negative effect on attracting FDI to the
continent. However, the effect of these factors varied significantly across the ana-
lytical country classification that they developed (Geda and Yimer 2015). Among
all determinants of FDI only government effectiveness and natural resource abun-
dance were found to be important across all countries. They stress on the impor-
tance of emphasizing different policies in different countries or country groups.
Country case studies on Africa, which invariably use time series analyses, have
reported results that are similar to those in recent cross section-based studies
reviewed earlier. Among these, Astatike and Assefa (2005) examined determinants
of FDI in Ethiopia over 1974–2001 using a time series analysis. Their empirical
analysis shows that economic growth, export orientation (openness), and liberal-
ization had a significant positive impact on FDI, while macroeconomic instability
(measured by inflation) and a low level of physical infrastructure (measured by
telephone lines per 1000 people) had a negative impact. Similarly, using a time
series analysis for Cameroon, Sunday and Lydie (2006) show that the level of
infrastructure development (increased electricity production and the ratio of paved
roads) was the most significant determinant of FDI in the country. Market size
(GDP per capita), openness, human capital development, and the rate of economic
growth were also important but were found to be less significant. Exchange rate,
political risk, the rate of inflation, debt burden, agglomeration effect, and the cre-
ation of an export-processing zone did not have any influence on FDI in Cameroon.
Seetanah and Rojid (2011) examined the determinants of FDI in Mauritius using
reduced-form demand for the inward FDI function. In their study, openness, wages,
and the quality of labor in the host country were important. Size of the market was
reported to have a relatively lesser impact on FDI; this is probably related to the
limited size of the population and the good export opportunities from Mauritius to
other African countries especially in SADEC/COMESA regions. The significant
coefficient of the lagged dependent variable in their model suggests the presence of
132 A. Yimer
dynamism in the system. Finally, Okpara (2012), using Granger causality and an
error correction model investigated the determinants of FDI flows to Nigeria during
1970–2009. He found that natural resource abundance, fiscal incentives, favorable
government policies, exchange rate, and infrastructural development had a positive
and statistically significant effect on FDI flows to Nigeria. Though statistically
insignificant, market size and trade openness were found to have a positive sign
while political risk was found to have a negative sign. Further, the statistically
significant error correction term revealed that past foreign investment flows could
significantly stimulate current investment inflows.
In sum, both the theoretical discussion in the previous section and the brief
review of empirical studies in this section show that market size, openness of the
economy, natural resource endowments, and political and macroeconomic stability
are important determinants of FDI flows to Africa. These are important factors that
any model about determinants of FDI flows to Africa needs to consider. However,
when examined in light of FDI theoretical literature, none of these African studies
formulate their empirical models by explicitly following one or the other strand of
literature. The variables used in their models, however, suggest the use of
Dunning’s eclectic paradigm without stating which variable is used as a proxy for
which theoretical concept. This is partly a result of missing theoretical discussions
and formulations in almost all these studies.
7.4 The Empirical Methodology
7.4.1 Auto-regressive Distributive Lag (ARDL) Approach

to Co-integration
In economic literature, a number of co-integration techniques such as the

Engle-Granger (1987), Johansen (1988), Johansen and Juselius (1990), Gregory
and Hansen (1996), Saikkonen and Lütkepohl (2000), and Pesaran et al.’s (2001)
ARDL approach have been used.
The ARDL approach developed by Pesaran et al. (1996, 2001) and Pesaran and
Shin (1999) has become popular in recent years. This ARDL model has some
advantages over other co-integration approaches. Firstly, this technique is com-
paratively more robust in small or finite samples consisting of 30–80 observations
(Pattichis 1999). Secondly, it can be utilized irrespective of whether regressors are
of I(0) or I(1) or mutually integrated though there still is a prerequisite that none of
the explanatory variables is of I(2) or higher order, that is, the ARDL procedure will
be inefficient in the existence of I(2) or higher order series. Thirdly, the ARDL
model applies a general-to-specific modeling framework by taking a sufficient
number of lags to capture the data-generating process.
Further, traditional co-integration methods may also experience the problems of
endogeneity, whereas the ARDL method can distinguish between dependent and
explanatory variables and remove the problems that may arise due to the presence
of auto-correlation and endogeneity. The ARDL co-integration estimates short-run
and long-run relationships simultaneously and provides unbiased and efficient
estimates. The appropriateness of using the ARDL model is that it is based on a
single equation framework. The ARDL model takes sufficient numbers of lags and
directs the data-generating process in a general to specific modeling framework
(Harvey 1981). Unlike other multivariate co-integration techniques such as
Johansen and Juselius (1990), the ARDL model permits the co-integration rela-
tionship to be estimated by OLS once the lag order of the model is identified. The
error correction model (ECM) can also be drawn by using the ARDL approach
(Pesaran and Shin 1999). ECM allows drawing outcomes for long-run estimates
while other traditional co-integration techniques do not provide such types of
inferences. As noted by Pesaran and Shin (1999), ECM joins together short-run
adjustments with long-run equilibrium without losing long-run information.
These advantages of the ARDL technique over other standard co-integration
techniques justify the application of ARDL approach in our study to analyze the
relationship among the FDI model’s variables.
7.4.2 The Empirical Model in the ARDL Framework
In order to examine the long-run relationship and the dynamic interaction between
FDI and institutions, our study employs an ARDL modeling approach. According
to Pesaran et al. (2001) the ARDL approach requires three steps:
The first step is estimating the long-run relationship among the variables. This is
done by testing the significance of the lagged levels of the variables in the error
correction form of the underlying ARDL model. Following Pesaran et al. (2001),
our ARDL model can be written as:
DLFDIt ¼ a0 þ b1 LFDIt1 þ b2 LRGDPt1 þ b3 LRESt1

þ b41 LINFt1 þ b5 LDEBGDPt1 þ b6 LOPNESt1
X
p
þ b7 LNERt1 þ b86OLSTABDPountry Polinstt1 þ d1 DLFDIt1
i¼1
X
p X
p X
p
þ d2 DRGDPt1 þ d3 DLRESt1 þ d4 DLINFt1
i¼0 i¼0 i¼0
Xp X
p
þ d5 DLDEBGDPt1 þ d6 DLOPNESt1
i¼0 i¼0
Xp X
p
þ d7 LNERt1 d8 DPolinstt1 þ et
i¼0 i¼0
134 A. Yimer
where LFDI is log of FDI, LRGDP is log of real GDP, RES is log of natural
resource abundance, INF is log of the domestic annual inflation rate, LDEBGDP is
log of external debt to GDP ratio, LOPNES is log of openness, LNER is log of
nominal exchange rate, Polinst is an indicator of political stability, and quality of
institutions in the host country. As there is a high degree of multi-collinearity
among the six political and institutional indicators, we used each of the political and
institutional indicators separately. Hence, the variable Polinst indicates in all of the
three steps a model that incorporates only a single political and institutional indi-
cator among the macroeconomic variables. The selection of the optimum lagged
orders of the ARDL models is based on the Schwarz Bayesian Criterion (SBC). In
order to test co-integration among the variables, the Wald F-statistics for testing the
joint hypotheses has to be compared with the critical values as tabulated by Pesaran
et al. (2001).
The joint hypotheses to be tested are as follows:
H0 : b1 ¼ b2 ¼ b3 ¼ b4 ¼ b5 ¼ b6 ¼ b7 ¼ b8 ¼ 0
H1 : bi 6¼ 0; i ¼ 1; 2. . .; 8
If the F-statistic is higher than the upper bound critical value, the null hypothesis
ðH0 Þ is rejected, indicating that there is a long-run relationship between the lagged
level variables in the model. In contrast, if the F-statistic falls below the lower
bound, then H0 cannot be rejected and no long-run relationship exists. However, if
the F-statistic falls in between the upper bound and lower bound critical values, the
inference is inconclusive. At this condition, the order of integration of each variable
should be determined before any inference can be made.
In the second step, once the co-integration is established, the conditional ARDL
(p, q, r, s, t, u, v, w) long-run model of the determinants of LFDIt can be estimated
as follows:
X
p X
q X
r X
s
LFDIt ¼ a0 þ b1 LFDIt1 þ b2 RGDPt1 þ b3 LRESt1 þ b4 LINFt1
i¼1 i¼0 i¼0 i¼0
X
t X
u
þ b5 LDEBGDPt1 þ b6 LOPNESt1
i¼0 i¼0
Xv X
w
þ b7 LNERt1 b8 Polinstt1 þ et
i¼0 i¼0
In the final step, we obtain the short-run dynamic parameters by estimating an

error correction model (ECM) associated with the long-run estimates. This is
specified as follows:
X
p X
q X
r
DLFDIt ¼ a0 þ d1 DLFDIt1 þ d2 DLRGDPt1 þ d3 DLRESt1
i¼1 i¼0 i¼0
X
s X
t X u
þ d4 DLINFt1 þ d5 DLDEBGDPt1 þ d6 DLOPNESOt1
i¼0 i¼0 i¼0
Xv X
w
þ d7 LNERt1 þ d8 DLPolinst1 þ hECMt1 þ et
i¼0 i¼0
where, d1 ; d2 ; d3 ; d4 ; d5 ; d6 ; d7 and d8 are the short-run dynamic coefficients of the

model’s convergence to equilibrium and h is the speed of adjustment.
In specifying the equation of our FDI model, we used the theoretical lines of
Dunning’s (1981, 1988, 1993) ‘eclectic theory’ of OLI advantages as determinants
of FDI flows to Africa. In addition to location advantages, Dunning’s ownership
and internalization (LI) advantages that may attract FDI to Ethiopia could be
proxied by market size, natural endowments, and a stable macroeconomic and
political environment as African empirical literature in the previous section shows.
Thus, we used these variables which are now briefly described as part of our
empirical model.
The FDI data (the dependent variable) series is taken from the African
Development Indicators (2015) and the World Development Indicators (2015) of
The World Bank (2015a, b).
7.4.3 Macroeconomic Variables
RGDP: Real GDP is a measure of the size of the host market, which also represents
the host country’s economic conditions and the potential demand for output.
Following the literature, real GDP is used to proxy for market size. Since this
variable is used as an indicator of the market potential for products of foreign
investors, the expected sign is positive.
RES: Natural resource availability. The availability of natural resources might be
a major determinant of FDI to the host country. FDI takes place when a country
richly endowed with natural resources lacks the amount of capital or technical skills
needed to extract or/and sell to the world market. Foreign firms embark on vertical
FDI in the host country to produce raw materials or/and inputs for their production
processes at home. This means that certain FDI may be less related to profitability
or market size of the host country than natural resources which are unavailable to
the domestic economy of foreign firms. As posited by the eclectic theory, all else
being equal, countries that are endowed with natural resources receive more FDI.
As noted by Asiedu (2002) very few studies on the determinants of FDI control for
136 A. Yimer
natural resource availability (except Morisset 2000; Geda and Yimer 2015). The
omission of natural resources from estimations, especially for African countries
may cause the estimates to be biased (Asiedu 2002). Given the absence of fuel and
other petroleum related resources in the country, the share of mining and quarrying
value added (current US$) is used to capture the availability of natural resource
endowments. This variable is considered acknowledging the fact that a good share
of FDI inflows to the country found its way to this sector.
OPNES: Trade openness as measured by total trade as a percentage of GDP. In
literature, the degree of liberalization of the trade regime in the host country is
regarded as a very important factor that promotes FDI inflows. This proxy is
important for foreign direct investors who are motivated by the export market. More
open economies usually follow ‘appropriate’ trade and exchange rate policies and
espouse a relatively liberal investment regime (Geda and Yimer 2015).
DEBGDP: External debt as a percentage of GDP. External debt is considered a
component of financial risk, influencing FDI inflows negatively (Nonnenberg and
Mendonca 2004). In addition, heavily indebted countries represent higher transfer
risks—the risk of potential restrictions on the ability to transfer funds across
national boundaries. Transfer risks are an important component of country risks and
a variable closely monitored by foreign investors. Higher transfer risks may cause
foreign capital to move out of a country and new FDI flows to be re-routed to safer
locations. The sign associated with EXTDEBTGDP is expected to be negative.
INF: Annual inflation rate. This is another important variable of macroeconomic
stability indicators which may affect FDI. It represents changes in the general price
level or inflationary conditions in the economy. In our study, the impact of inflation
rates on FDI is expected to be negative.
NER: The nominal exchange rate. The effect of changes in exchange rates on
FDI flows is ambiguous. Elbadawi and Mwega (1997), among others, used the real
exchange rate as an indicator of a country’s international competitiveness,
hypothesizing that a real depreciation would attract larger FDI flows. However, it
may be argued that unless the purpose of FDI flows to a country is to build an
export platform overvalued exchange rates should not represent a considerable
hurdle to foreign investors. On the contrary, depreciation increases the costs of
imported inputs and reduces the foreign currency value of profit remittances, both
of which have adverse effects on the profitability of FDI projects. This effect will
dominate if FDI is undertaken primarily to serve the domestic market. Thus, if we
assume that a prospective investor uses the previous year’s change in the exchange
rate as a guide to its evolution in the near future, we would expect a negative sign
on the variable Δ ER (since an increase in the index represents a depreciation).
7.4.4 Political and Institutional Variables (Polinst)
As noted by Schneider and Frey (1985) political instability and the frequent
occurrence of disorder ‘create an unfavorable business climate which seriously
erodes the risk-averse foreign investors’ confidence in the local investment climate
and thereby repels FDI away.’ Political stability, as argued by Aseidu (2002), is a
significant factor in location decisions of multinational corporations (MNCs),
especially in their decisions to invest in African states.
Our study used the Worldwide Governance Indicators (WGI) research dataset of
the Political Risk Services (2015) to capture the effect of political instability and
quality of institutions in attracting FDI inflows to the host country. This dataset
summarizes the views on the quality of governance provided by a large number of
enterprises, citizens, and expert survey respondents in industrial and developing
countries. This data was gathered from a number of survey institutes, think tanks,
non-governmental organizations, international organizations, and private sector
firms.
WGI projects constructs of aggregate indicators of six broad dimensions of
governance: Voice and accountability; political stability and absence of violence/
terrorism; government effectiveness; regulatory quality; the rule of law; and control
of corruption. The six aggregate indicators are based on 31 underlying data sources
reporting the perceptions of governance of a large number of survey respondents
and expert assessments worldwide.1
Voice and accountability (VOIACC): Reflects perceptions about the extent to
which a country’s citizens are able to participate in selecting their government, as
well as freedom of expression, freedom of association, and a free media.
Political stability and absence of violence/terrorism (POLSTAB): Reflects per-
ceptions about the likelihood that the government will be destabilized or over-
thrown by unconstitutional or violent means including politically-motivated
violence and terrorism.
Government effectiveness (GOVEFFE): Reflects perceptions about the quality
of public services, the quality of civil services and the degree of its independence
from political pressures, the quality of forming and implementing policies and the
credibility of the government’s commitment to such policies.
Regulatory quality (RQ): Reflects perceptions about the government’s ability to
formulate and implement sound policies and regulations that permit and promote
private sector development.
Rule of law (RoL): Reflects perceptions about the extent to which agents have
confidence in and abide by the rules of society, in particular, the quality of contract
enforcement, property rights, the police and the courts, as well as the likelihood of
crime and violence.
Control of corruption (CORR): Reflects perceptions about the extent to which
public power is exercised for private gain, including both petty and grand forms of
corruption as well as the ‘capture’ of the state by elites and private interests.
Political and institutional risk rating, as provided by the International Country
Risk Guide of Political Risk Services (2015), awards the highest value to the lowest
1
Details on the underlying data sources, the aggregation method, and the interpretation of the
indicators, can be found in Kaufmann et al.’s (2010) WGI methodology paper.
138 A. Yimer
risk and the lowest value to the highest risk and provides a means for assessing the
political and institutional framework of countries. The expected signs for all the
institutional variables are positive, which indicates that better quality institutions
will stimulate more foreign investments.
As there is a high correlation among the political and institutional indicators and
the possibility of a high degree of multi-collinearity among them, we used each of
the political and institutional indicators separately and hence estimated six separate
models (see Annexure 1 for the correlation matrix).
7.5 Discussion of Results
In an econometric analysis, before carrying out any estimation, a test for station-
arity2 of the variables in the model is undertaken. We found that some of the
variables to be integrated were of order one-I(1), while others to be integrated were
of order zero-I(0) (see Table 7.1).
Once checked for the unit root tests, the next step in the bounds test approach for
co-integration is estimating the ARDL model using the appropriate lag length. One
of the most important issues in applying ARDL is choosing the order of the dis-
tributed lag functions. Pesaran et al. (2001) have shown that the Schwarz Bayesian
Criterion (SBC) should be used in preference over other model specification criteria
because it often has more parsimonious specifications: the small data sample in our
current study further reinforces this point. Since we had 43 annual observations, we
chose two as the maximum lag length in the ARDL model.
For all the models, the bound test for co-integration with the null hypothesis of
no long-run relationship among the variables is rejected as the F-statistic is greater
than that of the upper bound critical value even at the one percent significance level.
This proved the presence of a long-run relationship among the variables of interest
in each of the models estimated (Table 7.2).
In the standard least squares model, the coefficient variance-covariance matrix is
derived with a key assumption that the error terms are conditionally homoskedastic
and serially uncorrelated (White White 1980). In cases where this assumption is
relaxed to allow for heteroskedasticity or auto-correlation, the expression for the
covariance matrix will be different and our inferences based on it will be misleading
(Roecker 1991; White 1980; Wooldridge 2000, among others).
Given that the problem of heteroskedasticty and serial correlation is a customary
problem in a time series analysis, it is necessary to estimate the coefficient
covariance under the assumption that the residuals are conditionally
heteroskedasticity and auto-correlated (Newey and West 1987). The coefficient
2
In this study, the Augmented Dickey-Fuller unit root testing procedure (which does not take into
account a structural break in the data) and the Lumsdaine and Papell (1997) unit root test (which
captures two structural breaks in a series) are used. Though the latter is not reported here, both tests
are in conformity.
Table 7.1 Unit root test results

Variables At level At first difference Conclusion
Intercept Intercept and trend Intercept Intercept and trend
LFDI −0.82 −2.69 −9.49 −9.42 I(1)
(0.80) (0.24) (0.00) (0.00)
LRGDP −0.51 −1.01 −3.67 −3.72 I(1)
(0.88) (0.93) (0.00) (0.03)
LRES −1.61 −2.98 –6.88 −6.84 I(1)
(0.47) (0.15) (0.00) (0.00)
LINF −5.55 −5.57 −8.28 −8.16 I(0)
(0.00) (0.00) (0.00) (0.00)
LDEBGDP −1.94 −1.05 −4.45 −4.81 I(1)
(0.31) (0.93) (0.00) (0.00)
LOPNES −1.33 −1.89 −6.32 −6.24 I(1)
(0.61) (0.64) (0.00) (0.00)
POLSTAB −2.72 −3.01 −4.41 −3.77 I(0)
(0.08) (0.04) (0.00) (0.03)
GOVEFFE −0.21 −1.71 −7.64 −7.24 I(1)
(0.93) (0.73) (0.00) (0.00)
CORR −2.25 −3.51 −6.69 −6.63 I(1)
(0.19) (0.05) (0.00) (0.00)
RoL −3.54 −3.28 −7.49 −6.29 I(0)
(0.01) (0.08) (0.00) (0.00)
RQ −2.32 −1.80 −3.37 −3.54 I(1)
(0.17) (0.68) (0.00) (0.00)
VOIACC −1.26 −2.48 −6.60 −6.23 I(1)
(0.64) (0.33) (0.00) (0.00)
Note p values in parenthesis
Table 7.2 Bound test for co-integration

Model F-test statistic Critical value bound level of significance
10% 1%
I0 bound I1 bound I0 bound I1 bound
Model 1 5.08 1.99 2.94 2.88 3.99
Model 2 4.09 1.99 2.94 2.88 3.99
Model 3 4.91 1.99 2.94 2.88 3.99
Model 4 4.10 1.99 2.94 2.88 3.99
Model 5 4.19 1.99 2.94 2.88 3.99
Model 6 4.20 1.99 2.94 2.88 3.99
covariance estimator under this assumption is termed the Heteroskedasticity and

Auto-correlation Consistent Covariance (HAC) or the Newey-West estimator. Note
that both these approaches will change the coefficient standard errors of an equa-
tion, but not their point estimates (Newey and West 1987). Newey and West (1987)
140 A. Yimer
have proposed a more general covariance estimator that is consistent in the presence
of both heteroskedasticity and auto-correlation of unknown form. This procedure is
followed in our study. Tables 7.3 and 7.4 present the long-run and short-run
determinants of FDI inflows to Ethiopia based on the ARDL approach.
(A) The long-run model
In line with previous empirical studies on Africa, most of the explanatory
variables have their expected signs in the long run. Market size (as proxied by
GDP), trade openness (as proxied by trade as a percentage of GDP), resource
abundance and deprecation in the official exchange rate are found to have a sig-
nificant positive impact on FDI inflows in the long run.
The significant positive long-run coefficient on the GDP variable is in line with
theory and suggests the presence of market seeking FDI inflows to the country.
Given that Ethiopia is home to more than 90 million people and a rising
middle-class population this may not be surprising.
The positive sign of the resource abundance indicator variable, as proxied by the
mining and quarrying value added, indicates the presence of resource seeking FDI
inflows to the country. This is not surprising given that a good share of FDI inflows
to the country found their way to this sector.
The significant positive coefficient on the exchange rate variable may indicate, as
noted by Elbadawi and Mwega (1997) among others, that depreciation in Ethiopia’s
exchange rate is affecting the inflows of FDI positively.
On the other hand, macroeconomic instability as proxied by the inflation rate
was found to affect FDI inflows negatively. The significant negative coefficient of
the inflation variable in the long run implies that foreign investors prefer investing
their money in countries where they perceive better macroeconomic stability.
Similarly, the significant positive coefficient of the trade openness variable suggests
that liberalization in the external trade sector of the country has encouraged FDI
inflows; this also supports the proposition that foreign investors are more likely to
invest in countries that have opened up to the outside world (see Onyeiwu and
Shrestha 2004; Asiedu 2006; Anyanwu 2012; among others).
In addition, better political stability and absence of violence/terrorism, govern-
ment effectiveness in forming and implementing quality policies and the credibility
of the government’s commitment to such policies, regulatory quality with regard to
the ability of the government to formulate and implement sound policies and
regulations that permit and promote private sector development, and better per-
formance of the rule of law affect FDI inflows into the country positively.
(B) The short-run model
In line with previous empirical studies on Africa, most of the macroeconomic
determinants of FDI inflows have their theoretical expected signs in all the models
in the short run. Market size, natural resource abundance, and trade openness were
found to affect FDI inflows in a significant positive way. The positive sign of the
natural resource availability variable as proxied by the mining and quarrying value
Table 7.3 The long-run model’s results

Dependent variable: log of net FDI inflows
Sample: 1970–2013; no. of observations: 43
Variables Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
ARDL ARDL ARDL ARDL ARDL ARDL
(1, 1, 0, 0, (1, 1, 0, 0, (1, 0, 0, 1, (1, 1, 0, 0, (1, 0, 0, 1, (1, 1, 0, 1,
0, 0, 2, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0)
Coefficient Coefficient Coefficient Coefficient Coefficient Coefficient
Log of real 0.71** 0.98** 1.14** 0.37* 0.98** 0.16*
GDP per
capita
Log of log of 2.47** 3.16** 2.00* 2.99** 2.45* 1.96*
natural
resource
abundance
Log of −1.93* −2.49** −1.9 −1.98* −2.28* −1.59
inflation
Log of −0.27 −0.22 −0.09 −0.44 −0.19 −0.58**
external debt
to GDP ratio
Log of 0.18** 0.19** 0.34 0.23** 0.33 −0.33
openness
Log of 4.53*** 4.41*** 3.61*** 4.77** 3.92*** 4.10***
nominal
exchange rate
Rule of law 4.60*
Political 2.19**
stability
Government 2.93*
effectiveness
Control of −1.51
corruption
Regulatory 5.09**
quality
Voice and 2.29
accountability
Constant 23.35** 21.61** 33.47** 9.23 31.05*** 3.05
Note ***, ** and * indicate 1, 5 and 10% level of significance respectively
added indicates the presence of resource seeking FDI flows to the country. This is
not surprising given that a good share of FDI inflows to the country found their way
to this sector.
The consistent negative coefficient of the inflation variable in all the models in
the short run implies that foreign investors prefer investing their money in countries
where they perceive better macroeconomic stability. Similarly, the significant
positive coefficient of the trade openness variable suggests that liberalization in the
142 A. Yimer
Table 7.4 The short-run model: Error correction model’s (ECM) results
Dependent variable: D(log of net FDI inflows)
Sample: 1970–2013; no. of observations: 43
Variables Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
ARDL ARDL ARDL ARDL ARDL ARDL
(1, 1, 0, 0, (1, 1, 0, 0, (1, 0, 0, 1, (1, 1, 0, 0, (1, 0, 0, 1, (1, 1, 0, 1,
0, 0, 2, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0) 0, 0, 1, 0)
Coefficient Coefficient Coefficient Coefficient Coefficient Coefficient
D(Log of real 4.06*** 3.66*** 1.57* 3.93*** 1.22* 2.7**
GDP per
capita)
D(Log of 2.36** 2.50** 1.33* 2.27** 1.78* 1.46
natural
resource
abundance)
D(Log −1.28** −2.06* −1.75 −1.81 −2.24* −1.38
inflation)
D(Log of −0.29 −0.61 −0.19 −0.65 −0.23 −0.72
external debt to
GDP ratio)
D (Log of 0.19* 0.18* 0.02 0.16 0.04 0.04
openness)
D(Log of −3.36** −0.98 −0.06 −1.52 0.44 −1.08
nominal
exchange rate)
D(Rule of law) 4.38**
D(Political −0.26
stability)
D(Government 4.19***
effectiveness)
D(Control of 0.66
corruption)
D (Regulatory 5.09**
quality)
D(Voice and 4.02**
accountability)
ECMt−1 −0.92*** −0.88*** −0.84*** −0.79*** −0.90*** −0.78***
Note ***, ** and * indicate 1, 5 and 10% level of significance respect
external trade sector of the country has encouraged FDI inflows and also supports
the proposition that foreign investors are more likely to invest in countries that have
opened up to the outside world (see Onyeiwu and Shrestha 2004; Asiedu 2006;
Anyanwu 2012; Geda and Yimer 2015; among others).
In addition, except for controlling corruption and political stability, all the other
political and institutional indicators have their a prior expected significant positive
signs. Among the political and institutional indicators, better regulatory quality,
better performance of the rule of law, and government effectiveness have a sig-
nificant positive effect on FDI inflows to the country.
As Table 7.4 shows, the expected negative sign of the error correction term
(ECM) is highly significant, suggesting that deviations from the long-term trajec-
tory are corrected very quickly. The ECM coefficient shows how quickly/slowly the
relationship returns to its equilibrium path, and it should have a statistically sig-
nificant coefficient with a negative sign. This holds for all the models estimated. As
noted by Banerjee et al. (1998), a highly significant error correction term is further
proof of the existence of a stable long-term relationship.
(C) Diagnostic and stability tests
As shown in Table 7.5 all the estimated models had a good fit. In addition, all
the models passed all the exhaustive post-estimation diagnostic tests. Such tests
included the normality test, heteroskedasticity test, test for serial correlation, model
specification and stability test and a test for normality. In analyzing the stability of
the long-run coefficients together with short-run dynamics, the cumulative sum
(CUSUM) and the cumulative sum of squares (CUSUMQ) were applied (see
Annexure 2 for the results). Following Pesaran et al. (2001), the stability of the
regression coefficients was evaluated by stability tests as they can show whether or
not the regression equation is stable over time. This stability test is appropriate in
time series data, especially when we are uncertain about when structural change
might have taken place.
As can be seen in the graphs in Annexure 2, the plots of both CUSUM and
CUSUMSQ statistics moved between the critical bounds at the 5% significance
level and did not cross the lower and upper critical limits. The latter implies that the
estimated coefficients were stable and there was no structural break.
Table 7.5 Diagnostic and stability tests

Tests Model Model Model Model Model Model
1 2 3 4 5 6
R-squared 0.90 0.89 0.89 0.89 0.89 0.89
Adjusted R-squared 0.87 0.85 0.85 0.85 0.85 0.84
F-statistic 23.82 24.23 23.00 22.4 23.09 20.77
Prob(F-statistic) 0.00 0.00 0.00 0.00 0.00 0.00
Jarque–Berra 0.78 0.10 1.66 0.25 1.19 0.79
Prob(Jarque–Berra) 0.67 0.95 0.43 0.88 0.55 0.67
Breusch–Godfrey serial 0.36 0.36 0.26 0.48 0.53 0.46
correlation LM test*
Heteroskedasticity test: 0.72 0.65 0.81 0.88 0.56 0.99
ARCH*
Ramsey reset test* 0.04 0.10 0.47 0.18 0.76 0.12
Note *p value is reported
144 A. Yimer
7.6 Conclusion
Based on the ARDL modeling approach along the lines of Dunning’s (1981, 1988)
‘eclectic theory,’ this study identified the main determinants of FDI flows to
Ethiopia for the period 1970–2013. The results of the empirical modeling exercise
in this study conclusively support the hypothesis that FDI in Africa is conditional
on prudent macro-policies and enabling business environments manifested through
better political stability and institutional quality. Better macroeconomic conditions,
political stability, institutional quality, and resource availability affect FDI flows to
Ethiopia positively. The effect of depreciation in the exchange rate was also found
to effect FDI inflows positively.
Prudent fiscal and monetary policies to tackle the negative impact of inflationary
pressures on FDI inflows and a move toward a careful liberalization of the foreign
exchange market and of external trade are important policy options that the gov-
ernment could work on to boost FDI inflows to the country. In addition, sustaining
the current growth momentum of the economy and further strengthening political
stability in the country, taking sincere steps to increase transparency, controlling
corruption and improving the regulatory quality of the country’s institutions are
fundamental areas that the government could work on to strengthen the country’s
position in the FDI inflows to the continent.
Further, regarding institutional and political factors, foreign investors are
attracted to those African countries that are more democratic. To attract foreign
investors, the country needs to improve its political and social situation and elevate
its democracy from a mere electoral level to a more liberal one. What is needed,
therefore, is deep introspection and political reforms of the various institutions and
political parties seeking to govern so as to promote a sustained commitment to
democracy that will guarantee equal citizenship, political pluralism, freedom,
human rights, general respect for others, and socio-political cum economic
inclusion.
Annexure 1: Correlation Matrix of the Political

and Institutional Indicators (Polinst)
Covariance analysis: ordinary

Sample: 1970–2013
Included observations: 43
Correlation* RoL POLSTAB GOVEFFE CORR RQ VOIACC
RoL 1.00
–
POLSTAB −0.84 1.00
(0.00) –
GOVEFFE 0.71 −0.90 1.00
(0.00) (0.00) –
CORR 0.77 −0.69 0.75 1.00
(0.00) (0.00) (0.00) –
RQ 0.71 −0.89 0.96 0.69 1.00
(0.00) (0.00) (0.00) (0.00) –
VOIACC −0.67 0.88 −0.84 −0.65 −0.77 1.00
(0.00) (0.00) (0.00) (0.00) (0.00) –
Note *p values in parenthesis
where
RoL Rule of law
POLSTAB Political stability and absence of violence/terrorism
GOVEFFE Government effectiveness
CORR Control of corruption
RQ Regulatory quality
VOIACC Voice and accountability
146 A. Yimer
Annexure 2: Parameter Stability Tests
Model 1
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8 0.0
-12 -0.2
-16 -0.4
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
CUSUM 5% Significance CUSUM of Squares 5% Significance
Model 2
CUSUM CUSUMSQ
16 1.4
1.2
12
1.0
8
0.8
4
0.6
0
0.4
-4 0.2
-8 0.0
-12 -0.2
-0.4
-16
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12

Model 3
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8 0.0
-12 -0.2
-16 -0.4
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
Model 4
CUSUM CUSUMSQ
16 1.4
1.2
12
1.0
8
0.8
4
0.6
0
0.4
-4 0.2
-8 0.0
-12 -0.2
-0.4
-16
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12

148 A. Yimer
Model 5
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8
0.0
-12 -0.2
-16 -0.4
84 86 89 90 92 94 96 98 00 02 04 06 08 10 12 84 86 89 90 92 94 96 98 00 02 04 06 08 10 12
Model 6
CUSUM CUSUMSQ
16 1.4
12 1.2
1.0
8
0.8
4
0.6
0
0.4
-4
0.2
-8 0.0
-12 -0.2
-16 -0.4
84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12
References
Abdoul GM (2012) What drives foreign direct investments in Africa? An empirical investigation
with panel data. African Center for Economic Transformation (ACET), Accra
Agarwal JP (1980) Determinants of foreign direct investment: a survey. Rev World Econ
116(4):39–773
Anyanwu JC (2012) Why does foreign direct investment go where it goes? New evidence from
african countries. Ann Econ Fin 13(2):433–470
Asiedu E (2002) On the determinants of foreign direct investment to developing countries: Is
Africa different? World Dev 30(1):107–118
Asiedu E (2003) Foreign direct investment in Africa: the role of government policy, institutions
and political instability. KS working paper No. 23, University of Kansas
Asiedu E (2006) Foreign direct investment in Africa: the role of natural resources, market size,
government policy, institutions and political instability. World Econ 29(1):63–77
Asiedu E, Gyimah-Brempong K (2008) The effect of the liberalization of investment: policies on

employment and investment of multination corporation corporations in Africa. African Dev
Rev 20(1):49–66
Astatike G, Assefa H (2005). Determinants of foreign direct investment in Ethiopia: a time series
analysis. In: Paper prepared at the 4th international conference on the Ethiopian economy,
June, Addis Ababa, Ethiopia
Banerjee AJ, Dolado J, Mestre R (1998) Error-correction mechanism tests for co-integration in
single-equation framework. J Time Ser Anal 19:267–283
Basu A, Srinivasan K (2002). Foreign direct investment in Africa—some case studies. IMF
Working Paper No. 61
Bende-Nabende A (2002) Foreign direct investment determinants in sub-Saharan Africa: a
co-integration analysis. Econ Bull 6(4):1–19
Buckley P, Casson M (1976) The future of the multinational enterprises. Macmillan, London
Caves RE (1971) Industrial corporations: the industrial economics of foreign investment.
Economica 38:1–27
Coase RH (1937) The nature of the firm. Econ New Ser 4(16):386–405
Dunning JH (1981) Explaining the international direct investment position of countries toward a
dynamic or development approach. Weltwirtschaftliches Archiv 117:30–64
Dunning JH (1988) The eclectic paradigm of international production: a restatement and some
possible extensions. J Int Bus Stud 19(1):1–31
Dunning JH (1993) Multinational enterprises and the global economy. Wesley, Wokingham
Elbadawi I, Mwega F (1997) Regional integration, trade, and foreign direct investment in
sub-Saharan Africa. In: Iqbal Z, Khan M (eds) Trade reform and regional integration in Africa.
IMF, Washington, DC
Engle RF, Granger CWJ (1987) Co-integration and error correction: representation, estimation and
testing. Econometrica 55:251–276
Ethiopian Investment Commission (2014) An investment guide to ethiopia. Addis Ababa, Ethiopia
Geda A (2002) Finance and trade in Africa: macroeconomic response in the world economy
context. Palgrave, New York
Geda A (2008) The political economy of growth in ethiopia. In: Ndulu B, O’connell SA, Azam JP,
Bates RH, Fosu AK, Gunning JW, Njinkeu D (eds) The political economy of growth in Africa:
1960–2000. Cambridge University Press, Cambridge
Geda A, Yimer A (2015) Determinants of foreign direct investment inflows to Africa: a panel
co-integration evidence using new analytical country classification. AAU Dep economics
working paper No. 4. Addis Ababa University
Gregory AW, Hansen BE (1996) Residual-based tests for cointegration in models with regime
shifts. J Econom 70:99–126
Harvey AC (1981) Time series models. Phillip allan and atlantic highlands, Humanities Press,
Oxford, NJ
Helleiner GK (1989) Transnational corporations and direct foreign investment (Ch. 27). In:
Chenery H, Srinivasan TN (eds) Handbook of development economics, vol. III. Elsevier,
Amsterdam
Hymer SH (1960) The international operations of national firms: a study of direct foreign
investment. PhD dissertation. The MIT Press, Cambridge, MA (Published posthumously)
Hymer SH (1976) The international operation of national firms: a study of direct foreign
investment. MIT Press, Cambridge
Johansen S (1988) Statistical analysis of cointegration vectors. J Econ Dyn Control 12(2–3):
231–254
Johansen S, Juselius K (1990) Maximum likelihood estimation and inference on cointegration—
with applications to the demand for money. Oxf Bull Econ Stat 52(2):169–210
Jorgenson DW (1963) Capital theory and investment behavior. Am Econ Rev 53:247–259
Kandiero T, Chitiga M (2003) Trade openness and foreign direct investment in Africa. In: Paper
prepared for the economic society of Southern Africa annual conference. Department of
Economics, University of Pretoria
150 A. Yimer
Kaufmann D, Kraay A, Mastruzzi M (2010) The worldwide governance indicators: a summary of

methodology, data and analytical issues. The world bank policy research working paper no.
5430
Kindleberger CP (1969) American business abroad: six lectures on foreign direct investment. Yale
University Press, New Haven
Kojima K (1973) Macroeconomic versus international business approach to direct foreign
investment. Hitotsubashi J Econ 23:1–19
Krugell H (2005) The determinants of foreign direct investment in Africa in multinational
enterprises. In: Gries T, Naude WA (eds) Foreign direct investment and growth in Africa,
South African perspectives. Springer, Berlin, pp 49–71
Krugman PR (1979) Increasing returns, monopolistic competition and international trade. J Int
Econ 9:469–479
Lumsdaine RL, Papell DH (1997) Multiple trend breaks and the unit root hypothesis. Rev Econ
Stat 79(2):212–218
MacDougall GDA (1960) The benefits and costs of private investment from abroad: a theoretical
approach. Econ Rec 36:13–35
Morisset J (2000) Foreign direct investment in Africa: policies also matter. Transnatl Corp
9(2):107–125
Newey W, West K (1987) A simple positive semi-definite, heteroskedasticity and autocorrelation
consistent covariance matrix. Econometrica 55:703–708
Nonnenberg M, Mendonca M (2004) The determinants of direct foreign investment in developing
countries. Institute of applied economic research working paper
Ocampo JA (1986) New developments in trade theory and LDCs. J Dev Econ 22:129–170
Ohlin B (1933) Interregional and international trade. Harvard University Press, Cambridge, MA
Okpara GC (2012) An error correction model analysis of the determinant of foreign direct
investment: evidence from Nigeria, MPRA paper no. 36676
Onyeiwu S, Shrestha H (2004) Determinants of foreign direct investment in Africa. J Dev Soc 20
(1/2):89–106
Pattichis CA (1999) Price and income elasticities of disaggregated import demand: results from
UECMs and application. J Appl Econom 31:1061–1071
Pesaran MH, Shin Y (1999) An autoregressive distributed-lag modeling approach to co-integration
analysis. Econom Soc Monographs 31:313–371
Pesaran MH, Shin Y, Smith RJ (1996) Testing for the existence of a long-run relationship, DAE
working paper no. 9622
Pesaran MH, Shin Y, Smith RJ (2001) Bounds testing approaches to the analysis of level
relationships. J Appl Econom 16:289–326
Political Risk Services (2015) International country risk guide. The Political Risk Services Group
Inc., East Syracuse
Roecker EB (1991) Prediction error and its estimation for subset-selection models. Technometrics
33:459–469
Saikkonen P, Lütkepohl H (2000) Testing for the cointegrating rank of a VAR process with
structural shifts. J Bus Econ Stat 18:451–464
Seetanah B, Rojid S (2011) The determinants of FDI in Mauritius: a dynamic time series
investigation. Afr J Econ Manag Stud 2(1):24–41
Schneider F, Frey BS (1985) Economic and political determinants of foreign direct investment.
World Dev 13(2):161–175
Sunday AK, Lydie TB (2006) An analysis of foreign direct investment flows to cameroon. In:
IbiAjayi S (ed) Foreign direct investment in sub-Saharan Africa: origins, targets, impact and
potential. African Economic Research Consortium, Nairobi (Chapter 5)
The World Bank (2015a) African development indicators. The World Bank, Washington DC
The World Bank (2015b) World development indicators. The World Bank, Washington DC
UNCTAD (1998) World investment report: trends and determinants. In: United Nations
conference on trade and development. United Nations, New York and Geneva
UNCTAD (2013) World investment report: trends and determinants. In: United Nations
conference on trade and development. United Nations, New York
UNDP (2011) Illicit financial flows from the least developed countries: 1990–2008, UNDP
discussion paper
Vernon R (1966) International investment and international trade in the product cycle. Quart J
Econ 80:190–207
Wheeler D, Mody A (1992) International investment location decisions: the case of US firms. J Int
Econ 33:57–76
White H (1980) A heteroskedasticity-consistent covariance matrix and a direct test for
heteroskedasticity. Econometrica 48:817–838
Wooldridge JM (2000) Introductory econometrics: a modern approach. South-Western College
Publishing, Cincinnati, OH
Part III
Capital Structure and Bank Loan Growth
Effects
Chapter 8
Firm-Specific Determinants of Insurance
Companies’ Capital Structure in Ethiopia
Yitbarek Takele and Daniel Beshir
Abstract This study examines the impact of firm-specific characteristics on capital

structure (CS) decisions of the Ethiopian insurance industry. The study used
panel-fixed effects robust standard error regression models, the DEBT model, and
the DE model using financial statements of eight insurance companies covering the
period from 2005 to 2014. To validate the results, it conducted normality, multi-
collinearity, heteroskedasticity, autocorrelation, and robustness tests. We found
pecking order, static trade-off, and agency cost theories as the most important in
explaining CS decisions of insurance companies in Ethiopia though the pecking
order theory appeared to be dominant. The empirical findings of the models indicate
that profitability and liquidity are significant in determining Ethiopian insurance
companies’ financing decisions, while business risk and size of the firm are
insignificant in shaping their behavior. On the other hand, firms’ asset tangibility
and growth opportunities had a significant impact on the total debt ratio, while these
factors were insignificant for the debt–equity ratio.
Keywords Ethiopia Capital structure Firm-specific Insurance Leverage
8.1 Introduction
Capital structure (CS) is a mix of long-term debt, specific short-term debt, common
equity, and preferred equity. It shows how a firm finances its overall operations and
growth by using different sources of funds. While looking at what constitutes CS,
debt comes in the form of bond issues or long-term notes payable and equity as
common stock, preferred stock, or retained earnings. It is in insurance companies’
Y. Takele (&)
College of Business and Economics, Addis Ababa University,
e-mail: yitbarekt87@gmail.com
D. Beshir
Libya Oil Ethiopia Ltd, Addis Ababa, Ethiopia

DOI 10.1007/978-981-10-4451-9_8
156 Y. Takele and D. Beshir
interest to know about their CS patterns as they need funds to settle claims or pay
damages at the time of loss. This helps insurance companies to be sustainable
because of the nature of risks involved in their businesses and the inherent
impracticality of retaining all risks that they face during operations.
The paper is structured as follows. Section 8.2 gives a brief overview of the
Ethiopian insurance sector. Section 8.3 discusses major theoretical underpinnings
of the subject. The next section addresses the link between theoretical lenses and the
variables chosen along with empirical reviews and the conceptual framework.
Section 8.5 explains the relationship among the variables, the methodology, and
data, while Sect. 8.6 analyzes the empirical results. Section 8.7 gives a conclusion.
The determinants of CS have been debated for many years and still represent one
of the unresolved issues in the corporate finance literature. Though a few of the
theories that have been developed have been empirically tested, their findings have
led to different, anomalous, and sometimes conflicting results and conclusions. This
also suggests that the different theories are not mutually exclusive making the
debates on CS more exciting (Rajan and Zingales 1995). Moreover, Morri and
Beretta (2008) emphasize the lack of a fully supported and commonly accepted
theory of CS decisions and the unfolding nature of its determinant factors.
The different studies have made immense contributions to the theory of CS.
However, these studies are inclined toward the developed economies, and less
developed countries have received little attention. This has raised concerns about
the generalizability of such works, for example, where capital markets are not well
developed or are underdeveloped. Consequently, research designs, methodologies,
and theoretical frameworks that best fit such contexts are worth undertaking. In
previous studies, antecedent variables, commonly regarded as determinants of CS
decisions, include profitability, age, agency cost, business risk, asset tangibility,
growth, non-debt tax shields, liquidity, political risks, and size. These variables,
among others, are related to firm value and risk exposure in one way or another.
Our study, therefore, investigates the determinants of decisions about CS in the
insurance industry in Ethiopia during 2005–2014. Our research identified six
hypotheses ðHai Þ:
Ha1 : There is a negative relationship between leverage and profitability in Ethiopian
insurance companies.
Ha2 : There is a positive relationship between leverage and asset tangibility in
Ethiopian insurance companies.
Ha3 : There is a positive relationship between leverage and growth in Ethiopian
Ha4 : There is a negative relationship between leverage and business risk in
Ha5 : There is a positive relationship between leverage and size of the firm in
Ha6 : There is a negative relationship between leverage and liquidity in Ethiopian
8 Firm-Specific Determinants of Insurance Companies’ … 157
8.2 An Overview of the Insurance Industry in Ethiopia
The emergence of modern insurance in Ethiopia can be traced back to the estab-
lishment of the Bank of Abyssinia in 1905. The bank acted as an agent for foreign
insurance companies to underwrite fire and marine policies. The first domestic
private insurance company was established in 1951 with a share capital of Eth Br
1,000,000, and in the 1960s, the number of domestic private companies was started
increasing (Zeleke 2007).
At present, there are 15 insurance companies that are operational in Ethiopia that
provide general insurance services, except one, which provides life insurance. One
of the insurance companies, the Ethiopian Insurance Corporation (EIC), is
state-owned, while the rest are private. Ethiopian insurance companies’ investment
activities are heavily constrained by the restrictions imposed by the National Bank
of Ethiopia’s investment proclamation which requires them to invest a majority of
their funds in government securities and bank deposits at negative real interest rates.
Moreover, lack of infrastructure, especially a stock market, has constrained
investment activities of Ethiopian insurance companies (Mezgebe 2010). Following
this, competition has become stiff in the industry and some of the private insurance
companies that want to increase their sales volumes have been granting unfair and
huge discounts to attract clients, thus attaining sales targets. This aggressive pricing
policy has led to an unhealthy spiral of premium cutting which significantly
undermines the growth and prospects of the insurance industry in Ethiopia.
8.3 Theoretical Underpinnings
Since the publication of Modigliani and Miller’s (1958) ‘irrelevance theory of

capital structure,’ the theory of corporate CS has been a study of interest for finance
economists. Researchers of this study believe the relevance of CS arguments and
theories that take into account market imperfections as witnessed in the 2008
financial crisis. Researchers also hold the assumption that it is possible to find an
‘optimal’ CS after accounting for market imperfections such as taxes, bankruptcy,
and agency costs. In their later work, Modigliani and Miller (1963) considered
some of the criticisms and deficiencies of their theory and relaxed the assumption
that neglected corporate taxes.
Major theories of CS have emerged that diverge from the assumption of perfect
capital markets in which the ‘irrelevance model’ is promoted. The first in the
irrelevance model is the trade-off theory. The original version of the trade-off theory
grew out of a debate over the Modigliani–Miller theorem. When corporate income
tax was added to the original irrelevance theory, it validated the use of debt as it
provides a tax shield. It proposes that optimal CS is achieved when the marginal
present value of the tax shield on additional debt is equal to the marginal present
value of the financial distress cost on additional debt (Myers 1984).
The dynamic trade-off theory, on the other hand, recognizes the role of time that
requires specifying a number of aspects that are typically ignored in a single-period
model. Of particular importance are the roles of expectations and adjustment costs.
In a dynamic model, the correct financing decision typically depends on the
financing margin that a firm anticipates in the next period (Goldstein et al. 2001).
Thus, an optimal financial choice today depends on what is expected to be optimal
in the next period.
Agency cost is another theory that predicts that CS choice is dependent on
agency cost. It advocates an investigation of the conflicting interests of managers
and equity and debt holders and its impact on CS decisions. It argues that managers
who are well placed to access superior information as compared to both debt and
equity holders, mainly due to ex-post asymmetric information (Jensen and
Meckling 1976; Jensen 1986), may make CS decisions that maximize their interests
but destroy the firm’s value.
Yet another interesting theory is the pecking order theory developed by Myers
and Majluf (1984) which states that CS is driven by a firm’s desire to finance new
investments, first internally and then with low-risk debt and finally, if all fails, with
equity. Its main thesis is an association of asymmetric information and signaling
problems with external financing.
Finally, Baker and Wurgler (2002) have suggested another theory of CS: the
‘market timing theory of CS.’ Market timing implies that firms issue new shares
when they perceive they are overvalued and that firms repurchase their own shares
when they consider these to be undervalued.
What we can deduce from these theories is that they are not mutually exclusive
and do not stand on their own; rather, there exists a thread connecting them:
information asymmetry. The exception to this could be the trade-off theory which
mainly bases itself on tax shield advantages and bankruptcy costs.
8.4 Empirical Review and Conceptual Framework
By summarizing previous studies, profitability, tangibility, growth, risk, size, and

liquidity of assets were selected and included as explanatory variables in our study
and a firm’s CS (leverage) was used as the dependent variable. Though there are
different measures of leverage, our paper used two ratios as a proxy of leverage.
The first was the debt ratio (total debt to total assets), and the second was the debt–
equity ratio (debt to equity). In both these, total debt was calculated as the sum of
short-term and long-term liabilities.
The pecking order theory (Myers 1984) argues that profitable firms with access
to retained profits can rely on them as opposed to outside sources such as debt. On
the other hand, the static trade-off theory (Myers 1984; Myers and Majluf 1984)
provides a contradictory view and argues that profitable firms have greater needs to
shield income from corporate tax to increase profits and should borrow more as
compared to less profitable firms. In contrast to Myers and Majluf (1984) and
Myers’ (1984) views, empirical evidence from financial and non-financial firms
(Ahmed et al. 2010; Gill et al. 2009; Najjar and Petrov 2011; Rajan and Zingales
1995; Sharif et al. 2012; Teker et al. 2009) found that profitable firms used less debt
financing in line with the pecking order theory, while studies by Kumar et al. (2012)
and Sayeed (2011) found that profitable firms used more debt finance. As a proxy
for the measure of profitability, our study used the ratio of operating income to total
assets (return on assets) used by Booth et al. (2001), Cassar and Holmes (2003),
Mohammed Amidu (2007), and Adesola (2009).
According to Jensen and Meckling’s (1976) agency cost theory, there is a
conflict between lenders and shareholders due to the possibility of moral hazard on
the part of borrowers. This conflict creates incentives for shareholders to invest in a
suboptimal way, and lenders require tangible assets as collateral to protect them-
selves. The agency cost of debt increases when firms cannot collateralize their
debts. The outsized proportion of a firm’s assets can be used as collateral to fulfill
lenders’ requirements. In the trade-off theory, Modigliani and Miller (1963) argue a
reduction in financial distress costs for those firms with more tangible assets
because of a better chance to get debt financing. Empirical studies by Najjar and
Petrov (2011); Noulas and Genimaks (2011); Rajan and Zingales (1995); and
Titman and Wessels (1988) found that firms with more proportion of tangible assets
raised more debt using the same as collateral. As indicated in the studies by
Mohammed Amidu (2007) and Adesola (2009), our study also used the ratio of
fixed assets over total assets as a proxy measure of tangibility.
The pecking order theory argues that firms prefer debt financing over equity due
to its riskiness, and hence, a positive relationship between leverage and growth is
expected. However, in the static trade-off theory, growing firms face financial
distress and prefer to use equity financing. Empirical studies by Ahmed et al.
(2010); Noulas and Genimaks (2011); Kumar et al. (2012); and Sharif et al. (2012)
have found that growing firms used more debt to finance their businesses. Contrary
to this, studies by Rajan and Zinglas (1995); Shah and Khan (2007); and Titman
and Wessels (1988) show that growing firms used equity financing instead of debt.
In our study, sharing the argument given by Dawood et al. (2011) and Onaolapo
and Kajola (2010) growth was measured as annual percentage change in total
assets.
The static trade-off theory (Myers 1984) argues that risky firms can borrow less
as compared to less risky firms because the costs of financial distress offset the tax
shields of debt. The riskier a firm, the greater the chance of defaulting and being
exposed to such costs. That is, high-volatile earning firms face a risk of the earnings
level dropping below their debt servicing commitments, thereby incurring higher
costs of financial distress. Hence, such firms should reduce their leverage levels to
avoid the risk of bankruptcy. As indicated in Song (2005), income variability is a
measure of business risk. In our study, it is measured as the ratio of the standard
deviation of operating income over total assets.
Theoretically, the static trade-off theory states that for large companies the risk of
bankruptcy is minimized due to the economies of scale. The assets of a company
will be financed more through debt, as optimality of CS can be reached by
balancing the benefits and costs of debt (Modigliani and Miller 1958). The
empirical results of Ahmed et al. (2010); Kumar et al. (2012) and Najjar and
Petrove’s (2011) studies support the argument that the size of a firm and its leverage
are positively related. According to the pecking order theory, however, informa-
tional asymmetry for large firms is smaller, and as a result, they prefer to be
financed by equity instead of debt (Myers and Majluf 1984) because this reduces
the chances of undervaluation of the new issued equity and thus encourages the
large firms to use equity financing. In our study, such as Booth et al. (2001) and
Cassar and Holmes (2003), the natural log of total assets is used to measure the size
of the firm.
There are two different opinions about the association between liquidity and CS.
The first view, as explained in the trade-off theory, argues that firms with more
liquidity tend to use more external borrowings because of their ability to pay off
their liabilities. On the contrary, the pecking order theory believes that firms with
financial slack will prefer internal sources than debt or equity to finance future
investments (Myers 1984). Most previous studies confirm the negative relation.
Harris and Raviv (1991); Najjar and Petrov (2011); and Sharif et al. (2012) found
that firms with high liquidity ratios or more liquid assets preferred using these assets
to finance their investments and discouraged raising external funds (either equity
or debt). But Bayeh found an insignificant effect of liquidity on leverage usage
by insurance companies. But Bayeh found an insignificant effect of liquidity on
leverage usage by insurance companies. Like Dawood et al. (2011) in our study
also, the ratio of current assets to current liabilities was used to capture liquidity
(see Table 8.1).
Our study used the quantitative research approach to construct an empirical model.
Multiple regression analyses were used to measure the effects of the determinants
on the output variable and to examine the associative relationships between vari-
ables in terms of the relative importance of the independent variables and predicted
values of the dependent variables.
Our study used secondary data from annual reports of insurance companies and
the National Bank of Ethiopia (NBE). As per NBE’s current information, 15 in-
surance companies are operating in the country. Since there are only a few insur-
ance companies, there was no need to take a sample from them. Accordingly, based
on the years of service, audited financial data of those insurance companies which
were operational in 2005–2014 were included in our study. The reason behind
selecting the stated period was to obtain strongly balanced data for the analysis. In
order to make the panel data model structured and balanced, the same regular
frequency of the cross-sectional data with the same start and end dates was
maintained. Six insurance companies did not have the required data for the period
and were excluded from the sampling frame. Moreover, one insurance company is
Table 8.1 Measurement of independent variables and expected relationships

Variable Measurement proxy Theoretical Theories Expected
used for this study relationship relationship with
with leverage leverage
Profitability Operating income/total (+) Static (−)
(PF) assets trade-off
theory
(−) Pecking
order theory
Tangibility of Fixed asset/total assets (−) Agency cost (+)
assets (TN) theory
(+) Static
trade-off
theory
Growth (GR) Annual change in total (+) Pecking (+)
assets order theory
(−) Static
trade-off
theory
Risk (RK) Standard deviation of (−) Static (−)
operating income trade-off
theory
Pecking
order theory
Firm size (SZ) Natural logarithm of (+) Static (+)
total assets trade-off
theory
(−) Pecking
order theory
Liquidity (LQ) Current asset/current (+) Static (−)
liability trade-off
theory
(−) Pecking
order theory
Source Own summary
government-owned and so was excluded as it was not possible to obtain complete

audited financial statements for the whole period. Finally, 10 consecutive years’
information and data from eight insurance companies for 2005–2014 were used in
our study.
The general model for this study is presented as follows:
Yi;t ¼ b0 þ b Xi;t þ ei;t
The subscript i represents the cross-sectional dimension, and t denotes the time
series dimension. The left-hand side in the equation, Yi;t , represents the dependent
variable in the model, which is a firm’s leverage. On the right side, Xi;t represents
the set of independent variables in the estimated model. Therefore, the expanded
forms of both models built in line with the hypothesis of the study are as follows:
DEBT model: debt ratio (total debt/total asset) as the dependent variable
(1) TD=TAit ¼ b0 þ b1 ðPFit Þ þ b2 ðTNit Þ þ b3 ðGRit Þ þ b4 ðRKit Þ þ b5 ðSZit Þ þ b6
ðLQit Þ þ e DE Model: debt–equity ratio as the dependent variable
(2) D=Eit ¼ b0 þ b1 ðPFit Þ þ b2 ðTNit Þ þ b3 ðGRit Þ
þ b4 ðRKit Þ þ b5 ðSZit Þ þ b6 ðLQit Þ þ e
where
TD/TA Total debt to total assets
D/E Debt to equity
PF Profitability
TN Tangibility
GR Growth
RK Risk
SZ Size of the firm
LQ Liquidity
e Error term
The models were tested for the classical linear regression model’s (CLRM)
assumptions. Accordingly, Shapiro–Wilk, the correlation matrix, and Breusch–Pagan
tests were conducted to test normality, multi-collinearity, and heteroskedasticity,
respectively. We found no multi-collinearity problem which would exist if the cor-
relation between the two independent variables was more than 0.75 (Malhotra 2008).
Moreover, Shapiro–Wilk showed that normality had been established. See
Annexure 2 for diagnostic tests.
We used the regression models and applied different tests (Breusch and Pagan
Lagrangian multiplier (LM) test, Hausman test) to choose the best model for the
panel data under the study:
• Pooled OLS (POLS) model regression,
• Pooled OLS with dummy variable (least square dummy variable: LSDV) model
regression or fixed effects regression model, and
• Random effects GLS (generalized lease square) model regression.
Before explaining the results of the regression analysis, the results of the descriptive
statistics and Pearson’s correlation coefficient matrix are briefly explained.
The mean of debt ratio (total debt to total assets) of the 80 observations was
66.8% with a standard deviation of 8.3% indicating that more than 66% of the
balance sheets of insurance companies in Ethiopia were debt-financed, while the
mean debt ratio in the USA and in the UK is 58 and 54%, respectively (Rajan and
Table 8.2 Descriptive Variable Obs. Mean Std. dev. Min Max
summary statistics
TD/TA 80 0.668 0.083 0.453 0.822
D/E 80 0.755 0.405 −0.189 1.669
gro 80 0.231 0.157 −0.066 0.670
tang 80 0.194 0.110 0.026 0.542
pr 80 0.082 0.049 −0.047 0.182
risk 80 0.141 0.099 0.025 0.432
size 80 18.914 0.843 16.965 20.294
lq 80 1.022 0.264 0.543 2.306
Source Structured review of annual financial report (generated
from STATA)
Zingales 1995) (Table 8.2). Though theoretically it is argued that firms in devel-
oped countries are levered as compared to their developing country counterparts
mainly due to their well-developed bond markets, the findings of our study show
otherwise. This could be related to the absence of stock markets in developing
country which makes equity financing more unattractive. What is interesting about
the descriptive statistics of our results is the presence of high variability in the
growth, tangibility, size, and liquidity of insurance companies in Ethiopia which
may stress the need to consolidate the sector through mergers and acquisitions.
8.6.1 Model Selection
Annexure 1 presents all model selection tests including the results for the POLS
model regression, the fixed effects (or LSDV) regression model, and the random
effects model regression. We used the Breusch and Pagan Lagrangian multiplier
(LM) test to decide between random effects and POLS and the Hausman test to
decide between random effects and fixed effects models.
The results of Breusch and Pagan LM test for the DEBT model revealed that
there was very strong evidence (p-value 0.0006) at the 1% level of significance
against the null hypothesis; POLS is appropriate. This result suggests the random
effects model’s estimation over the pooled OLS model. The same LM test for the
DE model showed indifference between POLS and the random effects model’s
estimations. Moreover, the results of the Hausman test showed very strong evidence
(p-value 0.0085 for the DEBT model and p-value 0.0012 for the DE model) against
the null hypothesis at the 1% level of significance suggesting fixed effects estimates
rather than random effects estimates. Accordingly, the analysis and discussion of
results are based on the fixed effects estimates.
In order to make the fixed effects estimation results robust, the modified Wald
group-wise heteroskedasticity test in the fixed effects regression model was
undertaken. The results for both the DEBT and DE models revealed very strong
evidence (p-value 0.0000) against the null hypothesis of homoscedasticity. Hence,

there was group-wise heteroskedasticity in fixed effects regression in both the
models. Accordingly, a robust standard error estimation in the fixed effects model
was used to tackle the group-wise heteroscedasticity problem of the fixed effects
estimates in both the models.
8.6.2 Estimation Results of the DEBT Model with

a Robust Standard Error in Fixed Effects
The results of the fixed effects model with a robust standard error regression for the
DEBT model are presented in Table 8.3. The results show that asset tangibility,
profitability, risk, and liquidity had a negative relation with debt ratio, while growth
and firm size had a positive association with leverage. The results also indicate that
growth and tangibility were statistically significant at 5%. Moreover, profitability
and liquidity were significant at 1%, while risk and firm size were insignificant. In
Table 8.3 Fixed effects estimates with a robust standard error for the DEBT model’s regression
Fixed effects (within) regression: DEBT Number of obs. 80
MODEL
Group variable: ID Number of 8
groups
R2 Within 0.7165 Obs. per group: 10
min
Between 0.8782 avg 10.0
Overall 0.7918 max 10
F (6,7) 1792.72
cor ðui ; xbÞ 0.4602 Prob > F 0.0000
(Std. Err. adjusted for 8 clusters in ID)
lev Coeff. Robust std. err. T p > |t| [95% conf. Interval]
gro 0.757 0.022 3.44 0.011 0.024 0.128
tang −1.366 0.045 −3.04 0.019 −0.243 −0.030
Pr −0.583 0.100 −5.80 0.001 −0.821 −0.345
risk −0.319 0.198 −1.61 0.151 −0.787 0.148
size 0.016 0.024 0.68 0.521 −0.014 0.074
lq −0.120 0.016 −7.61 0.000 −0.157 −0.083
_cons 0.582 0.490 1.19 0.274 −0.577 1.741
Sigmau 0.032
Sigmae 0.029
rho 0.554 (Fraction of variance due to ui )
Source Structured review of annual financial report (generated using STATA)
addition, the value of R2 -within = 0.7165 and adjusted R2 ¼ 0:6931 for the DEBT
model. Hence, 69.31% of the variability in leverage is explained by selected
firm-specific factors.
8.6.3 Estimation Results of DE Model with

a Robust Standard Error in Fixed Effects
The results of the fixed effects model with a robust standard error regression for the
DE model are given in Table 8.4. The results show that profitability, risk, and
liquidity had a negative relation with the debt–equity ratio, while asset tangibility,
growth, and firm size had a positive association with the debt–equity ratio. The
results also indicate that only profitability and liquidity were statistically significant
at 5%. The other explanatory variables were insignificant. In this model, the value
of R2 -within was 0.5199 and adjusted R2 was 0.4804. This shows that only 48% of
the variability in the debt–equity ratio is explained by selected firm-specific factors.
Table 8.4 Fixed effect estimates with a robust standard error for the DE model regression
Fixed effects (within) regression: DE MODEL Number of obs. 80
Group variable: ID Number of 8
groups
R2 Within 0.5199 Obs. per group: 10
min
Between 0.7077 avg 10.0
Overall 0.6022 max 10
F (6,7) 74.13
cor ðui ; xbÞ 0.2470 Prob > F 0.0000
(Std. Err. adjusted for 8 clusters in ID)
lev Coeff. Robust std. err. t p > |t| [95% conf. interval]
gro 0.283 0.189 1.49 0.179 −0.165 0.731
tang 0.061 0.731 0.08 0.936 −1.669 1.791
Pr −2.128 0.569 −3.74 0.007 −3.475 −0.781
risk −0.802 0.803 −1.00 0.351 −2.700 1.096
size 0.220 0.114 1.93 0.095 −0.050 0.490
lq −0.345 0.116 −2.98 0.020 −0.619 −0.072
_cons −2.848 2.192 −1.30 0.235 −8.032 2.336
Sigmau 0.179
Sigmae 0.215
rho 0.410 (Fraction of variance due to ui )
8.7 Discussion of Results
8.7.1 Profitability and Leverage

Ha1 : There is a negative relationship between leverage and profitability in Ethiopian
The results of the fixed effects model with a robust standard error for both models
indicated that profitability had a negative relationship with leverage, and highly
significant (p-value = 0.001 for the DEBT model; and p-value = 0.007 for the DE
mode). Thus, the null hypothesis is rejected, and the alternative is supported. The
results are consistent with the pecking order theory which argues that profitable
firms with access to retained profits can rely on internal sources instead of external
ones. Moreover, the negative association between profitability and leverage is in
line with the pecking order and agency theories. It also supports the findings of
Rajan and Zingales (1995) and Cassar and Holmes (2003) but contradicts the static
trade-off theory (Myers 1984; Myers and Majluf 1984) which argues that profitable
firms have greater needs to shield their incomes from corporate tax to increase their
profits and should borrow more as compared to less profitable firms.
8.7.2 Asset Tangibility and Leverage

Ha2 : There is a positive relationship between leverage and asset tangibility in
A priori positive relationship was hypothesized and expected between tangibility
and leverage. The results of the DE model show that tangibility had a positive but
insignificant impact on leverage. The results indicate that the Ethiopian insurance
sector holds less fixed assets and relies less on debt financing. Nonetheless, the
positive correlation is in line with the static trade-off and pecking order theories.
On the other hand, the DEBT model’s results showed that tangibility had a
negative relationship and a significant (p-value = 0.019) impact on leverage.
Consistent with the findings of previous studies (Ebru 2011), the relationship
between tangibility and short-term debt was negative and significant. With respect
to short-term debt, it is generally expected that firms tend to match the maturity of
their debts with assets. This means that firms with more fixed assets rely more on
long-term debt, while those with more contemporary assets depend more on
short-term financing (Abor 2005).
The negative relationship between tangibility and leverage in our study conforms
with the agency cost theory though it is not consistent with the findings of Hassan
(2011); Najjar and Petrov (2011); Noulas and Genimaks (2011); Rajan and Zingales
(1995); and Titman and Wessels (1988) who found that firms with a higher pro-
portion of tangible assets used more debt using it as collateral.
8.7.3 Growth and Leverage

Ha3 : There is a positive relationship between leverage and growth in Ethiopian
The results of the relationship between growth and leverage for both the DEBT and
DE models’ regressions show a positive association. The finding of a positive
association could be for the reason that growing insurance firms rely more on
external borrowings to seize market opportunities. This argument is supported by
the pecking order theory.
Growth opportunities for insurance companies exhibit a significant (p-value =
0.011 for the DEBT model) impact on the debt ratio. The probable reason could be
that growing insurance companies need to expand their branches to reach additional
customers prompting them to absorb more debt. This finding is in conformity with
Ahmed et al. (2010); Kumar et al. (2012); Noulas and Genimaks (2011); and Sharif
et al.’s (2012) studies who found that growing firms were mainly financed by debt.
However, the results obtained from the DE model regression show that there exists
no significant relationship (p-value = 0.179 for the DE model) between expected
growth and the debt-to-equity ratio. This finding is in conformity with studies by
Hassen (2011); Najjar and Petrove (2010); Olayinka (2011); Rajan and Zinglas
(1995); Shah and Khan (2007); and Titman and Wessle (1988) which showed that
growing firms were financed more by equity instead of debt. This positive insignif-
icant result indicates that growth is not considered a proper explanatory variable of
leverage in the Ethiopian insurance sector. One possible explanation could be that the
measure used in our study, the percentage change in total assets, did not reflect future
growth possibilities enough. Thus, other more significant results might be obtained
by using another measure (proxy) for growth, for instance annual change in sales or
the market-to-book ratio. In addition, the adjusted R2 for the DE model’s regression
revealed that only 48% of the variability in the debt–equity ratio was explained by the
selected firm-specific variables in our study.
8.7.4 Risks and Leverage

Ha4 : There is a negative relationship between leverage and business risk in
Business risks are insignificant for both the DEBT model (p-value = 0.151) and the
DE model (p-value = 0.351) in explaining CS decisions of Ethiopian insurance
companies. This result contradicts Kindie (2011) and Solomon’s (2012) studies.
However, it is in line with the argument of the trade-off theory which suggests that
less risky insurance firms can take more debt as their ability to pay interest pay-
ments without delay is reliable. The results of both the models are also in line with
the pecking order theory, which predicts a negative relationship between leverage
and the earning volatility of a firm.
8.7.5 Size of the Firm and Leverage

Ha5 : There is a positive relationship between leverage and the size of a firm in
The size of the insurance firms is insignificant in explaining capital decision
behaviors for both the DEBT model (p-value = 0.521) and the DE model (p-
value = 0.095) at the 5% significance level. The reason could be that lending
organizations give less emphasis to the size of the firm while performing a credit
risk analysis. However, the results of both the models confirm that the size of an
Ethiopian insurance company positively affected leverage even if it was insignifi-
cant. This is in line with the trade-off and agency theories and is similar to Rajan
and Zingales (1995) and Kindie (2011).
8.7.6 Liquidity and Leverage

Ha6 : There is a negative relationship between leverage and liquidity in Ethiopian
For both models, liquidity had a negative relationship with leverage and was sig-
nificant (p-value = 0.000) for the DEBT model and (p-value = 0.020) for the DE
model at the 5% significance level. This negative strong significant relationship
implies that Ethiopian insurance firms with liquid assets such as cash and mar-
ketable securities prefer internal sources than debt or equity to finance future
investments which are consistent with the pecking order theory. The results,
however, contradict the trade-off theory, which argues that firms with more liquidity
tend to use more external borrowings because of their ability to pay off their
liabilities. The results also deviate from Kindie’s (2011) empirical study.
8.8 Conclusion and Future Research Direction
The empirical findings of both the models indicate that profitability and liquidity
were significant in determining Ethiopian insurance companies’ financing deci-
sions, while business risk and size of a firm were found to be insignificant in
shaping the behavior of the firm. On the other hand, asset tangibility and growth
opportunities for firms had a significant impact on the total debt ratio. However,
these factors were insignificant for the debt–equity ratio. Insurance companies in
Ethiopia rely on short-term debt due to the absence of a stock market in the country.
They also depend more on external borrowings to expand their markets.
Based on previous studies and an extensive literature review, the major theories
of CS including the static trade-off theory, the pecking order theory, and the agency
theory were selected and an attempt was made to identify the theory that best
explained the financial decision behavior of insurance companies in Ethiopia. The
results revealed that pecking order, information asymmetry, and the static trade-off
theories were all important in explaining the CS of insurance companies in
Ethiopia, even if the pecking order theory appeared to be dominant.
Considering the current growth opportunities for insurance companies in
Ethiopia, internal sources of funding might not be enough. Therefore, it is advisable
not to depend only on internal sources of funds. Having a reasonable proportion of
long-term debt in CS is considered a priority for growth in developing countries as
this helps them utilize available market opportunities. Moreover, the industry
should keep in touch with the trade-off theory since it has strong practical appeal; it
rationalizes moderate debt ratios and sets a target debt-to-equity ratio.
Future Research Direction
Macroeconomic factors (such as inflation, GDP, and interest rate), other qualitative
factors (management quality of each insurance company, policies, and procedures),
and the ownership structures of the companies which might have an impact on CS
choice and the effect of regulation on solvency and CS of insurance companies are
recommended as area for further research. Moreover, there is a need to thoroughly
study why pecking order happens to be the dominant theory in explaining the
financing behavior of insurance companies in Ethiopia.
Annexure 1: Model Selection
POLS model regression, fixed effects (or LSDV) regression model, and the random
effects model regression results of the DEBT model regression
Variable POLS LSDV Fixed effects Random effects

gro 0.119*** 0.076** 0.076** 0.089***
tang −0.280*** −0.137* −0.137* −0.204***
pr −0.755*** −0.583*** −0.583*** −0.645***
Risk −0.409*** −0.319** −0.319** −0.328***
size 0.003 0.014 0.014 0.015
lq −0.183*** −0.120*** −0.120*** −0.147***
ID
2 −0.015
3 −0.65*
4 0.014
5 −0.019
6 −0.489***
(continued)
(continued)
7 −0.073***
8 −0.0566**
_cons 0.949*** 0.615* 0.582 0.651**
N 80 80 80 80
Note *p < 0 0.05; **p < 0.01; ***p < 0.001
POLS model regression, fixed effects (or LSDV) regression model, and the
random effects model regression results of the DE model regression

gro 0.496* 0.283 0.283 0.496*
tang −0.833** 0.061 0.061 −0.833**
pr −3.220*** −2.128** −2.128** −3.220***
Risk −1.530** −0.802 −0.802 −1.530**
size 0.119 0.220 0.220 0.119
lq −0.674*** −0.345* −0.345* −0.674***
ID
2 −0.157
3 −0.335
4 −0.099
5 0.093
6 −0.189
7 −0.467***
8 −0.244
_cons −0.273 −2.673 −2.673 −0.273
N 80 80 80 80
Legend *p < 0 0.05; **p < 0.01; ***p < 0.001
Breusch and Pagan LM test for DEBT model
Breusch and Pagan Lagrangian multiplier test for random effects: DEBT model
lev [ID, t] = xb + u[ID] + e[ID, t]
Estimated results:
Var sd = sqrt (var)
lev 0.0069 0.0832
e 0.0008 0.0291
u 0.0003 0.0180
Test: var (u) = 0
Chi2 ð01Þ ¼ 10:63
prob [ Chi2 ¼ 0:0006
Breusch and Pagan LM test for DE Model
Breusch and Pagan Lagrangian multiplier test for random effects: DE model
lev[ID, t] = xb + u[ID] + e[ID, t]
Estimated results:
var sd = sqrt (var)
lev 0.164 0.405
e 0.046 0.215
u 0 0
Test: var(u) = 0
Chi2 ð01Þ ¼ 0:00
prob [ Chi2 ¼ 1:0000
Hausman LM test for DEBT model
Coefficients
(b) (B) (b − B) Sqrt (diag(v_b − v_B))
Fixed effects Random effects Difference S.E
gro 0.076 0.089 −0.13 .
tang −0.137 −0.206 0.069 0.370
pr −0.583 −0.645 0.062 0.014
(continued)
(continued)
Coefficients
(b) (B) (b − B) Sqrt (diag(v_b − v_B))
Fixed effects Random effects Difference S.E
risk −0.319 −0.329 0.087 0.067
size 0.016 0.015 0.001 0.009
lq −0.120 −0.147 0.027 0.012
b = consistent under H0 and Ha ; obtained from xtreg
B = inconsistent under Ha and efficient under H0 ; obtained from xtreg
Test: H0 : difference in coefficiens not systematic
Chi2 ð6Þ ¼ (b − B)′[(v_b − v_B) ^ (−1)] (b − B)
¼ 17.21
prob [ Chi2 ¼ 0.0085
Hausman LM test for DE Model
Coefficients
(b) (B) (b − B) Sqrt (diag(v_b −v_B))
Fixed effects Random effect Difference S. E
gro 0.283 0.496 −0.213 .
tang 0.061 −0.834 0.893 0.358
pr −2.128 −3.222 1.094 0.329
risk −0.802 −1.530 0.728 0.639
size 0.220 0.119 0.102 0.0903
lq −0.345 −0.674 0.329 0.115
b = consistent under H0 and Ha ; obtained from xtreg
B = inconsistent under Ha and efficient under H0 ; obtained from xtreg
Test: H0 : difference in coefficiens not systematic
Chi2 ð6Þ ¼ (b − B)′[(v_b −v_B) ^ (−1)] (b − B)
¼ 22.10
prob [ Chi2 ¼ 0.0012
Modified Wald test for group-wise heteroscedasticity in fixed effects regression:

DEBT model
H0 : sigmaðiÞ2 ¼ sigma2 for all i

Chi2 ð8Þ ¼ 49:00
prob [ Chi2 ¼ 0:000
Modified Wald test for group-wise heteroscedasticity in fixed effects regression:

DE model
H0 : sigmaðiÞ2 ¼ sigma2 for all i

Chi2 ð8Þ ¼ 1129:25
prob [ Chi2 ¼ 0:000
Annexure 2: Diagnostic Tests
Test of normality for DEBT model: Shapiro–Wilk Test
H0 : The distribution is normal

Variable Obs. W V z Prob > z
lev 80 0.980 1.339 0.640 0.261
Test of normality for DE model: Shapiro–Wilk test
H0 : The distribution is normal

Variable Obs. W V z Prob > z
lev 80 0.990 0.682 −0.838 0.799
Tests of multi-collinearity: correlation matrix between explanatory variables
Gro Tang pr Risk Size Lq

gro 1.000
tang −0.246 1.000
pr 0.328 −0.100 1.000
risk −0.101 0.043 −0.367 1.000
size 0.027 −0.213 0.449 −0.731 1.000
lq 0.306 −0.243 0.1429 0.238 −0.289 1.000
References
Abor J (2005) The effect of capital structure on profitability: empirical analysis of listed firms in
Ghana. J Risk Financ 6:438–445
Adesola WA (2009) Testing static tradeoff theory against pecking order models of capital structure
in Nigerian quoted firms. Glob J Soc Sci 8(1):51
Ahmed N, Ahmed Z, Ahmed I (2010) Determinants of capital structure: a case of life insurance
sector of Pakistan. Eur J Econ Financ Adm Sci 24:7–12
Amidu M (2007) Determinants of capital structure of banks in Ghana: an empirical approach.
Balt J Manag 2(1):67–79
Baker M, Wurgler J (2002) Market timing and capital structure. J Finance 57(1):1–32
Booth L, Aivazian V, Demirguc-Kunt A, Maksimovic V (2001) Capital structures in developing
countries. J Finance 56(1):87–130
Cassar G, Holmes S (2003) Capital structure and financing of SMEs: Australian evidence. Account
Finance 43(2):123–147
Dawood MHAK, Moustafa ESI, El-Hennawi M (2011) The determinants of capital structure in
listed Egyptian corporations. Middle East Finance Econ 9:83–99
Ebru Ç (2011) An empirical investigation on the determinants of capital structures of Turkish
firms. J Finance Econ 9:35–42
Gill A, Biger N, Pai C, Bhutani S (2009) The determinants of capital structure in the service
industry: evidence from United States. Open Bus J 2:48–53
Goldstein R, Ju N, Leland H (2001) An EBIT-based model of dynamic capital structure. J Bus 74
(4):483–512
Harris M, Raviv A (1991) The theory of capital structure. J Finance 46(1):297–355
Jensen MC (1986) Agency cost of free cash flow, corporate finance, and takeovers. Corporate
Finance, and Takeovers. Am Econ Rev 76(2):323–329
Jensen MC, Meckling WH (1976) Theory of the firm: managerial behavior, agency costs and
ownership structure. J Financ Econ 3(4):305–360
Kindie AB (2011) Determinants of capital structure on Ethiopian insurance companies. Thesis
Addis Ababa University, Addis Ababa
Kumar MS, Dhanasekaran M, Sandhya S, Saravanan R (2012) Determination of financial capital
structure on the insurance sector firms in India. Eu J Soc Sci 29(2):288–294
Malhotra NK (2008) Marketing research: an applied orientation, 5th edn. Pearson Education India
Mezgebe M (2010) Assessment of the reinsurance business in Developing Countries: Case of
Ethiopia. MA thesis, Graduate School of Business, University of South Africa
Modigliani F, Miller MH (1958) The cost of capital, corporation finance and the theory of
investment. Am Econ Rev 48(3):261–297
Modigliani F, Miller MH (1963) Corporate income taxes and the cost of capital: a correction. Am
Econ Rev 53(3):433–443
Morri G, Beretta C (2008) The capital structure determinants of REITs. Is it a peculiar industry?
J Eur Real Estate Res 1(1):6–57
Myers SC (1984) The capital structure puzzle. J Finance 39(3):574–592
Najjar N, Petrov K (2011) Capital structure of insurance companies’ in Bahrain. Int J Bus Manag 6
(11):138
Noulas A, Genimakis G (2011) The determinants of capital structure choice: evidence from Greek
listed companies. Appl Financ Econ 21(6):379–387
Onaolapo AA, Kajola SO (2010) Capital structure and firm performance: evidence from Nigeria.
Eur J Econ Finance Adm Sci 25:70–82
Rajan RG, Zingales L (1995) What do we know about capital structure? Some evidence from
international data. J Finance 50(5):1421–1460
Sayeed MA (2011) The determinants of capital structure for selected Bangladeshi listed
companies. Int Rev Bus Res Pap 7(2):21–36
Shah A, Khan S (2007) Determinants of capital structure: evidence from Pakistani panel data. Int
Rev Bus Res Pap 3(4):265–282
Sharif B, Naeem MA, Khan AJ (2012) Firm’s characteristics and capital structure: a panel data
analysis of Pakistan’s insurance sector. Afr J Bus Manage 6(14):4939
Solomon MA (2012) Characteristics and capital structure: a panel data analysis from Ethiopian
insurance industry. Int J Commer Manag 3(12):21–27
Song HS (2005) Capital structure determinants an empirical study of Swedish companies. CESIS
Electronic Working Paper Series 2005–25
Teker D, Tasseven O, Tukel A (2009) Determinants of capital structure for Turkish firms: a panel
data analysis. Int Res J Finance Econ 29:179–187
Titman S, Wessels R (1988) The determinants of capital structure choice. J Finance 43(1):1–19
Zeleke H (2007) Insurance in Ethiopia: historical development, present status and future
challenges. Master Printing Press, Addis Ababa, Ethiopia
Chapter 9
Income Distribution and Economic
Growth
Atnafu Gebremeskel
Abstract This paper links access to bank loans and income distribution to pro-
ductivity growth. Its main focus is on examining how functional income distribu-
tion can influence the evolution of productivity and thereby promote economic
growth. We obtained key variables and their evolution from the Ethiopian Central
Statistical Agency dataset on medium and large scale manufacturing firms. The
paper uses the evolutionary economic framework and the evolutionary theory
jointly with its evolutionary econometric approach. This sees economic growth as
an open-ended process. The major findings and conclusions of this paper are lack of
strong evidence of evolution (intra-industry selection) to foster productivity growth
and reallocation (structural change). The employment share of each firm within an
industry entered the model with a negative sign but a significant coefficient. In
economic terms, the positive and negative coefficients of labor share within a firm
and employment share of each firm within the industry give us important infor-
mation about structural changes within the manufacturing sector. The key policy
lesson is that access to bank loans is of great importance to firms. This is partic-
ularly so for industries such as spinning, tanning and publishing in which all firms
that had access to bank loans revealed movements in their employment shares. This
is evidence of structural transformation. It is desired that future research includes
economy-wide modeling, estimation and more formalization of evolutionary eco-
nomic models to study the link between access to bank loans and its effects on
income distribution and inclusive economic growth.

Keywords Income distribution Evolutionary economics Evolutionary econo-

metrics Productivity Growth
A. Gebremeskel (&)
e-mail: atnafuga@gmail.com

DOI 10.1007/978-981-10-4451-9_9
178 A. Gebremeskel
9.1 Introduction
Income distribution remains one of the few unanswered questions in economics.

Mincer’s (1958) thinking is that economists have long theorized about the nature or
causes of inequalities in personal incomes. In contrast, the vigorous development of
empirical research in the field of personal income distribution is of recent origin.
For nearly 200 years, Anglophone economics followed Ricardo (1815) and con-
ceived of distribution as referring to a functional role in economic production.
The functional approach to income distribution has survived a marginal revo-
lution in economics, an industrial revolution, the development of welfare eco-
nomics, the great depression, the advent of macroeconomics, the creation of a
welfare state, the mathematizing of neo-classical economics and several generations
of prominent economists arguing that economics should rightly be concerned with
the distribution of well-being across individuals and the erosion of the sharp class
divisions that gave Ricardo his distribution theory (Goldfarb and Leonard 2005).
While who gets what refers to personal distribution of income across individuals,
functional distribution is across suppliers of productive factors because of the
distributive consequences and their wider implications are more important than the
causes.
Moreover, the emphasis of contemporary research has almost completely shifted
from a study of the causes of inequalities to the study of the facts and of their
consequences for various aspects of economic activities. One such activity is
productivity growth and economic growth.
The question of how inequalities are generated and how they evolve over time
has been a major concern of economics for more than a century. Yet, the rela-
tionship between inequalities and the process of economic development is far from
being an agreed area of research. In developing economies, it is a challenge for both
academic and policy circles. There is demand for academicians to investigate this
and it is an issue that also needs to be dealt with by policymakers.
Thus, a study of income distribution should not be undertaken for the sake of a
study but for its wider implications on economic performance. Economic growth is
effected by economic performance because the growth-inequality linkage is both
important and controversial.
It is important because policymakers need to understand the way in which an
increase in output will be shared among different groups within an economy and the
constraints that this sharing may put on future growth. Its controversial aspects arise
from the fact that it has been difficult to reconcile the different theories, especially
since empirical evidence has been largely inconclusive (Cecilia 2010). For example,
Barro (1990) and Persson and Tabellini (1994) argue that moderate redistribution
promotes growth whereas a high degree of redistribution will have a negative
impact on growth.
The conventional textbook approach on the effect of inequality on growth is that
inequality is good for incentives and therefore good for growth, even though
9 Income Distribution and Economic Growth 179
incentive and growth considerations might be traded off against equity goals. On
the other hand, development economists have long expressed counter-arguments.
For example, Todaro (1997) provides four general arguments why greater
equality in developing countries may in fact be a condition for self-sustaining
economic growth: (a) dissaving and/or unproductive investments by the rich;
(b) lower levels of human capital held by the poor; (c) demand pattern of the poor
being more biased toward local goods; and (d) political rejection by the masses.
Overall, the view that inequality is necessary for accumulation and that redis-
tribution harms growth has faced challenges from many fronts. For example,
Alesina and Rodrik (1994) and Persson and Tabellini (1994) combine political
economy arguments with the traditional negative incentive effect of redistribution.
These authors maintain that inequalities affects taxation through the political pro-
cess when individuals are allowed to vote in order to choose the tax rate (or,
equivalently, vote to elect a government whose programs include a certain redis-
tributive policy). If inequalities determine the extent of redistribution, then this will
have an indirect effect on the rate of growth of the economy.
In their paper ‘Social Conflict, Growth and Income Distribution,’ Benhabib and
Rustichini (1996) explore the effect of social conflict arising due to income dis-
tribution on both short-run and long-run economic growth rates. According to them,
despite the predictions of the neo-classical theory of economic growth, poor
countries were observed to invest at lower rates and have not grown faster than rich
countries. They studied how the level of wealth and the degree of inequalities
affected growth and show how lower wealth can lead to lower growth and even to
stagnation when the incentives to domestic accumulation are weakened by redis-
tributive considerations.
Perotti (1996) contends that equality has a positive impact on growth while
Rehme (2006) argues that redistributing governments may have a relatively
stronger interest in technological advances or high economic integration. He
observes a positive association between redistribution and growth across countries.
While we can find vast literature on income inequalities and economic growth
similar to the studies mentioned earlier, they exclude the role of firms and the
mechanisms behind them for the creation and evolution of the links between in-
come distribution and economic growth. However, the existence of firms and their
actions are recognized in economic theory.
Thus, our introduction of firms in such an analysis is not arbitrary. Firms play a
central role in shaping the path of economic theory and as sources of growth in the
process of economic evolution. This argument is theoretically consistent with one
of the questions in economics (Coase 1937). Thus, any analysis which omits the
role of firms in the creation and evolution of income distribution in the growth
process cannot make a complete description. More specifically, empirical evidence
on how firms’ financial structures can influence their productivity and thereby drive
economic growth is scarce. This study bridges this gap.
Two crucial questions arise for policymakers which have policy relevance. The
first is whether inequality is a prerequisite for growth. And the second concerns the
180 A. Gebremeskel
effects of growth promoting policies on inequalities, and in particular under which

circumstances a conflict between the two objectives may emerge.
Thus, our paper takes firms as a hub for generating macroeconomic regularities.
Firms generate a link between sources and uses of funds, productivity, income
distribution and structural transformation in the market process. We explore the
dependence of macroeconomic productivity growth on firm-level productivities.
We examine how firms’ access to bank loans can influence an aggregate rate of
growth. Growth in productivity, output and employment is determined mutually
and endogenously. More specifically, this paper answers the following questions:
(a) How do firm-level sources and use of funds (investments from bank loans)
influence economic growth?
(b) Does access to bank loans affect intra- and inter-firm reallocation of labor?
(c) Can we find evidence of structural change, that is, reallocation of labor from
less productive to more productive industries?
(d) Can we draw some theoretical results and what policy lessons can we draw
from this?
The rest of the paper is organized as follows. Section 9.2 discusses economic
growth theories. Section 9.3 deals with evolutionary economics and economic
growth from an evolutionary perspective. Section 9.4 discusses econometric
modeling in the presence of evolutionary change; it also presents empirical evi-
dence and is followed by Sect. 9.5 which presents empirical results from Ethiopia.
Section 9.6 gives a conclusion.
9.2 Theory of Economic Growth
Economic growth is a dominant area of theoretical and empirical research in eco-

nomics in general and in macroeconomics in particular. For example, Nelson
(1996: 7) points out that from the beginning of modern economics as a field of
study, economic growth has often been the central area of inquiry, but on and off.
During the early decades, Hahn and Matthews (1964) presented the most com-
prehensive survey on the contributions that had been made to the theory of eco-
nomic growth beginning with Harrods’s article in 1939. Salavadori (2003)
emphasizes that an interest in the study of economic growth has experienced
remarkable ups and downs in the history of economics. It was the central issue in
classical political economy from Adam Smith to David Ricardo, and then in the
critique by Karl Marx (Nelson 1996; Salavadori 2003).
Then, the growth theory waned (Nelson 1996) and moved to the periphery
during the so-called marginal revolution (Salavadori 2003). Undoubtedly, one of
the reasons for this was that formal theory had developed which focused on market
equilibria. The concern was with what lay behind demand and supply curves and
how these jointly determined the observed configuration of outputs, inputs and
prices. The troubled economic times after World War I, in particular the great
depression, also pulled the attention of economists toward analyzing shorter-run
phenomena such as balance of payment disequilibria, inflation and unemployment.
There was a renaissance of interest in long-run economic growth after World
War II. One reason for this was that new national product data was first available for
USA and later for other advanced industrial nations. This for the first time allowed
economists to measure economic growth at the national level (Nelson 1996).
In modern times, the starting point for any study of economic growth is the
neo-classical growth model which emphasizes the role of capital accumulation.
This model, first constructed by Solow (1956) and Swan (1956), shows how eco-
nomic policy can raise an economy’s growth rate by inducing people to save more.
But the model also predicts that such an increase in growth cannot last indefinitely.
In the long run, a country’s growth rate will revert to the rate of technological
progress, which neo-classical theory takes as being exogenous. Underlying this
long-run result is the principle of diminishing marginal productivity which puts an
upper limit on how much output a person can produce simply by working with
more and more capital given the state of technology. Aghion and Howitt (1992,
1998) provide a presentation on this.
9.2.1 The Neo-Classical Growth Theory
In the neo-classical framework, the notion of growth as increased stocks of capital

goods was codified as the Solow–Swan growth model, which involves a series of
equations that show the relationship between output, labor-time, capital and
investment. This was the first attempt to model long-run growth analytically.
According to this theory, the role of technological changes was crucial and even
more important than the accumulation of capital.
This theory assumes that countries use their resources efficiently and that there
are diminishing returns to capital and labor. From these two premises, the
neo-classical model makes three important predictions: first, increasing capital
relative to labor creates economic growth since people can be more productive
given more capital. Second, poor countries with less capital per person grow faster
because each investment in capital produces a higher return than in rich countries
with ample capital. Third, because of diminishing returns to capital, economies
eventually reach a point where any increase in capital no longer creates economic
growth.
The model also notes that countries can overcome this steady state and continue
growing by inventing new technologies. In the long run, output per capita depends
on the rate of saving, but the rate of output growth should be equal to any saving
rate. In this model, the process by which countries continue growing despite
diminishing returns is ‘exogenous’ and represents the creation of new technology
that allows production with fewer resources. As technology improves, the steady
state level of capital increases and the country invests and grows.
182 A. Gebremeskel
The strengths of the neo-classical approach for economic growth are consider-
able. The neo-classical theory has provided a way of thinking about the factors
behind long-run economic growth in individual sectors and in the economy as a
whole. The theoretical structure has called attention to historical changes in factor
proportions and has focused an analysis of the relationship between those changes
and factor prices. These key insights and the language and formalism associated
with them have served to effectively guide and to give coherence to research that
has been done by many different economists around the globe. The weakness of the
theoretical structure is that it provides a grossly inadequate vehicle for analyzing
technical change.
The fundamental problems with neo-classical explanations of economic growth
are: (1) despite much empirical efforts at the neo-classical production function, the
model still faces problems in explaining considerable inter-plant and international
differences in productivity as well as differences between developed economies.
Even more striking is evidence for single industries, showing big sectoral pro-
ductivity gaps between different countries (Hodgson 1996); and (2) increasing
capital creates a growing burden of depreciation. It is also noted that the economic
life of capital assets has been declining. In particular, the orthodox formulation
offers no possibility of reconciling analyses of growth undertaken at the level of the
economy or the sector with what is known about the processes of technical changes
at the microeconomic level. Hodgson (1996) has a detailed account of this and
similar arguments.
9.2.2 The Endogenous Growth Theory
In response to some of the problems in the standard neo-classical growth theory, the
idea of an endogenous growth theory emerged in the works of Romer (1986, 1987,
1990, 1994), Lucas (1988) and a second generation variant pioneered by Aghion
and Howitt (1992, 1998). They developed the endogenous growth theory which
includes a mathematical explanation of technological advancement.
This broke from the preceding neo-classical thinking by encompassing learning
by doing and knowledge spillover effects. In these models, cumulative divergence
of national output and productivity becomes more likely than convergence and thus
seems to correspond more adequately to available data.
However, the amended aggregate production function is still at the conceptual
foundation of the endogenous growth models, typically embodying features such as
increasing marginal productivity of knowledge but diminishing returns in the
productivity of knowledge (Hodgson 1996).
Therefore, overall, there are constant returns to capital and economies never
reach a steady state. Growth does not slow as capital accumulates, but the rate of
growth depends on the type of capital that a country invests in. Research done in
this area has focused on what increases human capital (for example, education) or
technological change (for example, innovation).
9.3 Economics as an Evolutionary Science and Economic

Growth from an Evolutionary Perspective
9.3.1 Why an Evolutionary Approach in Economics?
The basic paradigm in mainstream economic theory, namely that individuals take
decisions in isolation using only the information received through some general
market signals such as prices, is built on the general equilibrium model. However,
as is well known, this model guarantees neither stability nor uniqueness of equi-
librium. Since the latter is essential for macroeconomists who wish to use com-
parative statistics, they have had to avoid this fundamental problem by resorting to
what has become the standard paradigm in modern macroeconomics, that is, the
representative agent (RA) framework.
The basic assumption is that the behavior of the aggregate can be treated as the
behavior of an average individual. The use of such an approach has been frequently
contested and has several obvious disadvantages. Firstly, it means that one has to
ignore communication and direct interaction among agents and ultimately defines
away the problem of coordination (Hahn and Solow 1995; Leijonhufvud 1992). In
this setting, interaction and coordination occur only through prices. The role of
prices is undoubtedly important, but the price mechanism alone can work only if
information is complete; in such a case, one can ignore the influence of other
coordination and interaction mechanisms. Here, again, these difficulties can be
sidestepped by assuming that a sector of the economy can be described by a RA.
There is no simple, direct, correspondence between individual and aggregate
regularities. It may be that in some cases, aggregate choices correspond to those that
can be generated by an individual. However, even in such exceptional cases, the
individual in question cannot be thought of as maximizing anything meaningful
from the point of view of society’s welfare. Our approach is exactly the opposite
from the representative individual approach. Instead of trying to impose restrictions
on aggregate behavior, by using, for example, the first-order conditions obtained
from the maximization program of the representative individual, the claim is that
the structure of aggregate behavior (macro) actually emerges from the interaction
between the agents (micro). In other words, statistical regularities emerge as a
self-organized process at the aggregate level: complex patterns of interacting
individual behavior may generate a certain regularity at the aggregate level. The
idea of representing a society by one exemplar denies the fact that the organiza-
tional features of the economy play a crucial role in explaining what happens at the
aggregate level.
The way in which markets are organized is assumed to have no influence on
aggregate outcomes. Thus, aggregate behavior, unlike that of biological or physical
systems, can be reduced to that of a glorified individual. Such an idea has, as a
corollary, the notion that collective and individual rationality are similar. What we
suggest is that collective outcomes be thought of as a result of an interaction
between agents who may have rather simple rules of behavior and who may adapt
184 A. Gebremeskel
rather than optimize. Once one allows for direct interaction among agents, mac-
robehavior cannot, in general, be thought of as reflecting the behavior of a ‘typical’
or ‘average’ individual.
The key assumption behind the construction of the aggregate production func-
tion is that all factor markets are perfect in the sense that individuals can buy or sell
as much as they want at a given price. With perfect factor markets (and no risk), the
market must allocate the available supply of inputs to maximize total output
(extensively found in Gatti et al. 2007 and the literature cited there).
Evolutionary theory in economics is as old as economics itself. It was pioneered
by Veblen (1898) when he asked, ‘Why is economics not an evolutionary science?’
and suggested that the only rational approach for economists was to assume that
economies evolve. Otherwise, he argued, we can describe an economy but have no
effective theory of change and development.
Veblen started his argument by asserting that all modern sciences are evolu-
tionary sciences (1898: 374) while Alchian (1950) brought out the evolutionary
approach as an alternative framework in economics. He started by proposing a
suggestion for a modification of economic analyses to incorporate incomplete
information and uncertain foresight as axioms. In the words of Alchian, this
approach dispensed with ‘profit maximization’ and it did not rely on predictable
individual behavior that is usually assumed as a first approximation in standard
textbook treatment.
The suggested approach embodies the principles of biological evolution and
natural selection by interpreting economic systems as an adaptive mechanism which
chooses among exploratory actions generated by the adaptive pursuit of ‘success’ or
‘profit.’
Krugman (1996) articulates economics as it is about what individuals do: not
classes, not ‘correlations of forces’ but individual actors. This is not to deny the
relevance of higher levels of analyses, but they must be grounded in individual
behavior. Methodological individualism is of the essence. He further notes that
individuals are self-interested. He extends his argument by saying that there was
nothing in economics that inherently prevented us from allowing people to derive
satisfaction from others’ consumption, but the predictive power of economic theory
came from the presumption that normally people care about themselves.
Individuals are intelligent; they do not neglect obvious opportunities for gain. It
is often asserted that economic theory draws its inspiration from physics, and that it
should become more like biology. If that is what you think, you should do two
things. First, read a text on evolutionary theory, like John Maynard Smith’s
Evolutionary Genetics. You will be startled at how much it looks like a textbook on
microeconomics. Second, try to explain a simple economic concept, like supply and
demand, to a physicist. You will discover that our whole style of thinking, of
building up aggregative stories from individual decisions, is not at all the way they
think (Krugman 1996). Veblen and Krugman’s suggestion is that ‘evolutionary
economics is the only rational proposition’ (Boulton 2010).
The renaissance in evolutionary economics in the past two decades has brought
with it a great deal of theoretical developments and interdisciplinary import (Dopfer
and Potts 2004).
Inspired by Veblen’s theory, evolutionary economics has become one alternative
approach to economic analyses involving complex economic interactions. Recent
contributors include Nelson’s (1974), Neo-classical vs Evolutionary Theories of
Economic Growth: Critique and Prospectus. More importantly, Richard Nelson
and Sidney Winter’s seminal work An Evolutionary Theory of Economic Change
(1982), Dopfer’s The Evolutionary Foundations of Economics (2005) and
Beinhocker’s The Origin of Wealth, Evolution, Complexity and the Radical
Remarking of Economics (2006) are advancements in the theory of evolutionary
economics.
The questions to be answered before using an evolutionary theoretical frame-
work to understand how economies grow are: What is evolutionary economics?
Why evolutionary economics? What are the theoretical foundations of evolutionary
economics? Where do economies come from? (Beinhocker 2006). How do the
behaviors, relationships, institutions and ideas that underpin an economy form, and
how do they evolve over time?
Beinhocker has argued that questions about origins play a prominent role in most
sciences because like it will be difficult to imagine modern cosmology without the
Big Bang or biology without evolution, it would be hard to believe that economics
could ever truly succeed as a science if it were not able to answer the question
‘Where do economies come from?’
Yet, the question about the origin of economies has not played a central role in
traditional economics which has tended to focus on how an economy’s output is
allocated rather than how it got there in the first place. The process of economy
formation presents us with a first-class scientific puzzle and one of the sharpest
distinctions between traditional economics and what is described as Complexity
Economics (Beinhocker 2006).
But what is evolution in economic science? A relatively narrow definition of
evolution is change in the mean characteristics of a population (Andersen 2004).
Economic growth, that is, the aggregate change in real output per person, is a
consequence of increasing the productivity of the factors of production and of
technological changes in a very wide sense. For a constant participation rate, it can
be modeled as a change in firm-level mean real output per employee weighted by
the firm’s employment share in the total number of firms in the economy. In Holm
(2014) this is referred to as the evolution of labor productivity.
The key ideas of evolutionary theory are that firms at any time are viewed as
possessing various capabilities, procedures and decision rules that determine what
they do given external conditions. They also engage in various ‘search’ operations
whereby they discover, consider and evaluate possible changes in their ways of
doing things. Firms, whose decision rules are profitable, given the market envi-
ronment, expand; those firms that are unprofitable contract. The market environ-
ment surrounding individual firms may be in part endogenous to the behavioral
186 A. Gebremeskel
system taken as a whole; for example, product and factor prices may be influenced
by the output of the industry and the demand for inputs (Nelson and Winter 1982).
According to Holm (2014), economic evolution is an open-ended process of
novelty generation and the reallocation of resources. Selection is the sorting of a
population of agents (firms) that is implicit to their differential growth rates. Firms
perform innovations and develop knowledge in attempts to gain decisive compet-
itive advantages over competitors, but firms are intentionally rational agents with
limited information and innovation; so more generally, learning may also lead to
decreased productivity. Firms prosper or decline as a result of the interaction
between their own learning activities, the learning activities of competitors and the
external factors that set the premises for the interaction. We can find more on this in
Dosi and Nelson (2010) and Metcalfe (1998). Safarzyńska (2010) also has an
excellent survey.
Holm (2014) explores how the evolution of productivity or any other charac-
teristic in a population of firms can be described. According to him, evolution can
be understood as the sum of two effects, which is referred to by different names in
literature: inter-firm or reallocation or selection effect and intra-firm or learning or
innovation effect. To this, the effects of entry and exit are added but as far as entry is
the introduction of new knowledge by entrepreneurs and exit is the disappearance
of an inferior firm, these effects are also learning and selection. As a stylized
depiction of economic evolution Holm (2014) expresses evolution as the total effect
of selection, learning, entry and exit.
Whereas inter-firm selection is driven by the process of competition,
inter-industry selection is driven by the process of structural change, which is
somewhat different. Productivity understood as physical efficiency is important in
competition among firms which produce homogenous products, for example,
within industries. This is less the case with heterogeneous outputs because com-
puting physical efficiency for heterogeneous products does not make sense because
as the composition of demand changes over time, not least as a consequence of
economic growth in itself, relative prices change as well and this affects
inter-industry selection (Holm 2014).
Holm has emphasized the importance of indicating the basic differences between
standard growth theories and growth theories in evolutionary economics.
Evolutionary economists (for example, Richard Nelson, Eric Beinhocker, Geoffrey
Hodgson and John Foster) strongly argue that an evolutionary framework is more
encompassing than standard approaches. Carlsson and Eliasson (2003) note that
economic growth can be described at the macrolevel but never explained at that
level. Economic growth is basically a result of experimental project creation and
selection in a dynamic market and in hierarchies of the capacity of the economic
system to capture winners and losers. Castellacci (2007) gives a review on the
evolution of evolutionary theories in economics which is presented in Table 9.1.
Metcalfe et al. (2006) explored an evolutionary theory of adaptive growth. They
supposed economic growth as a product of structural change and economic
self-transformation based on processes that were closely connected with but not
reducible to the growth in knowledge.
Table 9.1 Contrast between new growth theories and evolutionary growth
Issues New growth theories Evolutionary theories
What is the main level of Aggregate models based on Toward a co-evolution between
aggregation? neo-classical micro-foundations micro-levels and macrolevels of
(methodological individualism) analysis (‘non-reductionism’)
Representative agent or Representative agent and Heterogeneous agents and
heterogeneous typological thinking population thinking
individuals?
What is the mechanism Learning by doing and searching Combination of various forms of
of creation of activity by: the R&D sector; learning with radical
innovation? radical innovations; and general technological and organizational
purpose technologies innovations
What is the dynamics of History is a uniform-speed Toward a combination of
the growth process? transitional dynamics gradualist and dynamics: history
How is history is a process of qualitative change
conceived? and transformation
Is the growth process ‘Weak uncertainty’ (computable ‘Strong’ uncertainty:
deterministic or risk): stochastic but predictable non-deterministic and
unpredictable? process unpredictable process
Toward equilibrium or Toward the steady state Never ending and ever changing
never ending
The dominant connecting theme is enterprise, the innovative variations it gen-

erates and the multiple connections between investment, innovation, demand and
structural transformation in the market process. Metcalfe and Foster (1998)
explored the dependence of macroeconomic productivity growth on the diversity of
technical progress functions and income elasticities of demand at the industry level
and the resolution of this diversity into patterns of economic change through market
processes. They show how industry growth rates are constrained by higher-order
processes of emergence that convert an ensemble of industry growth rates into an
aggregate rate of growth. The growth in productivity, output and employment is
determined mutually and endogenously, and its value depends on variations in the
primary causal influences in the system.
9.3.2 Econometric Modeling in the Evolutionary Economic

Framework
Evolutionary economics in general and evolutionary econometrics in particular are

not an arbitrarily choice. They are both relevant and have theoretical foundations.
The theoretical basis for such a modeling is drawn from a self-organization
approach and analyzed by the logistic diffusion growth model.
Evolutionary economics and the subsequent developments of its estimation
techniques have enabled researchers to explore the advantages of evolutionary
188 A. Gebremeskel
economics. This methodology is offered to construct an econometric model in the

prescience of a structural change of an evolutionary type. In its various approaches,
evolutionary economics has been concerned with economic processes that arise
from systems which are subject to on-going structural changes in historical time.
Foster and Wild (1999a) identified three characteristics that all evolutionary rep-
resentations of economic processes seem to share:
1. A system that is undergoing a cumulative process of structure building, which
results in increasing organization and complexity, cannot easily reverse its
structure;
2. In the face of this time irreversibility, structure can change in non-linear and
discontinuous ways in the face of exogenous shocks, particularly when the
relevant evolutionary niche is filled; and
3. An evolutionary process of on-going structural changes introduces an increasing
degree of fundamental uncertainty. Thus, a great deal of structure building
involves the installation of protective repair and maintenance sub-systems.
Based on this discussion on evolutionary economics and the underlying theory
of the functional income distribution and its implications on economic impact such
as growth in productivity, our study tests if there is an indication for structural
transformation. This is achieved by investigating the evolution of key variables, that
is, evolution of employment share, evolution of market share, evolution of output
share at the industry level and the evolution of productivity growth. This is done in
two ways. First, by developing and estimating evolutionary econometrics to learn if
there is an indication for evolution and second by conducting a graphical
simulation.
Based on this background, we use a logistic diffusion equation (LDE) offered by
Foster and Wild (1999b) as a theory of historical process. In real terms, it is rooted
in the Bernoulli Differential Equation of the type shown in the equation in
Annexure 1. The last line in this equation is a Logistic Differential Equation of First
Order (LDEFO). Thus, based on the equation in Annexure 1, Foster and Wild
(1999b) have developed an econometric model in the presence of evolutionary
change as:

dX X
¼b 1 ð9:1Þ
dt K
In Eq. 9.1, b is the net, that is, it allows for deterioration or deaths, firm
entry-exit rate or diffusion coefficient, and K is the carrying capacity of the envi-
ronment, for example, total industry or economy’s market size, employment or
output over which each firm will compete to capture as much of it. K is a constraint,
for example, the total sales of an industry and X could be a firm’s sales so that
X/K is the firm’s market share.
Two points must be raised about Eq. 9.1. First X/K can be understood as any
share. If we are to work at the macrolevel, we may interpret X/K as the ratio of GDP
to capital stock. This ratio is less than 1 because at any point in time the total
national output is some fraction of inputs, the magnitude of the fraction depending
on the productivity of the economy.
Equation 9.1 can be expanded to employ the existing econometric framework
for estimation. Foster and Wild (1999b) have acknowledged that the application of
the LDE of this type has been common in literature on the economics of innovation,
following Griliches’s (1957) pioneering work. However, economists have tended to
view LDE in terms of disequilibrium adjustments from a stable equilibrium state to
another in economics of the evolutionary growth theory.
As it stands, Eq. 9.1 depicts a smooth process tending toward infinite time. Only
in a discrete interval version of LDE, we can generate the kinds of discontinuities
that we can see in historical data. However, discrete interval dynamics are not
pronounced features of most aggregated economic data. Thus, it is unlikely that we
can generate a discontinuity endogenously in most cases.
Now, it is convenient for the purposes of an econometric investigation to rear-
range Eq. 9.1 in the following way to obtain the Mansfield (1981) variant,
employed in many such studies. Dividing both sides of Eq. 9.1 by K and rear-
ranging, we arrive at:

Xt1
Xt Xt1 ¼ Xt1 b 1 þ ut
K ð9:2Þ
InXt InXt1 ¼ b bXt1 =K þ et where et ¼ ut =K
The transformation into approximation in Eq. 9.2 allows the logistic equation to
be estimated linearly and the error term is corrected for bias because of the upward
drift of the mean of the X-series.
Equation 9.2 offers a representation of the endogenous growth of a
self-organizing system subject to time irreversibility and constrained by boundary
limits. To come up with the complete econometric model, Foster and Wild qualified
their argument in the following ways:
(a) Regulation in the economic system can restrict economic agents and their
organizations to particular market niches. This means, again, that the principle
of competitive exclusion is significantly weakened. For example, governments
restrict the issue of bank licenses, which preserves a niche which non-bank
financial institutions have difficulty entering. Typically, competition in the
economic sphere is overlaid by ‘public interest’ regulations that attempt to limit
competition;
(b) Economic sub-systems rely on an interaction with the wider economic system
in order to engage in trade. Thus, the K limit for a particular system will tend to
rise continually in line with the general expansion of economic activity; and
(c) Increasing politicization of an economic system will lead to more predator–prey
type interactions. This will tend to occur in saturation phases of LD growth.
Thus, we do not always witness smooth transitions from one LD growth path
to another but, instead, Schumpeterian ‘creative destruction’, dominated by
190 A. Gebremeskel
conflict and discontinuous dissipation of an accumulated structure (that is, a

rapid fall in K).
Taking into account these qualifications, we arrived at the following LDE which
is suitable for application in economics:

Xt1
InXt InXt1 ¼ ½bðÞ 1 aðÞ þ et ð9:3Þ
K ðÞ
Thus, b and K are now themselves functions of other variables. The function ()
allows for factors that affect the diffusion coefficient, rendering it non-constant over
time and K() takes into account the factors in the greater system that expand or
contract the capacity limit faced by the system in question. The resource compe-
tition term, a(), is now a more general functional relationship than the simple
mechanism containing, for example, relative prices and existing demand for a
particular product, the general economic condition in the environment.
A potential problem with Eq. 9.3 is that as X tends to its limit, growth in X will
tend to zero so that the impact of factors in b() will also tend to zero. This is
unlikely to be the case, so it is more appropriate to allow exogenous variables that
affect the diffusion rate to influence the rate of growth of X with the same strength at
all points on the logistic diffusion:

Xt1
InXt InXt1 ¼ ½bðÞ 1 aðÞ þ bðÞ þ et ð9:4Þ
KðÞ
As it stands, Eq. 9.4 could be viewed as a disequilibrium process tending to an

equilibrium defined in terms of K() and a(). However, such an equilibrium
interpretation differs from that in conventional usage. The non-stationary process
modeled by Eq. 9.4 represents neither a mean reversion process in the presence of a
deterministic trend, nor a co-integrated association between X and variables in K()
and a(), in the presence of a stochastic trend.
The stationary state to which the logistic trajectory tends is the limit of a
cumulative, endogenous process, not a stable equilibrium outcome of an unspeci-
fied disequilibrium mechanism following an exogenous shock. The functions K()
and a() allow for measurable shocks to the capacity limit and () encompasses the
effect of exogenous shocks which alter the diffusion rate.
One final development is necessary. Although an equilibrium correction
mechanism is inappropriate in this type of a model, homeostasis will occur in the
short period around what can be viewed as a moving equilibrium.
Equation 9.4 relates to the momentum of a process and, as such, some path
dependence is likely to exist in the sense that the system in question will still have a
(decelerating) velocity even if all endogenous and exogenous forces impinging on
the system cease to have an effect.
This is likely to be stronger the more non-stationary the variable in question is
and the shorter the observation interval. Imposing a simple AR (1) process, we get:

Xt1
InXt InXt1 ¼ ½bðÞ 1 aðÞ þ bðÞ þ cðInXt InXt1 Þt1 þ et
KðÞ
ð9:5Þ
In conventional treatments of path dependence in time-series data, constructs

like the ‘partial adjustment hypothesis,’ concerning the presumed disequilibrium
movements of levels of variables, are used to rationalize the use of lagged
dependent variables. Inclusion of a lagged dependent variable requires upward
revision of the estimated coefficients on explanatory variables in order to obtain
their ‘equilibrium’ values. Here, the interpretation is different, but related. Instead of
viewing a lagged dependent variable as evidence of sluggishness, we view its
presence in our growth specification as evidence of momentum in the process
(Foster and Wild 1999b). In Eq. 9.5, we can note that the left hand side is
equivalent to the growth rate of series X. In our paper, it could be the growth rate of
productivity.
9.3.3 Empirical Evidence of Evolutionary Econometrics
Empirical literature on evolutionary economics is scarce. However, there are some

works which focus on the macrolevel, for example, Foster (1992, 1994) and
Hodgson (1996).
Foster (1992) looked into a new perspective on the determination of sterling M3
using econometric modeling under the presence of evolutionary change. First, he
obtained a logistic diffusion model from the first-order differential equation. Next,
he modeled the evolution of M3 in log-linear specification in the form of evolu-
tionary econometrics. He noted the ordinary least squares (OLS) and recursive least
squares (RLS) as favored estimation methods in such a condition. He estimated
datasets over 1963–1988 obtained from the UK monetary authority. He concluded
that it was possible to understand the determination of M3 by viewing it as money
supply, rather than money demand magnitude which is an outcome of a historical
process. Such a process has been modeled as institutionally driven and subject to
evolutionary change.
In Foster (1994), we can also find an evolutionary macroeconomic approach
stressing institutional behavior used for estimating a model for Australian dollar
M3. The conclusion is that since Australia and UK have the same cultural and
institutional heritage, evolutionary econometrics captured a similar M3 creation
process in both countries implying the appropriateness of an evolutionary approach
for studies involving the diffusion process.
The most interesting out of these is Hodgson (1996) as it is the most direct
theoretical and empirical research in long-term economic growth. He argues that his
work is in part inspired by works on institutional economics such as those by
Nelson and Winter and Thorstein Veblen (who was the first to suggest the use of
192 A. Gebremeskel
economics as an evolutionary analogy taken from biology). His empirical estima-

tion starts by placing major stress on institutional disruptions such as wars or
revolutions and on the existence of political institutions such as multiparty systems.
Hodgson used a regression analysis to provide some preliminary empirical
validation for his ideas. He admitted that it was not a fully fledged macroeconomic
model, saying that the available data was crude and limited for providing a more
ambitious and adequate test. He used real GDP per worker-hour as the index of
productivity from Madison’s data and summarized his findings as: first, two kinds
of disruptions (disruption of extensive foreign occupation of home soil and revo-
lution) seemed to be significant in determining and eventually advancing produc-
tivity growth. Second, there was evidence that the growth trajectory was determined
by the timing of industrialization. Third, a relatively stable international order was
found to be significant and positively related to growth.
Stockhammer et al. (2008) estimated the relationship between functional income
distribution and aggregate demand (AD) in the Euro area. They modeled AD as:
AD is the sum of consumption (C), investment (I), net exports (NX) and govern-
ment expenditure (G). All variables are in real terms. In their general formulation,
consumption, investment and net exports are written as a function of income(Y), the
wage share ðXÞ and some other control variables (summarized as z). The latter are
assumed to be independent of output and distribution. Government expenditure is
considered to be a function of output (because of automatic stabilizers) and
exogenous variables (such as interest rates). However, as our paper focuses on the
private sector, this will play no further role in our analysis. AD thus is:
AD ¼ C ðY; XÞ þ I ðY; X; z1 Þ þ NXðY; X; zNX Þ þ G ðY; zG Þ ð9:6Þ
Stockhammer et al.’s (2008) basic assertion for the inclusion of income distri-
bution in consumption, investment and net export and government expenditure
terms in Eq. 9.6 is: in the consumption function wage incomes (W) and profit
incomes (R) are associated with different propensities to consume. The Kaleckian
assumption is that the marginal propensity to save is higher for capital incomes than
for wage incomes; consumption is therefore expected to increase when the wage
share rises. They argue that Keynesian as well as neo-classical investment functions
depend on output (Y) and the long-term real interest rate or some other measure of
the cost of capital. The latter is part of z1 . The authors further argue that in addition
to output and interest rate, investments are expected to decrease when the wage
share rises because future profits may be expected to fall. Moreover, it is often
argued that retained earnings are a privileged source of finance and may thus
influence investment expenditures.
They claim that first, the policy implications of their findings are that wage
moderation in the EU is unlikely to stimulate employment. They suggest that wage
moderation leads to a (moderate) contraction in output. Since an expansion in
output can be regarded as a necessary (but not sufficient) condition for an expansion
in employment, wage moderation (at the EU level) is not an ‘employment-friendly’
wage policy. Their second implication refers to wage coordination; they contend
that their findings suggest that demand is wage-led in the Euro area. This finding
does not extend to individual Euro member states.
Our paper takes advantage of the formalization of evolutionary economics by
Foster (1994, 2014) and Foster and Wild (1999a, b).
9.4.1 The Data and Variables
This section examines if firms’ access to bank loans has any effect on growth
through1 its effects on functional income distribution. The dataset is the medium
and large manufacturing industries as compiled by the Central Statistical Agency
(CSA) of Ethiopia. The available panel data covers 1996–2009 with 611 and 1943
firms in 1996 and 2009, respectively.
If access to bank loans first affects functional income distribution and if func-
tional income distribution affects productivity growth that would imply that facil-
itating access to bank loans might ultimately foster growth in the economy. To
achieve this objective, we first explore the real firms over the period on some key
variables and econometrically estimate Eq. 9.5 using the generalized method of
moments (GMM). Finally, alternative policy simulation scenarios are performed to
understand the full effect of bank loans, income distribution and productivity
growth linkage.
First, from firm-level data, the parameters of interest are computed for each firm
for each year:
• Employment share (EMPSHAFIRM): Is supposed to capture if there is an
indication of a structural change, that is, the movement of labor from less
productive to more productive sectors;
• Market share (MKTSHARE): This is the available resource over which firms
have to compete. It is through this competition process that decisions to invest in
productivity fostering factors are undertaken;
• Output share (OUSHA): Firms can also compete over industry output; and
• Productivity growth (GROWTHPRO): Is the main variable of interest. Its
growth rate is understood as the growth of mean characteristics in evolutionary
economics. Thus, growth is perceived to mean growth in productivity.
Based on these variables, our paper draws some inferences about the connection
between access to bank loans, functional income distribution and productivity
growth.
1
In the evolutionary growth framework, growth is mainly understood as growth of any mean
characteristics (in our case productivity growth).
194 A. Gebremeskel
9.4.2 Results from Data Exploration
The evolution of employment shares, market shares, output shares and growth in
productivity are shown in Figs. 9.1, 9.2, 9.3 and 9.4 in Annexure 2. The purpose of
these figures is to learn if there is any indication of a structural transformation process
within the manufacturing sector. If there is a change in the structure of production in
the manufacturing sector, we expect the labor share to be continuously shifting within
the industry. The shift should take place from low productivity to high productivity
industries. This would mean higher labor productivity and consequently higher labor
incomes which will form a positive feedback loop with productivity.
In Fig. 9.1, we observe movements for employment share within the industries
only for 11 industries. We identified these industries from the data as:
• Production, processing and preserving of meat, fruits and vegetables
• Manufacture of animal feed
• Manufacture of non-metallic NEC
• Manufacture of basic iron and steel
• Manufacture of other fabricated metal products
• Manufacture of pumps, compressors, valves and taps
• Manufacture of other general purpose machinery
• Manufacture of batteries
• Manufacture of bodies of motor vehicles
• Manufacture of parts and accessories
• Manufacture of furniture.
From the firm-level dataset, it was possible to learn that most of the firms within
these industries had access to bank loans. For example, overall, the 105 firms within
the production, processing and preserving of meat, fruits and vegetables industries
had access to bank loans. In the manufacture of animal feed industry, out of 98
firms, 37 had access to bank loans. Generally, all the indicated firms had access to
bank loans during the years of observation. In Fig. 9.1, we can observe that in these
industries, there is a significant movement (fluctuation) in employment shares. The
only exceptions are spinning, tanning and publishing industries in which all firms
had access to bank loans. However, any indication of movement in their employ-
ment share is not displayed.
One can argue that the employment share must be within the same sector (in-
dustries) and not across industries. If the reallocation of labor was taking place
across industries, we could have observed variations in the employment share in the
rest of the industries, but this is not evidenced.
Whether these industries are high productivity sectors and hence growth and
equality promoting can be another area of enquiry. But looking at their face value
alone, we may tentatively conclude that those industries which are related to
metallic manufacturing in particular are connected to the government (see Fig. 9.1
in Annexure 2).
Production, processing and preserving

of meat, fruit and veg manufacture of edible oil Manufacture of dairy products Manufacture of flour Manufacture of animal feed manufacture of bakery Manufacture of sugar and confecionary
1
.5
0
manufacture of pasta and macaroni Manufacture of food NEC Distiling rectifying and blending of spirit Manufacture of wine Malt liqores and malt Manufacture of soft drinks Manufacture of tobacco
1
.5
0
spining , weaving and finishing Manufacture of cordage rope and twine Kniting mills manufacture of wearing apparal except fur Tanning and dressing of leather manufacture of footwear Manufacture wood and wood products
1
employment share
.5
0
Manufacture of basic chemicals Manufacture of soap detregents, Manufacture of chemical product

Manufacture of paper and paper products Publishing and printing services except fertilzers Manufacture of paints varnishes Manufacture of phrmaceuticals, medicinial perfumes.. NEC
1
.5
0
Manufacture of glass and Manufacture of cement ,lime Manufacture of articles of concrete,

Manufacture of rubber Manufacture of plastics glass products Manufacture of structural clay products and plaster cement Manufacture of non-metalic NEC
1
.5
0
Manufacture of other fabricated Manufacture of pumps,compressors,

Manufacture of basic iron and steel Manufacture of structural metal products Manufacture of cuttlery hand tools.... metal products valves and taps Manufacture of ovens manufacture of bodies for mothor vechiles
1
.5
0
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
Manufacture of furniture
1
.5
0
1995 2000 2005 2010
period
Graphs by International standard industrial classification (ISIC)
Fig. 9.1 Evolution of employment share
Referring to Fig. 9.3, firms’ shares in total industry output are more pronounced
than their market shares. This tells us the underlying market structure, which may
subsequently have an effect on functional income distribution and productivity
growth (see Fig. 9.3 in Annexure 2).
196 A. Gebremeskel
Production, processing and preserving

of meat, fruit and veg manufacture of edible oil Manufacture of dairy products Manufacture of flour Manufacture of animal feed manufacture of bakery Manufacture of sugar and confecionary
8
6
4
2
0
manufacture of pasta and macaroni Manufacture of food NEC Distiling rectifying and blending of spirit Manufacture of wine Malt liqores and malt Manufacture of soft drinks Manufacture of tobacco
8
6
2 4
0
Manufacture of cordagerope manufacture of wearing apparal

spining , weaving and finishing and twine Kniting mills Tanning and dressing of leather manufacture of footwear Manufacture wood and wood products
except fur
8
6
2 4
0
Manufacture of paper and paper Manufacture of basic chemicals Manufacture of phrmaceuticals, Manufacture of soap detregents, Manufacture of chemical products
products Publishing and printing services except fertilzers Manufacture of paints varnishes medicinial perfumes.. NEC
market share
6 8
4
2
0
Manufacture of glass and Manufacture of structural clay Manufacture of cement , Manufacture of articles of concrete,
Manufacture of rubber Manufacture of plastics glass products products lime andplaster cement Manufacture of non-metalic NEC
8
4 6
0 2
Manufacture of structural Manufacture of other fabricated Manufacture of pumps, manufacture of bodies for
Manufacture of basic iron and steel metal products Manufacture of cuttlery hand tools.... metal products compressors,valves and taps Manufacture of ovens mothor vechiles
8
4 6
2
0
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
8
4 6
2
0
1995 2000 2005 2010
period
Fig. 9.2 Evolution of market share
It has been discussed that firms are at the heart of an evolutionary approach to
economic growth and growth in productivity at the firm level is a key to economic
growth. We can see from Fig. 9.4 that there are fluctuations in the productivity
growth rate (from −20 to 10%). We also note that, for example, the productivity
growth for production, processing and preserving of meat, fruits and vegetables
remained positive, which might be an indication of the effect of access to bank
loans (see Fig. 9.4 in Annexure 2).
Production, processing and preserving Manufacture of sugar and manufacture of pasta and
of meat, fruit and veg manufacture of edible oil Manufacture of dairy products Manufacture of flour Manufacture of animal feed manufacture of bakery confecionary macaroni
.15
.1
.05
0
output share of each industry based on the value of output
Distiling rectifying and

Manufacture of food NEC blending of spirit Manufacture of wine Malt liqores and malt Manufacture of soft drinks Manufacture of tobacco spining , weaving and finishing 1711
.15
.1
.05
0
manufacture of wearing Manufacture wood and Manufacture of paper and

Manufacture of cordage rope and twine Kniting mills apparal except fur Tanning and dressing of leather manufacture of footwear wood products paper products Publishing and printing services
.15
.1
.05
0
Manufacture of basic chemicals Manufacture of phrmaceuticals, Manufacture of soap detregents, Manufacture of

except fertilzers 2421 Manufacture of paints varnishes medicinial perfumes.. chemical productsNEC Manufacture of rubber Manufacture of plastics
.15
.1
.05
0
Manufacture of glass and Manufacture of structural Manufacture of cement , Manufacture of articles of Manufacture of basic iron Manufacture of structural
glass products 2691 clay products lime and plaster concrete, cement Manufacture of non-metalic NEC and steel metal products
.15
.1
.05
0
Manufacture of cuttlery Manufacture of other fabricated Manufacture of pumps, Manufacture of other general
hand tools.... metal products compressors, valves and taps 2912 Manufacture of ovens 2919 purpose machnery 2929
.15
.1
.05
0
1995 2000 2005 2010
manufacture of bodies for manufacture of parts

3000 3130 Manufacture of battries 3410 mothor vechiles and accessaries Manufacture of furniture
.15
.1
.05
0
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
year of observation
Fig. 9.3 Evolution of output share at the industry level
9.4.3 Econometric Results
This section deals with the econometric estimation of the logistic differential
equation in Eq. 9.5. The variables entering the model are two natured: the evolu-
tionary component and the exogenous component.
We estimated Eq. 9.5 using firm-level panel data. To achieve this, the data was
transformed (logarithms, growth rates, lags and differences) so that the transformed
data was consistent with the evolutionary econometric framework.
198 A. Gebremeskel
Production, processing and Manufacture of sugar and

preserving of meat, fruit and veg manufacture of edible oil Manufacture of dairy products Manufacture of flour Manufacture of animal feed manufacture of bakery confecionary
10
0
-20 -10
Distiling rectifying and

manufacture of pasta and macaroni Manufacture of food NEC blending of spirit Manufacture of wine Malt liqores and malt Manufacture of soft drinks Manufacture of tobacco
10
0
-20 -10
Manufacture of cordage manufacture of wearing apparal Manufacture wood and wood

spining , weaving and finishing rope and twine Kniting mills except fur Tanning and dressing of leather manufacture of footwear products
10
0
-20 -10
Manufacture of paper and Manufacture of basic chemicals Manufacture of phrmaceuticals, Manufacture of soap detregents, Manufacture of chemical
paper products Publishing and printing services except fertilzers Manufacture of paints varnishes medicinial perfumes.. productsNEC
GROWTHPRO
10
0
-20 -10
Manufacture of glass and Manufacture of structural Manufacture of cement , Manufacture of articles of concrete, Manufacture of non-metalic
Manufacture of rubber Manufacture of plastics glass products clay products lime and plaster cement NEC
10
0
-20 -10
Manufacture of structural Manufacture of other Manufacture of pumps,compressors, manufacture of bodies for

Manufacture of basic iron and steel metal products Manufacture of cuttlery hand tools.... fabricated metal products valves and taps Manufacture of ovens mothor vechiles
10
0
-20 -10
1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010 1995 2000 2005 2010
10
0
-20 -10
1995 2000 2005 2010
period
Fig. 9.4 Evolution of productivity growth
The dependent variable is change in the mean characteristics (growth in produc-

tivity). The explanatory variables are growth in labor share (GRWTHLSHARE), the
complement2 of output share (COMPVOUSHA), technically one minus output share
to fit the first term in Eq. 9.5, complementary market share (COMPMKTSHARE),
again the same interpretation as before so that it is consistent with Eq. 9.5, lagged
2
Here the complement of variable x is equal to (1 − x) (see the first term on the right hand side in
Eq. 9.5).
Table 9.2 Estimation results (GMM): dependent variable: growth in productivity

Variable Coeff. Std. error Z P > [Z]
GRWTHLSHARE 0.00052 0.0001 3.47 0.001
COMPVOUSHA −5.626 0.409 −13.75 0.000
COMPMKTSHARE 4.251 0.456 9.32 0.000
LAGDELTFP −0.412 0.0203 −20.20 0.000
EMPSHAFIRM −4.068 1.556 −2.61 0.009
cons 0.9196 0.421 2.18 0.029
change in labor productivity (LAGDELTFP) which represents the last term of Eq. 9.5
and finally, employment share of each firm (EMPSHAFIRM).
For the evolutionary approach, once the logistic differential in Eq. 9.5 is for-
mulated, it can be estimated using standard panel data econometric techniques
(random effects, fixed effects or GMM) which do not require separate treatment
here. The reported results are with a Wald Chi-square value of 773.57 with six
degree of freedom and probability value of (p > X2) of 0.0000 (Table 9.2).
The estimated results indicate that all explanatory variables entered the esti-
mation with statistically significant estimates. As expected, productivity was pos-
itively affected by the growth in labor share. However, the employment share
entered with a negative and statistically significant coefficient. We may interpret
this as lack of labor movement from low productive to high productive industries.
9.5 Summary, Conclusions, Policy Recommendations

and Future Areas of Research
The basic research question in this paper was explaining how firm-level labor share
affects firm and industry level productivity and how it affects aggregate productivity
in an economy taking the case of Ethiopia.
The most direct interpretation of the estimated results is that evolution and
change in mean characteristics (change in productivity) are positively affected by
the growth of functional income distribution (the growth in labor share: even if the
economic sign of the coefficient is of small order), its statistical significance is quite
acceptable.
The other variable of interest here is employment share of each firm within an
industry, which entered the model with a negative sign but a significant coefficient.
In economic terms, the positive and negative coefficients of labor share within a
firm and the employment share of each firm within the industry tell us very
important information about structural changes in the manufacturing sector.
If structural change was evident, the employment share would have entered with
a positive effect. However, it did not do this. Therefore, this does not support the
popular view of a structural bonus hypothesis which postulates a positive
200 A. Gebremeskel
relationship between structural change and economic growth. This hypothesis was
based on the assumption that during the process of economic development,
economies upgrade from industries with comparatively low to those with a higher
value added per labor input. For example, Timmer and Szirmai (2000) have a
detailed explanation on this.
This result is supported by an almost opposite mechanism, where structural
change has a negative effect on aggregate growth; this is revealed by Baumol’s
hypothesis of unbalanced growth. Intrinsic differences between industries in their
opportunities to raise labor productivity (for a given level of demand) shift ever
larger shares of the labor force away from industries with high productivity growth
toward stagnant industries with low productivity growth and accordingly higher
labor requirements. In the long-run, the structural burden of increasing labor shares
getting employed in the stagnant industries tends to diminish the prospects for
aggregate growth of per capita income. Baumol (1967) is key literature on this.
When the complement of firms’ market share enters the regression result with a
positive sign, the actual market share would have entered with a negative sign
which has a direct and clear economic meaning, that is, since firms may try to
capture the market through nominal ways (for example, price competition or
advertising or any other institutional arrangements) this will harm productivity. Our
major conclusion is lack of strong evidence for intra-industry selection.
The policy lesson is that access to bank loans is of great importance to firms.
Particularly those industries (spinning, tanning and publishing) in which all firms
had access to bank loans revealed movements in employment share, which is
evidence of structural transformation.
There are reasons why it is important to introduce appropriate public loan
policies, that is, ensuring a lending channel of monetary policy to work without
breaks. First, a credit aggregate can be a better indicator of monetary policy than an
interest rate or a monetary aggregate in Ethiopia. Second, monetary tightening that
reduces loans to firms can have negative distributional consequences. Particularly
for those firms for whom bank loans are a primary source of finance, ease of access
to bank loans can have economy-wide distributional consequences. More specifi-
cally, the credit policy should be such that manufacturing firms get better access to
banks.
It is desired that the future research direction includes economy-wide modeling,
estimation and more formalization of evolutionary economic models to study the
link between access to bank loans and its effects on income distribution and
inclusive economic growth.
Acknowledgments This research is supported by the Jönköping International Business School,

Jönköping University (Sweden), in collaboration with Addis Ababa University for doctoral studies
in economics, a project supported by the Swedish International Development Cooperation Agency
(SIDA).
The author would like to thank Professor Almas Heshmati, Professor Andreas Stephan,
Dr Tadele Ferede and other participants of a seminar at the Jonkoping International Business
School for their comments and suggestions on an earlier version of this research.
Annexure 1: Basic Logistic Differential Equation
X_ þ aðtÞX ¼ bðtÞX r ; if r ¼ 1; it is easily separable and becomes

X_ þ aðtÞX ¼ bðtÞX r and introducing Z ¼ X 1r
Z_ ¼ ð1 rÞX r X_
X_
But þ aðtÞ ¼ bðtÞX r1 ) X_ ¼ bðtÞX r1 aðtÞ X
X
Therefore,

Z_ ¼ ð1 rÞX r X_ ¼ ð1 rÞX r bðtÞX r1 aðtÞ X
ðEq A2) Z þ ð1 rÞ aðtÞ ¼ ð1 rÞ bðtÞ
Annexure 2: Evolution of Key Variables
See Figs. 9.1, 9.2, 9.3 and 9.4.
References
Aghion P, Howitt P (1992) A model of growth through creative destruction. Econometrica 60

(2):323–351
Aghion P, Howitt P (1998) Endogenous growth theory. MIT Press, Cambridge
Alchian A (1950) Uncertainty, evolution, and economic theory. J Polit Econ 58(3):211–221
Alesina A, Rodrik D (1994) Distributive politics and economic growth. Quart J Econ 109(2):465–
490
Andersen E (2004) Population thinking, price’s equation and the analysis of economic evolution.
Evol Inst Econ Rev 1(1):127–148
Barro RJ (1990) Government spending in a simple model of endogamous growth. J Polit Econ 98
(5):103–125
Baumol WJ (1967) Macroeconomics of unbalanced growth: the anatomy of urban crisis. Am Econ
Rev 57(3):415–426
Beinhocker, ED (2006). The Origin of Wealth, Evolution, Complexity, & the Radical Remarking
of Economics. Random House Business Books, Boston
Benhabib J, Rustichini A (1996) Social conflict, growth and income distribution. J Econ Growth
1:125–142
Boulton J (2010) Why is economics not an evolutionary science? Quart J Econ 12(2):41–69
Carlsson B, Eliasson G (2003) Industrial dynamics and endogenous growth. Ind Innov 10(4):435–
455
Castellacci F (2007) Evolutionary and new growth theories. Are they converging? J Econ Surv 21
(3):585–627
Cecilia G (2010) Income distribution, economic growth and European integration. J Econ Inequal
8(3):277–292
Coase RH (1937) The nature of the firm. Economica 4(16):386–405
202 A. Gebremeskel
Dopfer K (2005) The evolutionary foundations of economics. Cambridge University Press,

Cambridge
Dopfer K, Potts J (2004) Evolutionary realism: a new ontology for economics. J Econ Method 11
(2):195–212
Dosi G, Nelson RR (2010) Technical change and industrial dynamics as evolutionary processes. In
Hall B, Rosenberg N (eds) Handbook of the economics of innovation. Academic Press,
Amsterdam, pp 51–127
Foster J (1992) The determination of sterling M3, 1963–88: an evolutionary macroeconomic
approach. Econ J 102(412):481–496
Foster J (1994) An evolutionary macroeconomic model of Australian dollar M3 determination:
1967–93. Appl Econ 26(11):1109–1120
Foster J (2014) Energy, knowledge and economic growth. J Evol Econ 24(2):209–238
Foster J, Wild P (1999a) Detecting self-organizational change in economic processes exhibiting
logistic growth. J Evol Econ 9(1):109–133
Foster J, Wild P (1999b) Econometric modeling in the presence of evolutionary change. Camb J
Econ 23(6):749–770
Gatti Domenico, Gaffeo D, Gallegati E, Giulion M, Kirman G, Palestrini A, Russo A (2007)
Complex dynamics and empirical evidence. Inf Sci 177:1204–1221
Goldfarb RS, Leonard TC (2005) Inequality of what among whom: rival conceptions of
distribution in the 20th Century. In Samuels WJ, Biddle J, Emmett Ross B (eds) Research, in
the history of economic thought and methodology, vol 23-A. Emerald Group Publishing
Limited, Amsterdam, pp 79–123
Griliches Z (1957) Hybrid corn: an exploration in the economics of technological change.
Hahn F, Matthews R (1964) The theory of economic growth: a survey. Econ J 74(296):779–902
Hahn F, Solow R (1995) A critical essay on modern macroeconomic theory. MIT Press,
Cambridge
Hodgson G (1996) An evolutionary theory of long-term economic growth. Int Stud Quart 40
(3):391–440
Holm J (2014) The significance of structural transformation to productivity growth: how to
account for levels in economic selection. J Evol Econ 24(5):1009–1036
Krugman P (1996) What economists can learn from evolutionary theorists. A talk given to the
European Association for Evolutionary Political Economy. November 1996. http://web.mit.
edu/krugman/www/evolute.html
Leijonhufvud A (1992) Keynesian economics: past confusions, future prospects. In: Vercelli A,
Dimitri M (eds) Macroeconomy: a survey of research strategies. Oxford University Press,
Oxford
Lucas R (1988) On the mechanics of economic development. J Monet Econ 22(1):3–42
Mansfield E (1981) Composition of R and D expenditures: relationship to size of firm,
concentration, and innovative output. Rev Econ Stat 63(4):610–615
Metcalfe JS (1998) Evolutionary economics and creative destruction. Routledge, London
Metcalfe JS, Foster J, Ramlogan R (2006) Adaptive economic growth. Camb J Econ 30:7–32
Mincer J (1958) Investment in human capital and personal income distribution. J Polit Econ 66
(4):281–302
Nelson RR, Winter SG (1974). Neoclassical vs. evolutionary theories of economic growth: critique
and prospectus. Economic Journal 84:886–905
Nelson R (1996) The sources of economic growth. Harvard University Press, Cambridge
Nelson R, Winter S (1982) An evolutionary theory of economic change. Harvard University Press,
Cambridge
Perotti R (1996) Growth, income distribution, and democracy: what the data say. J Econ Growth 1
(2):149–187
Persson T, Tabellini G (1994) Is inequality harmful for growth? Am Econ Rev 84(3):600–621
Rehme G (2006) Redistribution and economic growth in integrated economies. J Macroecon 28
(2):392–408
Ricardo D (1815) Essay on The influence of a low price of corn upon the profits of stock, 2nd edn.
John Murray, London
Romer P (1986) Increasing returns and long-run growth. J Polit Econ 94(5):1002–1037
Romer P (1987) Crazy explanations for the productivity slowdown. NBER Macroecon Annu
2:163–202
Romer P (1990) Endogenous technological change. J Polit Econ 98(5):71–102
Romer P (1994) The origins of endogenous growth. J Econ Perspect 8(1):3–22
Safarzyńska K, van den Bergh JCJM (2010) Evolutionary models in economics: a survey of
methods and building blocks. J Evol Econ 20(3):329–373
Salvadori N (2003) The theory of economic growth. Edward Elgar, Cheltenham
Solow R (1956) A contribution to the theory of economic growth. Q J Econ 70(1):65–94
Stockhammer E, Onaran O, Ederer S (2008) Functional income distribution and aggregate demand
in the Euro area. Camb J Econ 33(1):139–159
Swan TW (1956) Economic growth and capital accumulation. Econ Rec 32(2):334–361
Timmer M, Szirmai A (2000) Productivity growth in Asian manufacturing: the structural bonus
hypothesis examined. Struct Change Econ Dyn 11(4):371–392
Todaro MP (1997) Economic development. Longman, London
Veblen T (1898) Why economics is not an evolutionary science? Quart J Econ 12(4):373–397
Part IV
Trade, Mineral Exports and Exchange
Rate
Chapter 10
Determinants of Trade with Sub-Saharan
Africa: The Secret of German Companies’
Success
Johannes O. Bockmann
Abstract This paper evaluates the degree to which internal, micro and
macro-environmental variables explain why some small- and medium-sized
enterprises (SMEs) based in Germany export more successfully to sub-Saharan
Africa (SSA) than other firms in the same category. It derives explanatory factors
specific to the region from experts. A bivariate correlation analysis identifies
relations between (in)dependent export performance (EP) measurements. Stepwise
multiple regression equations for firms’ overall EP and overall export profitability
in the last three years highlight factors with the most significant correlations. As
evaluated in previous research and as mentioned by experts, it applies a multidi-
mensional approach, investigating variables according to the resource-based view
and the contingency paradigm. This study indicates that SSA has specific
requirements for successful exports which differ from other regions. Knowledge
about these particular characteristics of the market will enable managers and pol-
icymakers to improve trade relations. By focusing on the EP of German SMEs in
SSA, this study fills a research gap since no previous study has concentrated on this
specific aspect.

Keywords German small- and medium-sized enterprises Export performance

Comparative advantages Internal Micro and macro-environmental factors
Sub-Saharan Africa
J.O. Bockmann (&)

Department of Economics and Logistics, International School
of Management Hamburg, Hamburg, Germany
e-mail: johannesbockmann@gmail.com

DOI 10.1007/978-981-10-4451-9_10
208 J.O. Bockmann
10.1 Introduction
Exports represent the preferred method for entry into foreign markets (Lado et al.
2004; Sousa et al. 2014; Zhao and Zou 2002) since they offer firms a comparatively
high level of flexibility with relatively small necessary investments thus permitting
a fast entry into new markets (Katsikea et al. 2007; Leonidou 1995; Sousa and
Novello 2014). Research on export modalities is of high interest to three major
stakeholders: public policymakers, managers, and researchers (Katsikea et al. 2000;
Sousa 2004).
Scholars explain the increasing interest in exports on the basis of its positive
effect on a country’s growth alongside the business opportunities that it offers
individual firms (Dean et al. 2000). Public policymakers encourage export activities
since they foster the accumulation of foreign exchange reserves, support the
development of national industries, create new jobs, and improve productivity
(Czinkota 1994). Developed countries see cross-border economic relationships as a
necessary instrument for maintaining their standard of living (Baldauf et al. 2000).
A detailed review of 33 articles published between 2000 and May 2015 looking
at export performance (EP), we identified 65 internal and 35 external determinants.
However, none of them focused on sub-Saharan Africa (SSA). This is surprising
since these markets offer great business opportunities. According to data from the
World Bank (Catalog Sources World Development Indicators 2015), the region’s
total GDP grew by 5.72% per year on average from 2000 to 2013. Further, imports
of goods and services increased by an average of 12.05% per year from 2010 to
2012 (Catalog Sources World Development Indicators 2015; United Nations
Statistics Division 2011, 2014). In 2012, SSA countries imported US$496.50 bil-
lion worth of goods and services (United Nations Statistics Division 2014). The
increasing demand for foreign products together with a relatively high level of
uncertainty in the region makes SSA predestined for exports rather than alternative
market entry methods such as foreign direct investment (Boly et al. 2014; Riddle
2008; Sousa and Novello 2014).
Regarding the exporter’s home country, only three papers concentrated on
Germany although the country was one of the top three merchandise exporters with
a share of 7.7% of world trade in 2013 and a trade surplus of US$264 billion (WTO
2014). The main drivers of this success are Germany’s small- and medium-sized
enterprises (SMEs) (MoAE 2015), a situation which is similar to that in most
European countries (Bijmolt and Zwart 1994). According to an EU definition,
SMEs include all firms with a maximum of 250 employees (Sousa et al. 2014).
However, Katsikea et al. (2007) argue that SMEs are not just smaller versions of
large firms but that they operate differently because of their size. Therefore, an
insight into the success factors of German SMEs may be relevant for German
policymakers and executives interested in the guarantors of EP (Baldauf et al.
2000).
10 Determinants of Trade with Sub-Saharan Africa … 209
Between 2000 and 2013, exports from Germany to all SSA countries grew on
average by 8.8% to US$13.51 billion. 89% of German exporters with experience in
Africa plan to expand on their commitments, especially in West and Central Africa
(Foly 2013). Politicians too, including the German Chancellor Angela Merkel are
showing an increasing interest in Africa. For example, during conferences such as
the EU–Africa summit a steady cross-sectoral rise in demand is expected thanks to
a growing middle class (Merkel 2014). Consequently, a deeper insight into the
factors which influence German EP in SSA is necessary.
Scholars argue that further research is needed to investigate the possible pre-
dictors of EP (Baldauf et al. 2000; Fevolden et al. 2015; Navarro-García et al.
2015). A focus on the EP of SMEs is specifically important since they in particular
profit from a combination of flexibility with limited resource commitments (Sousa
et al. 2014), while their significant contributions to national economies underline
their relevance for policymakers (Sousa and Novello 2014). Further, there is a need
to investigate the specifics of EP in selected regions/countries (Navarro-García et al.
2015; Rambocas et al. 2015). Concerning Germany, Wagner (2014) maintains that
detailed company characteristics should be worked out. Sousa et al. (2008) and
Sung (2015) have identified a strong demand for more research on developing
countries (DC), such as the ones in SSA, since their share in world trade is
increasing thus offering significant opportunities in the present and future global
economic order.
In summary, the quoted views substantiate the need for additional research in the
field of EP, covering individual regions and explanatory variables. To provide
evidence if SSA requires different or additional internal, micro and macroeconomic
variables, this study concentrates on the factors relevant for German SMEs tar-
geting this region. The rest of this paper is organized as follows. It first gives a
literature review which is followed by a section on methodology. The next section
gives the findings and analysis of the semi-structured interviews and questionnaire.
The last section gives the conclusions and discusses possible areas for further
research.
10.2 Literature Background
Research about EP goes back to Tookey’s (1964) work about factors associated
with success in exporting. In a wider context, it addresses the outcomes of export
activities, mostly at the firm or export venture level (Kahiya and Dean 2014).
Nowadays, EP is the mostly studied in the field of export marketing (Leonidou and
Katsikea 2010). Multiple aspects arise from the fact that the ‘Export performance
dialogue is spread over a large pan-discipline research landscape which includes
210 J.O. Bockmann
International Businesses, International Marketing, International Entrepreneurship,

Small Business Management and International Trade’ (Kahiya and Dean 2014:
378).
10.2.1 Measuring EP
Approaches for measuring EP are fragmented and uncoordinated (Kahiya and Dean
2014; Katsikea et al. 2000) and no single view prevails (Sousa 2004). An almost
philosophical approach points out that for most export start-ups pure survival is
already some measurement of success (Kahiya and Dean 2014). Indicators reflect
objective and subjective facts. While objective measures deal with absolute per-
formance, subjective ones are concerned with a firm’s expectations or its perceived
performance as compared to its competitors (Akyol and Akehurst 2003). Scholars
have identified 42 (Katsikea et al. 2000: 497) or even 50 (Sousa 2004: 9) indicators
for EP. Since no individual indicator adequately captures the phenomenon of EP
(Kahiya and Dean 2014; Lages and Lages 2004; Zou et al. 1998), there is general
agreement in favor of a multidimensional approach. Many researchers such as
Baldauf et al. (2000) and Papadopoulos and Martín-Martín (2010) prefer a multiple
approach.
10.2.2 Determinants of EP
Two major theoretical approaches to classify the determinants of EP stand out. The
resource-based view emphasizes a firm’s individual competencies as its unique
bundles of assets (Conner and Prahalad 1996; Nalcaci and Yagci 2014; Stoian et al.
2011). Accordingly, the success of a company is a result of its acquiring and
exploiting its own unique resources such as competence, experience, and size (Zou
and Stan 1998). Research also identifies how higher performance can be achieved
in comparison with other firms (Barney 2002; Dhanaraj and Beamish 2003; Singh
and Mahmood 2014).
On the other hand, the contingency paradigm proposes that environmental fac-
tors affect the companies’ strategies and EP which is then the result of a specific
company context (Sousa et al. 2008). Consequently, exports are considered an
organization’s strategic response to the interaction of external and internal factors
(Robertson and Chetty 2000; Sousa et al. 2008; Yeoh and Jeong 1995).
In the meantime, there is a general agreement that a multidimensional approach
including a range of determinants such as managerial, organizational, and envi-
ronmental aspects is most appropriate (Baldauf et al. 2000; Katsikea et al. 2000;
Rambocas et al. 2015). This is confirmed by Morgan et al. (2004) who synthesized
the different views into a robust theoretical model.
10.2.3 Internal and Microenvironmental Factors
Thirty-three papers published between 2000 and May 2015 were analyzed and 65
variables were identified. International experience measured in years (21.21% of the
reviewed papers), firm size as represented by the number of employees (18.18%),
adapting the price strategy to market conditions (15.15%), and the number of foreign
markets served by a firm (12.12%) are mostly applied to explain a business’ EP.
10.2.4 Macro-environmental Factors
Most scholars extend their research scope by using qualitative and quantitative
determinants. In the 33 papers published between 2000 and May 2015, 21 studies
covered external variables, identifying 35 external factors. An increasing level of
competition in the foreign market influences EP, but there is no consensus if it is
positive (9.09% of reviewed papers) or negative (6.06%). Scholars are equally
inconsistent regarding the influence of distance. Two papers (6.06%) show that an
increasing distance has a positive impact, whereas one paper presents a negative
result. Also, the foreign exchange rate plays a multifaceted role: in one paper it has
a positive influence, whereas three papers (9.09%) found no significant effect.
Customs and tariffs (9.09%) and regulations (15.15%) are frequently named as
impacting EP negatively, while one study claimed that they were irrelevant.
10.3 Methodology
10.3.1 Research Philosophy
Our research applies pragmatism, which is not committed to any single philosophy.
The lack of studies about EP of German SMEs in SSA leads to pragmatism since it
allows a researcher to consider different points of view to get a holistic picture.
Consequently, multiple approaches are necessary to gain quantitative and qualitative
data (Collis and Hussey 2014; Saunders 2012). Actually, many EP studies (e.g.,
Freeman and Styles 2014; Rambocas et al. 2015) have applied this philosophy.
10.3.2 Research Approach
Our study is abductive since it combines both deductive and inductive elements.
The initial semi-structured interviews aimed at expanding knowledge about EP
from experts without reference to the existing theory. The respective results were
212 J.O. Bockmann
merged with the findings from existing literature into one questionnaire. Thus, for
German SMEs targeting SSA, the existing theory could be tested and modified by
new insights (Collis and Hussey 2014; Saunders 2012).
10.3.3 Research Purpose
To answer the research question, a varied approach (multiple methods) rather than
one method was chosen achieving a broader view (e.g., by Freeman and Style 2014;
Rambocas et al. 2015; Wagner 2014). First semi-structured interviews were carried
out which mainly resulted in qualitative data. Subsequently, a questionnaire survey
was done to gain primarily quantitative, but also qualitative data.
The aim of exploratory research is to ‘seek new insights into phenomena, to ask
questions, and to assess the phenomena in a new light’ (Saunders 2012: 670).
Consequently, this study started with semi-structured interviews examining the
factors known to influence EP as well as searching for additional ones prior to
developing a questionnaire. A good reason to include exploratory research as a first
step is the positive experience of Freeman and Styles (2014), Lacka and Stefko
(2014), and Nalcaci and Yagci (2014) who gained new insights about EP for other
regions by conducting interviews.
Explanatory research has its emphasis on clarifying the relationship between
variables. The questionnaire supports this purpose by enabling the identification of
interrelations between dependent and independent factors of EP and the develop-
ment of casual relationships between them (Saunders 2012). It tests the interaction
between existing measurements for EP relevant in other countries identified during
the literature review. The fact that researchers such as Singh and Mahmood (2014)
and Sousa and Novello (2014) have applied explanatory research in their EP studies
underlines the value of this approach.
10.3.4 Research Strategy
Based on a detailed literature study to gain secondary data and information about
the current status of research activities, semi-structured interviews were chosen to
extract new insights from experts concerning the factors which influence a firm’s
EP, thus getting answers to specific key questions while providing the flexibility to
react to the flow of conversation (Saunders 2012). Freeman and Styles (2014) have
previously used a similar approach.
Subsequently, a self-completion questionnaire (Collis and Hussey 2014) was
developed to collect data for empirical tests. The nature of this questionnaire was
mainly quantitative and explanatory since the participants were asked to grade the
influence of different variables on their firm’s EP. By evaluating the data with a
bivariate correlation and multiple regression, the relationships were identified, as
previously done, for example, by Castellacci and Fevolden (2014), Fevolden et al.
(2015) and Stoian et al. (2011). Moreover, the participants were encouraged to
explain their grading and to suggest additional factors influencing EP.
The applied semi-structured interviews and questionnaire fall in the survey
strategy which is mostly applied to gain quantitative data, but qualitative infor-
mation can also be accumulated this way. A questionnaire allows an efficient
collection of standardized data from a large population enabling comparisons and
further analysis. Moreover, it helps define the relationship between EP’s indepen-
dent and dependent factors. This strategy is generally perceived as authoritative,
comparatively easy to explain and understandable for participants.
10.3.5 Semi-structured Interviews
At first, general information about the participants and their firms was derived from
answers to closed questions, followed by an inquiry regarding target markets in
SSA. Closed questions were used since the participants were surveyed on a specific
issue. In the second part, participants elaborated freely on internal and external
factors which were perceived to influence their firm’s EP (Saunders 2012).
As a sampling technique, a non-probability sample was chosen because ‘the
probability of each case being selected from the total population is not known’
(Saunders 2012: 261). More specifically, purposive sampling based on the scholar’s
judgment was applied. Although all participants had been in charge of exports to
SSA for several years and were therefore a good fit, this approach is not statistically
representative. Therefore, it was followed by a questionnaire survey (Saunders
2012). The response rate of 40% was fairly high compared to Sousa’s reviews with
30 and 25% (Sousa et al. 2008).
Table 10.1 summarizes the general information, which has been changed to
ensure confidentiality about the participants.
10.3.6 Questionnaire Survey
Based on the literature review and the interviews, a Web-based questionnaire was
developed. For Easterby-Smith et al. (2015), this is an efficient way to collect data
from a large number of people, which was also important for our analysis (Collis
and Hussey 2013). First, general information about the respondents was gathered,
which was followed by questions regarding their target markets in SSA. Later, the
participants were asked to grade their EP and the respective determinants. Finally,
they could enter personal data to receive an executive summary of the findings.
The seven-point Likert scale: Answers were graded on a seven-point Likert scale
because this allows the gathering of perceptions (Navarro-García et al. 2015).
214 J.O. Bockmann
Table 10.1 Participants of the semi-structured interview

Firm A B C D
Industry Trading house, Medical turnkey Medical turnkey Textiles and advertising
incl. finance projects projects industry
Interviewee Senior Executive Director turnkey Chief Executive Chief Executive Officer
Project Manager projects Officer
for SSA
Employed 4 27 5 9
(years)
Target Ghana (4.5) Congo (7) Ghana (50) Several countries in SSA such
countries Kenya (4.5) Senegal (6) Nigeria (35) as South Africa, Congo,
(years of South Africa (4.5) South Africa (1) South Sudan (10) Namibia, Liberia, and South
export Angola (4.5) Zimbabwe (25) Sudan (56)
activities) Mozambique (1) Nigeria (4)
Tanzania (1) Ghana (7)
Guinea (1)
Further, this extended scale ‘has been shown to process valid psychometric measure
properties’ (Singh and Mahmood 2014: 88) and has been successfully used in
previous EP studies (e.g., Rambocas et al. 2015; Singh and Mahmood 2014; Ward
and Duray 2000).
Subjective self-reporting was employed because of the expectation (and expe-
rience) that firms are unwilling to disclose full data (Leonidou et al. 2002; Singh
and Mahmood 2014) and because of a proven correlation between subjective and
objective measures (Akyol and Akehurst 2003; Dess and Robinson 1984; Matanda
and Freeman 2009; Stoian et al. 2011).
Dependent variables of EP: Since there is no generally accepted definition for EP
(Sousa 2004; Sousa et al. 2008; Stoian et al. 2011; Wheeler et al. 2008), the
measurements for our study were developed on the basis of existing literature which
guaranteed success and facilitated comparisons with previous results.
First, the respondents were encouraged to rate their overall perceived satisfaction
with EP in SSA in the last three years on a seven-point Likert scale, ranging from
‘extremely dissatisfied’ to ‘extremely satisfied’ (similarly applied, e.g., by Akyol
and Akehurst 2003; Cadogan et al. 2012; Freeman and Style 2014; Lee and Griffith
2004; Navarro-García et al. 2015; Sousa and Novello 2014; Sousa et al. 2014).
They were told that the overall satisfaction about EP should include the areas of
international sales growth, export business profitability, the firm’s image in foreign
markets, international expansion, and market share (Cavusgil and Zou 1994;
Navarro-García et al. 2015; Navarro et al. 2010 ).
Second, they were asked about their overall satisfaction with their company’s
performance in terms of export profitability in SSA in the last three years (similar
to, e.g., Cadogan et al. 2002; Dean et al. 2000; Nalcaci and Yagci 2014; Robertson
and Chetty 2000; Singh and Mahmood 2014; Sousa and Novello 2014; Stoian et al.
2011). The time frame was adapted from Cadogan et al. (2012) and
Navarros-García et al. (2015). Sousa and Novello’s (2014) and Sousa et al.’s (2014)
approaches to ask for the overall satisfaction with EP and export profitability was
employed.
Independent variables of EP: The items applied to measure each construct were
based on the earlier interviews with professionals as well as existing literature.
Participants were again asked to grade internal, micro and macro-factors on a
seven-point Likert scale.
Questionnaire sampling: Only German SMEs exporting to at least one SSA
country in the last three years were considered. Following most researchers in the
field of EP such as Nalcaci and Yagci (2014), Sousa et al. (2014), and Sousa and
Novello (2014), only CEOs and managers with decision making responsibilities
regarding exports to SSA were accepted. As shown in Table 10.2, the response rate,
when compared with Sousa was quite low, possibly because the authors only/or
additionally sent out the questionnaire via post or called all potential participants
(Sousa 2008).
To ensure representative sampling, the number of participants should be as large
as possible (Cooper and Schindler 2014; Saunders 2012). According to Saunders
et al. (2012), a relatively low response rate, however, is not necessarily bad as a
sample size of 30 or more represents a high degree of accuracy and reliability. With
a useable sample size of 41, this was a given. Moreover, Armstrong and Overton’s
(1977) extrapolation procedure was applied to ensure that no differences existed
between early and late responses (the basic details of the participants in the ques-
tionnaire are given in Table 10.3).
10.4 Findings and Analysis: Semi-structured Interviews
10.4.1 Method of Analysis
A content analysis was done to quantify the orally given data. Using this widely
applied method, items of qualitative data were systematically converted into
numerical data (Collis and Hussey 2014; Easterby-Smith et al. 2015).
10.4.2 Evaluated Macro-environmental Factors
The factors mentioned in an open question to influence EP are given in Table 10.4.
Table 10.2 Comparison of response rates

Sousa’s (2000) review of EP papers Sousa’s (2008) review of EP papers This study
30% 25% 10.96%
216 J.O. Bockmann
Table 10.3 Basic data of questionnaire participants

Total number of participants 58
Number of analyzed participants 41
Company size, range (number of 247.00
full-time employees)
Company size, mean (number of 129.62
full-time employees)
Industries Advertising Materials/Textiles; Agriculture; Architecture;
Automation Technology; Automotive; Building; Business
Services; Cables and Wires; Commodities Trading;
Construction; Consulting; Consumer Goods; Energy
(Services); Engineering; Export Trade; Finance; Food;
Health Services; Healthcare/Medical; ICT/Consulting; IT;
Management Consultancy; Manufacture of Welding
Consumables; Metalwork; Refrigeration and Air
Conditioning; Shipping; Solar energy; Toys; Trading
Key informants Area Sales Director—Africa and Middle East; Authorized
Officer; Chief Executive Officer; Export Manager; General
Coordinator Africa; Head of Department International;
Head of Sales Department Africa and Asia; Head of Sales
Department EMEA; International Business Development;
Managing Director; Market Development Manager Africa;
Marketing Director/Manager; President Region Africa;
Regional (Sales) Manager, Sales Director; Sales Manager;
Senior Manager; Shareholder; Speaker; VP International
Sales
Head offices All over Germany
Unit of analysis Firm
Company age, range (years) 174.00
Company age, mean (years) 63.02
Besides other factors such as export promotion programs and the prohibition of
bribery, German politics and the legal environment were also considered to have an
impact on EP. A survey by Transparency International (Hardoon 2013) shows that
bribe is a serious matter in Africa and that decision makers are willing to accept
such payments. For example, 54% of the 2207 households questioned in Ghana in
2013 said that they had paid bribes; politicians were described as corrupt by 76%
(Hardoon 2013). A participant in one of the studies stated that for this reason his
firm concentrated on private customers. Two others argued that contributions were
illegal in all European countries, but Germany was the only country where the law
was strictly enforced. France, besides others, was said not to apply existing legis-
lations. In cultures where expensive presents express esteem and where decision
makers depend on special payments to support their families and tribes, German
companies have no chances of getting contracts. This supports O’Cass and Julian’s
study (2003) stating that legal and political decisions influence EP. Dean et al.
(2000) confirm that governmental agencies may support exports.
Table 10.4 Macro-factors which influence EP

Positive influence Negative influence
Made in Germany (four times) Difficulties in finding partners to finance big
Export promotion by the German government projects (once)
(once) Contributions to decision makers are illegal
Local conditions: Some countries are not able in Germany, but especially offered by firms
to coordinate projects by themselves so they based in other countries (three times)
need companies specializing in offering German politics does not consider the special
turnkey projects (once) characteristics of the region, information level
does not correspond with the current situation
(once)
German politics should support German
producers by financing exports to the region
(once)
Competition from China and other countries
with cheaper products (twice)
The country of origin referred to by all interviewees as influencing EP has been

previously mentioned to be relevant by Lacka and Stefko (2014). The difficulties in
finding partners to finance big projects have been addressed by Felbermayr and
Yalcin (2013). Identified competition from other countries matches the factor
‘market competitiveness’ recorded to be significant, for example, by Cadogan et al.
(2012), Lages and Montgomery (2005), and Navarros-García et al. (2015).
10.4.3 Evaluated Internal and Microenvironmental Factors
The variables mentioned in an open question to influence EP are given in

Table 10.5.
The relevance of product quality falls in line with the importance of the product
strategy. Previously, O’Cass and Julian (2003) and Shoham et al. (2002) have
identified its significance for Australian firms, Lee and Griffith (2004) for South
Korea, and Piercy et al. (1997) for Britain.
The influence of price has been highlighted by various scholars such as Lado
et al. (2004), Morgan et al. (2004), and Sousa et al. (2014). However, Sousa and
Novello’s study (2014) found that there was no influence of the price strategy.
Factor market knowledge or rather know-how and social competencies emerged
significant in studies by Kahiya and Dean (2014) and Ling-yee (2004).
Also, company size matters. Besides others, Kahiya and Dean (2014) and Lado
et al. (2004) describe it as fundamental and Lee, and Griffith (2004) mention that a
certain size is necessary to export successfully. For example, one participant
mentioned that his firm as a medium-sized company concentrated on smaller
projects. There is no consensus, however, about its relevance. For instance, Lee and
Griffith (2004) and Stoian et al. (2011) could not prove any influence.
218 J.O. Bockmann
Table 10.5 Internal and micro-factors which influence EP

Positive influence Negative influence
Concept of sustainability, for example, not Initially mistrust toward the region, it was
only building a hospital but also training necessary to build trust in different
employees and finding qualified staff (once) departments such as risk control and
Continuous physical presence in the target accounting (once)
market (twice)
Network in the industrial sector in the firm’s
home country (once)
As a medium-sized company concentration
on smaller projects (once)
General willingness of the firm to deal with
risks in Africa caused by insufficient
experience in the region (once)
Cooperation with local partners (once)
High local market knowledge (once)
Company image (once)
Employees: Know-how and social
competence (twice)
Competence not only to offer good quality,
but also good prices (twice)
Product quality (once)
General willingness of firms to deal with the aspect of risk in Africa has not been
mentioned in previous studies.
Two participants said that time spent in abroad or rather continuous physical
presence in the target country was essential. However, Stoian et al. (2011) could not
prove any relevance of this for Spanish exporters.
Employees’ principle mistrust toward SSA was mentioned as influencing EP
negatively. The attitude of employees toward a target market has been previously
researched by Nalcaci and Yagci (2014).
10.5 Findings and Analysis: Questionnaire
10.5.1 Method of Analysis
Data were imported from the online questionnaire provider into IBM SPSS. From
58 given datasets, 41 emerged as valid, once they were edited following Brase and
Brase (2010) and Pallant (2013). The included datasets fulfilled the mathematical
requirements for analysis and fit into the target group:
– Except firms larger than 250 employees (SME threshold),
– Except unfinished datasets, and
– Including individuals who are involved with their firms in exports to SSA.
First, a none-response bias was ensured by an extrapolation procedure. To

isolate those regions of SSA where the results of the analysis were applicable, the
information provided by the participants was evaluated by means of descriptive
statistics. EP’s dependent variables were studied regarding their frequency and
possible bivariate correlations to ensure their validity for further analysis. Then the
independent variables were looked at with the Pearson correlation and Spearman.
Later, for both EP measurements a stepwise multiple regression analysis was car-
ried out.
10.5.2 Target Regions
Figure 10.1 gives the regions served by at least 20% of the participants’ firms.
Countries colored green (Ghana, Nigeria, and South Africa) enjoyed the
patronage of more than 60% of the German SMEs exporting to SSA. However, this
was almost equally true for the orange zone (Cameroon, Angola, Namibia,
Mozambique, Tanzania, Kenya, and Ethiopia), with a total of 50–60% of the
companies having export activities there.
Obviously, all areas colored in green and orange (except Ethiopia) are located by
the sea. German SMEs prefer exporting to countries that are easily accessible and
they avoid landlocked markets.
Fig. 10.1 Markets served by

participants’ firms in SSA
220 J.O. Bockmann
Fig. 10.2 Overall EP–

frequency distribution
(n = 41)
Fig. 10.3 Export

profitability–frequency
distribution (n = 41)
10.5.3 Influence of Determinants on EP
Dependent variables
The dependent variables over all of EP and export profitability were graded by
all participants on a scale from one (extremely dissatisfied) to seven (extremely
satisfied). The results are symmetrically bell-shaped thus representing normal dis-
tribution (Figs. 10.2 and 10.3).
To test the null hypothesis if there is no correlation between overall EP and
export profitability of German SMEs, a Pearson product-moment correlation
coefficient was established following Anderson et al. (2014). Since the significance
(2-tailed) is less than 0.05, the correlation is significant. The Pearson correlation
actually shows a strong positive relationship between the variables (0.682), that is,
Table 10.6 Pearson correlation with overall EP and export profitability

Overall EP Export profitability
Overall EP Pearson correlation 1 0.682**
Sig. (2-tailed) 0.000
N 41 41
Export profitability Pearson Correlation 0.682** 1
Sig. (2-tailed) 0.000
N 41 41
Note *Correlation is significant at the 0.01 level (2-tailed)
higher levels in one variable are associated with higher values in the other. A shared
variance of 46.51% can explain each other’s variance. In view of these results, there
is significant evidence to reject the formulated null hypothesis (Pallant 2013)
(Table 10.6).
Independent variables
The participants were asked to grade the influence of different macro-factors on
their company’s EP in SSA from one (none) to seven (substantial). Internal and
microenvironmental factors were graded from one (much worse) to seven (much
better) in comparison with major competitors in the market.
Both measurements for EP were tested with each factor by a bivariate correlation
to describe the strength and direction of their relationship following Anderson et al.
(2014) and Pallant (2013). The following hypotheses were tested:
H0a: There is no correlation between overall EP and the ‘independent variable.’
H1a: There is a significant correlation between overall EP and the ‘independent
variable.’
H0b: There is no correlation between export profitability and the ‘independent
variable.’
H1b: There is a significant correlation between export profitability and the ‘inde-
pendent variable.’
In case of p < 0.05, the correlation is significant at the 0.05 level (2-tailed) and
H0 can be rejected. If p < 0.01, the correlation is even significant at the 0.01 level
(2-tailed) and H0 can be rejected (Pallant 2013). The relationships were charac-
terized depending on ‘r’ (Table 10.7).
Table 10.8 summarizes the results on internal and microenvironmental factors.
Table 10.9 summarizes the results on macro-environmental factors.
Table 10.7 Guidelines for Small positive relation r = 0.10 to 0.29

interpreting the correlation
Medium positive relation r = 0.30 to 0.49
coefficient based on Pallant
(2013) Large positive relation r = 0.50 to 1.0
Small negative relation r = −0.10 to −0.29
Medium negative relation r = −0.30 to 0.49
Large negative relation r = −0.50 to −1.0
222 J.O. Bockmann
Table 10.8 Pearson correlation/Spearman of internal and microenvironmental factors with

overall EP/export profitability
Overall Export
EP profitability
Age of firm in years Pearson correlation 0.010 −0.052
Sig. (2-tailed) 0.948 0.748
Total number of full-time employees Pearson correlation −0.008 −0.009
Sig. (2-tailed) 0.959 0.954
Years your firm has been exporting Pearson correlation 0.074 −0.020
in general Sig. (2-tailed) 0.646 0.901
Years your firm has been exporting Pearson correlation 0.161 0.041
to sub-Saharan Africa Sig. (2-tailed) 0.314 0.797
Number of languages spoken in the Pearson correlation 0.019 0.093
export department (fluently or better) Sig. (2-tailed) 0.905 0.562
Number of countries in the Pearson correlation 0.183 0.178
sub-Saharan Africa region that your Sig. (2-tailed) 0.253 0.266
company serves
Adaptation of product strategy to the Pearson correlation 0.333* 0.470**
markets of sub-Saharan Africa Sig. (2-tailed) 0.034 0.002
Adaptation of price strategy to the Pearson correlation 0.168 0.317*
markets of sub-Saharan Africa Sig. (2-tailed) 0.294 0.044
Adaptation of promotion strategy to Pearson correlation 0.280 0.456**
the markets of sub-Saharan Africa Sig. (2-tailed) 0.076 0.003
Adaptation of distribution strategy to Pearson correlation 0.338* 0.491**
the markets of sub-Saharan Africa Sig. (2-tailed) 0.031 0.001
Firm characteristics: federal state the Spearman’s rho 0.066 −0.135
company is located in Correlation
coefficient
Sig. (2-tailed) 0.681 0.399
Firm characteristics: company’s Pearson correlation 0.219 0.357*
image in sub-Saharan Africa is … Sig. (2-tailed) 0.169 0.022
Firm characteristics: willingness to Pearson Correlation 0.463** 0.542**
deal with risks in sub-Saharan Africa Sig. (2-tailed) 0.002 0.000
caused by insufficient experience in
the region is …
Firm characteristics: product/service Pearson Correlation −0.001 0.187
quality Sig. (2-tailed) 0.996 0.241
Firm characteristics: product/service Pearson correlation 0.034 0.157
sustainability Sig. (2-tailed) 0.831 0.326
Firm characteristics: our firm keeps Pearson correlation 0.322* 0.282
up to date with relevant export Sig. (2-tailed) 0.040 0.074
market information
Firm characteristics: research and Pearson correlation 0.076 0.119
development Sig. (2-tailed) 0.635 0.458
(continued)

Overall Export
EP profitability
Firm characteristics: resources in Pearson correlation 0.093 0.257
managerial, financial, and staff Sig. (2-tailed) 0.564 0.105
endowments
Managerial characteristics and Pearson correlation 0.033 0.135
relationships: network in the Sig. (2-tailed) 0.837 0.401
industrial sector in the home country
Managerial characteristics and Pearson correlation 0.086 0.319*
relationships: export commitment Sig. (2-tailed) 0.591 0.042
and support
Managerial characteristics and Pearson correlation 0.100 0.350*
relationships: international business Sig. (2-tailed) 0.533 0.025
knowledge
Managerial characteristics and Pearson correlation 0.356* 0.426**
relationships: social competencies Sig. (2-tailed) 0.022 0.006
Managerial characteristics and Pearson correlation 0.242 0.399**
relationships: access to information Sig. (2-tailed) 0.128 0.010
about foreign market/opportunities
Managerial characteristics and Pearson correlation 0.267 0.518**
relationships: attitude toward the Sig. (2-tailed) 0.091 0.001
region in involved departments
Relationship with foreign Pearson correlation 0.377* 0.198
intermediaries: Sig. (2-tailed) 0.015 0.216
commitment/cooperation with
intermediaries
intermediaries: trust in intermediaries Sig. (2-tailed) 0.021 0.123
intermediaries: information exchange Sig. (2-tailed) 0.048 0.377
Relationship with foreign Pearson correlation 0.290 0.094
intermediaries: output control Sig. (2-tailed) 0.066 0.557
intermediaries: process control Sig. (2-tailed) 0.067 0.358
Relationship with foreign Pearson correlation 0.423** 0.296
intermediaries: flexibility Sig. (2-tailed) 0.006 0.060
Relationship with foreign Pearson correlation 0.058 −0.044
intermediaries: relative dependence Sig. (2-tailed) 0.720 0.786
on intermediaries
intermediaries: integration Sig. (2-tailed) 0.092 0.584
Relationships with customers and Pearson correlation 0.018 −0.033
customer characteristics: need of Sig. (2-tailed) 0.913 0.836
bribery to get contracts
(continued)
224 J.O. Bockmann

Overall Export
EP profitability
Relationships with customers and Pearson correlation 0.353* 0.387*
customer characteristics: continuous Sig. (2-tailed) 0.024 0.012
physical presence in the foreign
market
Relationships with customers and Pearson correlation 0.332* 0.479**
customer characteristics: price Sig. (2-tailed) 0.034 0.002
sensitivity of customers regarding
product/service
Relationships with customers and Pearson correlation 0.141 0.483**
customer characteristics: customer Sig. (2-tailed) 0.380 0.001
sensitivity concerning product
origin/image of company’s home
country …
Relationships with customers and Pearson correlation −0.135 0.156
customer characteristics: power of Sig. (2-tailed) 0.399 0.330
customers
Relationships with customers and Pearson correlation 0.263 0.496**
customer characteristics: developing Sig. (2-tailed) 0.096 0.001
and maintaining relationships with
export customers
Concerning your exports to Spearman’s rho −0.239 −0.242
sub-Saharan Africa: do you sell more Correlation
proactively or reactively? coefficient
Sig. (2-tailed) 0.132 0.128
Do you provide after sales services? Spearman’s rho 0.305 0.384*
correlation
Coefficient
Sig. (2-tailed) 0.052 0.013
Note
**Correlation is significant at the 0.01 level (2-tailed)
*Correlation is significant at the 0.05 level (2-tailed) n = 41
10.5.4 Multiple Regression Analysis with Dependent

Factor of Overall EP
Stepwise multiple regressions were carried out using SPSS. As suggested by

Anderson et al. (2014), for all multiple regressions a 0.05 alpha was used to add and
0.10 to remove determinants. Further, an appropriate procedure was guaranteed
thanks to a sample size of at least 40 participants, multi-collinearity and singularity,
ensuring no influence of outliers as well as normality and linearity (Pallant 2013).
Table 10.10 gives details about the variables selected for the stepwise multiple
regression analysis. Three different models with either one, two or three indepen-
dent variables were constructed.
Table 10.9 Pearson correlation/Spearman of macro-environmental factors with overall EP/export

profitability
Germany: availability of export Pearson correlation −0.186 −0.046
financing programs Sig. (2-tailed) 0.244 0.776
Germany: availability of export Pearson correlation −0.310* −0.053
guarantees Sig. (2-tailed) 0.048 0.741
Germany: offset agreements between Pearson correlation 0.044 0.215
Germany and SSA Sig. (2-tailed) 0.786 0.176
Germany: export assistance Pearson correlation −0.101 −0.061
Sig. (2-tailed) 0.530 0.705
Germany: home country’s legal Pearson correlation 0.027 −0.162
environment Sig. (2-tailed) 0.867 0.312
Germany: home country’s political Pearson correlation −0.035 −0.119
influence Sig. (2-tailed) 0.826 0.459
SSA: environmental turbulences Pearson correlation −0.233 −0.212
Sig. (2-tailed) 0.143 0.183
SSA: local partners to finance Pearson correlation 0.020 0.075
projects Sig. (2-tailed) 0.899 0.641
SSA: bribery to fulfill contract Pearson correlation −0.089 −0.283
obligations Sig. (2-tailed) 0.578 0.073
SSA: customs and tariffs Pearson correlation −0.138 −0.417**
Sig. (2-tailed) 0.390 0.007
SSA: ecological environment Pearson correlation −0.326* −0.220
Sig. (2-tailed) 0.037 0.167
SSA: economic policies Pearson correlation −0.153 0.068
Sig. (2-tailed) 0.338 0.674
SSA: foreign exchange rate Pearson correlation 0.174 0.301
Sig. (2-tailed) 0.275 0.056
SSA: legal influences Pearson correlation −0.025 0.161
Sig. (2-tailed) 0.875 0.316
SSA: political influences Pearson correlation 0.100 0.238
Sig. (2-tailed) 0.536 0.134
SSA: social environment Pearson correlation −0.043 −0.036
Sig. (2-tailed) 0.789 0.825
SSA: technical environment Pearson correlation −0.095 −0.155
Sig. (2-tailed) 0.554 0.334
SSA: GDP Pearson correlation 0.018 −0.083
Sig. (2-tailed) 0.910 0.604
SSA: infrastructure Pearson correlation 0.118 0.083
Sig. (2-tailed) 0.462 0.605
SSA: level of competition Pearson correlation 0.391* 0.074
Sig. (2-tailed) 0.011 0.646
(continued)
226 J.O. Bockmann

SSA: psychic distance Pearson correlation 0.021 0.119
Sig. (2-tailed) 0.899 0.457
SSA: market distance Pearson correlation −0.087 0.013
Sig. (2-tailed) 0.588 0.937
SSA: mining/export of oil and rare Pearson correlation 0.081 0.069
earth elements Sig. (2-tailed) 0.613 0.670
SSA: regulations Pearson correlation 0.157 0.089
Sig. (2-tailed) 0.327 0.579
Note
**Correlation is significant at the 0.01 level (2-tailed)
*Correlation is significant at the 0.05 level (2-tailed) n = 41
Table 10.10 Variables entered/removed during the stepwise multiple regression analysis
(dependent factor overall EP)
Model Variables entered Variables Method
removed
1 Firm characteristics: willingness . Stepwise (criteria:
to deal with risks in sub-Saharan probability-of-F-
Africa caused by insufficient to-enter 0.050,
experience in the region is … Probability-of-F-
2 SSA: level of competition . to-remove 0.100)
3 SSA: ecological environment .
As recommended by Pallant (2013), for relatively small sample sizes the model
summary is evaluated regarding the adjusted R square which helps understand the
degree to which each model represents the variance of the dependent variable. It
turns out that Model 1 explains 19.4%; Model 2, 31.9%; and Model 3, 41.1% of the
variance of overall EP. See Table 10.11.
To determine the statistical significance of the three models, the ANOVA tables
were checked. All three models reached an overall statistical significance since in
each case p < 0.01 (Pallant 2013). In each of the three models, all independent
variables had a significance value of <0.05. This indicates that all variables made a
significant statistical contribution to the prediction of overall EP (Pallant 2013).
According to Pallant (2013), an adjusted R square of 0.411 for Model 3 is quite a
respectable result since it explains 41.1% of the variance in overall EP. The Mastery
Scale of the third-factor ecological environment in SSA has a part-correlation
coefficient of −0.32. The squared value 0.1024 indicates that 10.24% of the vari-
ance in overall EP is attributable to the ecological environment. The same proce-
dure shows that the level of competition makes a unique contribution of 11.09%
and that of willingness to deal with risks in SSA 22.09% (Pallant 2013; Tabachnick
and Fidell 2013).
Table 10.11 Model summary of stepwise multiple regression analysis with dependent factor
overall EP
Model R R2 Adjusted R2 Std. error of the estimate
a
1 0.463 0.215 0.194 1.214
2 0.594b 0.353 0.319 1.116
c
3 0.675 0.455 0.411 1.038
Note Significant at 1% (a), 5% (b) and 10% (c) levels of significance
Following Tabachnick and Fidell (2013), the regression equation was formulated
using the unstandardized coefficient B selected from Model 3. Regression equation
for overall EP is obtained from:
Y ¼ b1 x1 þ b2 x2 b3 x3
where
Y Overall EP (seven-point Likert scale)
x1 Willingness to deal with risks in SSA (seven-point Likert scale)
x2 Level of competition in SSA (seven-point Likert scale)
x3 Ecological environment in SSA (seven-point Likert scale)
Overall EP ¼ 1:111
þ 0:501 Willingness to deal with risks in SSA
þ 0:281 Level of competition in SSA
0:233 Ecological environment in SSA
With values entered on a seven-point Likert scale, the results are shown on this
scale as well. The equation demonstrates that the willingness of the managers to
deal with risks had the greatest positive influence on overall EP. A change of one
point in the Likert scale increased overall EP by 0.501 Likert points. Since this
factor has not been researched before, no comparisons with existing literature can
be done.
Also, the level of competition in SSA had a positive influence on the dependent
factor. A change of one point led to a change of 0.281 Likert points. This confirms
Matanda and Freeman (2009) and Sousa and Novello’s (2014) works who identi-
fied a positive relation. However, Cadogan et al. (2012), Lee and Griffith (2004),
and Navarro-García et al. (2015) found a negative relation in their research.
The ecological environment had the smallest (yet negative) influence. Higher
ecological standards resulted in a lower overall EP; an improvement by one Likert
point was associated with a decrease of 0.233. Again, a comparison with existing
literature is not possible since this factor, which emerged during the semi-structured
interviews, has not been researched before.
228 J.O. Bockmann
10.5.5 Multiple Regression Analysis with Dependent Factor

Export Profitability
Table 10.12 shows the variables that were selected during the stepwise multiple
regression analysis.
To ensure that the statistical significance is given, the ANOVA was checked
again. Model 11, explaining 80.4% of the variance in export profitability, was
selected since it had the highest adjusted R square (Pallant 2013) (Table 10.13).
Following Tabachnick and Fidell (2013), the subsequent regression equation
was formulated based on the unstandardized coefficient B. Regression equation for
export profitability:
Y ¼ b1 x1 b2 x2 þ b3 x3 þ b4 x4 b5 x5 þ b6 x6 þ b7 x7 þ b8 x8 þ b9 x9
where
Y Export profitability (seven-point Likert scale)
x1 Customer sensitivity for product origin (seven-point Likert scale)
x2 Customs and tariffs in SSA (seven-point Likert scale)
x3 Psychic distance (seven-point Likert scale)
x4 Adaptation of product strategy (seven-point Likert scale)
x5 Network in industrial sector in home country (seven-point Likert scale)
x6 Updating with market information (seven-point Likert scale)
x7 Foreign exchange rate (seven-point Likert scale)
x8 Research and development (seven-point Likert scale)
x9 Dependence on intermediaries (seven-point Likert scale)
Export Profitability ¼ 2:228

þ 0:420 Customer sensitivity for product origin
0:402 Customs and tariffs in SSA
þ 0:351 Psychic distance
þ 0:580 Adaptation of product strategy
0:566 Network in industrial sector in home country
þ 0:388 Updating with market information
þ 0:271 Foreign exchange rate
þ 0:181 Research and development
þ 0:172 Dependence on intermediaries
The positive influence of customer sensitivity to product origin previously

mentioned during the interview has been confirmed by Lacka and Stefko (2014) for
Poland before.
Table 10.12 Variables entered/removed during stepwise multiple regression analysis (dependent
factor export profitability)
Model Variables entered Variables removed Method
1 Firm characteristics: . Stepwise (criteria:
willingness to deal with probability-of-F-
risks in sub-Saharan Africa to-enter 0.050,
caused by insufficient probability-of-F-
experience in the region is to-remove 0.100)
…
2 Relationships with .
customers and customer
characteristics: customer
sensitivity concerning
product origin/image of
company’s home country
…
3 SSA: customs and tariffs .
4 SSA: psychic distance .
5 Adaptation of product .
strategy to the markets of
sub-Saharan Africa
6 Managerial characteristics .
and relationships: network
in the industrial sector in
home country
7 Firm characteristics: our .
firm keeps up to date with
relevant export market
information
8 SSA: foreign exchange rate .
9 . Firm characteristics:
willingness to deal
with risks in
sub-Saharan Africa
…
10 Firm characteristics: .
research and development
11 Relationship with foreign .
intermediaries: Relative
dependence on
intermediaries
The negative influence of customs and tariffs in SSA confirms the results from
the semi-structured interviews. Although Baldauf et al. (2000) consider this factor
to have a neutral influence, most researchers (e.g., Fugazza and McLaren 2014;
Jordan 2014; Kahiya and Dean 2014) have proved a negative influence.
230 J.O. Bockmann
Table 10.13 Model summary of stepwise multiple regression analysis with dependent factor
export profitability
Model R R square Adjusted R square Std. error of the estimate
1 0.542a 0.294 0.276 1.094
2 0.637b 0.406 0.375 1.016
c
3 0.713 0.509 0.469 0.936
4 0.776d 0.602 0.558 0.854
5 0.808e 0.652 0.603 0.810
6 0.834f 0.695 0.641 0.770
7 0.877g 0.769 0.720 0.680
8 0.894h 0.799 0.749 0.644
9 0.889i 0.790 0.745 0.648
10 0.907j 0.823 0.779 0.605
11 0.921k 0.848 0.804 0.569
Note Significant at 1% (a), 5% (b) and 10% (c) levels of significance
According to the regression equation, psychic distance has a positive influence

on export profitability. The same effect has been established for other regions, for
example, by Lee and Griffith (2004), Sousa et al. (2014) and Stoian et al. (2011).
The positive influence of the adaptation of product strategy confirms Lado et al.
(2004), Lee and Griffith (2004), and Shoham et al.’s (2002) results. For the regions
they researched, they found a positive influence of this factor. However, Freeman
and Styles’ (2014) research about Australian firms showed a neutral influence. This
indicates that the factor adaptation of product strategy may have a positive influence
in some regions and is relevant for German SMEs which target SSA.
The negative impact of networking activities in the industrial sector cannot be
explained. Since this factor, mentioned during the semi-structured interviews, has
not been researched before no comparisons with existing literature are possible.
In the regression equation updating with market information has a positive
influence. Lately, Freeman and Styles (2014) have also proved its positive effect on
EP.
The positive influence of research and development falls in line with Kahiya and
Dean’s (2014) findings.
Wierts et al. (2014) substantiated a positive influence of the foreign exchange
rate on EP. The regression equation related to export profitably confirms this.
However, Baldauf et al. (2000), Lacka and Stefko (2014), and Jordan (2014) came
to the conclusion that the foreign exchange rate had no significant influence.
Among all the positive relations, the positive influence of a dependence on
intermediaries is interesting. According to Porters’ five-forces, an increasing
dependence on intermediaries should rather be negative (Porter 2014). In SSA,
however, there is an unpredictable environment where local partners safeguard and
increase the chances of getting business. The price to pay is dependence (Foly 2013).
Similar to the regression equation for overall EP, all values were entered and
presented on a seven-point Likert scale.
10.5.6 Comparison of Multiple Regression Analyses Results

on Overall EP and Export Profitability
Both analyses indicate that the willingness to deal with risks in SSA has a high
impact on the dependent variables. All three models constructed with overall EP as
a dependent variable include this factor, whereas models relating to export prof-
itability exclude this factor from Model 8 onwards. Otherwise, all other variables
included in the various models differ. Therefore, decision makers wanting to
influence EP need to differentiate between the targets to overall EP or export
profitability and choose suitable strategies. These findings tally with suggestions
made by, for example, Sousa et al. (2008), Stoian et al. (2011), and Wheeler et al.
(2008), that different measurements for EP are necessary for adequate results.
10.6 Conclusion
Sousa et al. (2008) name EP as one of the most widely researched but least
understood areas of international marketing. Our paper, specifically analyzing the
EP of German SMEs targeting SSA, contributes to know-how in this field and fills a
research gap. It carried out and evaluated a comprehensive literature review,
semi-structured interviews, and a questionnaire survey. New questions were iden-
tified like why German SMEs tend to prefer exporting to countries with direct
access by sea.
The results prove that SSA has specific requirements for successful exports
which differ from other regions. This knowledge enables managers and policy-
makers to improve trade relations and to enhance their businesses.
10.7 Further Research
In order to generalize the findings, like in cases of Sousa et al. (2014), Stoian et al.
(2011), and Styles (2014), we suggest that the scope of work be extended to
additional home markets as well as foreign countries/regions. Since our paper
evaluated the whole of SSA without considering country specifics, additional
research focusing on individual target markets within SSA is desirable. Another
shortcoming of this paper lies in the fact that it covers only a specific time frame.
Longitudinal studies about German SMEs targeting SSA would be useful for
gaining further insights into their EP. It would also be useful to research individual
industries instead of multi-industries to find out if particular criteria need to be
considered (Stoian et al. 2011). Although there is no academic limit to the number
of independent and dependent variables for further analysis, two concrete ideas can
be derived from the suggestions made by respondents. They said that ‘area
232 J.O. Bockmann
competitiveness of German industry should be analyzed more deeply, also with

regard to raw materials’ and the aspect of ‘local content.’ However, in our study,
these valuable aspects were not included since the respective questionnaires were
received after data collection had been completed.
The collected data indicate that German SMEs have a tendency to export to
limited countries in SSA. They seem to be attracted to regions with direct access to
the sea. Additional research should be done to identify the reasons for this
preference.
References
Akyol A, Akehurst G (2003) An investigation of EP variations related to corporate export market

orientation. Eur Bus Rev 15(1):5–19
Anderson DR, Sweeney DJ, Williams TA, Freeman J, Shoesmith E (2014) statistics for business
and economics. Cengage Learing, Hampshire, p 3
Armstrong JS, Overton TS (1977) Estimating non-response bias in mail surveys. J Mark Res 14
(3):396–402
Baldauf A, Cravens DW, Wagner U (2000) Examining determinants of EP in small open
economies. J World Bus 35(1):61–79
Barney J (2002) Gaining and sustaining competitive advantage. Prentice Hall, New Jersey
Bijmolt THA, Zwart PS (1994) The impact of internal factors on the export success of Dutch small
and medium-sized firms. J Small Bus Manage 32(2):69–83
Boly A, Coniglio N, Prota F, Seric A (2014) Diaspora investments and firm EP in selected
sub-Saharan African countries. World Dev 59:422–433
Brase CH, Brase CP (2010) Understanding basic statistics. Brooks/Cole Cengage Learning,
Belmont
Cadogan JW, Diamantopoulos A, Siguaw J (2002) Export market-oriented activities: Their
antecedents and performance consequences. J Int Bus Stud 33(3):615–626
Cadogan JW, Sundqvist S, Puumalainen K, Salminen RT (2012) Strategic flexibilities and EP: the
moderating roles of export market-oriented behavior and the export environment. Eur J Mark
46(10):1418–1452
Castellacci F, Fevolden A (2014) Capable Companies or changing markets? Explaining the EP of
firms in the defence industry. Def Peace Econ, pp. 1–27
Catalog Sources World Development Indicators (2015) GDP (current US$). The World Bank
Group
Cavusgil ST, Zou S (1994) Marketing strategy-performance relationship: an investigation of the
empirical link in export market ventures. J Marketing 58(1):1–21
Collis J, Hussey R (2013) Business research: a practical guide for undergraduate and postgraduate
students, 4th edn. Palgrave Macmillan, Basingstoke
Collis J, Hussey R (2014) Business research: a practical guide for undergraduate and postgraduate
students, 4. Palgrave Macmillan, Basingstoke
Conner KR, Prahalad CK (1996) A resource based theory of the firm: knowledge versus
opportunism. Organ Sci 7:477–501
Cooper DR, Schindler PS (2014) Business research methods, vol 12. McGraw-Hill Education,
New York
Czinkota MR (1994) A national export assistance policy for new and growing businesses. J Int
Marketing 2(1):91–101
Dean DL, Mengüç B, Myers CP (2000) Revisiting firm characteristics, strategy, and EP
relationship. Ind Mark Manage 29(5):461–477
Dess GG, Robinson RB (1984) Measuring organizational performance in the absence of objective
measures: the case of privately-held firm and conglomerate business unit. Strateg Manag J
5:265–273
Dhanaraj C, Beamish PW (2003) A resource-based approach to the study of EP. J Small Bus
Manage 41(3):242–261
Easterby-Smith M, Thorpe R, Jackson P (2015) Management research, 5th edn. Sage Publications
Ltd, London
Felbermayr GJ, Yalcin E (2013) Export credit guarantees and EP: an empirical analysis for
Germany. World Econ 36(8):967–999
Fevolden AM, Herstad SJ, Sandven T (2015) Specialist supplier or systems integrator? The
relationship between competencies and EP in the Norwegian defence industry. Appl Econ Lett
22(2):153–157
Foly C (2013) Chancenkontinent Afrika. Bundesverband der Deutschen Industrie e.V, Berlin.
Available at: http://www.bdi.eu/images_content/GlobalisierungMaerkteUndHandel/BDI-
Umfrage_SSA.pdf (online)
Freeman J, Styles C (2014) Does location matter to EP? Int Mark Rev 31(2):181–208
Fugazza M, Mclaren A (2014) Market access, EP and survival: evidence from Peruvian Firms. Rev
Int Econ 22(3):599–624
Hardoon H (2013) Global corruption barometer. Transparency International, Berlin
Jordan AC (2014) The impact of trade facilitation factors on South Africa’s exports to a selection
of African countries. Dev Southern Afr 31(4):591–605
Kahiya ET, Dean DL (2014) EP: multiple predictors and multiple measures approach. Asia Pac J
Marketing Logistics 26(3):378–407
Katsikea CS, Leonidas S, Leonidou C, Morgan NA (2000) Firm-level EP assessment: review,
evaluation, and development. J Acad Mark Sci 28(4):493–511
Katsikea E, Theodosiou M, Morgan RE (2007) Managerial, organizational, and external drivers of
sales effectiveness in export market ventures. J Acad Mark Sci 35(2):270–283
Lacka I, Stefko O (2014) Key factors for development of export in Polish food sector. Organizacija
47(2):107–115
Lado N, Martínez-Ros E, Valenzuela A (2004) Identifying successful marketing strategies by
export regional destination. Int Mark Rev 21(6):573–597
Lages LF, Lages CR (2004) The ‘STEP’ scale. A measure of short term EP improvement. J Int
Marketing 12(1):36–56
Lages LF, Montgomery DB (2005) The relationship between export assistance and performance
improvement in Portuguese export ventures: an empirical test of the mediating role of pricing
strategy adaptation. Eur J Mark 39(7/8):755–784
Lee C, Griffith D (2004) The marketing strategy-performance relationship in an export-driven
developing economy: a Korean illustration. Int Mark Rev 21(3):321–334
Leonidou LC (1995) Export barriers: non-exporters’ perceptions. Int Mark Rev 12:4–25
Leonidou LC, Katsikea CS (2010) Integrative assessment of exporting research articles in business
journals during the period 1960–2007. J Bus Res 63(8):879–887
Leonidou LC, Katsikea CS, Samiee S (2002) Marketing strategy determinants of EP: a
meta-analysis. J Bus Res 55:51–67
Ling Yee L (2004) An examination of the foreign market knowledge of exporting firms based in
the People’s Republic of China: its determinants and effect on export intensity. Ind Mark
Manage 33(7):561–572
Matanda MJ, Freeman S (2009) Effect of perceived environmental uncertainty on
exporter-importer inter-organisational relationships and EP improvement. Int Bus Rev 18
(1):89–107
Merkel A (2014) Speech by Federal Chancellor Angela Merkel at the reception for the diplomatic
corps at the federal chancellery. In Federal Chancellor of Germany. The Press and Information
Office of the Federal Government, Berlin, p. 1
MoAE (2015). German Mittelstand: Motor der deutschen Wirtschaft. Bundesministerium für
Wirtschaft und Energie (BMWi) [Federal Ministry for Economic Affairs and Energy],
234 J.O. Bockmann
Öffentlichkeitsarbeit. Available at: http://www.bmwi.de/BMWi/Redaktion/PDF/Publikationen/

factbook-german-mittelstand,property=pdf,bereich=bmwi2012,sprache=de,rwb=true.pdf
(online)
Morgan NA, Kalek A, Katsikea CS (2004) Antecedents of export venture performance: a
theoretical model and empirical assessment. J Marketing 68(1):90–108
Nalcaci G, Yagci MI (2014) The effects of marketing capabilities on EP using resource-based
view: assessment on manufacturing companies. Proc Soc Behav Sci 148:671–679
Navarro A, Acedo FJ, Robson MJ, Ruzo E, Losada F (2010) Antecedents and consequences of
firms’ export commitment: an empirical study. J Int Marketing 18(3):41–61
Navarro-García A, Schmidt ACM, Rey-Moreno M (2015) Antecedents and consequences of
export entrepreneurship. J Bus Res 68(7):1532–1538
O’Cass A, Julian C (2003) Examining firm and environmental influences on export marketing mix
strategy and EP of Australian exporters. Eur J Mark 37(3/4):366–384
Pallant J (2013). SPSS survival manual—a step by step guide to data analysis using IBM SPSS,
vol 5. Mc Graw-Hill, Berkshire
Papadopoulos N, Martín-Martín P (2010) Toward a model of the relationship between
internationalization and EP. Int Bus Rev 19(4):388–406
Piercy NF, Cravens DW, Katsikea S (1997) Examining the role of buyer-seller relationships in
EP. J World Bus 32(1):73–86
Porter ME (2014) Wettbewerbsvorteile—Spitzenleistungen erreichen und behaupten, 3rd edn.
Campus Verlag, Frankfurt
Rambocasa M, Menesesb R, Monteiroc C, Brito PQ (2015) Direct or indirect channel structures.
Evaluating the impact of channel governance structure on EP. Int Bus Rev 24(1):124–132
Riddle L (2008) Diasporas: exploring their development potential. ESR Rev 10:28–35
Robertson C, Chetty SK (2000) A contingency-based approach to understanding EP. Int Bus Rev
9(2):211–235
Saunders M (2012) Choosing research participants. In: Symons G, Cassell C (eds) The practice of
qualitative organizational research: core methods and current challenges. Sage Publications,
London, pp 37–55
Shoham A, Evangelista F, Albaum G (2002) Strategic firm type and EP. Int Mark Rev 19(3):236–
258
Singh H, Mahmood R (2014) Aligning manufacturing strategy to EP of manufacturing small and
medium enterprises in Malaysia. Proc Soc Behav Sci 130:85–95
Sousa CMP (2004) EP measurement: an evaluation of the empirical research in the literature. Acad
Marketing Sci Rev 2004(9):1–22
Sousa CMP, Martínez-López FJ, Coelho F (2008) The determinants of EP: a review of the
research in the literature between 1998 and 2005. Int J Manag Rev 10(4):343–374
Sousa CMP, Novello S (2014) The influence of distributor support and price adaptation on the EP
of small and medium-sized enterprises. Int Small Bus J 32(4):359–385
Sousa CMP, Lengler JFB, Martínez-López FJ (2014) Testing for linear and quadratic effects
between price adaptation and EP: the impact of values and perceptions. J Small Bus Manage 52
(3):501–520
Stoian MC, Rialp A, Rialp J (2011) EP under the microscope: a glance through Spanish lenses. Int
Bus Rev 20(2):117–135
Sung B (2015) Public policy supports and EP of bioenergy technologies: a dynamic panel
approach. Renew Sustain Energy Rev 42:477–495
Tabachnick BG, Fidell LS (2013). Using multivariate statistics, vol 7. Pearson, Boston, p. 138
Tookey DA (1964) Factors associated with success in exporting. J Manage Stud 1(1):48–66
United Nations Statistics Division (2011) World Statistics pocketbook. Department of Economic
and Social Affairs, Statistics Division, USA
United Nations Statistics Division (2014). World statistics pocketbook. Department of Economic
and Social Affairs, Statistics Division
Wagner J (2014) Is export diversification good for profitability? First evidence for manufacturing
enterprises in Germany. Appl Econ 46(33):4083–4090
Ward PT, Duray R (2000) Manufacturing strategy in context: environment, competitive strategy
and manufacturing strategy. J Oper Manag 18(2):123–138
Wheeler C, Ibeh K, Dimitratos P (2008) UK EP research: review and implications. Int Small Bus J
26(2):207–239
Wierts P, Van Kerkhoff H, De Haan J (2014) Composition of exports and EP of Eurozone
countries. J Common Market Stud 52(4):928–941
WTO (World Trade Union) (2014). International trade statistics 2014. World Trade Organization
International Trade Statistics, pp 178–179
Yeoh PL, Jeong I (1995) Contingency relationships between entrepreneurship, export channel
structure and environment: a proposed conceptual model of EP. Eur J Mark 29:95–115
Zhao H, Zou S (2002) The impact of industry concentration and firm location on export propensity
and intensity: an empirical analysis of Chinese manufacturing firms. J Int Marketing 10(1):52–71
Zou S, Stan S (1998) The determinants of EP: a review of the empirical literature between 1987
and 1997. Int Mark Rev 15:333–356
Zou S, Taylor C, Osland G (1998) The EXPERF scale: a cross-national generalized EP measure.
J Int Marketing 6(3):37–58
Chapter 11
An Assessment of the Contribution
of Mineral Exports to Rwanda’s
Total Exports
Emmanuel Mushimiyimana
Abstract In 2012, the International Council on Mining and Metals (ICMM)

proved that mineral exports can be an alternative for increasing exports for agrarian,
low- and middle-income countries and that in the past two decades their contri-
bution to total exports increased from 30 to 60%. Based on this theory, we use an
econometric model and work with data techniques to test whether Rwanda main-
tained this pace from 1998 to 2014. Our results show that Rwanda did not manage
to reach that level since she only averaged 29.1%. Our findings show that if mineral
exports increase at 10%, total exports will increase at 7%. This implies that the
Government of Rwanda needs to bring in a lot of reforms in the mining sector and
take Botswana and Namibia as its role models.
Keywords Mineral export Governance Mining sector Resources Rwanda
11.1 Introduction
Modern mining started in Rwanda in the 1930s even though before colonialism
Rwandans heated tin for the production of traditional hoes, machetes, spears, and
other domestic material. The mining sector in Rwanda was started by Belgians who
got mining experience in southeastern DRC, in Katanga. Then, two companies
International Council on Mining and Metals was formed in 2001 to catalyze improved
performance and enhance the contribution of mining, minerals, and metals to sustainable
development.
E. Mushimiyimana (&)
Department of Political Science and International Relations, College of Arts and Social
Sciences, University of Rwanda, Butare, Rwanda
e-mail: manemanu12@yahoo.fr

DOI 10.1007/978-981-10-4451-9_11
238 E. Mushimiyimana
emerged—MINETAIN1 and SOMUKI.2 The two remained important in the mining

sector in Rwanda until its independence in 1962. In 1985, SOMIRWA became
bankrupt. In 1988, COPIMAR (Mining Cooperative of Artisan Miners) started
operations. In 1989, the government created another company REDEMI3 with an
investment of almost 100 million RWF. However, this company collapsed due to
the genocide.
After the genocide against the Tutsi in 1994, REDEMI continued to function but
without enough capacity since its infrastructure base was almost fully destroyed. In
2001, mineral exports recovered and reached 45.7% of Rwanda’s total exports.
Mineral revenue increased gradually: ‘In 2006, the Rwandan Minerals Industry set
revenue targets of $54 million and $63 million for 2007 and 2008 respectively. The
targets were exceeded with revenues of $71 million in 2007 and $93 million in
2008. In 2011, the export revenue reached to $156 million and $136 million in 2012
and US$228 million in 2013. The performance of this sector is due to strengthened
supervision regulation, availability of new data for investor’s interest and the
support for value addition in metallic ores and quarries. The main issue with
Rwanda’s mineral exports is to increase the scale at which the current mineral
exports are produced’ (RNRA 2014: 1).
In 2007, the Office of Geology and Mines replaced REDEMI. The government
was in a period of privatizing most of its companies. In 2008, the Government of
Rwanda contracted South African company, New Resolutions Geophysics (NRG),
to carry out an aerial survey covering almost the whole country to acquire gravity
and new magnetic data for further understanding the subsurface and its possible
associated mineral potential. In 2011, OGMR changed its name and become the
Geology and Mines Department. Through the privatization process, the Rutongo
Mining Company replaced most of the public shares and organizational parts of the
mining sector of cassiterites (tin ore). Mining deposits were liberalized to include
private firms. Actually, the government privatized mining concessions for
improving performance.
The Government of Rwanda sets up prospective target areas (PTAs) to delineate
and quantify mineral resources. The government has invested in exploration works
in PTAs to generate geology data to be used by mineral exploration companies
(RNRA 2014). It has also enacted a mining law allowing the right to exploit three
categories of mines—artisanal, small scale, and large scale—to any person/
company with proven technical expertise and financial capacity to develop and run
a mining project (RNRA 2014). Industrial mining is yet to intensify in Rwanda
since what exists right now is artisan mining. There is a need for modern tech-
nology and mechanization in this sector. Among, the needed equipments are dril-
lers, bulldozers, and gravity table shakers. There is no value addition to Rwanda’s
mineral exports since they are exported as raw materials and not as metals in the
1
Société des Mines d’Etain du Ruanda-Urundi.
2
Société Minière de Muhinga-Kigali.
3
Régie d’Exploitation et de Développement des Mines.
11 An Assessment of the Contribution of Mineral … 239
case of metal resources such as tin and tungsten. ‘The establishment of processing
plants to smelt cassiterite into tin, refining wolframite and tantalite into tungsten and
tantalum respectively is open to private investors’ (RDB 2014: 1). The government
is committed to supporting over 400 local mining companies, and 30 cooperatives
are opened to consider partnerships and joint ventures, covering financing, capital
equipment, technical support, and competitive mineral trade contracts (RDB 2014).
Besides, there is a need to boost the exploitation of gemstones: ‘Rwanda possesses
a variety of gemstones including; beryl (aquamarine), amblygonite, corundum
(ruby and sapphire), tourmalines and different types of quartz and granites. Setting
up cutting and polishing plants of gemstones is also an opportunity’ (RDB 2014).
Trading of minerals is carried out by ‘holders of mining and mineral trading
licenses and owners of smelting and screening companies’ (RDB 2014: 2).
Rwanda’s target is ‘trading in minerals, including cassiterite, wolframite and nio-
bium—tantalite must contain at least 30% value added’ (RDB 2014). There is a
need to develop industrial minerals in order to meet the ‘demand for construction
materials especially tiles, slabs sculptures, paints, bricks and concrete aggregates.
Rwanda possesses a variety of minerals such as good quality silica sands, kaolin,
vermiculite, diatomite, clays, limestone, talcum, gypsum and pozzolan’ (RDB
2014). However, as compared to other countries, Rwanda’s performance in mineral
exports is yet to improve.
Botswana, for instance, used mineral resources as a source of income to finance
her expenditure for her independence. ‘Botswana’s success appears so exceptional
because the driving force behind Botswana’s economy has been its mineral sector’
(Dougherty 2011: 9). On the contrary, Rwanda considered the mining sector as a
subsidiary. Its main source of income has been aided and mineral exploitation has
remained weak since independence. However, due to the developmental needs of
the country in the twenty-first century, the policy is changing and the mining sector
is considered one of the strategic inputs that will help the country to sustain growth,
independence, and self-reliance. One does wonder about the means and way for-
ward to bring about positive changes though.
In comparison, Botswana’s strategy was to attract foreign direct investment
(FDI) and protect investors from any failures or to compensate them when they
failed. This helped the country to be FDI friendly, and it accumulated more and
more resources from abroad. Botswana’s openness to foreign assistance was also
reflected in its export-oriented productive structure. Initially, Botswana produced
beef and diamonds for export, but over time it diversified into non-traditional export
crops, mostly to South Africa (Dougherty 2011). The government was able to retain
a significant portion of the wealth generated by Botswana’s diamond mines through
a policy which rather than retaining a fixed percentage of the sales involved
profit-sharing agreements and a portion of equity in mining operations. This policy
allowed the government to retain significant shares of profitable ventures and fewer
shares of less profitable ventures; such a policy also did not deter new investors
(Dougherty 2011).
Further, interest in mining investments needs to be underpinned by an open
market economy. Restricted trade halts competitiveness. However, there should be
a sense of control of the mining sector since it is based on natural resources and has
both embedded advantages and risks. One of the mechanisms of controlling mining
companies is framing proper agreements.
Botswana signed an agreement with De Beers, a heavy investor in the country in
a contract based on production sharing. In this regard, there are four types of
contracts: license agreements, production-sharing agreements, joint ventures, and
service agreements. License agreements give more rights to a contracting firm such
as right to a mining concession, production, and exports. Production-sharing
agreements state that the state cedes all production and exporting authority to the
firm, but this usually involves an equity arrangement and higher returns to the
government in the long run. Under these two types of agreements, the government
does not shoulder any risks (Dougherty 2011). However, in the license agreement,
the government can lose total control of mining concessions. In joint ventures and
service agreements, a firm gets a limited right to mineral exploitation and trading
and the government controls the concessions and the trading of the production. The
consequences are that political elites who control the government use political
power to mismanage production. Consequently, the firm that works with the
government gets over-tightened. It is worth knowing the type of contracts that
Rwanda has signed with key mining companies as improvements in mineral exports
not only depend on the type of contract and natural resource endowments, but also
depend on the diversification of mineral products for exports.
Namibia is a sound example of successful mining of gold and dimension stones
such as granite and marble; Rwanda too has potential in these minerals. Some
minerals that have been left behind are currently important given the fact that Africa
is modernizing with both styles and sizes. For instance, Rwanda has a new industry
that processes granite—the East African Granite Industry Ltd. Namibia exports
granite. It has gold in Miyove in the Northern Province. In 2011, Simba Gold
Corp. of Canada engaged in soil and rock sampling at its Miyove Gold project. In
November, Desert Gold Ventures Inc. of Canada purchased the Byumba conces-
sion, which had resources of 5.55 million metric tons at a grade of 1.48 g per metric
ton gold. Desert Gold and Simba planned to drill at Byumba and Miyove Gold,
respectively, in 2012 (Desert Gold Ventures Inc. 2012). Since gold is a precious and
lucrative metal worldwide, its exports can yield enough money for Rwanda once it
is well exploited.
In short, Namibia and Botswana are role models for sub-Saharan African
countries as they have enhanced their economic development by strengthening their
mining sectors. Though unlike some other sub-Saharan countries, Rwanda has not
extracted diamonds and oil as yet she has gold, cassiterite, and tantalum in addition
to methane gas, granite, and other types of dimension stones. The necessary thing is
to boost production and attract more foreign direct investment in order to generate
more income from mineral exports.
Our research hinges on the hypothesis that the exports of mineral resources can
contribute significantly and progressively to Rwanda’s total export revenue as has
happened in other low- and middle-income countries. In our research, we use
econometric methods to investigate the contribution of Rwanda’s mineral exports to
total exports from 1998 to 2014. The literature review discusses recent theories
developed by ICMM that argue that mineral exports increased in value from 2005
to 2010 and this has proven to have played a significant role in enhancing sus-
tainable economies and reducing poverty in developing nations. The contribution of
our research is in testing whether this ICMM theory is applicable to Rwanda from
1998 to 2014. It also looks at different perspectives that Botswana and Namibia
have used to reach high levels of mineral production and exports and thus highlight
the way that Rwanda can follow these African role models in the mining sector.
The research outcomes show that if mineral exports increase by 10%, then total
exports will increase by 7%. The probability calculated Pr = 0.00 is inferior (<) to
0.05. Therefore, there is a significant contribution of mineral exports (MINEX) to
total exports (TOTEX), considering the significance level of 5%. The recommen-
dation is that the Government of Rwanda can set up mechanisms to boost mineral
exploitation both at her domestic mineral sites and in neighboring countries through
private companies or public-private joint ventures. The government should respect
the legalization standards set up regionally and internationally so that the revenue
from mining empowers the state and the region instead of destroying it (Collier and
Hoeffler 2002).
Our study concludes that Rwanda did not reach the minimum average level of
contribution of mineral exports to total exports which was between 30 and 60%
according to ICMM. It is also argued that the pace is still slow for the country to
reach other low- and middle-income countries because even if Rwanda increases
mineral exports by 10%, ceteris paribus, total exports will only increase by 7%.
Instead, Rwanda needs to increase her mineral exports to at least 50% in order to
have a 35% increase in total exports or achieve a 100% increase in mineral exports
in order to have a 70% increase in total exports. Therefore, there is a need to reform
the mining sector by referring to role models such as Botswana and Namibia.
Our research hinges on the ICMM theory that mineral exports increase rapidly to
become a major share of total exports in low-income agrarian economies even when
they start from a low base. Developing countries’ exports are less than their
imports, and this implies that the LDCs4 balance of payments is always in deficit.
Increasing exports is a good way of boosting the economy. Increasing exports
implies that the government earns more foreign currency to be able to purchase the
commodities that the country needs to import for economic sustainability and the
welfare of its citizens. In a framework of self-reliance, the government of Rwanda is
4
Less developed countries or developing nations with GDP less than US$5000 per capita.
looking at lowering its aid dependency and building an economy based on pro-
duction, accumulation of FDI, and expansion of other sectors such as services and
industries. The key sectors in Rwanda have been mainly agriculture, industry, and
services. According to Minister Gatete, the service sector was the main contributor
to the country’s GDP in 2011: ‘The Service sector contributed 45% of GDP
compared to 33 and 16% of agriculture and industrial sectors respectively. The
Service sector had the highest growth of 12% followed by Industry 7% and agri-
culture 3%.’ Based on the Prebisch–Singer hypothesis: ‘(a country) with high
export dependence on primary products5 stands to lose out from a worsening of the
terms of trade’ (Riley 2012), ICMM posits that the contribution of mineral
resources to the accumulation of FDI and to total exports is high at a level of 60–90
and 30–60%, respectively, while it is limited and very low to government revenue
(2–20%), national income (3–10%), and total employment (1–2%) in low- and
middle-income countries.
On the one hand, mining FDI often dominates total FDI flows in low-income
economies that have only limited other attractions for international capital; on the
other hand, mineral exports can increase rapidly to become a major share of total
exports (ICMM 2012). These are the domains in which mineral resources have
provided considerable outputs in the last two decades. However, without a con-
siderable increase in government revenue, income, and employment, no one can
assure the role of the mining sector in a more sustainable economy in a developing
nation. The mining sector has contributed to the growth of countries such as
Botswana and Namibia (Dougherty 2011), while it has also led to a reverse out-
come, namely a resource curse or put the countries at high risk (Collier and Hoeffler
2002; Global Witness 2010). In sub-Saharan Africa, the countries endangered by
mineral resources are Sierra Leone, Zimbabwe, DRC, and Angola. Therefore,
accumulation of FDI and increase in total exports go hand in hand with strategies
for the government to get a considerable share in mining revenue, otherwise
minerals will only raise profits for companies rather than for states and societies.
Mineral taxation has become a very significant source of tax revenue in many
low-income economies with limited tax-raising capacities (2–20%) (ICMM 2012).
However, this is not high because of lack of institutional capacity to tax mineral
exploiters and having mining concessions that are dominated by informal trade.
Moreover, some low- and middle-income nations have corrupt tax systems or
inefficiencies in managing collected money and other resources.
Mineral exports of some developing nations lack value addition since they
export raw materials. The modern mineral process technology is sophisticated and
requires intensive capital (ICMM 2012) and skilled labor to be more effective for
total exports. Wright and Czelusta (2004) argue that it is no coincidence that
countries’ exports of minerals and metals tend to emerge across multiple
5
Goods with low levels of processing, diversification and raw materials.
commodities in concert. Davis (2009) has argued that many countries have multiple
and various mineral endowments that are there for the taking, and mineral
extraction is a matter of domestic public interest, supported by sufficient
country-specific technological knowledge and in some cases technological advan-
ces that lead to production and exports across a broad range of endowments.
According to Davis (2009), a mining policy is important for potential augmentation
of endowments. For instance, Chile was a major exporter of copper in the 1800s,
which then fell away as its high grade deposits got exhausted and there was no
national consensus for supporting the industry. Production surged again in the
mid-1900s as government support for mining was renewed (Davis 2009: 5). In
actual fact, the main difficulties lay in the link between mineral income profitability
and the welfare of citizens.
Mining employment on its own is usually small relative to the total national
labor force (ICMM 2012) because the mining sector is developing and using more
machines than man power. This means that for minerals to be profitable for the
people and the economy in general, economic distribution is important. Other
findings also show that countries with mineral endowments become poorer than
those without mining concessions. Zimbabwe and Nigeria are an illustration of this.
‘Zimbabwe is a country tremendously blessed with vast and diverse precious stones
ranging from gold, chrome, lithium, asbestos, and cesium, as well as high-quality
emeralds and other minerals and metals’ (Mahonye and Mandishara 2015: 1–2).
Since independence, the mining sector has contributed an average of about 40% to
total exports (Hawkins 2009) with the major share coming from gold and other
minerals such as ferrochrome, nickel, and platinum. This, however, still falls in the
range of low-income countries with many people under the poverty line. In another
case, Mills (2010) highlights that Nigeria despite having earned an estimated
US$400 billion from oil in the past 40 years has the number of Nigerians living
under US$1 per day increasing consistently. Says Mills (2010: 171b): ‘Nigeria
would have been better- by some estimates the economy would have been 25%
bigger- if the Niger delta had no oil.’ Table 11.1 shows that not only have the
countries in the Great Lakes region misused natural resources for their economic
growth but also that the contribution of mineral exports was very poor in the other
countries in the same sub-Saharan region. This implies that mining policies in the
Great Lakes region in general and in Rwanda in particular should be taken
seriously.
Our research uses the ICMM theory that mineral resources can rapidly contribute
to total exports even if the economy of that country is agrarian. Therefore, we rely
on ICMM’s measurable data highlighted earlier besides Davis’ (2009) theory
referred to earlier which argues that the development of mineral exports does not
depend on an abundance of natural reservoirs but mostly on policy choices to
develop an added value for minerals for export and increasing their endowments in
the national economy. The contribution of our research is that it tests the appli-
cability of the existing knowledge to the Rwandan situation and tests the position
and pace of Rwanda as one of the low-income countries in the area.
244
Table 11.1 Mineral resources and the GDP PPP per capita of GLR countries as compared to advanced countries in the mining sector in the sub-Saharan
region
Great lakes region Other sub-Saharan countries with mining efficiency
Country name and her natural resources GDP GDP Country name GDP GDP
PPP per PPP per PPP per PPP per
capita capita capita capita
2012 2013 2012 2013
Burundi: nickel, uranium, rare earth oxides, peat, $600 $600 Botswana: diamonds, copper, nickel, salt, soda $15,900 $16,400
cobalt, copper, platinum, vanadium, arable land, ash, potash, coal, iron ore, silver
hydropower, niobium, tantalum, gold, tin, tungsten,
kaolin, limestone
DRC: cobalt, copper, niobium, tantalum, petroleum, $400 $400 The Republic of the Congo: petroleum, timber, $4700 $4800
industrial and gem diamonds, gold, silver, zinc, potash, lead, zinc, uranium, copper, phosphates,
manganese, tin, uranium, coal, hydropower, timber gold, magnesium, natural gas, hydropower
Rwanda: gold, cassiterite (tin ore), wolframite $1500 $1500 Namibia: diamond, copper, uranium, gold, $7900 $8200
(tungsten ore), methane, hydropower, granites, sand, silver, lead, tin, lithium, cadmium, tungsten,
and arable land zinc, salt, hydropower
Source CIA world fact book (data value in US$ 2013)
E. Mushimiyimana
11.3 Methods
Our research used quantitative methods, especially econometrics. Econometrics is a

statistical and mathematical application to economic variables for testing and pre-
dicting future outcomes. Econometrics was coined by Ragnar Frisch (1895–1973)
of Norway, through the foundation of the Econometric Society and the Journal
Econometrica. Frisch described an economic society as an international society for
the advancement of economic theory in its relation to statistics and mathematics.
Frisch explained that statistics, economic theory, and mathematics were necessary
but were not sufficient conditions by themselves for a real understanding of the
quantitative relations in modern economic life. It is the unification of all three which
is powerful, and it is this unification that constitutes econometrics (Bjerkholt 1995).
Methods alone cannot be useful without use of research instruments employed in
data collection and analysis.
Our research used a triangulation of techniques such as documentary approach
and working with data to support the econometric analysis of Rwanda’s mineral
exports to total exports from 1998 to 2014. Our research also used a comparative
analysis of Rwandan mineral exports with other sub-Saharan countries like
Botswana and Namibia. The Eviews tool was used in econometrics to calculate the
contribution of mineral exports to total exports and to see whether there was a
significant effect of the former on the latter. The method critically assessed whether
Rwandan mineral exports were moving at the pace of other low- and
middle-income countries that are performing very well in mining exports as
highlighted by ICMM.
11.4 Data
Our research used secondary data, official documents, and discourses related to
Rwandan exports. It also compared data from known sources such as the CIA
World Fact Book, the National Bank of Rwanda (BNR), and the Rwanda Natural
Resource Authority (RNRA). We visited BNR for a field visit and data gathering.
Table 11.1 gives information about mineral resources and GDP per capita of the
countries in the Great Lakes region (GLR) as compared to advanced countries in the
mining sector in the sub-Saharan region. From the table, it is clear that GLR’s
mineral resources did not contribute to the countries’ GDPs. Though our research
did not measure the rate of contribution of the mining sector to the rest of the
countries highlighted earlier due to the limitation of the scope of the research, it is
clear that countries such as Botswana and Namibia benefitted from good policies in
the mining sector to help them overcome poverty. Besides, Rwanda and GLR in
general have different mineral endowments. Development of Rwanda’s mineral
exports during 1999–2003 is shown in Table 11.2.
Table 11.2 Rwanda’s mineral exports (1999–2013)

Year Volume (tons) Value (US$ million)
1999 943 6.9
2000 1.012 12.6
2001 2.102 42.6
2002 2.083 15.9
2003 2.599 11.1
2004 5.082 29.3
2005 6.465 37.3
2006 6.187 37.0
2007 8.283 70.6
2008 7.364 94.0
2009 7.960 54.6
2010 8.406 71.0
2011 9.697 158.0
2012 7.588 136.3
2013 7.639 226.2
Source RNRA (2014)
Table 11.3 shows Rwanda’s annual export earnings and annual contribution of
mineral exports during 1995–2013. There is a strong and positive trend in both
indicators over time.
Table 11.3 shows that the contribution of mineral exports to total exports, cal-
culated in percentages, increased from 1995 to 2001, and went downward and
upward in a U-shaped curve from 2001 to 2005. It increased again in 2008 to take a
stable position in 2010 and 2013 (see also Table 11.4). However, though there was
a positive increase in general, mineral exports were in a sharp upward move from
1995 to 2001 while positively uneven from 2002 to 2012 (see Fig. 11.1).
Table 11.3 Annual contribution of mineral exports to total export of Rwanda since 1995 (in %)
Year Export earnings (US$ million) Contribution of mineral exports (%)
1995 1.5 3.0
1996 2.3 3.7
1997 3.8 4.1
1998 4.7 7.3
1999 6.9 11.2
2000 12.6 18.2
2001 42.6 45.6
2002 15.9 23.6
2003 11.1 17.5
2004 29.3 29.9
2005 37.3 29.9
(continued)

Year Export earnings (US$ million) Contribution of mineral exports (%)
2006 37.0 24.8
2007 70.6 40.0
2008 94.0 40.0
2009 54.6 30.0
2010 71.0 30.0
2011 158.0 30.0
2012 136.3 28.3
2013 226.2 31.0
Average in % 29.1
Source RNRA (2014)
Table 11.4 Mineral exports contribution and total exports

Year MINEX (US$) TOTEX (US$)
1998 4,690,000 64,140,000
1999 6,930,000 62,010,000
2000 12,580,000 69,040,000
2001 42,630,000 93,550,000
2002 15,870,000 67,360,000
2003 11,080,000 63,030,000
2004 29,280,000 98,110,000
2005 37,300,000 124,980,000
2006 36,570,000 147,380,000
2007 70,620,000 176,770,000
2008 92,350,000 264,820,000
2009 55,430,000 234,970,000
2010 67,850,000 297,280,000
2011 151,430,000 464,240,000
2012 136,070,000 590,750,000
2013 225,700,000 703,010,000
2014 203,320,000 723,090,000
Source BNR (2015)
11.5.1 Calculation of Predictability of Increase in Mineral

Resource Export Value
Figure 11.1 shows the increase in mineral revenues from 1998 to 2014 (drawn from
Table 11.2). There is a prediction that in 2020, mineral exports will be equal to or
Fig. 11.1 Prediction of increase in revenue from mineral exports in Rwanda
more than US$300 million. A scatter plot of mineral export data for Rwanda was
done between 1999 and 2013 to find the progress in generating revenue.
The results as shown in Table 11.4 and Fig. 11.1 are that the revenue accrued
was almost US$20 million to US$250 million in 2013. This shows how progressive
mineral income has been for Rwanda’s total revenue in the last 15 years. The linear
shape of the scatter plot shows that Rwanda will continue to get more and more
mineral revenue in the coming years, if other factors remain constant.
Though revenues from mineral exports increased positively from 2000 to 2013,
Fig. 11.2 shows that there were some downfalls in 2003, 2009, and 2012 and the
effect on total revenue, in percentage, decreased little by little in 2003 and 2006, to
be constant at almost 30% from 2009 to 2013. The effect in percentage is still low
Fig. 11.2 Contribution of mineral exports to total exports and earnings for Rwanda (in %)
though the real income from mineral exports increased sharply due to improve-
ments of other sectors in Rwanda’s GDP; this was mainly the service sector which
has taken the lead in the last few years. This is also quite similar to the Rwanda
Development Board’s (RDB 2014) position and prediction: ‘In the last three years,
mineral exports recorded USD 96.4M (2010), USD 15.4M (2011) and USD
136.1M (2012). The sub-sector’s contribution to GDP is to increase from 1.2 to
5.27% (10% growth rate per each year) up to 2017/2018.’
11.5.2 Specification of the Econometric Model
This model refers to the fact that the more the mineral export revenue (LMINEX)
increases, the more it significantly increases Rwanda’s total exports (LTOTEX). If
the total export revenue increases at a high pace, then Rwanda’s balance of pay-
ments will be positive and the country will be able to finance most of its imports
and other public expenditure. Therefore, the econometric model will define the
contribution of mineral exports to total exports:
(1) LTOTEX = (b1 + b2LMINEX + et)
From Table 11.4 we get an econometric table, set in logarithmic data in order to
ease an interpretation of percentages (Table 11.5).
The estimation is that LTOTEX = 6.4318 + 0.71423 * LMINEX. This means if
mineral exports increase at 10%, total exports will increase at 7%. The probability
calculated Pr = 0.00 < 0.05. Therefore, there is a significant effect of mineral
revenue LMINEX to total export revenue LTOTEX, considering the level of sig-
nificance at the 5% level, but this pace is very slow considering the level of other
Table 11.5 Econometric model and results

Dependent variable: LTOTEX
Method: Least Squares
Date: 05/27/16 Time: 23:20
Sample: 1998, 2014
Variable Coefficient Std. Error t-statistic Prob.
C 6.431831 1.184529 5.429862 0.0001
LMINEX 0.714291 0.067415 10.59542 0.0000
R-squared 0.882134 Mean-dependent 18.95617
variable
Adjusted R-squared 0.874276 S.D-dependent variable 0.890265
S.E. of regression 0.315667 Akaike info criterion 0.641871
Sum of squared residuals 1.494681 Schwarz criterion 0.739896
Log likelihood −3.455902 F-statistic 112.2628
Durbin-Watson[aut]Watson, J. stat 0.775939 Prob. (F-statistic) 0.000000
Source Eviews data
performing low- and middle-income states as stipulated by ICMM. The estimation

of parameters of this model is very reliable and significant since the R-squared is
0.88 which means that this model explains the contribution of mineral exports to
total exports at 88%. Table 11.3 supports this by providing the average contribution
of 29.1% which is less but almost close to the worldwide average contribution of
mineral exports to total exports of 30–60% as highlighted by ICMM. This means
that though Rwanda is making some progress, like other low- and middle-income
countries she is still following a low pace in terms of the contribution of mineral
exports to total exports.
Rwanda is far away from Botswana and Namibia, which have average percentage
contribution of mineral exports to total exports of 83.7 and 53.4%, respectively
(ICMM 2012). The results of our econometric model show that if Rwanda wants to
reach the levels of these role models, she has to increase her mining sector’s
performance to 80 or 120%.
Based on the model used, our research recommends that the mining policy of
Rwanda should focus on: (1) setting up a main strategy to boost exports of min-
erals, (2) structure and industrialize the mining sector so that the exploitation and
production of minerals stay smooth and increase instead of being uneven with
decreases and increases in years and to add value especially by setting up refineries,
(3) determine the types of contracts that the government signs with firms. We
recommend production-sharing agreements instead of license agreements or any
other type of contract. Production-sharing agreements maximize the government’s
revenue while giving all rights of exploitation and exports to private firms, (4) the
Government of Rwanda needs to reallocate mineral incomes to other
pro-development policies such as education and infrastructure starting from where
mining concessions are given as collateral to local environment damage, (5) the
mining sector should go hand in hand with other public reforms such as good
governance and politics that decrease the gaps between the rich and the poor. Once
the government has accrued mining revenue, it can also help other sectors such as
manufacturing, agriculture, and industry to develop, (6) the mining sector needs
more modern technology and market openness to be more effective and efficient—
attracting efficient investors could be an added value, and (7) Rwanda needs to
develop not only cassiterite or tantalum production but also gold exploitation,
methane gas, and the processing of dimension stones such as granite like
Namibia did.
Our research concludes that mineral exports have not contributed considerably to
Rwanda’s total export revenues. However, Rwanda had a significant increase in
revenues from mineral resources between 1998 and 2014 but did not reach the
average contribution of mineral exports to total exports of 30–60% as highlighted
by ICMM. Minerals only contributed 29.1% to her total exports, and this implies
that Rwanda still has a lot to do in terms of improving its mining sector. We have
also seen that Botswana and Namibia in Africa took off due to strategic and wise
exploitation of resources. Rwanda can learn from them.
The econometric model proves that if mineral revenues from exports increase by
10%, then total export revenues will increase by 7%. Rwanda needs to multiply its
existing efforts by 8–12 times if like Namibia and Botswana she needs a more
significant effect of mineral exports on its economy.
The Government of Rwanda can set up mechanisms to boost mineral
exploitation so that this sector contributes significantly to its economy. She can
come up with policy measures to attract foreign companies to invest heavily in the
exploitation of gold, methane gas, and dimension stones such as granite and marble,
as happened in Namibia, and not only focusing on cassiterites or tantalum. The
contractual frameworks with companies should be based on production-sharing
agreements like Botswana did in order to liberalize the mining sector with the state
maximizing its profits.
References
Bjerkholt O (1995) The economic society. Econometrica 63(4):755–765

CIA (2015) Natural resources. The world fact book. Available online, https://www.cia.gov/library/
publications/the-world-factbook/fields/2111.html, last consulted on 02 June 2016
Collier P, Hoeffler A (2002) On the incidence of civil War in Africa. J Conflict Resolut 46(1):
13–28
Davis, Graham A (2009) Extractive economies, growth and the poor. In: Richards Jeremy
(ed) Mining society, and a sustainable world. Verlag, Berlin, Springer, pp 37–60
Desert Gold Ventures Inc. (2012) Desert gold announces encouraging results at its Rubaya project
located in Rwanda: Vancouver, British Columbia, Canada, Desert Gold Ventures Inc. press
release, January 26, p 8. Available online at http://www.desertgold.ca/news/February%206%
202012.pdf, last consulted, 02 June 2016
Dougherty ML (2011) A policy framework for new mineral economies: lessons from Botswana,
Illinois State University. Available online: http://www.uvm.edu/ieds/sites/default/files/
Botswana_Minerals.pdf, last consulted 01 June 2016
Witness Global (2010) The hill belongs to them—the need for international action on Congo’s
conflict minerals trade: London. United Kingdom, Global Witness 31 p
Hawkins T (2009) The mining sector in Zimbabwe and its potential contribution to recovery.
United Nations Development Programme, Working paper Series No 1
ICMM (2012) The role of mining in national economies. Available online: http://www.icmm.com/
the-role-of-mining-in-national-economies, last consulted 01 June 2016
Mahonye N, Mandishara L (2015) Mechanism between mining sector and economic growth in
Zimbabwe, is it a resource curse? Economic Research Southern Africa, South Africa. Available
online: http://www.econrsa.org/system/files/publications/working_papers/working_paper_499.
pdf, last consulted 01 June 2016
Mills G (2010) Why Africa is poor and what African can do about it. Penguin Books, South Africa
RDB (2014) Mining. Available online: http://www.rdb.rw/rdb/mining.html, last consulted 24 Oct
2014
Riley G (2012) Economic growth-prebisch-singer hypothesis. Available online at http://www.
tutor2u.net/economics/revision-notes/a2-macro-economic-growth-prebisch-singer-hypothesis.
html, last consulted on 14 Oct 2014
RNRA (2014) Mining in Rwanda. Available online at http://rnra.rw/fileadmin/user_upload/

Mining_in_Rwanda_.pdf, last consulted 30 May 2016
Wright G, Czelusta J (2004) The myth of the resource curse. Challenge 47(2):6–38
Chapter 12
Testing the Balassa Hypothesis
in Low- and Middle-Income Countries
Fentahun Baylie
Abstract This study analyses the long-run relationship between economic growth
and real exchange rate for a group of 15 low- and middle-income countries for the
period 1950–2011. Co-integration between growth and exchange rate is established
by means of an augmented pooled mean group estimation method (which controls
for heterogeneity and cross-sectional dependence). Unlike previous studies,
cross-sectional dependence is accounted for which implies that the productivity
effect of the Balassa term is expected to be estimated consistently and without bias.
Moreover, our results indicate that the effect of the Balassa term depends more on
the income group (level of per capita income) than the rate of economic growth.
In general, the power of the effect is stronger for higher income countries in the long
run. The study clearly indicates that the Balassa hypothesis holds for middle-
income countries, while this is not the case for low-income countries. However,
fiscal policy and exchange rate volatility rather clearly explain the variations in the
real exchange rate.
Keywords Productivity Growth Real exchange rate Balassa hypothesis

Panel data
12.1 Introduction
The Balassa hypothesis tests the impact of productivity growth on the real exchange
rate. It states that for a growing economy, the real exchange rate is expected to
appreciate in the long run. Our study is based on a finding by Baylie (2008). The
real effective exchange rate is an important policy parameter and among the most
determining factors of growth in Ethiopia (Baylie 2008). Though Baylie recom-
mends depreciation of the domestic currency for promoting economic growth in the
short run, the author discovered that it is healthier to allow appreciation in the long
F. Baylie (&)
e-mail: fbaylie@yahoo.com

DOI 10.1007/978-981-10-4451-9_12
254 F. Baylie
run to encourage sustainable economic growth. Hence, he provided (an exchange

rate) policy recommendations which promote appreciation of the domestic currency
for sustainable growth in the long run.
Both depreciation and appreciation are not welcomed effortlessly by the mon-
etary authority. As suggested by Baylie (2008), depreciation in particular is not
favored by the monetary authority as it increases the burden on the importing
capacity for a developing country like Ethiopia. In contrast, by the time it is
recommended that a country allows appreciation, all advantages of depreciation
have been exhausted while prospects of appreciation are pending. Depreciation may
initially help promote exports and generate sufficient foreign earnings. Once this
objective is met, there arises a need to promote imports of capital goods by allowing
appreciation to establish import-substituting industries to transform the economy.
The only issue to consider in this case is the ‘timing’ of switching policy. The
solution to this dilemma is provided by the Balassa hypothesis.
At the time when the Balassa hypothesis holds in a particular economy,
depreciation is not gainful. In short, it states that if economic growth is accompa-
nied in appreciation of the domestic currency (Balassa hypothesis), the monetary
authority should not constrain the appreciation for the simple reason that it may
discourage exports. If economic growth by itself brings appreciation, it can be
sustained as the latter further puts inertia on the former. There is a possibility of one
driving the other in the long run when the hypothesis holds.
In short, the hypothesis states that the impact of growth on the exchange rate is
positive; that is, there is appreciation of the domestic currency. The main purpose of
our study, therefore, is to show whether this analysis can be extended to a group of
low- and middle-income countries on various continents. While there is evidence in
favor of the hypothesis, there are also some anti-Balassa results in some studies.
The negative results could be associated with different reasons specific to each
study.
Tica and Druzic’s (2006) survey shows that since its discovery in 1964, the
hypothesis has been tested 58 times in 98 countries in time series or panel analyses
and in 142 countries in cross-country analyses. In these analyzed estimates,
country-specific Balassa hypothesis coefficients have been estimated 164 times. The
first empirical test of the theory was carried out by Balassa (1964) himself. Kravis
and (1983) and Bhagwati (1984) were also among the forerunners. The conclusions
from all these studies confirm the difficulty in ignoring the significance of the
hypothesis in general. The strongest empirical support in favor of the relationship
between productivity and exchange rate is found in cross-sectional and panel
empirical studies.
Chuoudhri and Kahn (2004) found evidence of Balassa–Samuelson effects in a
panel of 16 developing countries. They found the traded and non-traded
12 Testing the Balassa Hypothesis in Low- and Middle-Income Countries 255
productivity differential to be a significant determinant of the relative prices of

non-traded goods, and the relative price in turn exerted a significant effect on the
real exchange rate. Similarly, Guo and Hall (2010) and Jabeen et al. (2011) also
show that productivity differences directly explained changes in the real exchange
rate by using the Johansen co-integration approach for China and Pakistan,
respectively.
A positive relationship between productivity and the real exchange rate is not,
however, a common fact in all studies. There are a number of studies that show
anti-Balassa results. Drine and Rault (2002, 2004), for example, tested the Balassa
hypothesis for 20 Latin American (middle-income) and six Asian (low-income)
developing countries separately. They applied Pedroni’s co-integration techniques
in both the studies. Though they were able to find evidence for the hypothesis in the
first study for middle-income countries, they failed to replicate the result in the
second study for low-income countries. The reason given for the failure is a break
in the relationship between productivity and relative price, one of the assumptions
of the hypothesis. Asea and Mendoza (1994a, b), Harberger (2003), Hassan (2011),
Isard and Symansky (1996), Miyajima (2005), and Wilson (2010) also found
anti-Balassa results in their studies on developing countries.
The study that comes the closest to our study is Chuah’s (2012). This study
found mixed results from a panel study of 142 developing (middle- and
low-income) and developed (high-income) countries. The estimation of the fixed
effect model showed that productivity growth in developed economies resulted in
real appreciation of domestic currencies, while the relationship was nonlinear in
developing economies. In the latter group, the real exchange rate initially depre-
ciated and then appreciated after per capita income jumped to a higher level (above
$2200), the main reason being a level of development.
Our study makes three improvements over Chuah’s (2012) study in terms of data
quality, methodology, and variables. First, our study uses data from the latest
version of the Penn World Table (PWT), version 8. Data from this version address
shortcomings associated with previous versions. In particularly, Chuah (2012) used
an expenditure-based measure of GDP from version 7, while our study uses the
output-based measure of GDP from version 8. Feenstra et al. (2013) suggested
using the second measure for studies interested in an economy’s productive
capacity. Second, our study accounts for cross-sectional dependence and hetero-
geneity by applying the common correlated effect approach of Pesaran (2013) and
pooled mean group estimation, respectively. Third, our study controls for important
supply- and demand-side factors.
The remaining paper is organized as follows. Section 12.2 provides the theo-
retical background of the hypothesis. Section 12.3 discusses the methodology. The
findings are presented in Sects. 12.4 and 12.5 gives the conclusion and policy
implications derived from the findings.
256 F. Baylie
12.2 Theoretical Framework of the Model:

The Balassa–Samuelson Hypothesis1
The Balassa hypothesis demonstrates the relationship between exchange rate,

purchasing power parity (PPP), and inter-country income comparisons in general.
The hypothesis emanates from the PPP theory. It explains the reason why the PPP
theory of exchange rate is imperfect. In the absence of all frictions, the prices of a
common basket of goods in two countries measured in the same currency should be
the same at all times for absolute PPP to hold, that is, P=eP ¼ 1: The Balassa–
Samuelson effect, first formulated by Harrod in 1934 and later by Balassa and
Samuelson in 1964 separately, says that distortions in purchasing power parity are
the result of international differences in relative productivity growth between the
tradable goods sector (mainly manufacturing and agriculture) and the non-tradable
goods sector (mainly services) (Herberger 2003; Tica and Druzic 2006). In contrast
to the PPP theory, price levels are higher in rich countries than poor ones when
converted to a common currency. This may be associated with higher productivity
growth in the tradable sector in rich countries (Rogoff 1996).
A nation’s prosperity is mainly associated with productivity growth in the
tradable goods sector. This has an effect of reducing costs in the same sector and
increasing real wages in the economy and puts an upward pressure on relative
prices of non-tradable goods where productivity has not grown by the same mag-
nitude. This distorts the PPP relationship and results in appreciation of the real
exchange rate. The same effect holds true across nations. A more prosperous nation
experiences higher productivity growth in the tradable goods sector than a poor
nation. Thus, an increase in the prices of non-tradable goods will be higher in a rich
country. As a result, a rich country’s real exchange will appreciate compared to a
poor’s nation currency (Asea and Corden 1994a, b).
The Balassa hypothesis may be tested in two forms: external and internal ver-
sions. The external version analyzes the impact of productivity growth on the real
exchange rate. The internal version analyzes the impact of productivity on relative
prices. If one fails to prove a relationship between productivity and the real
exchange rate, it is most likely that the hypothesis is functioning through the
internal version; that is, the relationship between relative prices and the real
exchange rate or relative prices and productivity growth should be tested. The main
objective of our study is to examine the validity of the external version of the
hypothesis.
The core idea of the Balassa hypothesis is related to the concept of convergence
(beta-b-convergence) in growth theories. Both describe features of developing
economies. Convergence between economies may be roughly defined as the
1
Though the idea has been mentioned by several authors (like Ricardo 1911; Harrod 1933; Viner
1937), the contribution of other authors is not as bold as Paul A. Samuelson and Bela Balassa
and hence the name Balassa-Samuelson hypothesis (Tica and Druzic 2006). The term ‘Balassa
hypothesis’ is used in this study.
tendency for levels of per capita income or productivity to equalize over time.
Growth theories2 state that countries with low capital-to-labor ratios (high marginal
productivity of capital) in general and with advantages of elements such as inno-
vation ability, human capital formation, technical progress, and economies to scale
in particular grow faster than others (Kumo 2011; Orlik 2003; Soukiazis 1995).
According to these growth theories, there is a tendency for developing countries
to grow faster than developed countries if some conditions in particular are satis-
fied. Given that the Balassa hypothesis is related to the impact of economic (pro-
ductivity) growth on the real exchange rate, there should be a greater probability of
finding evidence for the hypothesis in converging economies as compared to
developed ones. The convergence process, thus, may be used as a criterion for
identifying candidate countries for a sample study.
12.3 Methodology
12.3.1 Data Type and Collection Methods
Data for all countries and variables are from Penn World Table for the period
1950–2011. The variables include exchange rate, per capita GDP, and government
expenditure. While the choice of the study period for each country depends on
data availability, countries are selected on the basis of the convergence criterion
which suggests that the fastest growing economies are mainly the developing
economies.
According to IMF’s World Economic Outlook Report (2015), all 15 countries in
our sample are developing countries. However, for comparison purposes, the
sample is divided into two categories on the basis of the size of economies (relative
GDP). The first group represents the top five largest economies in the sample—
BRICS (Brazil, Russia, India, China, and South Africa). They are from the (upper)
middle-income countries’ category (except India) which together nearly represent
90% of the US economy. The second group consists of 10 low-income countries
(Angola, Ethiopia, Ghana, Indonesia, Kenya, Nigeria, the Philippines, Rwanda,
Tanzania, and Uganda). Lower middle-income countries (with per capita income
lower than $4125) are included in the second group in our sample.
2
There are three main theoretical approaches to explain the convergence phenomenon: the neo-
classical approach, endogenous growth theory, and demand-orientated approach. While (absolute)
convergence is the inherent nature of diminishing returns to reproducible capital in the first
approach, it is conditional on different factors and elements such as innovation ability, human
capital formation, technical progress, and economies to scale in the second and third approaches
(Soukiazis 1995).
258 F. Baylie
12.3.2 Model Specification
The original Balassa model was designed for a fully employed small open econ-
omy; a 2 2 2 system (two countries, two commodities, two factors); an
inter-sector mobile labor (scarce factor) and inter-nation mobile capital; law of one
price for factors within a nation and for tradables across nations; a constant return to
scale production frontier; perfect competition in both markets (goods and factors);
neural technical progress; and constant terms of trade (Podkaminer 2003).
A derivation of the Balassa–Samuelson model may be considered as a
three-stage process. The first is to derive the relationship between the productivity
differential and relative price. The second is to derive the relationship between
relative price and exchange rate. The third is to derive the relationship between
productivity differential and exchange rate.
STEP 1: The original Balassa–Samuelson model is framed on the basis of the
traditional Ricardian trade model (Asea and Corden 1994a, b). It is a supply-side
model defined by constant return to scale Cobb-Douglas style production functions
in two sectors as (Podkaminer 2003):
YT ¼ AT LaT KT1a ð12:1Þ
YN ¼ AN LbN KN1b ð12:2Þ
where T and N refer to traded and non-traded sectors, and a and b represent the
share of labor in each sector, respectively, with b a:
In a perfectly competitive market, factor prices must equal their respective value
of marginal products at equilibrium for both sectors:
1a
KT
PT AT a ¼w ð12:3Þ
LT
a
KT
PT AT ð1 aÞ ¼r ð12:4Þ
LT
1b
KN
PN AN b ¼w ð12:5Þ
LN
b
KN
PN AN ð1 bÞ ¼r ð12:6Þ
LN
Combing the two factor markets for each sector independently and taking the
logarithm of both sides for each equation yields:
logðPT Þ ¼ ð1 aÞ logðrÞ ð1 aÞ logð1 aÞ þ a logðwÞ logðAT Þ ð12:7Þ
logðPN Þ ¼ ð1 bÞ logðrÞ ð1 bÞ logð1 bÞ þ b logðwÞ logðAN Þ ð12:8Þ
Recalling the assumption that price of tradables (numeraire) and interest rate
(not technology) are the same across boundaries, differentiation of the above with
respect to time yields:

dPT ðsÞ a dwðsÞ dA T ð s Þ
ds ds ds
¼0¼ ð12:9Þ
PT ðsÞ wðsÞ AT ðsÞ

dPN ðsÞ b dwðsÞ dA N ð s Þ
ds ds ds
¼ ð12:10Þ
PN ðsÞ w ð sÞ A N ð sÞ
Substituting Eq. (12.9) into Eq. (12.10) helps define the relative price of
non-tradables in terms of productivity differentials for home and foreign country
^ represents growth rate):
(A

dPN ðsÞ dAT ðsÞ dAN ðsÞ
ds b ds ds
¼ ð12:11Þ
P N ð sÞ a AT ðsÞ A N ð sÞ

b ^ ^N
^pN ¼ AT A ð12:12Þ
a

b ^ ^
^pN ¼ AT AN ð12:13Þ
a
The difference between Eqs. (12.12) and (12.13) defines price differentials
across countries:

b ^ ^ b ^ ^
^pN ^pN ¼ AT AN AT AN ð12:14Þ
a a
This means that the price differential between sectors and across countries can be
explained by productivity differentials between sectors and across nations.
STEP 2: We follow Ahn (2009) to link the exchange rate and productivity differ-
ential through the price index. The real exchange rate is defined in a log-linear form
as (increase shows appreciation):
260 F. Baylie
Q ¼ P=eP
ð12:15Þ
q ¼ p e p
Price indices are defined as weighted averages of prices in tradable and

non-tradable sectors in both domestic and foreign markets:
ð1hÞ
P ¼ PdN P1d
T and P ¼ Ph
N PT
In log-linear form:
p ¼ dpN þ ð1 dÞpT ð12:16Þ
p ¼ hpN þ ð1 hÞpT ð12:17Þ
d and h represent the share of non-tradables in the consumer basket at home and
abroad, respectively. Substituting Eqs. (12.16) and (12.17) into Eq. (12.15) helps
define the real exchange rate as a function of price differential:

q ¼ dðpN pT Þ h pN pT þ pT e pT ð12:18Þ
Since pT ¼ e þ pT (law of one price for tradables), Eq. (12.18) will be:

q ¼ dðpN pT Þ h pN pT ð12:19Þ
STEP 3: Eq. (12.19) defines the real exchange rate as a function of the relative price
differential between countries. Substituting Eq. (12.14) into Eq. (12.19) helps define
the exchange rate as a function of the productivity differential. We assume that the
share of non-tradables in the foreign consumer basket ðhÞ is the same as home ðdÞ:
Hence:

b ^ ^ b ^ ^
^q ¼ d AT AN AT AN ð12:20Þ
a a
If the home market grows faster than the foreign one, then the domestic currency
appreciates and vice versa.
In order to avoid the assumption of neutral technical progress, we introduced an
intercept in the econometric model (Kohler 1998). We also introduced demand-side
factors as the Balassa model is not complete by itself (De Gregorio and Wolf 1994).
Therefore, the econometric model used in our study is derived from Eq. (12.20) (see
Annexure 1 for derivation). It includes two more factors (demand and supply sides):
ðlnQÞit ¼ ai þ b1i lnðY=Y Þit þ b2i lnðG=G Þit þ b3i volðEÞit þ eit ð12:21Þ
where Q and E are real and nominal exchange rates.
ðlnQÞit is log of the real exchange rate of each country measured against the US
dollar. Increase implies appreciation. ‘it’ refers to ith country in period t. lnðY=Y Þit
is log of real GDP per capita relative to the US economy. It is a proxy for the
productivity growth differential in each country. The Balassa hypothesis declares
that productivity growth has a positive impact on the real exchange rate.
lnðG=G Þit is log of relative real government expenditure. It is a proxy for fiscal
policy. Kohler (1998) argues that government expenditure accounts for demand
shifts toward non-tradables which results in appreciation of the real exchange rate in
the short run. In the long run, it does not have an impact unless financed by
distortionary taxes. Distortionary taxes reduce real wages and relative prices of
non-tradables, and this leads to the depreciation of the real exchange rate in the long
run.
volðEÞit is exchange rate volatility measured as the absolute value of percentage
change in the nominal exchange rate. It is a supply-side factor. The impact of
volatility on the real exchange rate may be positive or negative; it depends on the
time horizon and type of regime. Kohler (1998) shows that the impact of volatility
is smaller in the short run and in poor countries due to greater nominal rigidities. In
relatively fixed exchange rate regimes (mainly poor economies), movements in
nominal exchange rate are restricted. In this case, growing economies experience
inflation in both sectors with relative prices of non-tradables falling. This leads to a
depreciation of the real exchange rate. In contrast, there is smaller rigidity in freely
floating exchange rate regimes (mainly rich economies). With productivity growth,
inflation in the non-tradable sector is balanced by deflation in the tradable sector
(as a result of a nominal appreciation). Relative prices of non-tradables increase,
and this leads to the real exchange rate appreciation.
12.3.3 Cross-sectional Dependence Test
Cross-sectional dependence is a problem associated with panel data that mixes

information from different cross sections and leads to a difficulty in interpreting the
individual effects of each section. It may be caused by socioeconomic network
effects, spatial effects, or the influence of a dominant unit or common unobserved
factors. When the problem is ignored, estimates are badly biased and the tests may
be misleading (Shin 2014). Factor models are used to filter out cross-sectional
dependence due to unobserved common factors. We used the Pesaran
262 F. Baylie
cross-sectional independence test in our study as it is the most powerful test

(Eberhardt 2011). It is given by CD (cross-sectional dependence) which is Nð0; 1Þ:
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
s X !
N 1 XN pffiffiffiffiffi
2
CD ¼ îj
Tij q ð12:22Þ
N ð N 1Þ i¼1 j¼i þ 1
12.3.4 Panel Unit Root Tests
Six types of panel unit root tests are available: Levin-Lin-Chu (LLC), Hariss-
Tzavalis (HT), Breitung, Im-Pesaran-Shin (IPS), Fisher type, and Hadri LM. The
panel data for our study are unbalanced, and N is fixed and smaller relative to T. It
also assumes that the auto-regressive parameter, q; is panel specific. Hence, the
candidate panel unit root tests that fit these criteria are the IPS and Fisher-type tests.
Another advantage of these tests is that they can be used to test a series which is not
serially independent across cross sections.
(a) The Im–Pesaran–Shin test
The following is a panel unit root test as proposed by Pesaran (2007) which
accounts for cross-sectional dependence. The standard Augmented Dickey–Fuller
(ADF) regressions are further augmented with cross-sectional averages of lagged
levels and first differences of individual series. Let yi;t be the observation on the ith
cross-sectional unit at time t, and suppose that it is generated according to the
simple dynamic linear heterogeneous panel data model:
yi;t ¼ ð1 /i Þli þ /i yi;t1 þ eit ð12:23Þ
where eit ¼ ci ft þ eit ; i ¼ 1. . .N; t ¼ 1. . .T:
The initial value, yi;0 , has a given density function with a finite mean and
variance, and the error term, eit , has a single-factor structure. ft is the unobserved
common effect, and eit is an individual-specific (idiosyncratic) error. The unit root
hypothesis of interest is expressed as:
H0 : /i ¼ 1 for all i against the possibly heterogeneous alternatives
H1 : /i \1; i ¼ 1; 2; . . .:N1 ; /i ¼ 1; i ¼ N1 þ 1; N2 þ 2; . . .:; N
N1 =N; a fraction of the individual processes that are stationary, is nonzero and
tends to the fixed value d such that 0 < d < 1 as N ! 1. This condition is nec-
essary for the consistency of unit root tests.
(b) Fisher-type tests

Maddala and Wu (1999) provide a Fisher-type panel unit root test which
accounts for cross-sectional dependence. Like the IPS test, the Fisher-type test is a
way of combining evidence on the unit root hypothesis from the N unit root tests
performed on N cross-sectional units. The fisher-type test makes this approach more
explicit. It combines p values from panel-specific unit root tests using four methods.
Three of the methods differ in whether they use inverse chi-square, inverse-normal,
or inverse-logit transformation of p values, and the fourth is a modification of the
inverse Chi-square transformation. The inverse-normal Z statistic offers the best
trade-off between size and power.
Let Gi;Ti be a unit root test statistic for the ith group, and assume that as Ti ! ∞,
then Gi;Ti ¼ [ Gi . Let pi be the p value of a unit root test for cross section i, that is,
pi = 1 − F(Gi;Ti ), where F() is the distribution function of random variable Gi . In
Chen (2013), the Fisher-type test is given as:
X
N
P ¼ 2 ln pi ð12:24Þ
i¼1
P is distributed as v2 with 2N degrees of freedom as T ! ∞ for all N. pi value

closer to zero (ln pi closer to −∞) implies large value of P, and then, the null
hypothesis of the existing panel unit root is rejected. pi value closer to 1 (ln pi closer
to zero) implies that the panel unit root does exist.
12.3.5 Panel Co-integration Tests
There are two possibilities to deal with nonstationary variables in a given model
after the stationarity test. First, to test whether the linear combination of nonsta-
tionary variables is stationary by using the co-integration test. If they are
co-integrated, then we proceed to a long-run analysis with the nonstationary vari-
ables. Otherwise, we difference the stationary variables for a short-run analysis.
Engle and Granger (1987) noted that ‘a test for co-integration can be thought as a
pretest to avoid “spurious regression” situations.’ If regression of one nonstationary
variable over another nonstationary variable yields a stationary series, it is known as
a co-integrating regression and the slope parameter in such a regression is known as
a co-integrating parameter.
We employ a residual-based Pedroni co-integration test which is simply a unit
root test applied to the residuals obtained from a co-integrating regression. If
variables are co-integrated, then the residuals should be I(0). If the variables are not
co-integrated, then the residuals are not I(0) (Pedroni 2004). The test allows for
heterogeneous intercepts and trend coefficients across cross sections. It is based on a
residual obtained from a regression:
264 F. Baylie
yit ¼ ai þ di t þ b1i x1i;t þ b2i x2i;t þ þ bMi xMi;t þ ei;t ð12:25Þ
for t ¼ 1; . . .; T; i ¼ 1; . . .; M; m ¼ 1; . . .; M; and x and y are assumed to be inte-

grated of order 1, I(1). The parameters ai and di are individual and trend effects.
Pedroni proposes seven different statistics to test panel data co-integration: panel
v-statistic, panel rho-statistic, panel PP-statistic, panel ADF-statistic, group
rho-statistic, group PP-statistic, and group ADF-statistic. The first four are based on
pooling or the ‘within’ dimension, and the last three are based on the ‘between’
dimension. The null hypothesis is no co-integration for both. However, the alter-
native hypothesis is qi = q < 1 for all i in the former, and it is qi < 1 for all i in the
latter (Pedroni 2004).
12.3.6 Estimation Method
The choice of estimation method mainly depends on the results of preliminary tests
of data. In our case, we looked for a method that helped an analysis of nonstationary
variables which were co-integrated. We considered a method that provides esti-
mated coefficients for individual countries. Therefore, we are not supposed to
consider traditional estimators such as Pooled OLS, fixed effect, and first-difference
OLS models which assume homogeneous technology parameters and factor load-
ings (common slope). Eberhardt et al. (2011) and others have suggested using the
pooled mean group estimation method for analyzing nonstationary variables which
are co-integrated in a long panel setting. This method is helpful for heterogeneous
technology parameters and factor loadings in particular.
The pooled mean group (PMG) estimator involves averaging and pooling. It
restricts long-run coefficients to be homogenous over cross sections, but allows for
heterogeneity in intercepts, short-run coefficients (including the speed of adjust-
ment), and error variances. It is argued that country heterogeneity is particularly
relevant in short-run relationships given that countries may be affected by
over-lending, borrowing constraints and financial crises in short-time horizons.
Homogenous long-run relationships may be assumed for reasons such as budget or
solvency constraints, arbitrage conditions, or common technologies (Cavalcanti
et al. 2011).
The relationship in pooled mean group estimation may be defined by an ARDL
model as:

Dqit ¼ ai þ bi Dxit þ ki qi;t1 hxi;t1 þ eit ð12:26Þ
where q ¼ lnQ and x ¼ lnX: bi are short-run parameters, which like r2i differ across
countries. Error correction term¸ ki , also differs across i, long-run parameter; h,
however, is constant across the groups. This estimator is quite appealing when
studying small sets of arguably ‘similar’ countries. In I(1) panels, this estimator
allows for a mix of co-integration (ki [ 0Þ and non-co-integration (ki ¼ 0Þ. xi;t
represents the set of explanatory variables defined in Eq. (12.21).
To account for cross-sectional dependence which may result from any common
unobserved factor incorporated in the error term, we follow Pesaran’s (2013)
common correlated effect approach. Unlike de-meaning, the approach handles
multiple factors which can be correlated with regressors and serial correlation in
errors and lagged dependent variables (Shin 2014). It does not require prior
knowledge of the number of unobserved common factors and can be applied to
dynamic panels with heterogeneous coefficients and weakly exogenous regressors
(Pesaran 2013). The procedure consists of approximating the linear combinations of
unobserved common factors by cross-sectional averages of the dependent and
explanatory variables and then running standard panel regressions augmented with
these cross-sectional averages.
The PMG estimator for a cross-sectionally dependent series may be explicitly
defined as:

Dqit ¼ ai þ bi Dxit þ ki qi;t1 hxi;t1 þ cit ft þ eit ð12:27Þ
where c0i ft þ eit ¼ eit
ft is a vector of unobserved common shocks which captures the source of error

term dependencies across countries. It may be stationary or nonstationary. The
impacts of these factors on each country are governed by the idiosyncratic loadings
in cit . The individual-specific errors, eit , are distributed independently across i and t;
they are not correlated with the unobserved common factors or the regressors; and
they have zero mean, variance greater than zero, and finite fourth moments
(Cavalcanti et al. 2011). The augmented pooled mean group estimator is,
therefore, defined by substituting cross-sectional averages for the unobserved
common factors, ft .
1X PT
Dqit ¼ ai þ bi Dxit þ ki qi;t1 hxi;t1 þ dzw;tl þ eit ð12:28Þ
N l¼1
where zw;t represents a set of cross-sectional averages of the dependent and inde-
pendent variables and their lagged values which approximate/proxy the unobserved
common factors ðft Þ. The focus of this estimator is on obtaining consistent estimates
of parameters related to observable variables, while the estimated coefficients on
cross-sectionally averaged variables are not interpretable in a meaningful way:
They are merely present to alter the biasing impact of unobservable common factors
(Eberhardt 2012).
266 F. Baylie
12.3.7 Error Correction Mechanism (ECM)
If two/more variables are co-integrated or prove to have a long-run relationship,

then one needs to go for an error correction mechanism. The error correction
mechanism (ECM) is a method used to correct any short-run deviations of variables
from their long-run equilibrium; that is, it corrects for short-run disequilibrium. An
important theorem, the Granger representation theorem, states that if two variables
Y and X are co-integrated, then the long-term or equilibrium relationship that exists
between the two can be expressed as ECM (Engle and Granger 1987). This means
that one shall go for the construction of an error correction model if the two
variables are co-integrated. ECM is given as follows in Bhattarai (2011) for ARDL
(1,1) with bi ¼ 0 :
qit ¼ ai þ ci qit1 þ bi xit þ hi xi;t1 þ eit
Dqit ¼ ai þ bi Dxit þ ki ui;t1 þ eit ð12:29Þ
D denotes
the first-difference
operator, eit is a random error term, and
ui;t1 ¼ qi;t1 hxi;t1 is one-period lagged value of error term from a
co-integrating regression.
This ECM equation states that dqit depends on dxit and also on the equilibrium
error term. If the error term is nonzero, the model is out of equilibrium. Suppose dxit
is zero (Bhattarai 2011) and ui;t1 is positive, it means qit1 is too high (above) to be
in equilibrium. Since ki is expected to be negative, the term ki ui;t1 is negative, and
therefore, dqit will be negative to restore equilibrium. That is, if qit is above its
equilibrium value, it will start falling in the next period to correct the equilibrium
error. Similarly, if ui;t1 is negative (i.e., qit is below its equilibrium value), ki ui;t1
will be positive, which causes dqit to be positive, leading qit to rise in the next
period. The absolute value of ki determines how quickly the equilibrium is restored
(Engle and Granger 1987).
12.4.1 Test Results
This analysis begins by performing different econometric tests. Since not all unit
roots provide the appropriate results, a cross-sectional independence test was per-
formed to decide the type of panel unit root test to be considered. Using the
Pesaran CD test, and possibly all other tests, the null hypothesis of cross-sectional
independence was rejected for the original data. Hence, the series for our data was
initially cross-sectionally dependent. However, after the data were augmented for
cross-sectional averages to eliminate unobserved common factors, the Pesaran CD
test, and possibly two other tests, failed to reject the null hypothesis of
cross-sectional independence. The test results are given in Annexure 2.
IPS and Fisher-type tests are panel unit root tests which account for cross-sectional
dependence. The results of the tests with different assumptions are given in
Annexure 2. All the variables are nonstationary at the 1% level of significance.
The next step is to test for co-integration—whether there is a long-run relation
between our nonstationary variables. The test for co-integration is residual based.
We used two Pedroni type tests (ADF and PP tests) and the IPS test. In all the cases,
we strongly reject the null hypothesis of no co-integration for both types of models
(augmented and non-augmented) (see Annexure 2). Augmented models include
cross-sectional averages of dependent and independent variables to account for
cross-sectional dependence.
We propose three types of augmented models for the model selection criterion:
models I, II, and III with one, two, and three explanatory variables, respectively.
Even though the model selection criterion suggests that a model with three variables
is our ‘best model’ in terms of log-likelihood ratio and Akaike information criteria
(see Annexure 2), we present the results of the other models as well for comparison.
12.4.2 Estimation Results
Unlike most previous studies, the results of our study were not uniform across all
developing countries. The impact of productivity growth on the real exchange rate
differed by income group or per capita income. Productivity growth led to an
appreciation in middle-income countries and depreciation in low-income countries
in the long run. Our results substantiate the findings of Drine and Rault (2002,
2004) and Chuah (2012). Drine and Rault (2002, 2004) found evidence for the
hypothesis in a study for middle-income countries (MICs) in 2002 and failed to
arrive at the same conclusion for low-income countries (LICs) in another study in
2004. Our findings also seem to be in implicit confirmation of Chuah’s (2012)
results. He calculated a turning point ($2200) below which change in income
resulted in depreciation of the real exchange rate. Almost all LICs in our study had
a per capita income less than $2200. The conclusions of Chuah’s (2012) study
coincide with our conclusions for LICs such as Indonesia, Kenya, Nigeria,
Tanzania, and Uganda.
Table 12.1 shows the long-run results of the panel co-integration estimation
using the augmented PMG estimator for different groups of countries and models in
the sample. We follow the tradition of presenting estimated coefficients of only
observable variables as cross-sectionally averaged variables are not directly inter-
pretable in a meaningful way. Estimated coefficients of full models (with observable
and unobservable variables) are reported in Annexure 3.
Basically, we consider three types of models in comparing three types of groups:
the all countries group (15 countries), country groups by income (middle-income
countries (MICs), 5 countries; low-income countries (LICs), 10 countries), and
268 F. Baylie
Table 12.1 Panel co-integration estimation: the augmented PMG estimator

Sample Type of model Long-run coefficients
[# of countries] ln Q ¼ dependent variable
lnðY=Y Þ lnðG=G Þ volðEÞ
All countries [15] Model I 0.378296***
(0.108540)
Model II 0.246642*** 0.170621***
(0.075056) (0.037414)
Model III 0.388355*** 0.110547** −0.021531**
(0.092802) (0.045996) (0.010345)
MICs (BRICS) [5] Model I 0.109014
(0.151776)
Model II 0.382324** 0.220887*
(0.160930) (0.120288)
Model III 0.344657** 0.252061*** 0.024845**
(0.149000) (0.092608) (0.012602)
LICs [10] Model I 0.320488*
(0.132415)
Model II 0.211173** 0.140663***
(0.098594) (0.041793)
Model III −0.287286*** 0.010920 −3.21610***
(0.086673) (0.049502) (0.419248)
Africa [9] Model I 0.366216**
(0.144702)
Model II 0.239056** 0.168407***
(0.103439) (0.043046)
Model III −0.247591*** 0.077565** −2.56978***
(0.071625) (0.038439) (0.285754)
Asia [4] Model I 0.248016
(0.258474)
Model II 0.519768** −0.179409**
(0.216328) (0.078622)
Model III 0.771434* −0.339890 −7.71735***
(0.439197) (0.207397) (2.826821)
Note Q and E are real and nominal exchange rates, Y/Y* = real GDP of home relative to foreign
(US), G/G* = real government expenditure of home relative to foreign (US), and vol(E) exchange
rate volatility
***, **, and * refer to significance level at 1, 5, and 10%. Standard errors in parentheses
MICs refers to middle-income countries of the BRICS group (Brazil, Russia, India, China, and
South Africa)
LICs refers to low-income countries (Angola, Ethiopia, Ghana, Indonesia, Kenya, Nigeria, the
Philippines, Rwanda, Tanzania, and Uganda)
Africa refers to African countries (Angola, Ethiopia, Ghana, Kenya, Nigeria, Rwanda, Tanzania,
Uganda, and South Africa)
Asia refers to Asian countries (China, India, Indonesia, and the Philippines)
country groups by region (Africa, 9 countries; Asia, 4 countries). For each group in
Table 12.1, the first row shows a model with one explanatory variable (produc-
tivity); the second row shows a model with two explanatory variables (productivity
and government expenditure); and the third row shows a model with three
explanatory variables (productivity—lnðY=Y Þ; government expenditure
—lnðG=G Þ; and exchange rate volatility—volðEÞÞ. The center of our discussion is
Model III (shaded rows) for each group below.
In general, the results in Table 12.1, in general, show that the Balassa hypothesis
holds for all countries as a group in the sample in the long run; that is, a 1%
improvement in productivity leads to an appreciation of domestic currencies in the
developing countries in the group by 0.388% on average. We find a different result,
however, when the sample is categorized into different groups. When categorized
by level of per capita income, the results show that the Balassa hypothesis holds
only for middle-income countries (MICs). The same fact holds when countries are
categorized by region; that is, the Balassa hypothesis holds only for Asian countries.
This may be related to the fact that in our sample, most middle-income countries are
from Asia and poor countries are from Africa. In both the cases, a 1% increase in
productivity appreciates the domestic currencies of countries in MICs and Asia
groups nearly by 0.34 and 0.77%, respectively (though only at the 10% level of
significance for the latter). For LICs and Africa groups, a 1% increase in produc-
tivity depreciates domestic currencies of countries in the groups by nearly 0.287
and 0.247%, respectively.
The long-run relationship between government expenditure and the real
exchange rate shows that expansionary fiscal policies result in appreciation of
domestic currencies in all cases except for the LICs and Asia groups. This may not
be surprising as the major countries with ‘big economies’ in both the groups are
almost similar (Indonesia and the Philippines are members of both groups). The
results of these groups are in line with Kohler’s (1998) argument who states that
government expenditure does not have an impact in the long run unless financed by
distortionary taxes.
Exchange rate volatility has the impact of depreciating the real exchange rate for
all countries in all groups except the middle-income group in the long run. This
confirms theoretical arguments which associate relatively fixed or highly managed
exchange rate systems (mainly in poor countries) to depreciation and flexible
regimes to appreciation in the real exchange rate.
Table 12.2 presents the results of short-run dynamics of the same groups of
countries and models as given in Table 12.1. The discussion that follows focuses on
Model III (the shaded rows). Short-run dynamics show that the impact of change in
productivity on change in the real exchange rate is significant but negative; that is, it
has the impact of depreciating the real exchange rate for all countries in all groups
in the short run.
Fiscal policy does not significantly explain the variations in the real exchange
rate. Exchange rate volatility has an impact only in MICs and all countries groups.
It negatively impacts the real exchange rate in the short run. This may be due to
greater rigidity in the short run.
270 F. Baylie
Table 12.2 Short-run dynamics of panel co-integration estimation: the augmented PMG
estimator
Sample Type of Adjustment Short-run coefficients
[# of countries] model coefficient D ln Q ¼ dependent variable
D lnðY=Y Þ D lnðG=G Þ D volðEÞ
All countries Model I −0.109894*** −0.327721***
[15] (0.022462) (0.081918)
Model II −0.142981*** −0.359401*** −0.046910*
(0.032498) (0.091545) (0.027791)
Model −0.122432*** −0.397465*** −0.057736* −0.161279***
III (0.037203) (0.082479) (0.030909) (0.038693)
MICs (BRICS) Model I −0.167326*** −0.300195***
[5] (0.068251) (0.082509)
Model II −0.193645*** −0.408364*** −0.082710
(0.102726) (0.133144) (0.073367)
Model −0.12243*** −0.3974*** −0.057736* −0.161279***
III (0.037203) (0.082479) (0.030909) (0.038693)
LICs [10] Model I −0.111716*** −0.352239***
(0.027206) (0.118025)
Model II −0.141304*** −0.364848*** −0.042973
(0.036735) (0.136587) (0.032613)
Model −0.086261*** −0.423972*** −0.016775 −0.020308
III (0.024822) (0.099213) (0.028716) (0.031332)
Africa [9] Model I −0.098028*** −0.427619***
(0.031963) (0.104199)
Model II −0.131601*** −0.427570*** −0.058400**
(0.041812) (0.122564) (0.031572)
Model −0.107015*** −0.408247*** −0.032190 −0.034526
III (0.032358) (0.100776) (0.029437) (0.046803)
Asia [4] Model I −0.127473*** −0.213683
(0.041693) (0.193959)
Model II −0.165803* −0.229174 0.035224
(0.092356) (0.192154) (0.067468)
Model −0.052518*** −0.458148*** −0.013347 −0.044258*
III (0.019297) (0.162181) (0.067688) (0.023795)
Note Δ ln Q = log of real exchange rate differenced, D ln Y/Y* = log of real GDP relative to
foreign (US) differenced, D ln G/G* = log of real government expenditure relative to foreign
(US) differenced, and vol(E) exchange rate volatility differenced
***, **, and * refer to the significance level at 1, 5, and 10%. Standard errors in parenthesis
South Africa)
Africa refers to African countries (Angola, Ethiopia, Ghana, Kenya, Nigeria, Rwanda, Tanzania,
Uganda, and South Africa)
Asia refers to Asian countries (China, India, Indonesia, and the Philippines)
The (negative) signs and statistical significance of the error correcting terms
show that the system is stable. A stable co-integrating relationship adjusts short-run
deviations by the extent of the error correcting term. The rate of adjustment is,
however, higher (12%) in MICs than LICs (8%). This means MICs have a faster rate
of adjustment and achieve equilibrium earlier than LICs. This may be associated
with better conditions to fulfill assumptions of the model in the former group.
Tables 12.3 and 12.4 present the short-run dynamics for individual countries in
two income groups (MICs and LICs), respectively. The results are for Model III.
The short-run dynamics show that the impact of productivity on the real
exchange rate was significant and negative for all countries except Brazil and South
Africa. Productivity did not have an impact on the real exchange rate in these
countries in the short run. Expansionary fiscal policies resulted in depreciation of
the real exchange rate in Brazil, Russia, and China. The role of exchange rate
volatility was significant in all countries. However, the effect was exceptionally
positive in Russia.
The rate of adjustment was the highest in Russia (56.25%) followed by Brazil
(22.76%). This may be associated with the size and features of these economies.
These are the two biggest economies in the group which account for 40 and 20% of
the US economy, respectively. A faster rate of adjustment means that they can
achieve equilibrium earlier than others.
Table 12.4 presents the short-run dynamics for LICs. The short-run dynamics
shows that the impact of productivity on the real exchange rate was significant and
negative for all countries except Indonesia and Uganda. Productivity did not impact
Table 12.3 Short-run dynamics by country: middle-income group (BRICS): Model III
Cases Adjustment coefficient Short-run coefficients
D ln Q ¼ dependent variable
All countries −0.12243*** −0.3974*** −0.057736* −0.161279***
(0.037203) (0.082479) (0.030909) (0.038693)
Brazil −0.227627*** 0.228636* −0.088359*** −0.002061***
(0.004377) (0.096920) (0.006094) (1.62E−05)
China −0.064666*** −0.5913*** −0.101536*** −0.354891***
(0.000587) (0.012107) (0.006333) (0.007472)
India −0.046854*** −0.4483*** 0.015571 −0.301000***
(0.000797) (0.026819) (0.008586) (0.006721)
Russia −0.562507*** −0.4439*** −0.451463*** 0.006537***
(0.015174) (0.045163) (0.010053) (3.27E−05)
South Africa −0.078684*** −0.194967 0.039537 −0.373313***
(0.001472) (0.151551) (0.058783) (0.009758)
South Africa)
272 F. Baylie
Table 12.4 Short-run dynamics by country: low-income group

Cases Adjustment Short-run coefficients
coefficient D ln Q ¼ dependent variable
All countries −0.086261*** −0.423972*** −0.016775 −0.020308
(0.024822) (0.099213) (0.028716) (0.031332)
Angola −0.001548*** −0.372509*** −0.171725** −0.005968***
(4.27E−07) (0.025896) (0.003968) (7.06E−06)
Ethiopia −0.136337*** −0.816907*** −0.03830*** 0.065656***
(0.000650) (0.015748) (0.002945) (0.007485)
Ghana −0.094831*** −0.691301*** 0.03351*** −0.042398***
(0.000562) (0.035033) (0.002317) (0.002733)
Indonesia −0.000108*** −0.006073 0.06723*** −0.017159
(1.02E−05) (0.086557) (0.010901) (5.17E−05)
Kenya −0.248868*** −0.121422*** 0.11217*** 0.164651***
(0.001492) (0.012865) (0.000995) (0.003452)
Nigeria −0.042414*** −0.632917*** −0.02030*** −0.125337***
(0.000194) (0.009143) (0.000807) (0.002453)
The −0.145762*** −0.647025*** 0.05421*** −0.089533***
Philippines (0.000584) (0.022619) (0.003975) (0.002379)
Rwanda −0.075394*** −0.345743*** −0.012847** −0.028748***
(0.000518) (0.005333) (0.002458) (0.002997)
Tanzania −0.107271*** −0.665093*** −0.04128*** −0.177574***
(0.000337) (0.013354) (0.002109) (0.003046)
Uganda −0.010293*** 0.059272 −0.15042*** 0.053328***
(5.76E−05) (0.032875) (0.007855) (0.002087)
***, **, and * refer to significance level at 1, 5, and 10%. Standard errors in parentheses
the real exchange rate in the short run in these countries. The role of fiscal policy
was significant in all countries even though the effect was different. The increase in
government expenditure resulted in a depreciation of the real exchange rate in all
countries except in Ghana, Kenya, Indonesia, and the Philippines. The strongest
impact of the fiscal policy was shown by Uganda (0.15%). The impact of exchange
rate volatility was significant in all countries except Indonesia.
The rate of adjustment was the highest in Kenya (24.89%) followed by the
Philippines (14.58%) and Ethiopia (13.63%). These three countries may achieve
equilibrium earlier than others in the group.
12.5 Conclusions and Policy Implications
12.5.1 Conclusions
Unlike most previous studies, the results of our study are not uniform across all the
developing countries in our sample. The impact of productivity growth on the real
exchange rate varied by income group or per capita income. Productivity growth
led to an appreciation of the real exchange rate in middle-income countries and
depreciation of the real exchange rate in low-income countries in the long run. In
general, the results of our study confirm that the relationship between the real
exchange rate and productivity does exist and is stronger for higher income
countries in the long run. Real per capita income matters more than the rate of
economic growth in explaining the effects of the Balassa term in our study.
In the short run, however, we find almost uniform results across income groups.
Productivity growth (possibly of non-tradables), expansionary fiscal policies, and
high exchange rate volatility result in the real exchange rate depreciation. More
specifically:
• Improvements in productivity and expansionary fiscal policies both have the
impact of depreciating the real exchange rate in almost all the countries, both
middle and low incomes.
• The impact of exchange rate volatility is significant only in middle-income
countries. This may be associated with the type of exchange rate policy/regime
adopted. It is mainly fixed (unchanged) in low-income countries in which case it
may not be useful to explain variations in the real exchange rate in the short run.
The reasons for the anti-Balassa hypothesis results in low-income countries in
our study may be associated with a failure to satisfy the basic assumptions of the
model. The relationship between the real exchange rate and productivity in the
external version of the hypothesis assumes a positive relationship between pro-
ductivity and relative prices as well as relative prices and the real exchange rate in
the internal version. In addition, the law of one price must hold in the tradable
sector.
12.5.2 Policy Implications
On the basis of our findings, we recommend the following policy options for MICs
and LICs:
• The Balassa hypothesis holds for middle-income countries in our sample.
Economic growth leads to an appreciation in the real exchange rate in these
countries. Hence, countries in this group may promote growth by increasing
productivity in the tradable sector.
274 F. Baylie
• Since the Balassa hypothesis does not hold for low-income countries in our
sample, economic growth does not lead to the real exchange rate appreciation in
these countries. Hence, countries in this group may continue to grow by pro-
moting productivity growth in the non-tradable sector.
• Depreciation of the real exchange rate can be associated with improvements in
the productivity of the non-tradable sector for low-income countries and should
be used accordingly.
• The role of fiscal policy may not last long in low-income countries and so should
be used accordingly.
Acknowledgements I am grateful for all comments and contributions of Professor Scott Hacker,
Professor Par Sjolander, and Dr. Girma Estiphanos for this work. It was a great pleasure to have
their say in my paper.
Annexure 1
1.1 Model Derivation (Scott Hacker’s Contribution)
Suppose that the growth rate of the real exchange rate is defined as a function of
productivity differential between the non-tradable and tradable sectors as in:

^ ¼d b ^ ^N b A ^ A
^
Q AT A T N ð1Þ
a a
^ ^p ^p .
with Q
This is the same as:

^ ¼d b ^ ^
^N A
Q AT AT A N ð1:1Þ
a

^ 0 ¼ d A
If we let M ^N A
^ and m1 ¼ d b ; then:
N a

^ 0 þ m1 A
^ ¼A
Q ^T A
^ ð1:2Þ
T
In levels form, this is equivalent to:

m
Q ¼ M0 AT =AT 1 ð1:3Þ
and in log-levels, it is

q ¼ m0 þ m1 ln AT =AT ð1:4Þ
where q ln Q and m0 ln M0
We proxy AT =AT with Y=Y where Y is the home real GDP per capita and Y* is
the foreign (US) real GDP per capita, so we get Eq. (2.23).
q ¼ m0 þ m1 ðlnY=Y Þ ð1:5Þ
Annexure 2
See Tables 12.5, 12.6, 12.7, 12.8, 12.9, and 12.10
Table 12.5 Descriptive statistics (all countries)

Statistics lnðQÞ lnðY=Y Þ lnðG=G Þ volðEÞ
Mean −0.6737 −2.7011 0.9462 0.4726
Median −0.6646 −2.7219 1.1813 0.0424
Maximum 0.4235 −0.4348 2.8605 45.552
Minimum −1.6263 −4.6914 −2.2488 0.0000
Std. dev. 0.3675 0.7907 0.9795 2.8547
Skewness 0.2293 0.0585 −0.7068 11.761
Kurtosis 2.7686 2.5192 2.9466 164.64
Jarque−Bera 9.1598 8.4988 69.465 909407
Probability 0.0102 0.0143 0.0000 0.0000
Sum −561.22 −2250.0 788.16 386.62
Sum sq. dev. 112.38 520.19 798.28 6658.1
Observations 833 833 833 818
276 F. Baylie
Table 12.6 Descriptive statistics (by country)

Country lnðQÞ lnðY=Y Þ lnðG=G Þ VolðEÞ Obs.
Mean Std. Mean Std. Mean Std. Mean Std.
dev. dev. dev. dev.
Angola −0.4650 0.2996 −2.2869 0.4128 1.5568 0.5088 3.4174 10.3081 41
Brazil −0.3824 0.3295 −1.8292 0.2390 0.4919 0.6321 1.8454 4.5140 61
China −0.8927 0.3308 −2.7127 0.4237 1.4123 0.2741 0.0474 0.0861 59
Ethiopia −0.6491 0.3398 −3.8203 0.3501 1.0952 0.4464 0.0434 0.1172 61
Ghana −0.4949 0.2539 −2.7498 0.4530 −0.0584 1.0071 0.2148 0.2427 56
India −0.9066 0.2218 −2.9414 0.2323 1.1495 0.5007 0.0528 0.0688 61
Indonesia −0.9281 0.2891 −2.5105 0.1529 1.0049 0.7692 0.6411 2.5342 51
Kenya −0.6909 0.2801 −2.8806 0.3690 0.5182 0.6943 0.0628 0.1158 61
Nigeria −0.4329 0.5292 −2.7153 1.0063 0.9769 1.0085 0.1322 0.2678 61
The −0.6683 0.3533 −2.3381 0.1450 0.7938 0.7134 0.0769 0.1468 61
Philippines
Russia −0.9519 0.2779 −1.1607 0.3472 1.7785 0.3746 1.5517 3.5475 21
Rwanda −0.9272 0.2395 −3.3872 0.3605 1.5705 0.9514 0.0867 0.1667 51
South −0.4349 0.1711 −1.4829 0.2699 −0.8529 0.4004 0.0757 0.1007 61
Africa
Tanzania −0.7518 0.2005 −3.3301 0.4605 2.0835 0.4700 0.1328 0.2058 51
Uganda −0.7283 0.3749 −3.4294 0.4039 1.6334 0.2242 0.2552 0.4379 61
All −0.6737 0.3675 −2.7011 0.7907 0.9462 0.9795 0.4726 2.8547 818
Table 12.7 Cross-sectional dependence tests

Tests Non-augmented model Augmented model
Breusch-Pagan LM 235.4183*** 145.5989***
Pesaran scaled LM 7.964616*** 1.766490*
Bias-corrected scaled LM 7.837498*** 1.639371
Pesaran CD 11.56950*** −0.781383
Note Null hypothesis: no cross-sectional dependence (correlation)
Note ***, and * refer to significance level at 1, and 10%.
Annexure 3
See Tables 12.11 and 12.12

12
Table 12.8 Panel unit root tests (IPS and Fisher-type tests)
Variables Specifications Pesaran statistics Fisher statistics Order of integration
lnðQÞ Constant −1.0567 32.73 I(1)
Constant and trend 1.6161 21.13
D lnðQÞ Constant −21.07*** 401.23*** I(0)
Constant and trend −20.07*** 344.39***
lnðY=Y Þ Constant −0.3133 34.84 I(1)

D lnðY=Y Þ Constant −16.22*** 296.62*** I(0)
lnðG=G Þ Constant −0.9245 32.50 I(1)
D lnðY=Y Þ Constant −19.99*** 377.62*** I(0)
volðEÞ Constant −13.74*** 246.72*** I(0)
Constant and trend −16.07 230.49
Testing the Balassa Hypothesis in Low- and Middle-Income Countries
Note *** indicates the rejection of the null hypothesis (unit root) at 1%
277
278
Table 12.9 Results of co-integration tests

Model IPS test ADF test PP test
Non-augmented Augmented Non-augmented Augmented Non-augmented Augmented
Model I −19.15*** −22.18*** 356.99*** 427.72*** 353.95*** 430.16***
Model II −18.13*** −22.00*** 339.99*** 426.95*** 351.52*** 499.48***
Model II −17.36*** −20.75*** 314.36*** 392.64*** 317.27*** 409.21***
Note *** indicates rejection of the null hypothesis (unit root/no co-integration) at 1%
F. Baylie
12
Table 12.10 Model selection criteria

Model type Log L AIC* BIC HQ Specification
I 822.9247 −1.8213 −1.3725 −1.6491 ARDL (1, 1)
II 858.6465 −1.8304 −1.1975 −1.5875 ARDL (1, 1, 1)
III 920.9778 −1.9800 −1.2444 −1.6975 ARDL (1, 1, 1, 1)
Note * refer to significance level at 10%.
279
Table 12.11 Panel co-integration estimation: pooled mean group estimator (MICs = 5 countries)
280
Cases Model type Adjustment Long-run coefficients

coefficient lnðQÞ ¼ dependent variable
lnðQÞ lnðY=Y Þ lnðY=Y Þ lnðG=G Þ lnðG=G Þ volðEÞ
All countries I 0.7986 0.1090 −0.2792
(0.3929) (0.1518) (0.4087)
II 2.5721*** 0.3823** −2.0355** 0.2209* −1.0014***
(0.6182) (0.1609) (0.7849) (0.1203) (0.2819)
III 3.0730*** 0.3447** −2.2913*** 0.2521*** −0.8858*** 0.0248**
(0.5858) (0.1490) (0.7061) (0.0926) (0.2532) (0.0126)
Short-run coefficients
D lnðQÞ ¼ dependent variable
DlnðQÞ D lnðY=Y Þ D lnðY=Y Þ D lnðG=G Þ D lnðG=G Þ D volðEÞ
All countries I −0.1673*** 1.1840*** −0.3002*** 0.1930
(0.0682) (0.2976) (0.0825) (0.1433)
II −0.1936*** 0.9115*** −0.4084*** 0.2804 −0.0827 0.1863***
(0.10272) (0.1791) (0.1331) (0.2339) (0.0734) (0.0612)
III −0.1224*** 0.8424*** −0.3974*** 0.1953** −0.0577* 0.0648 −0.1613***
(0.0372) (0.1041) (0.0825) (0.0869) (0.0309) (0.0663) (0.0387)
Brazil I −0.0940*** 1.5177*** 0.0801 −0.1346
(0.0026) (0.0929) (0.0812) (0.0974)
II −0.1458*** 1.2512*** 0.0152 −0.1126 −0.0096 0.0453
(0.0044) (0.1148) (0.0927) (0.0964) (0.0059) (0.0435)
III −0.2276*** 1.0306*** 0.2286* −0.0875 −0.0883*** 0.2952** −0.0021***
(0.0044) (0.1047) (0.0969) (0.0865) (0.0060) (0.0529) (0.0001)
(continued)
F. Baylie
12
Cases Model type Adjustment Long-run coefficients

coefficient lnðQÞ ¼ dependent variable
China I −0.0625*** 0.7314*** −0.5448*** 0.1722*
(0.0012) (0.0440) (0.0166) (0.0610)
II −0.0933*** 0.5149*** −0.5655*** 0.1126 −0.1301*** 0.2787***
(0.0011) (0.0416) (0.0143) (0.0523) (0.0076) (0.0293)
III −0.0647*** 0.5464*** −0.5913*** 0.1947** −0.1015*** 0.2244*** −0.3549***
(0.0006) (0.0344) (0.0121) (0.0438) (0.0063) (0.0242) (0.0075)
India I −0.0883*** 0.6518*** −0.4247*** 0.1166**
(0.0019) (0.0235) (0.0294) (0.0267)
II −0.0488*** 0.6234*** −0.3918*** 0.1173** 0.0029 0.1059***
(0.0011) (0.0301) (0.0327) (0.0271) (0.0091) (0.0129)
III −0.0468*** 0.5813*** −0.4483*** 0.1445*** 0.0156 0.0836*** −0.3010***
(0.0008) (0.0267) (0.0268) (0.0229) (0.0086) (0.0112) (0.0067)
Russia I −0.4328*** 2.2026*** −0.2004*** 0.7271*
(0.0122) (0.1078) (0.0308) (0.3085)
II −0.5998*** 1.4202*** −0.7845*** 1.2003** −0.3464*** 0.3769**
(0.0173) (0.2234) (0.0570) (0.3171) (0.0124) (0.1013)
III −0.5625*** 1.3547*** −0.4439*** 1.1735** −0.4515*** 0.3237** 0.0065***
(0.0152) (0.1719) (0.0452) (0.3029) (0.0100) (0.0646) (0.0001)
South Africa I −0.1589*** 0.8167*** −0.2509* 0.0837

(0.0051) (0.0582) (0.1060) (0.0756)
II −0.0806*** 0.7475*** −0.3152 0.0843 0.0696 0.1248**
(0.0015) (0.0622) (0.1463) (0.0773) (0.0426) (0.0356)
III −0.0787*** 0.6452*** −0.1949 −0.0028 0.0395 0.2259*** −0.3733***
(0.0015) (0.0548) (0.1515) (0.0679) (0.0588) (0.0306) (0.0097)
Note ***, **, and * refer to significance level at 1, 5, and 10%. Standard errors in parentheses
281
Table 12.12 Panel co-integration estimation: pooled mean group estimator (LICs = 10 countries)
282
Cases Model Adjustment Long-run coefficients

type coefficient lnðQÞ ¼ dependent variable
All countries I 0.3168 0.3205* 0.8329***
(0.2237) (0.1324) (0.2129)
II 0.6872*** 0.2112** 0.3854 0.1407*** −0.3507***
(0.2259) (0.0986) (0.2512) (0.0418) (0.0959)
III 1.0669*** −0.2873*** 0.2720 0.0109 −0.0182 −3.2161***
(0.2127) (0.0867) (0.2552) (0.0495) (0.1282) (0.4192)
Short-run coefficients
D lnðQÞ ¼ dependent variable
D lnðQÞ D lnðY=Y Þ D lnðY=Y Þ D lnðG=G Þ D lnðG=G Þ D volðEÞ
All countries I −0.1117*** 0.7647*** −0.3522*** 0.2879**
(0.0272) (0.1003) (0.1180) (0.1313)
II −0.1413*** 0.7635*** −0.3648*** 0.2653** −0.0429 −0.0144
(0.0367) (0.1032) (0.1366) (0.1351) (0.0326) (0.1094)
III −0.0863*** 0.5631*** −0.4239*** 0.1896 −0.0168 0.0173 −0.0203
(0.0248) (0.1294) (0.0992) (0.1162) (0.0287) (0.0951) (0.0313)
Angola I −0.1021*** 0.8559** −0.2539** 0.6178**
(0.0129) (0.2088) (0.0445) (0.3376)
II −0.1835*** 1.0438*** −0.2890*** 0.6206** −0.1326*** −0.7164***
(0.0105) (0.1673) (0.0349) (0.2624) (0.0049) (0.0993)
III −0.0015*** 0.9369*** −0.3725*** 0.4225 −0.1717** −0.5855** −0.0059***
(0.0001) (0.1255) (0.0259) (0.1812) (0.0039) (0.0886) (0.0001)
(continued)
F. Baylie
12

Ethiopia I −0.3364*** 0.5018*** −0.9714*** 0.3830*** 0.3830***
(0.0034) (0.0332) (0.0187) (0.0419) (0.0419)
II −0.4113*** 0.4470*** −1.1079*** 0.4443*** −0.1390*** 0.1123***
(0.0031) (0.0291) (0.0161) (0.0334) (0.0028) (0.0161)
III −0.1363*** 0.3654*** −0.8169*** 0.3477*** −0.038*** 0.1337*** 0.0657***
(0.0006) (0.0279) (0.0157) (0.0343) (0.0029) (0.0192) (0.0075)
Ghana I −0.1228*** 0.3116** −0.7780*** 0.7832***
(0.0032) (0.0722) (0.0433) (0.0938)
II −0.2304*** 0.2622** −0.8854*** 0.6431*** −0.0894*** 0.1710**
(0.0046) (0.0696) (0.0518) (0.0870) (0.0034) (0.0427)
III −0.0948*** 0.0931 −0.6913*** 0.3100** 0.0335*** −0.079616* −0.0424***
(0.0006) (0.0470) (0.0350) (0.0622) (0.0023) (0.025340) (0.0027)
Indonesia I −0.1386*** 0.8098*** 0.2978** 0.5774**
(0.0043) (0.1281) (0.0631) (0.1625)
II −0.1571*** 0.8070*** 0.3864** 0.4926* 0.0454** −0.0530
(0.0050) (0.1307) (0.0765) (0.1635) (0.0103) (0.0744)
III 0.0001*** 0.8071*** −0.0061 0.6202** 0.0672*** 0.0295 −0.0172
(0.0001) (0.1249) (0.0866) (0.1609) (0.0109) (0.0685) (0.0001)
Kenya I −0.1036*** 1.0582*** −0.4606*** 0.1793**

(0.0025) (0.0396) (0.0340) (0.0450)
II −0.1197*** 1.0269*** −0.4043*** 0.1692** 0.0740*** −0.1535***
(0.0033) (0.0374) (0.0334) (0.0443) (0.0037) (0.0211)
III −0.2489*** 0.2617*** −0.1214*** −0.0381* 0.1122*** −0.0537*** 0.1646***
(0.0015) (0.0133) (0.0129) (0.0124) (0.0009) (0.0074) (0.0034)
(continued)
283
284

Nigeria I −0.0364*** 0.8468*** −0.4929*** 0.8597***
(0.0002) (0.0885) (0.0112) (0.1219)
II −0.0277*** 0.7961*** −0.5074*** 0.9467*** −0.0155*** 0.5019***
(0.0002) (0.0832) (0.0106) (0.1138) (0.0011) (0.0436)
III −0.0424*** 0.7329*** −0.6329*** 0.7799*** −0.0203*** 0.5929*** −0.1253***
(0.0002) (0.0613) (0.0091) (0.0850) (0.0008) (0.0378) (0.0024)
The I −0.1089*** 0.6733*** −0.2433** −0.1986*
Philippines (0.0036) (0.0759) (0.0713) (0.0830)
II −0.1138*** 0.7101*** −0.3810** −0.1423 0.1328*** −0.3828***
(0.0045) (0.0724) (0.0716) (0.0821) (0.0141) (0.0469)
III −0.1458*** 0.3199*** −0.6470*** 0.1060** 0.0542*** −0.0446* −0.0895***
(0.0006) (0.0226) (0.0226) (0.0239) (0.0039) (0.0158) (0.0024)
Rwanda I −0.0665*** 0.8509*** −0.4273*** −0.0079
(0.0008) (0.0487) (0.0069) (0.0531)
II −0.1001*** 0.7458*** −0.4331*** −0.0645 −0.0098* −0.0204
(0.0018) (0.0559) (0.0071) (0.0540) (0.0035) (0.0313)
III −0.0754*** 0.7087*** −0.3457*** −0.1616** −0.0128** 0.0278 −0.0287***
(0.0005) (0.0363) (0.0053) (0.0416) (0.0024) (0.0235) (0.0029)
Tanzania I −0.0505*** 0.3739*** −0.2922*** −0.0059
(0.0019) (0.0578) (0.0151) (0.0659)
II −0.0359*** 0.4408*** −0.0900** −0.0628 −0.1428*** 0.1009**
(0.0024) (0.0572) (0.0263) (0.0658) (0.0058) (0.0306)
III −0.1073*** 0.0668** −0.6651*** −0.1276** −0.0412*** −0.1234*** −0.1776***
(0.0003) (0.0195) (0.0133) (0.0238) (0.0021) (0.0102) (0.0030)
(continued)
F. Baylie
12

Uganda I −0.0512*** 1.3649*** 0.0995* −0.3089**
(0.0005) (0.0555) (0.0325) (0.0661)
II −0.0335*** 1.3547*** 0.0635 −0.3939*** −0.1529*** 0.2964***
(0.0004) (0.0526) (0.0314) (0.0651) (0.0069) (0.0228)
III −0.0103*** 1.3388*** 0.0593 −0.3629** −0.1504*** 0.2763*** 0.0533***
(0.0001) (0.0542) (0.0329) (0.0673) (0.0078) (0.0276) (0.0021)

lnðQÞ = log of real exchange rate, lnðY=Y Þ = log of real GDP relative to foreign (US), lnðG=G Þ = log of real government expenditure relative to foreign
(US), volðEÞ exchange rate volatility, lnðQÞ = log of real exchange rate demeaned, lnðY=Y Þ = log of real GDP relative to foreign (US) demeaned, and
lnðG=G Þ = log of real government expenditure relative to foreign (US) demeaned
***, **, and * refer to significance level at 1, 5, and 10. Standard errors in parentheses
285
286 F. Baylie
References
Ahn M (2009) Looking for the Balassa-Samuelson effect in real exchange rate changes: Andong
National University. J Econ Res 14:219–237
Asea K, Corden M (1994a) The Balassa-Samuelson model: an overview. USA
Asea K, Mendoza G (1994b) The Balassa-Samuelson model: a general equilibrium appraisal.
Review of International Economics, Working Paper #709
Balassa B (1964) The purchasing power parity doctrine: a reappraisal. J Polit Econ 72(6):584–596
Baylie F (2008) The impact of real effective exchange rate on the economic growth of Ethiopia.
Master thesis, Addis Ababa University, Ethiopia
Bhagwati N (1984) Why are services cheaper in the poor countries? The Econ J 94(374)
Bhattarai K (2011) Co-integration and error correction models: econometric analysis. Hull
University, Business School, England
Cavalcanti V, Mohaddes K, Raissi M (2011) Commodity price volatility and the sources of
growth. IMF Working Paper, Middle East and Central Asia Department
Chen M (2013) Panel unit root and co-integration tests. National Chung Hsing University, USA
Chuah P (2012) How real exchange rate move in growing economies: Anti-Balassa evidence in
developing countries. Malaysia
Chuoudhri E, Kahn S (2004) Real exchange rate in developing countries: are Balassa-Samuelson
effect present? IMF Working Papers WP/04/188
De Gregorio J, Wolf H (1994) Terms of trade, productivity and the real exchange rate. NBER
Working Paper No. 4407
Drine I, Rault C (2002) Do panel data permit to rescue the Balassa-Samuelson hypothesis for latin
American countries? An empirical analysis using panel data co-integration tests. William
Davidson Working Paper Number 504
Drine I, Rault C (2004) Does the Balassa-Samuelson hold for Asian countries? An empirical
analysis using panel data co-integration tests. Appl Econ Int Dev 4(4):000
Eberhardt M (2011) Panel time-series modelling: new tools for analyzing xt-data. University of
Nottingham, Case Business School, England
Eberhardt M (2012) Estimating Panel Time Series Models with Heterogeneous Slopes. Stata
Journal 12 (1):61–71
Engle R, Granger C (1987) Co-integration and error correction: representation, estimation, and
testing. Econometrica 55(2):251–276
Feenstra R, Inklaar R, Timmer M (2013) The next generation of the penn world table. Am Econ
Rev 105(10):3150–3182
Guo Q, Hall G (2008) A test of the Balassa-Samuelson effect applied to Chinese regional data.
Rom J Econ Forecast 2:57–78
Hassan F (2011) The Penn-Balassa-Samuelson effect in developing countries: price and income
revisited. London School of Economics and Political Science, London
Herberger C (2003) Economic growth and the real exchange rate: revising the Balassa-Samuelson
effect. University of California, Los Angeles
IMF (2015) World Economic Outlook Report. IMF
Isard P, Symansky S (1996) Long-run movements in real exchange rates. IMF Occasional Paper
No. 145
Jabeen S, Malik S, Haider A (2011) Testing the Harrod-Balassa-Samuelson hypothesis: the case of
Pakistan. Quaid-i-Azam University, Islamabad
Kohler M (1998) The Balassa-Samuelson effect and monetary targets. Centre for Central Banking
Studies, Bank of England
Kravis B, Lipsey E (1983) Towards an explanation of national price levels. Princeton Studies in
International Finance, No. 52, Princeton University, USA
Kumo W (2011) Growth and macroeconomic convergence in Southern Africa. African
Development Bank Group, Working Paper No. 130
Maddala S, Wu S (1999) A comparative study of unit root tests with panel data and a new simple
test. Oxford Bull Econ Stat 61:631–652
Miyajima K (2005) Real exchange rates in growing economies: how strong is the role of the
nontradables sector? IMF Working Paper No. 05/233
Orlik A (2003) Real convergence and its different measures; lessons to be learnt by EMU applicant
countries
Pedroni P (2004) Panel co-integration; asymptotic and finite sample properties of pooled time
series tests with an application to the PPP hypothesis. Econ Theor 28:597–625
Pesaran H (2007) A simple panel unit root test in the presence of cross-section dependence. J Appl
Econ 22(2):265–312
Pesaran H (2013) Large panel data models with cross-sectional dependence: a survey.
Unpublished, Cambridge, UK
Podkaminer L (2003) Analytical notes on the Balassa-Samuelson effect. BNL Q Rev 226
Rogoff K (1996) The purchasing power puzzle. J Econ Lit XXXIV: 647–668
Shin Y (2014) Dynamic panel data workshop. University of Melbourne
Soukiazis E (1995) The endogeneity of factor inputs and the importance of balance of payments on
growth: an empirical study for the OECD countries with special reference to Greece and
Portugal. unpublished PhD dissertation, University of Kent, Canterbury
Tica J, Druzic I (2006) The Harrod-Balassa-Samuelson effect: a survey of empirical evidence.
University of Zagreb, Zagreb
Wilson E (2010) European real effective exchange rate and total factor productivity: an empirical
study. Victoria University of Wellington, New Zealand
Part V
Growth, Productivity and Efficiency in
Various Industries
Chapter 13
Agricultural Tax Responsiveness
and Economic Growth in Ethiopia
Hassen Azime, Gollagari Ramakrishna and Melesse Asfaw
Abstract Of late, the pattern of tax revenues and its nexus with economic growth
in developing countries become an increasing concern for policy framers and
researchers. Since tax revenue is one of the important sources of government
revenue, a tax policy assumes significance as a vehicle for a viable and long-term
source of revenue and economic growth. Similarly, economic growth has aug-
menting effects on the tax revenue of a country. This study investigates tax
responsiveness to the changes in gross domestic product in Ethiopia in the period
1981–2014. It mainly focuses on the components of agricultural tax revenue:
agricultural income tax and land use fee. In addition, it also studies personal income
tax and business profit income. Understanding and analyzing the level of sensitivity
of tax revenue to discretionary policy measures and GDP are essential in formu-
lating fiscal policy. The empirical evidence on Ethiopia suggests that the trends in
agricultural income tax and land use fee collection are highly inconsistent.
Agricultural income tax and land use fee are not buoyant, indicating that the growth
of the agricultural sector has no statistically significant impact on agricultural
income tax buoyancy. However, personal income tax revenue, business profit
revenue, and total direct tax revenue are responsive to changes in non-agricultural
GDP in Ethiopia. In light of these findings, some policy interventions for improving
tax revenue are suggested.
Keywords Tax buoyancy Tax elasticity Agricultural tax revenue Direct tax
revenue
H. Azime (&)
Institute of Tax and Customs Administration, Department of Public Finance
Ethiopian Civil Service University, Addis Ababa, Ethiopia
e-mail: azimeadem@gmail.com
G. Ramakrishna M. Asfaw
School of Graduate Studies, Ethiopian Civil Service University, Addis Ababa, Ethiopia
e-mail: profgrk@gmail.com
M. Asfaw
e-mail: drmelesse@gmail.com

DOI 10.1007/978-981-10-4451-9_13
292 H. Azime et al.
13.1 Introduction
Several studies have emphasized the importance of tax revenue in promoting

economic development. In a recent study, Feger and Asafu-Adjaye (2014) conclude
that in order to advance development, governments are required to spend more on
public services and this can be achieved by improving tax revenue mobilization.
A similar opinion was expressed by Besley and Ghatak (2006) when they wrote,
‘the different public goods such as availability of clean drinking water, sewage
disposal, transportations, health care, and primary and secondary schools are the
necessity for well-being as well as an input for increasing the productivity.’ Even
though the main purpose of taxation is financing public goods and services, the tax
policy should be based on certain fiscal principles. In this connection, Tanzi and
Zee (2000) emphasize that a tax system should be guided by the equity principle
that stipulates that taxpayers should only pay what is deemed to be their fair share
of taxes. Additionally, the tax administration should have certain efficiency
objectives whereby the government collects sufficient revenues to carry out its
welfare and development goals. Therefore, while designing a tax system, it is
important that the equity principle and the efficiency objectives do not come in
direct conflict with each other.
Economic development, particularly rural development in countries such as
Ethiopia, requires substantial financial investments in infrastructure, education,
health, and other social services. A fiscal policy instrument of an economic system
needs to fund public expenditure from the domestic economy in a larger proportion.
One important source for meeting these investment needs is tax revenue.
Sub-Saharan African countries are facing several challenges in augmenting their
tax collections (Sanjeev and Tareq 2008). Rural development in these countries
largely depends on the contribution of the agricultural sector, and thus, the tax
revenue raised in this sector also plays a major role in economic development. The
main purpose of our study is to examine the responsiveness of agricultural tax
revenue to growth in the agricultural component of Ethiopia’s GDP. This revenue is
the legal levy imposed on farmers on their incomes generated through agricultural
activities as well as the fee imposed on the land owned. Empirical evidence on this
issue, more particularly on Ethiopia, is limited as well as mixed and does not
provide comprehensive and conclusive evidence. Our paper tries to fill this gap.
Tax revenue responsiveness to changes in the economic activity of a country
affects its revenue mobilization efforts. In theory, tax revenue is said to increase
with economic growth based on the assumption that the tax base grows as GDP
increases (Milwood 2011). Increase in revenue during a particular fiscal year may
occur either due to an effect of changes in the tax policy or as a result of a natural
increase in tax revenue because of an increase in GDP. The responsiveness of tax
revenue to changes in GDP is usually measured by two concepts: tax elasticity and
tax buoyancy. The first concept measures the extent to which a tax structure gen-
erates revenue in response to increases in taxpayer incomes without a change in
statutory tax rates (Bunescu and Comaniciu 2013; Craig and Heins 1980).
13 Agricultural Tax Responsiveness and Economic Growth in Ethiopia 293
The second concept is defined as the overall reaction of tax revenue to changes
in GDP and discretionary changes in the tax policy over time. It is a measure of
how tax revenue varies with changes in GDP. Tax revenue is therefore expected to
increase as the economy grows, that is, the level of estimation is how far the tax
revenue reacts to changes in GDP. Tax buoyancy measures can be used to assess
the efficiency of a given tax system regarding its revenue generation capacity
(Jenkins et al. 2000). Knowledge of this measure is important in decision making
about the fiscal policy of a country because it allows us to determine the evolution
of the tax revenue collected by the government (Bunescu and Comaniciu 2013;
Moreno and Maita 2014). Hence, tax buoyancy is a valuable method for analyzing
the tax policy and examining the composition of a tax system.
The tax structures in developing countries should be responsive enough so as to
enable the countries to meet their government spending for development. Thus, the
main objective of our study was to examine the responsiveness of agricultural tax
revenue and other tax revenues to the changes in economic growth in Ethiopia.
More specifically, the paper has the following objectives:
• To estimate and analyze the responsiveness of agricultural income tax revenue
and land use fee to changes in the agricultural component of GDP, and
• To estimate the responsiveness of personal income tax and business income tax
revenue to changes in non-agricultural GDP.
The rest of the paper is organized as follows: The next section gives a brief
overview of the tax structure in Ethiopia. In Sect. 13.3, a conceptual model and a
brief review of earlier studies are presented. The data collection methods and the
variables are presented in Sect. 13.4. Section 13.5 gives the data analysis and
empirical findings. The last section gives a summary and conclusion.
13.2 An Overview of the Tax Structure in Ethiopia
This section presents an overview of the tax structure in Ethiopia across its two
economic regimes: the state-led liberalized regime (1991 onward) and the socialist
regime (1974–91) called the Derg regime. Under both the regimes, the Ethiopian
tax system consisted of direct and indirect taxes. Direct taxes include agricultural
income, land use fee, personal income, rental income, business profit, interest
income, and capital gain tax while indirect taxes include value-added tax (VAT),
turnover tax, excises, stamp duties, customs duties, and export taxes.
During the socialist regime, the government controlled all economic spheres
including agriculture. The land reform policy of 1975 nationalized land and took
another step of distributing land equally among peasants. Consequently, the peas-
ants were forced to establish and organize themselves into peasant associations
(Prichard 2015). Smallholder farmers in Ethiopia depend on small acres of land that
is owned or rented to generate income. The term ‘agricultural taxation’ used in our
294 H. Azime et al.
study includes only taxes paid by the farmers. So the smallholder farmers’ burden
of taxes is from agriculture income tax and land use fee.
During the socialist regime, the objective of agricultural tax was transferring a
substantial portion of the agricultural surplus to industry. As a result, the govern-
ment taxed the agricultural sector heavily. In particular, the agricultural income tax
rate was progressive and was as high as 89% in the highest income bracket.
Taxation on exports of the main crop reached as high as 100% of the farm gate
price (Rashid et al. 2007).
Because of the change of government in Ethiopia in 1991, the country witnessed
a shift in the policy regime. Different reforms were initiated in 1992. These included
new legislations for earnings tax, business income tax, rural land, and agricultural
income tax (Alemayehu and Shimeles 2005). During 1992, agricultural taxes were
not collected because of the transition period and difficulties in collecting taxes
from farmers. Since 1992, IMF and the World Bank have supported Ethiopia in
liberalizing its economy and implementing structural adjustment programs (SAPs)
to address the internal and external imbalances in the economy.
The government has initiated different reforms to liberalize its economy. It
undertook comprehensive tax reforms encompassing most of the principal revenue
sources. Along with the reforms in the tax system, the liberalization policies were
also extended to monetary policy tools, foreign and domestic trade, production, and
distribution (Geda and Shimeles 2005). The major goals of the tax reforms initiated
during this regime included increasing the tax base, improving tax collection, tax
incentives for the private sector, and dealing with equity in taxation.
13.2.1 The Current Agricultural Tax Structure in Ethiopia
Agricultural income tax is one of the most sensitive features of income taxation in
general. In most developing countries, governments impose taxes on agricultural
income, but it is hard to determine the income of smallholder farmers and to reach
income earners. These difficulties are due to the large number of small units of income
generation, the absence of accounting procedures suited to income taxation, the
fluctuating nature of agricultural productivity and profits, and low levels of education.
Ethiopia amended its 1978 agricultural income tax rates in 1995 and 1997.
Moreover, annual revenue exceeding birr1 1200 was subjected to a progressive tax
rate. Agricultural income tax rates imposed by the regional states with the provision
of the constitution were wide ranging from 5 to 40%. Agricultural income taxation
was based on the size of the landholding rather than the amount of annual agri-
cultural production. For instance, the Oromia regional state (the largest and most
populous region in Ethiopia) initially adopted a progressive agricultural income tax
system but replaced it with an agricultural income tax system based on the size of the
landholding, rather than the amount of agricultural produce (ONRS 2002, 2005).
1
Birr is the currency used in Ethiopia. Currently, one (1) USD is equal to about 22.24 birr.
The agricultural income tax rate, exemption limits, and assessment differ slightly
across regions. Each region levying the tax has its statutes with specific provisions
for determining taxable incomes.
13.2.2 Land Use Fee
In principle, land taxes are less complex as compared to agricultural income tax
because assessment of land tax requires the total area of the land, its location, and
type of land grade; suitability for irrigation; land fertility; and rural transportation
for a market. As Newbery (1987) has suggested, this information might not be too
costly to collect. Based on this information, it would be possible to design a simple
presumptive tax structure for land tax (Sarris 1994).
According to the amended proclamation number 77/1997 of income tax for land
use and agricultural activities, smallholder farmers in the regional states are taxed
birr 10 for the first hectare and birr 7.5 for each extra half hectare (Geda and
Shimeles 2005). In some regions, the area of land and the land classification system
that is based on relative soil fertility estimates determine the level of taxation.
During 2004–14, the total rural area cultivated and expanded for agricultural pur-
poses increased by 2.7% per year and the number of smallholder farmers increased
by 3.8%. The total agricultural output level also increased during this period
(Bachewe et al. 2015; Moller 2015).
13.2.3 Agricultural Tax Revenue Growth
In macroeconomic terms, the level of tax revenue is measured relative to its

GDP. Measuring the tax revenue in GDP compares the level of taxes collected to
the tax base; this helps in evaluating the tax performance for a given tax base.
Evidently, developing countries have fewer tax ratios to GDP when compared to
developed countries. According to Besley and Persson (2014), developing countries
collect taxes which are 10–20% of GDP, whereas developed countries on average
raise around 40% of GDP. Similar to this, Ethiopia’s tax revenue to its GDP is also
low. Despite the government’s tax revenue mobilization efforts, the total tax
revenue-to-GDP ratio was 11.4% in 2009–10, and with some small fluctuations, it
rose to 11.7% in 2013–14.
Although direct taxes increased from 0.02 in 2009–10 to 0.022 in 2013–14,
Fig. 13.1 indicates that the ratio of direct taxes to GDP declined steadily. Indirect
tax revenues were twice as high as direct taxes in most years, and Fig. 13.1 shows
that the ratio of indirect taxes to GDP increased steadily. This is in line with the
findings of Feger and Asafu-Adjaye (2014), who concluded that the tax structure in
sub-Saharan Africa (SSA) is skewed toward indirect taxes because the existing
structural, institutional, and policy characteristics in these countries are not
296 H. Azime et al.
.15
.1
.05
0
1980 1990 2000 2010 2020

Year
Tax Revenue Direct Tax
Indirect tax
Fig. 13.1 Total tax revenue, direct and indirect tax as shares of GDP. Source Authors’
computations using data from the Ministry of Finance and Economic Development (MOFED)
conducive to the collection of direct taxes. It is also argued that indirect taxes are
less sensitive to these influences; hence, they can be collected with little effort and
are relatively easy to administer (Khan 2001).
As depicted in Fig. 13.2, the agricultural tax revenue series shows a decline in
revenue until 1992. Because of the change in regime during 1991–92, there was no
assessment of agricultural tax revenue. The figure also shows that the tax ratio has
Agricultural Income tax and Land use fee as % of Total GDP Agricultural Income tax & Land Use fee as % of Agricultural GDP
.2
.0 8
.1 5
.0 6
.1
.0 4
.0 5
.0 2
0
0
1980 1990 2000 2010 2020 1980 1990 2000 2010 2020
Year Year
Agricultural Income tax Land Use fee Agricultural Income tax Land Use fee
Fig. 13.2 Agricultural tax revenue as share of total GDP and agricultural GDP. Source Authors’
computations using data from MOFED
fluctuated consistently in the last two decades in Ethiopia. In fact, the tax ratio trend
is not stable, implying inconsistency in tax performance that could be due to
fluctuations in GDP.
According to Feger and Asafu-Adjaye (2014), to date, total tax revenue col-
lection in SSA countries has only averaged about 15% of GDP. However, in the
case of Ethiopia, it is 11.5%, which is still below the SSA average amount.
Moreover, the agricultural income tax collection efficiency in Ethiopia is not as
broad-based as it should be. The efforts of the tax administration, capacity, and
efficiency may have attributed to less progress in collecting the revenue generated
from the agricultural tax income base. In 2003-04, the agricultural income tax
revenue was 0.13% of agricultural GDP (0.06% of total GDP). It dropped to 0.07%
of agricultural GDP in 2007–08 (0.03% of total GDP), but it picked up to 0.13% of
agricultural GDP (0.08% of the total GDP) in the 2010–11 fiscal year.
Though agriculture remains the mainstay of the Ethiopian economy when it
comes to employment and its contribution to GDP, its contribution to the total tax
revenue collection is below 1%. Figure 13.3 shows the shares of personal income
tax and business profit tax to GDP from 1981 to 2014. In 1981, personal income
tax’s revenue share was around 0.1% of GDP; its share grew to 2% of GDP in
2014. Business profit income tax also fluctuated but was still slightly higher than
personal income tax until 2005. However, after this period, it increased moderately
and its contribution reached a 3.5% share of GDP.
4
3
2
1
0
1980 1990 2000 2010 2020

Year
Personal income tax Business profit income tax
Fig. 13.3 Personal income tax and business profit income tax revenue as shares of total
GDP. Source Authors’ computations based on data from MOFED
298 H. Azime et al.
13.3 The Conceptual Model and a Review

of Earlier Studies
The need to measure tax responsiveness in relation to its revenue-generating

capabilities can be seen in light of monitoring the progress of tax collections and tax
revenue forecasting. Two measures for monitoring the government’s
revenue-generating capabilities have been formulated: tax buoyancy and tax elas-
ticity. These two concepts measure the response of tax revenue to changes in
income. According to Howard et al. (2009), this estimation of elasticity and
buoyancy concepts has relevance in sub-Saharan African countries, where there is a
considerable lag in tax collection and inefficiency given its potential for tax.
13.3.1 Tax Buoyancy
The buoyancy of a tax is estimated with the relative deviation in tax collection
efforts, or it is a specific tax revenue item as compared to a change in the tax base.
Thus, buoyancy is based on actual tax income and shows the changes in the tax
structure, which may include tax rates, tax basis, and tax administration and
compliance. Therefore, tax buoyancy is a measure of both the soundness of the tax
base and the usefulness of tax changes regarding revenue collection.
13.3.2 Tax Elasticity
On the other hand, tax elasticity measures the automatic response of tax revenue to
the evolution of the tax base. Tax elasticity does not include the effects of fiscal
policy changes in the tax structure such as a change in tax rates, coverage,
exemptions, and deductions or administration. Tax elasticity reflects only the
built-in responsiveness of tax revenue to movements in the national income.
Both the tax buoyancy and elasticity concepts help analyze the capacity of the
tax system in mobilizing revenue with and without changes in the tax policy. Tax
buoyancy is a useful concept for measuring the performance of both the tax policy
and tax administration over time whereas tax elasticity is a relevant factor for
forecasting purposes (Jenkins et al. 2000). The tax elasticity coefficient gives an
indication to policymakers on whether tax revenues will increase at the same pace
as the national income.
Different studies have investigated the impact of GDP on the sensitivity of tax
revenues for African countries. Among these, Osoro (1993) concluded that for the
main categories of taxes in Tanzania, elasticities were found to be less than 1%.
However, in comparison with buoyancy due to its discretionary changes, it became
higher than the elasticity coefficient. Mawia and Nzomoi (2013) evaluated the tax
buoyancy of different taxes in Kenya and found that tax revenue did not respond to
economic changes except excise duty. Ahmed and Muhammad (2010) analyzed 25
countries for the period 1998–2008 and applied a pooled least squares analysis
method. Their results show that growth in the agricultural sector had little impact on
the efficiency of tax revenue and was also less responsive to revenue mobilization in
the case of developing countries mainly due to difficulties in assessing the incomes
generated and the low incomes that may not be taxed or may be under-taxed.
Other studies show that the agricultural share’s contribution demonstrated a
consistently negative impact on revenue collections, but tax revenue increased with
trade share (Prichard 2015). Leuthold (1991) studied eight African countries by
measuring the tax effort for the period 1973–81 in a panel data using the OLS
estimation. The author argues that the agricultural share will affect the estimation
coefficient of direct and indirect tax revenues negatively. His review suggests that
evidence is not in favor of improving tax buoyancy in agriculture, and it also seems
that there is no evidence available on Ethiopia. Studying the responsive elements of
agricultural taxation in Ethiopia’s current context is expected to provide an effective
agricultural taxation system that enhances domestic revenue mobilization and rural
investments, which can be used for stimulating development.
13.4 Estimation Methods and Data Collection
13.4.1 Estimation of Tax Buoyancy
Public finance policies in developing countries typically change tax parameters and
structures from time to time. This affects ‘revenue buoyancy.’ According to Creedy
and Gemmell (2001), the tax buoyancy estimation coefficient is the ratio of the
observed increase in revenues to the observed increase in incomes. A tax is buoyant
if revenue measures are increased in excess of 1% for a 1% increase in GDP or
national income (Creedy and Gemmell 2008; McCluskey and Trinh 2013). More
than 1% tax buoyancy will indicate a more proportionate increase in tax revenues
compared to that of GDP. Therefore, tax buoyancy that includes discretionary
changes is a measure of the efficiency of the tax base and the soundness of changes
in the tax policy regarding revenue collection and mobilization.
According to Haughton (1998), tax buoyancy (TB) is formulated as the per-
centage change in tax revenue to the percentage change in the tax base:
TB ¼ %D Revenue=%D base ð13:1Þ
where the base can be GDP, or the relative base can be considered. Revenue could
refer to the different components of the total tax or individual taxes.
300 H. Azime et al.
In our study, the focus is on two types of agricultural taxes: agricultural income
tax (AgIT) and agricultural land tax or land use fee (AgLT):
log AgITt ¼ a0 þ a1 þ log AgrGDPt þ e ð13:2Þ
log AgLTt ¼ a0 þ a1 þ log AgrGDPt þ e ð13:3Þ
where AgrGDP is agricultural gross domestic product and ɛ is a stochastic dis-

turbance term. Since the variables are converted into their natural log forms, the
coefficient estimates indicate tax buoyancy as they measure the percentage response
in agricultural income tax and land use fee variables for a given 1% change in
agricultural GDP.
In estimating the coefficient of buoyancy, no attempt is made to control for
discretionary changes in the tax policy and administration. Discretionary tax
measures refer to legal changes in tax rates, tax base, tax allowances and credits,
and administrative tax efficiency. Consequently, buoyancy reflects both discre-
tionary changes and anticipated revenue growth. It helps investigate whether
growth in the agricultural sector has an impact on tax revenue.
13.4.2 Estimation of Tax Elasticity
Tax elasticity measures the extent to which a tax structure generates revenues in
response to increases in taxpayer incomes without a change in statutory tax rates
(Craig and Heins 1980). If a tax is to be elastic, a 1% increase in GDP may bring in
a more than 1% increase in revenue from the tax, holding discretionary tax changes
constant.
Singer (1968) and Ehdaie (1990) have developed an econometric model mea-
surement for estimating the tax elasticity coefficient. The model takes into account
the relations between GDP, tax revenue, the formation of the tax system, the tax
base using time series data for analysis and a model based on logarithmic functions
(Bunescu and Comaniciu 2013).
Accordingly, we used a dummy variable (Di) to represent the shift in tax policy
during the study period 1981–2014. From Eq. 13.2, the functional tax form is as
follows:
X
logt ðAgITÞ ¼ log a þ b logðAgrGDPÞt þ hi Di þ et ð13:4Þ
where
a Constant;
b Elasticity coefficient;
hi Impact or coefficient of the discretionary change; and
Di Dummy variable as a proxy for the ith discretionary tax measures (DTM) taken
during the period under review. The summation sign in Eq. 13.4 creates room
for the possibility of multiple changes in the tax system during the study period
We introduced a dummy variable to represent a shift in tax policy during the
administrative reforms starting from 1992. The decade of the 1990s differed from
the previous period in the application of a more liberal policy. During the second
half of the 1990s, tax reforms were implemented. Since 1993, the tariff structure has
improved extensively and more proclamations and regulations have been intro-
duced to streamline the old tax system.
In estimating the coefficient of tax buoyancy, annual time series data was col-
lected from 1981 to 2013. The data comprises the following variables of interest:
agricultural GDP, non-agricultural GDP, aggregated agricultural income tax,
aggregated land tax, personal income tax, business profit income tax, aggregated
direct tax, and consumer price index. This data is from the Ministry of Finance and
Economic Development (MOFED) and the World Development Indicators’
(WDI) database.
Agricultural income tax revenue, land tax revenue, personal income tax, busi-
ness profit income tax, and aggregated direct tax were converted to their real values
by dividing the nominal values with the consumer price index (CPI). The use of
CPI as the deflator helps smoothen the data and also avoids biased results that could
have resulted from inflation. CPI is used because it falls on the expenditure side of
the GDP equation. According to Triplett (2001), CPI is preferable as it represents
the cost-of-living index and avails appropriate guidance for measuring consumer
inflation. Hence, it is best used in deflating tax revenues.
The variables used in the models are as follows:
D.ln_RealAGDP is the first differenced log of real agricultural GDP;
D1992 is a dummy variable to show for 1992 when there was a change
in government and no collection of tax revenue;
Dpolicy is a dummy variable to capture policy changes due to the tax
reforms; and
t is time trend
The limitation of applying this approach is data requirement which separates tax
revenue from discretionary changes. Due to lack of this data, we corrected the dataset
for the effects of tax reforms and tax policy changes using dummies. This technique
assumes that income elasticity is constant over the range of revenues considered.
13.5 Empirical Findings
Initially, the agricultural GDP fluctuated steadily but was followed by a period
when there was a quick increase. Since 1992, the new Ethiopian regime has
introduced various changes in the tax system and it is expected that real agricultural
302 H. Azime et al.
GDP could be non-stationary. As such, to have meaningful results, the trend model
with options of Dickey-Fuller test that includes a constant and a time trend and the
Augmented Dickey-Fuller tests were employed to test for the presence of unit roots
in the variables. Also, other methods such as Kwiatkowski–Phillips–Schmidt–Shin
(KPSS) and Phillips–Perron (PP) unit root tests were also employed.
The results indicate that the real agricultural GDP exhibited unit roots at different
critical levels. However, real agricultural GDP was found to be stationary after
differencing once, implying that the variable was integrated of order one. However,
the real agricultural income tax, the real land use fee, and the total agricultural tax
variables were found to be stationary at levels (see Annexure 1). Thus, real agri-
cultural income tax and real land use fee, as well as the total agricultural tax series,
are integrated of zero. Therefore, the first difference of the real agricultural GDP (D.
ln_RealAGDP) was used as a dependent variable in the model. The independent
variables in the model include time (t); and a dummy variable d1992 was intro-
duced for 1992 when real economic activity for assessing agricultural income tax
and land use fee was substantially slower than the historical trend.
13.5.1 Agricultural Income Tax and Land Use Fee

Buoyancy
The results suggest that agricultural GDP had some significant impact on agricultural
income tax. In fact, the estimated value of revenue buoyancy is −1.13 which is
significant at the 10% level. This implies that a 1% increase in agricultural GDP was
associated with a 1.13% decrease in agricultural income tax in Ethiopia. The findings
also suggest that agricultural GDP had no statistically significant influence on agri-
cultural land use fee and total agricultural tax. The R2 value is high, suggesting that
the model is a good fit. Table 13.1 presents the regression results on tax buoyancy.
Table 13.1 Estimates of tax buoyancy for Ethiopia (1984–2014)

Agricultural income Agricultural land use Total agricultural
tax fee tax
D.ln_RealAGDP −1.126 −0.716 −0.645
(1.83)* (1.16) (1.06)
d1992 −3.160 −3.038 −3.103
(10.57)*** (10.11)*** (10.54)***
_cons 4.912 4.826 5.571
(85.42)*** (83.55)*** (98.39)***
R2 0.79 0.77 0.79
N 33 33 33
*p < 0.1; **p < 0.05; ***p < 0.01
Source Authors’ computations using data from MOFED
13.5.2 Agricultural Income Tax and Land Use Fee

Elasticity
A further analysis was done in evaluating tax elasticity by including a dummy

variable for a policy change since 1992. The basic model was extended by
including time trend t and a dummy variable to capture policy changes that rep-
resent tax reforms for the period 1992–2014.
The elasticity of agricultural income tax was estimated to be −1.16, implying
that a 1% increase in agricultural GDP was associated with a decrease in a 1.16% in
agricultural income tax (Table 13.2). On the other hand, agricultural GDP had no
statistically significant effect on agricultural land use fee and total agricultural tax.
This may be due to the declining share of agriculture in GDP and employment as
the economy grew over this period.
The coefficient of time is statistically significant at 5% for land use fee and
significant for total agricultural tax at 1%; its t-value is approximately 2.54 and
2.78, respectively. However, the coefficient of time is not statistically significant for
agricultural income tax at the 5% significance level; its t-value is approximately
1.87. This supports the observation that agricultural income tax is driven not only
by agricultural GDP but also by other internal developments in the country and due
to an improving tax administration. It also shows that informational requirements of
land taxation affect the design of taxes in the rural sector. Thus, the case for
agriculture productivity as a focus of economic growth strategies must rely on
Table 13.2 Estimate of tax elasticity (1984–2014)

Agricultural income Agricultural land use Total agricultural
tax fee tax
D.ln_RealAGDP −1.156 −0.918 −0.833
(1.89)* (1.53) (1.45)
d1992 −3.379 −3.249 −3.335
(11.48)*** (11.21)*** (12.05)***
Dpolicy −0.439 −0.459 −0.498
(2.48)** (2.63)** (2.99)***
T 0.016 0.021 0.022
(1.87)* (2.54)** (2.78)***
_cons 4.940 4.778 5.533
(47.43)*** (46.59)*** (56.49)***
R2 0.83 0.82 0.84
N 33 33 33
*p < 0.1; **p < 0.05; ***p < 0.01
Source Authors’ computation using data from MOFED
304 H. Azime et al.
Table 13.3 Estimates of Buoyancy Elasticity

buoyancy and elasticity
(1984–2014) Agricultural income tax −1.12 −1.15
Land use fee −0.71 −0.91
Total agricultural tax −0.64 −0.83
identifying a set of inter-linkages through which agricultural growth contributes to a

growth in revenue sources for the effective provision of public services in the rural
Ethiopian economy. See Table 13.3.
13.5.3 Buoyancy and Elasticity of Personal Income Tax

and Business Income Tax
Under the category of direct taxes, non-agricultural tax revenue variables, which are
real personal income tax and business profit income tax, as well as the total direct
tax series, were analyzed. As the first step, a more detailed examination of the data
properties and the final model specification was done and the property of the series
was analyzed using the augmented Dickey–Fuller (ADF), KPSS, and Phillips–
Perron (PP) unit root tests (the results are presented in Annexure 1).
Since all series were found to be I(1), this required testing for co-integration to
establish the relationship between personal income tax and business income tax
with non-agricultural GDP. Upon realizing the existence of a unique co-integrating
vector, the structural vector auto-regressive (SVAR) model was used to investigate
and estimate the elasticity and buoyancy in the short run between the variables. AIC
was used to select the optimum lag length of SVAR models. Based on the SVAR
estimation, tax buoyancy and elasticity results are given in Table 13.4.
The results in Table 13.4 suggest that personal income tax had a buoyancy of
0.08. Estimates of the tax system yielded a 0.08% change in tax revenue as a
consequence of both automatic changes and a change in the discretionary fiscal
policy for a 1% change in non-agricultural GDP. In other words, a 1% increase in
non-agricultural GDP led to a 0.08% increase in personal income tax during the
current period even though some proportion of incremental income was transferred
to the government in the form of taxes, implying that the tax system was less
buoyant.
Table 13.4 Estimates of Buoyancy Elasticity

buoyancy and elasticity for
personal income tax and Personal income tax 0.08 0.068
business income tax Business profit income tax 0.12 0.11
Total direct tax 0.13 0.118
The results clearly show that elasticity for Ethiopia’s personal income tax was
0.068%, which indicates that the developments in non-agricultural GDP over the
study period spurred less than the automatic proportionate increase in tax revenue.
The implication is that the tax system did yield a 0.068% change in tax revenue,
resulting from economic activity for every 1% change in non-agricultural
GDP. Thus, a decreasing proportion of incremental income was collected and
transferred to the government in the form of tax revenue, which shows that the
personal income tax system in Ethiopia was inelastic over the study period.
This also shows that a 1% increase in non-agricultural GDP led to a 0.12%
increase in business profit income tax in the current fiscal year. Thus, a decreasing
amount of incremental business profit income tax was collected and transferred to
the government in the form of taxes, implying that the tax system was less buoyant.
When the policy change is captured as a dummy variable, the estimates of tax
elasticity result in a 1% increase in non-agricultural GDP leading to a 0.11%
increase in business profit income tax in the current period. Thus, a lesser pro-
portion of incremental business profit income tax was collected and transferred to
the government in the form of tax revenue. This shows that this tax was also
inelastic over the study period. In general, personal income tax and business profit
income tax were progressive in nature given that it was expected that their elas-
ticities would be greater than 1.
Further, a 1% increase in non-agricultural GDP led to a 0.13% increase in total
direct tax in the current period. When a policy change was included as a dummy
variable, a 1% increase in non-agricultural GDP led to about a 0.12% increase in
direct tax in the current period.
The overall elasticity of the tax system clearly shows that the tax system in the
country is inelastic and is therefore not responsive to changes in national income.
The elasticity coefficient was not much lower than buoyancy for all the variables,
implying that the discretionary measures did not significantly impact own revenue.
It can easily be observed that discretionary changes to personal income tax and
business profit income tax made little contribution to the growth in overall direct tax
revenues.
Our study analyzed and measured the responsiveness of agricultural tax to eco-
nomic growth in Ethiopia. Agricultural tax buoyancy measures growth in agricul-
tural tax revenue as a ratio of the growth in agricultural GDP. The study concludes
that growth in agricultural GDP had a significant and negative impact on the growth
in agricultural income tax collections in Ethiopia. Agriculture’s share had an
adverse influence on revenue collections consistently but non-agricultural direct
306 H. Azime et al.
taxes increased by an increase in personal income and business profit taxes. In

general, tax buoyancy or an elasticity coefficient that is lower than unity may
indicate issues related to the structure of the tax, administration or compliance in the
tax system. Based on these findings, the study recommends that reviewing the tax
system is crucial as and when the economic structure changes. Tax policy measures
should aim at increasing the tax base by bringing in the growing agricultural sector
of smallholder farmers under the tax administration of the federal government.
There is also a need to improve the tax administration continuously so that tax
evasion and other malpractices can be tackled. Efforts are also needed to minimize
the costs of tax collection.
With inelastic tax estimates for personal income tax and profit tax, the Ethiopian
government has to review its tax collection system and pursue further reforms to
exploit the tax revenue potential of the economy fully. The sensitivity response of
revenue to changes in the tax base for personal income tax and business profit
income tax was also found to be less than unity, indicating that the possibility of
enhancing revenue proceeds from these taxes remains fairly weak. This requires the
implementation of discretionary measures coupled with other measures for the
shortfalls in revenue.
Annexure 1
ADF, KPSS, and PP unit root test results
13
ADF test KPSS test PP test

Variables Critical value Test Lag Critical value Test Lag Critical value Test Lag
statics order statics order statics order
% 1 5 10 1 5 10 1 5 10
Agricultural −4.38 −3.6 −3.24 −2.025 12 0.216 0.146 0.119 0.633 3 −23.524 −18.508 −15.984 −2.935 3
GDP
Agricultural −4.38 −3.6 −3.24 −4.817 12 0.216 0.146 0.119 0.16 3 −23.524 −18.508 −15.984 −22.573 3
income tax
Land Use fee −4.38 −3.6 −3.24 −3.658 12 0.216 0.146 0.119 0.116 3 −23.524 −18.508 −15.984 −18.661 3
Total Agr tax −4.38 −3.6 −3.24 −6.739 10 0.216 0.146 0.119 0.108 3 −23.524 −18.508 −15.984 −20.679 3
Non-agricultural −4.38 −3.6 −3.24 −0.079 12 0.216 0.146 0.119 0.654 3 −23.524 −18.508 −15.984 −3.274 3
GDP
Personal income −4.38 −3.6 −3.24 1.406 12 0.216 0.146 0.119 0.624 3 −23.524 −18.508 −15.984 0.659 3
tax
Business income −4.38 −3.6 −3.24 1.115 12 0.216 0.146 0.119 0.384 3 −23.524 −18.508 −15.984 −4.116 3
tax
Direct tax −4.38 −3.6 −3.24 0.527 12 0.216 0.146 0.119 0.429 3 −23.524 −18.508 −15.984 −2.695 3
First differences
Agricultural −4.325 −3.576 −3.226 −4.698 1 0.216 0.146 0.119 0.0634 3 −23.396 −18.432 −15.936 −23.538 3
GDP
Agricultural Tax Responsiveness and Economic Growth in Ethiopia
Non-agricultural −4.334 −3.58 −3.228 −3.632 2 0.216 0.146 0.119 0.041 3 −23.524 −18.508 −15.984 −23.396 3
GD
Personal income −4.325 −3.576 −3.226 −3.952 1 0.216 0.146 0.119 0.104 3 −23.396 −18.432 −15.936 −30.704 3
tax
Business income −4.316 −3.572 −3.223 −4.205 0 0.216 0.146 0.119 0.0922 3 −23.396 −18.432 −15.936 −22.889 3
tax
Direct tax −4.325 −3.576 −3.226 −4.427 0 0.216 0.146 0.119 0.102 3 −23.268 −18.356 −15.888 −25.549 3
307
Source Computed by the authors

308 H. Azime et al.
References
Ahmed QM, Muhammad SD (2010) Determinant of tax buoyancy: empirical evidence from
developing countries. Eur J Soc Sci 13(3):408–418
Alemayehu G, Shimeles A (2005) Taxes and tax reform in Ethiopia, 1990–2003. Research Paper,
UNU-WIDER, United Nations University (UNU), No. 65
Bachewe FN, Guush B, Berhane B, Minten M, Taffesse AS (2015) Agricultural growth in Ethiopia
(2004–2014): Evidence and drivers. International Food Policy Research Institute (IFPRI),
Working Paper, Washington, DC, p 81
Besley T, Ghatak M (2006) Public goods and economic development. Oxford University Press,
Oxford
Besley T, Persson T (2014) Why do developing countries tax so little? J Econ Perspect 28(4):
99–120
Bunescu L, Comaniciu C (2013) Tax elasticity analysis in Romania: 2001–2012. Proc Econ
Finance 6:609–614
Craig ED, Heins AJ (1980) The effect of tax elasticity on government spending. Public Choice 35
(3):267–275
Creedy J, Gemmell N (2001) The revenue elasticity of taxes in the UK. Melbourne Institute
Working Paper, No. 11/01
Creedy J, Gemmell N (2008) Corporation tax buoyancy and revenue elasticity in the UK. Econ
Model 25(1):24–37
Ehdaie J (1990) An econometric method for estimating the tax elasticity and the impact on
revenues of discretionary tax measures: applied to Malawi and Mauritius. Country Economics
Deprtment, The World Bank, Working Paper Series, No. 334
Feger T, Asafu-Adjaye J (2014) Tax effort performance in sub-Sahara Africa and the role of
colonialism. Econ Model 38:163–174
Geda A, Shimeles A (2005) Taxes and Tax Reform in Ethiopia, 1990–2003 WIDER Working
Paper Series 065, World Institute for Development Economic Research (UNU-WIDER)
Haughton J (1998) Estimating tax buoyancy, elasticity and stability. United States Agency for
International Development, EAGER/PSGE Excise Project
Howard M, Foucade AL, Scott E (2009) Public Sector Economics for Developing Countries.
University of the West Indies Press
Jenkins GP, Kuo CY, Shukla GP (2000) Tax analysis and revenue forecasting. Harvard Institute
for International Development, Harvard University, Cambridge, Massachusetts
Khan MH (2001) Agricultural taxation in developing countries: a survey of issues and policy.
Agric Econ 24(3):315–328
Leuthold JH (1991) Tax shares in developing economies a panel study. J Dev Econ 35(1):173–185
Mawia M, Nzomoi J (2013) An empirical investigation of tax buoyancy in Kenya. Afr J Bus
Manage 7(40):4233–4246
McCluskey WJ, Trinh HL (2013) Property tax reform in Vietnam: options, direction and
evaluation. Land Use Policy 30(1):276–285
Milwood TAT (2011) Elasticity and Buoyancy of the Jamaican tax system. Bank of Jamaica
Moller LC (2015) Ethiopia’s great run: the growth acceleration and how to pace it. The World
Bank Group, Washington, DC
Moreno MA, Maita M (2014) Tax elasticity in Venezuela a dynamic cointegration approach.
Central Bank of Venezuela
Newbery DM (1987) Taxation and development. In the theory of taxation for developing
countries. Newbery DM, Stern NH (eds), pp 165–204. Published for the World Bank [by]
Oxford University Press
ONRS (2002) Oromia Rural Land Use and Administration Proclamation. No. 56, (ed.), Oromia
National Regional State. Finfinnee, Ethiopia
ONRS (2005) Oromia national regional Government rural land use payment and agricultural
income tax amendment, 2005, Proc. No. 99, Megeleta Oromia, 13th year, No. 13. Oromia
National Regional State
Osoro N (1993) Revenue productivity implications of tax reform in Tanzania. African Economic
Research Consortium, Research Paper No. 20
Prichard W (2015) Taxation, responsiveness and accountability in Sub-Saharan Africa: the
dynamics of tax bargaining. Cambridge University Press
Rashid S, Assefa M, Ayele G (2007) Distortions to agricultural incentives in Ethiopia. Agricultural
Distortions Working Paper 43. The World Bank, Washington, DC
Sanjeev G, Tareq S (2008) Mobilizing revenue. Finance Dev 45(3):44–47
Sarris AH (1994) Agricultural taxation under structural adjustment. Food and Agriculture
Organization of the United Nations
Singer NM (1968) The use of dummy variables in estimating the income-elasticity of state
income-tax revenues. National Tax J 21(2):200–204
Tanzi V, Zee HH (2000) Tax policy for emerging markets: developing countries. National Tax J
53(2):299–322
Triplett JE (2001) Should the cost-of-living index provide the conceptual framework for a
consumer price index? Econ J 111(472):311–334
Chapter 14
Improving Agricultural Productivity
Growth in Sub-Saharan Africa
Olaide Rufai Akande, Hephzibah Onyeje Obekpa

and Djomo-Raoul Fani
Abstract Improved agricultural productivity is central to achieving inclusive

development, reducing poverty, and enhancing the living standards of most people
in sub-Saharan Africa. Concerned by the declining state of agricultural productivity
in this region, we pursue the question whether agro-processing activities and
exports of raw agricultural materials have a backward linkages effect on agricultural
production activities. And if the relationship exists how can it be more effectively
used? The regression results indicate that increases in export of raw agricultural
materials negatively influence productivity growth in agriculture. Consistent with
the findings of other studies that agro-industrial growth in the sub-Saharan region
faces several challenges, the response of agricultural production to agro-industrial
activities was positive but inelastic. To overcome these challenges, improving the
value of agricultural exports and thereby improving agricultural productivity
growth are needed in policy, regulatory, and institutional frameworks across
countries in the region that will enable agro-industrial development to become
stronger; lead to the creation of opportunities for increased private sector engage-
ment including through the formation of public–private partnerships for developing
synergies; provide access to credit for participants along the agricultural value
chain; provide rural infrastructure that reduces postharvest losses and transport
costs and shorten transit time, while increasing overall rural mobility; support
innovations and technology for developing competitive value chains; provide
access to value-responsive markets; provide access to timely information for
improving bargaining powers; establish organizations to reduce transaction costs;
and lead to inclusion of women, poor, and/or marginal groups in the value chains.
Overall, this strategy will be optimal when it concomitantly and yearly increases
agro-industrial activities and decreases agricultural raw material exports by 2.5% of
their existing values, given 1981 as the base year.
O.R. Akande (&) H.O. Obekpa D.-R. Fani

Department of Agricultural Economics, University of Agriculture, Makurdi, Nigeria
e-mail: akande.olaide@uam.edu.ng

DOI 10.1007/978-981-10-4451-9_14
312 O.R. Akande et al.

Keywords Total factor productivity Agricultural production Inclusive devel-

opment agro-processing Export of agricultural raw materials Panel data
Simulation
14.1 Introduction
Growth theories emphasize the influential role of nonconventional inputs in

accounting for productivity and income differences among output producing units.
However, in contrast to the neoclassical growth theory, arguments based on the
endogenous growth theory (Aghion and Howitt 1992; Grossman and Helpman
1991; Romer 1990) assume that differences in growth among economic entities
using the same or similar inputs are accounted for by factors and disturbances
within the growth model. By implication, therefore, policy interventions can be
used to adjust suboptimal production.
In sub-Saharan Africa (SSA), agriculture remains the major occupation of most
people, contributing to the population’s food security and providing rural dwellers
livelihood option. In many countries in the region, agriculture is the key source of
foreign exchange and revenue for the government. If properly developed, agri-
culture also has the potential of stemming the current dangerous trend of rural-urban
migration, reducing the numerous social problems in cities and spurring sustainable
inclusive development. However, as in other regions of the world, the capacity of
the sector to meet its potential critically depends on the growth of the agro-industry.
Agro-industrial development spurs growth in primary agricultural production
because of the forward and backward linkages existing between these sectors
(Hirschman 1958). Agro-processing in particular has several positive effects on
agricultural production because it is a necessary part of the agricultural value chain.
Thus, its absence retards the flow of value in an agricultural economy.
Agro-industrial development also promotes job creation and inclusive development
because of its potential to provide jobs for disadvantaged groups like women.
Further, a growth in agro-processing reduces postharvest losses, thereby increasing
incomes and helping people fulfill their economic aspirations. However, while
agriculture-led growth has played an important role in reducing poverty and
transforming the economies in many Asian countries, the strategy has not worked in
Africa. For example, most African countries have failed to meet the requirements of
a successful agricultural revolution. An obvious corollary to this is deep and
prevalent poverty in the region as compared with the other regions in the world
(Kharas 2007; Strawson et al. 2015; UNDP 2011).
Two mutually reinforcing problems are contributing to the high prevalence of
poverty in the region: bad policies and low agricultural productivity. For instance,
Fuglie and Rada (2013) point out that some of the lowest levels of agricultural land
and labor productivity in the world are found in sub-Saharan Africa. Anderson and
Masters (2009) say that farmers in many parts of Africa continue to face more
discriminatory policies as compared with farmers in other global regions because
14 Improving Agricultural Productivity Growth in Sub-Saharan Africa 313
farmers in the continent are confronted with policies that lower economic incentives
to invest in agricultural production and modern inputs.
This situation stresses the need for strategies that stimulate more rapid agricul-
tural growth in sub-Saharan Africa. However, increased exploitation of natural
resources or a spike in commodity terms of trade may only spur limited growth in
the long run. In contrast, policies anchored on key productivity determinants
(Binswanger and Townsend 2000) can help maintain agricultural growth over the
long run. In our paper, we pursue the question of how agro-industrial activities and
exports of agricultural raw materials can be used to generate effective agricultural
productivity growth in SSA. Our study differs from the literature on sources of total
factor productivity (TFP) growth in agriculture in two aspects. First, we circumvent
the simultaneity equation bias associated with TFP estimations from the panel data
by using the hybrid Olley and Pakes (1996) and Levinsohn and Petrin (2003)
procedure. Second, as against the deterministic forecasting approach in most
studies, the simulation approach that we use acknowledges that uncertainties are
associated with realization of values of some TFP determinants, and by extension,
the random nature of TFP itself.
The rest of the paper is organized as follows. Section 14.2 presents the con-
ceptual framework, while Sect. 14.3 gives details of the econometric model
underlying the analysis. It also presents the estimated model and data sources.
Section 14.4 discusses the results and gives a conclusion.
14.2 Agro-Industrial Development: A Conceptual

Framework
14.2.1 Agro-Industrial Development and Productivity

Growth
According to FAO (1997), agro-industry refers to a subset of manufacturing that

processes raw materials and intermediate products derived from the agricultural
sector. Agro-industry transforms products originating in agriculture, forestry, and
fisheries and processes them into canned food, beverages, fruit juice, meat and dairy
products, textile and clothing, leather wood and rubber products, and animal feed,
among others.
Support for the development of agro-industry as a precursor to agricultural
productivity growth is rooted in the “linkage hypothesis.” The original version of
the theory of unbalanced growth pioneered by Hans Singer, Alfred Hirschman, and
Wait Rostow emphasized the need for investments in strategic sectors of the
economy instead of all the sectors simultaneously. In Hirschman’s (1958) view, the
other sectors will automatically develop themselves through what are known as
“linkage effects.” The implicit assumption is that the best development path for
developing countries with income scarcity lies in selecting those enterprises and
industries where progress will induce further progress elsewhere. By implication,

therefore, any industry that shows a high degree of dependency as measured by the
proportion of output sold to or purchased from other industries, can provide a strong
stimulus to economic growth. Thus, where a complementary backward relationship
exists between industry A and industries B and C, growth of output of industry A
may generate demand for products of B and C and may also reduce the marginal
cost of production in these industries.
Correspondingly, through its backward and forward linkages, the agro-industry
can play a substantial role in spurring agricultural growth, providing employment in
rural areas, ensuring food security, and stimulating innovativeness among farmers.
According to FAO (1997), the agro-industry could spur productivity growth in
agricultural production through market expansion because establishing processing
facilities is an essential first step toward stimulating both consumer demand for
processed products and an adequate supply of the needed raw materials. Second,
the provision of transport, power, and other infrastructural facilities required for
agro-industries also benefits the agricultural production process and enhances
productivity. Ramachandran (2009) further states that agro-based industries can
spark innovativeness among farmers by encouraging them to resort to new pro-
duction techniques because the agro-industry helps agriculture become more pro-
ductive by enlarging the supply of inputs like fertilizers, pesticides, and improved
farm implements and equipment. The development of an agricultural output-based
industry automatically encourages farmers to produce the concerned crops. In the
absence of agro-based industries, the farmer community develops a sort of
frog-in-the-well attitude toward farming (Ramachandran 2009).
Another important effect of agro-processing is a substantial increase in
employment that may result from setting up an industry using raw materials. For
instance, considerable employment may be generated in agriculture by being the
raw material base, even if the agro-industrial process is itself capital-intensive (FAO
1997). In particular, food processing in the early stages of development can be an
important direct complement to agriculture as a source of employment for seasonal
labor. The off-farm employment opportunities provided by food processing may
thus represent the first instrument of time-smoothing in the labor market and as such
is an important factor of capital accumulation in rural areas. Ramachandran (2009)
further argues that by helping provide employment opportunities locally,
agro-industries stop the dangerous consequences of mass exodus of farmers and
rural dwellers associated with rural-urban migration.
The agro-industry’s capacity to generate demand and employment in other
industries is also important because of its role in activating sideways linkages, that
is, linkages derived from the use of by-products or waste products of the main
industrial activity (FAO 1997). For example, animal feed industries can utilize
several agro-industrial by-products such as whey, oilseed press cakes and blood,
carcass and bone meat. In addition, many industries using agricultural raw materials
produce waste that can be used as fuel, paper pulp, or fertilizers. Smallholder
producers in developing countries have been experiencing high postharvest losses
threatening their food security and negatively affecting the financial sustainability
of their operations. For instance, the Africa Post Harvest Loss Index (2014) esti-
mates that losses for roots and tubers were at 10–40%, for fruits and vegetables at
15–44%, while fish and sea food at 10–40%. Developing the agro-processing
potential, either through indigenous knowledge (drying, salting, crushing, pre-
cooking) or modern technology-based methods (extraction, canning, bottling,
concentration), has the capacity to reverse these losses. Therefore, agro-industrial
activities also have the potential to contribute toward food security.
However, unplanned agro-industrial development may generate negative exter-
nalities and sustain primary agricultural production in a low level of equilibrium.
For example, there may be significant risks in terms of equity, sustainability, and
inclusiveness when value addition and capture are concentrated in the hands of a
few value chain participants to the detriment of the others (da Silva and Baker,
2007). This will be the case in a situation of unbalanced market power in the
agri-food chain. Moreover, sustainability of agro-industrial development depends
on its competitiveness in terms of costs, prices, operational efficiencies, product
offers, and other associated parameters. Establishing and maintaining competi-
tiveness may constitute a particular challenge for small- and medium-scale
agro-industrial enterprises and small-scale farmers.
The preconditions for developing agro-industries include necessary transportation,
information, and communication technologies and access to reliable supplies of key
utilities, notably electricity and water. Therefore, infrastructural constraints influence
the cost and reliability of the physical movement of raw materials and end products,
the efficiency of processing operations, and responsiveness to customer demands. The
prevailing macroeconomic and business conditions and the level, quality, and relia-
bility of infrastructure are also critical determinants of competitiveness in the export of
processed agro-food products (Crammer 1999). In a situation of acute infrastructural
constraints, the additional complexities of processing operations may outweigh the
benefits of diversification in the exports of primary commodities toward value addi-
tion (Love 1983). Weak infrastructure may further put agro-processing enterprises at a
competitive disadvantage vis-à-vis their industrialized competitors and distort the
competitiveness of developing countries relative to one another. Unreliable and costly
supplies of utilities may also prevent enterprises from operating at or near full capacity
utilization. Overall, a weak infrastructural environment will lower the rate of transi-
tion of agro-industries from informal to formal operators and steer the structure of the
sector toward a higher level of concentration.
14.2.2 Export of Agricultural Raw Materials

and Productivity Growth
Arguments supporting commodities trade across international borders are rooted in

the export-led growth hypothesis (see, Adams 1973; Crafts 1973; Edwards 1992,
1998). According to this model, export trade is a key determinant of economic
growth. The key premise of this argument is that overall growth in a country can be
generated not only by increasing the amount of labor and capital within the
economy, but also by expanding exports. Accordingly, exports can serve as an
“engine of growth.” An offshoot of this idea is the assumption that developing
countries have comparative advantages in agricultural production, thus only
needing to forward their agricultural produce to international markets (Akande
2012). However, empirical analyses to confirm this proposition have shown mixed
results. While positive for some countries (Krueger 1978; Lussier 1993), they were
negative for others with more than half the empirical investigations published in the
1990s finding no long-run relationship between exports and economic growth,
suggesting that correlations between these variables arise as a result of short-term
fluctuations.
A critical factor that affects the chances of developing countries benefiting from
export trade in agriculture is increasing consumer concerns about food safety.
Specifically, food exports from the developing world are exposed to demanding
food safety standards from organizations such as Codex Alimentarius and by
unilateral requests from individual importers. Also, attitudes and standards in vogue
in the developed world spill over to local markets (Pinstrup-Andersen 2000). A new
form of protectionism often arises in which high quality and safety standards
imposed by importing countries cannot be accommodated rapidly by local pro-
duction technologies or guaranteed by local analytical capabilities. The latter may
lead to increased levels of rejection at entry ports. Moreover, even if the problem
regarding the safety of an imported food has been overcome, the credibility of the
exporting country to produce safe food may be at stake, thus affecting the volume of
its food exports. For this reason, developing countries that consider implementing
or strengthening their food-borne disease controls and investigation and surveil-
lance systems are unlikely to gain in the long run from food and agricultural export
trade.
In summary, the review indicates that depending on the prevailing factors the
correlation between agricultural productivity, agro-processing and raw material
exports can be positive or negative and is also subject to random influence from
market forces. Hence, the focus of this paper is establishing this correlation and
how the equilibrium can be shifted in a way so as to achieve sustainable growth and
inclusive development in SSA.
14.3 Econometric Framework and Data
The simulation approach examines the future evolution of TFP in SSA agricultural
production under the assumption that uncertainties are associated with the evolution
of certain TFP determinants (Davidson and MacKinnon 2004). First, we estimated
the TFP data from the aggregate agricultural production function using the hybrid
Olley and Pakes (1996) and Levinsohn and Petrin (2003) procedure. Second, the
fixed coefficients in the TFP simulation model were estimated from a Tobit
regression. Finally, the impact of varying scenarios of agro-processing activities

and raw material exports on TFP’s evolution under uncertainties were forecast
using the Monte Carlo simulation. The random values of the uncertain variables in
the simulation model were generated from their probability distribution functions
(PDF).
Specifically, the simulated TFP (h) model is:
h ¼ E½f ðXit Þ; X PDFðXit Þ or

1X N ð14:1Þ
E ðf ðXi ÞÞ ¼ hN ¼ f ðXit Þ
N i¼1
where X is a vector of TFP determinants.

By the law of large numbers, the approximation hN converges to the true value
as N increases to infinity. Therefore, the hN estimate is unbiased if:
EðhN ¼ hÞ
As a first step, agricultural TFP was estimated from the hybrid Olley and
Pakes-Levinsohn and Petrin production function:
yit ¼ boi þ bk kit þ bl lit þ bld ldit þ xðkit ; iit Þ þ uqit : ð14:2Þ
where lower case letters represent the log transform of the respective variable, y is
gross domestic product measured in million purchasing power parity in dollars
(PPP$); k is the gross capital investment measured in million US dollars; l is
agricultural labor measured in million people employed in agriculture; ld is agri-
cultural land measured in square kilometers; i is gross agricultural investment
measured in million US dollars; u is the error term Nð0; r2 Þ.1
The fixed parameters in the TFP simulation model were estimated from the Tobit
regression:
tfpit ¼ aoi þ a1 agvaddit þ a2 agrmtexptit þ a3 agr&dit

þ a4 agfdiit þ a5 agodait ð14:3Þ
where a0i are fixed effects parameters on countries; aðj:j [ 0Þ are parameters on the
associated variables; agvadd is value addition to agricultural products through
agro-processing measured in current market prices (USD); agrmtexpt is the value of
agricultural raw materials exported measured in current US dollars; agr&d is the
public expenditure on agricultural research and development measured in million
constant 2011 US dollars; agfdi is the value of foreign direct investment in agri-
culture measured in current US dollars; agoda is the value of official development
1
Annexure A gives a derivation of this model.
assistance to agriculture measured in constant 2012 US dollars; eit is the error term
*N(0, r2 ).
Finally, TFP was simulated from the stochastic model:

tfpit ¼ aoi þ a1 agvaddit þ g1;it þ a2 ðagrmtexptit þ g2;it Þ
þ a3 agr&dit þ a4 agfdiit þ a5 agodait þ nit ð14:4Þ
where g1;it and g2;it are uncertainties associated with measurements of

agro-processing and agricultural raw material exports, respectively. They are
expected to capture random events associated with these business and open econ-
omy variables. nit is an exogenous white noise disturbance in the model.
Given the stochastic nature of this model, the behavior of TFP growth under
various scenarios was investigated. The simulated scenarios consisted of con-
comitant yearly positive changes to the state of agro-processing activities and
decreases in exports of agricultural raw materials by 1, 2.5, 5, 7.5, and 10% with
1981 as the starting point.
14.3.1 The Data
Data for the study is the longitudinal time series or panel data on 13 countries in
sub-Saharan Africa. The data covered the period 1981–2005. Data was collected
from the databases of the Food and Agriculture Organization (FAO) of the United
Nations, Agricultural Science and Technology Indicators (ASTI) (www.asti.cgiar.
org), and the World Bank (www.worldbank.org). Data on agricultural raw materials
exported was derived by multiplying the proportion of agricultural raw materials in
the total merchandize export by the total merchandize export. The value of
agro-industrial value addition was proxied by the industrial value added. This was
obtained by multiplying industrial value added as a proportion of GDP by the
GDP. Values of official development assistance in agriculture (agoda) and foreign
direct investment in agriculture (agfdi) were obtained by weighting the aggregate of
these variables by the proportion of agriculture value added in GDP.
14.4.1 Results
Annexure B summarizes the data, while Table 14.1 and Table 14.2 give estimates
from production function and the TFP model. The goodness of fit statistics of the
hybrid Olley and Pakes-Levinsohn and Petrin production function indicates a good
fit of the data to the model. The returns to scale statistics show that agricultural
Table 14.1 Parameter estimates of hybrid Olley and Pakes-Levpet and Petrin regression model of
agricultural production in sub-Saharan Africa
Variablea Coefficient Std. error Sig. level
Labor 0.72 0.36 0.05
Land −0.16 0.46 0.74
Gross capital 1 0.42 0.02
Investment 0.001 0.10 0.99
Wald 0.43(0.43) SS
Source Author’s computation
a
All variables are in logarithm form
Table 14.2 Parameter estimates of the Tobit regression model of TFP in SSA’s agriculture
Variable Mixed effects model Random effects model
Coefficient (std. error) Coefficient (std. error)
agr&d −0.15(0.05)** −0.133(0.032)***
Agoda 0.04(0.02)** 0.027(0.021)
Agfdi −0.004(0.001)* −0.004(.002)**
Agvadd 0.09(0.02)** 0.034(0.024)
Agrmtexpt −0.04(0.01)*** −0.032(0.013)**
Burkina Faso −1.56(0.05)**
Madagascar −2.35(0.06)*
Ghana −0.37(0.07)*
Mali −1.42(0.06)*
Togo 0.06(0.05)**
Kenya −1.47(0.09)*
Nigeria −1.20(0.14)
Malawi −0.80(0.07)*
sigma_u 2.68e−19(1.00) 0.79(0.206)***
Sigma_e 0.12(0.01)*** 0.12(0.01)***
Rho 4.81e−36(3.69e−19) 0.98(0.01)
Fit stat.:
Log likelihood 90.23 63.96
AIC −150.46 −113.92
BIC −106.01 −93.18
Wald Chi-square 7163.81*** 29.96***
Likelihood ratio (LR) 52.54***
Source Author’s computation
***(**)(*)—significant at 1, 5, 10%
production in SSA exhibits constant returns to scale. The coefficients on labor and
gross capital were significantly different from zero, whereas those on land and
investment were not significant. Specifically, the elasticity coefficient on labor
indicates that a percentage increase in the variable increased aggregate agricultural
production by 0.71%. A percentage increase in capital on the other hand increased

the value of agricultural production by the same percentage. In other words, this
implies that agricultural output changed at the same rate as gross capital. This result
is consistent with the findings of Grilliches (1998) that if TFP is correctly estimated,
the coefficient on capital should be roughly equal to unity. The negative but
insignificant coefficient on the land variable points to the potential for productivity
depletion arising from extensive land use practices without corresponding nutrient
replenishment through the use of fertilizers and other soil additives. These results
support Nkamelu’s (2013) findings that the land extensification path in Africa is
rapidly becoming unsustainable or impractical as land grows scarcer.
The estimated TFP Tobit model indicates a good fit of the model to the data. The
likelihood ratio (LR) test showed a better fit of the mixed effects model relative to
the random effects model (LR = 52.54; P 0:01). Other fitness parameters of the
model, including log likelihood, the Akaike information criteria (AIC) and the
Swatch information criteria also selected the mixed effects Tobit model in prefer-
ence to the random effects model.
The elasticity coefficients on agro-industrial value addition and on export of
agricultural raw materials for the mixed effects Tobit model were statistically sig-
nificant. Specifically, the coefficient on value addition through agro-processing was
positive indicating that intensification of agro-processing activities improved agri-
cultural production in SSA. In contrast, the negative coefficient on raw material
exports points to the fact that increasing exports of agricultural raw products has a
decreasing effect on productivity of the agricultural sector in the region. Moreover,
the coefficients of the control variables including public investment in agricultural
R&D, agricultural development assistance, and foreign direct investment in agri-
culture were statistically significant. However, while the coefficient of value of
development assistance to agriculture was positive, those of R&D and foreign direct
investment in agriculture were negative. These negative coefficients suggest that
excess public investments in research and development crowd out private partici-
pation while the level of investments by foreign nationals in the agricultural sector
is inconsistent with the growth of the agricultural economy in sub-Saharan Africa.
The simulation (Table 14.3 and Fig. 14.1) revealed that policies that yearly and
concomitantly increase agro-industrial value addition and reduce agricultural raw
material exports by 2.5%, assuming 1981 as the base year, will lead to acceptable
progressive growth in TFP in agriculture in SSA.
14.4.2 Discussion
Evidence from the regression analysis points to the fact that increases in
agro-processing activities and its corollary decrease in the export of raw agricultural
materials increase agricultural production in SSA. However, the low elasticity
coefficient on value addition (less than unity) implies that agricultural productivity
in the region responds little to changes in value addition activities, which further
Table 14.3 Scenario analysis of the effect of increases in agro-industrial activities and decreases
in export of agricultural raw materials on TFP in sub-Saharan Africa
Scenario (% increase in Percentage of Percentage of progressive
agro-processing plus corresponding progressive growth in growth in TFP over the
% decrease in agric raw materials TFP over the baseline baseline (marginal)
export) (total)
Baseline 0 0
1 1.33 1.33
2.5 8 3.2
5 13.33 2.67
7.5 20 2.67
10 21.33 2.13
Source Author’s calculation
Fig. 14.1 Effect of improving agro-industrial activities and decreasing agricultural raw material
exports on progressive growth of TFP in agriculture in SSA
suggests that the growth of agro-industry in SSA faces some challenges. AfDB
(2008), the World Bank, and Information Development/Agribusiness (2013)
identified the challenges including lack of infrastructure, storage, finance, compe-
tencies, adequate technologies, and a good policy environment which confront
agro-industrial development in many parts of Africa. Specifically, these studies say
that lack of storage capacity in conjunction with poor rural electrification and water
access, insufficient road networks, and difficult access to communication tools
(telephone, e-mail, etc.) affect the competitiveness of the final agro-processing

products in terms of costs, quality, and supplies. Low and unstable agricultural
productivity in Africa further constrains the success of the agro-industry.
Moreover, the level of capacity building in agro-processing in sub-Saharan
Africa is low with the focus being on production extension. This partially explains
the high percentage of postharvest losses apart from lack of appropriate logistics
and storage capacity. Public R&D has also focused on production and prioritizing
investments in agricultural research extension but not in postharvest and food
technology. Most ongoing agricultural operations in Africa (especially at the
small-medium farmer level) continue to be focused on production aspects with no
forward linkages. And, in most cases, agro-processing at the rural level in Africa
ranges from nonexistent to just very basic. This is linked to the fact that access to
agro-processing technologies is very limited due to lack of expertise/know-how and
affordable costs. Besides, due to poor infrastructure, production factors such as
water, electricity, and diesel-petrol are either not available or very expensive. The
high costs of these production factors affect the availability, quality, and cost of
other key inputs like packaging materials in the agro-industry.
Further, accessing technologies is not always affordable because taxation sys-
tems in many African countries overload the imported costs of agro-industry
equipment. There is also a challenge in incorporating certification systems that
could fulfill the local-regional requirements in the first phase and regional and
international requirements at a later stage if the final target is the export market.
A typical African farmer has no expertise in this area because his priority has been
simple production so far.
Africa’s business environment is also characterized by limited financial
resources, which has direct implications for industrial development. Commercial
banks work at very high rates which are unaffordable for many small-medium
entrepreneurs. These financial constraints are further magnified when start-up
businesses in agro-industry have to be serviced. Many African countries are still at a
very low position in rankings on ease of doing business. This in some cases can
stop foreign agro-processing investors, and also make it difficult to access tech-
nologies and equipment. Licensing, business start-up costs, trade procedures, and
time required are worse in sub-Saharan Africa as compared with other developing
regions.
Overcoming these challenges for successful agro-industrialization requires
carefully chosen policy strategies. The solution to this problem must start with the
policy environment recognizing that appropriate infrastructure together with
capacity building are the key pillars that can successfully decrease postharvest
losses and serve as an initial trigger for attracting private sector investments. Road
and market infrastructure is also important as they provide critical linkages for
connections and transactions between value chain participants besides the other
rural functions that they perform that indirectly support the development of the
value chain. While roads are useful for value chains, they must connect agricultural
areas with competitive advantage to strategic markets.
Similarly, more infrastructure for production (irrigation schemes, dams) is

needed in SSA to increase production, making it more cost-effective and fulfilling
the demands of volume and quality of the agro-industry. The needed policy strategy
must consider strengthening market intelligence and market linkages and make
them sustainable, especially in rural areas. An enabling environment must also be
established for developing the value chain through policies, regulations, and sup-
porting institutions. To facilitate increased private sector engagement, greater
clarity is needed on the evolving and expected roles of the public and private
sectors. Public–private partnerships can support the development of agriculture
value chains, but require significant inputs to identify opportunities and imple-
mentation arrangements.
Extension support services also need to be closer to a business development
model than the traditional agricultural extension model; they should also be able to
bring the market and value addition needs to the farmer and the small-medium
agro-processor level. Farmers’ associations and cooperatives based on the scale
economy could also overcome the gaps that individual farmers cannot. However,
the challenge may be how to promote and support them in a sustainable way and
how to equip them with a comprehensive tool package (finance and marketing
services, technical and managerial skills, extension services) that could make them
competitive enterprises.
Access to credit is a key requirement for all participants in a value chain just as
access to timely market information such as on prices and is essential for a func-
tioning value chain. This helps participants like producers in the chain to respond to
changes in market prices and improves their negotiating powers with traders and
processors. The creation of free trade areas at the regional level can help overcome
problems when local equipment is required, but still the challenge is how to make
international technology available and affordable without undermining the potential
emergence of local technology providers.
14.4.3 Limitations and Suggestions for Further Studies
The limitation of our study is associated with the fact that the findings may be
affected by the quality of the data used. Specifically, nonavailability of data on
many variables and missing data reduced the number of countries used for the
analysis. A more precise estimate may be obtained by a study that uses datasets
with improved quality.
14.4.4 Conclusion and Recommendations
This paper investigated the question of how agro-processing and agricultural raw
material exports can be effectively used to improve productivity of agriculture in
SSA. Our findings lead to the conclusion that while intensifying efforts in exporting
raw agricultural materials lead to decreased productivity growth in agriculture,
increasing agro-processing activities marginally lead to improved agricultural
productivity growth, suggesting that agro-industrial activities are locked in a low
level of equilibrium.
To overcome the challenges associated with agro-industrialization and improv-
ing the value of agricultural exports thereby improving agricultural productivity
growth, there is a need for a policy, regulatory, and institutional framework across
countries in the region that enables agro-industrial development to become stronger;
creating opportunities for increased private sector engagement including through
the formation of public–private partnerships for developing synergies; providing
access to credit for participants along the agricultural value chain; providing rural
infrastructure that reduces postharvest losses and transport costs and shortens transit
time while increasing overall rural mobility; supporting innovations and technology
for developing competitive value chains; providing access to value-responsive
markets; providing access to timely information to improve bargaining powers;
establishing organizations to reduce transaction costs; and including women, poor,
and/or marginal groups into value chains. This strategy will have optimal results if
it concomitantly and yearly increases agro-industrial activities and decreases agri-
cultural raw material exports by 2.5% from their existing values.
Appendix 1: Model Derivation
In deriving TFP data as Solow’s residuals, the aggregate agricultural production

function was conceived as,
b b
Yit ¼ Ai Kit k Litl ð14:5Þ
where Y is the aggregate output, K is the vector of capital input, L is the labor input,
A is the Hicksian neutral efficiency level.
While Y; K and L are all observed by an econometrician, A is not observed by a
researcher. Taking the natural logarithm results of Eq. (14.5) yields:
yit ¼ b0i þ bk kit þ bl lit þ eit ð14:6Þ
where the lower case letters refer to the natural logarithm of respective variables and
lnðAÞ ¼ b0i þ eit : Where b0i measures productivity that varies over countries, and
eit s, the time specific deviation from that mean. When eit is decomposed into a
predictable and unpredictable component, Eq. (14.6) becomes:
yit ¼ b0i þ bk kit þ bl lit þ vit þ uit ð14:7Þ
where xt ¼ b0i þ vit represents sector specific productivity and uit is a iid error
term, representing unexpected deviation from the mean due to measurement or
other unexpected circumstances. The task is to estimate Eq. (14.7) and solve for xt .
TFP can then be calculated by exponentiating ðxt Þ and then expressing it as a
function of its relevant determinants such as:
TFP ¼ gðXÞ; ð14:8Þ
where X is a vector of TFP determinants.

Estimation of Eq. (14.7) using the OLS technique on panel data from continuing
firms or countries faces three particular difficulties: multi-collinearity, selection, and
simultaneity bias. An endogeneity or simultaneous equation bias arises because
investments in inputs are likely to be correlated with past productivity shocks.
Specifically, endogeneity occurs because productivity is known to profit maxi-
mizing firms (but unknown to an econometrician) when they choose their input
levels (Marschak and Andrews 1944). Production units will increase their use of
inputs as a result of positive productivity shocks. Under this condition, any
unobserved shock to productivity that raises output could indirectly raise invest-
ments on inputs, inducing a correlation between the explanatory variables and the
error term in the productivity equation. Moreover, if no allowance is made for entry
and exist owing to productivity shocks, a selection bias will emerge (Van Beveren
2012). The implication of this is that the production elasticities of the observed
factors are not identified because the compound error vt an ut are not identically and
independently distributed. Therefore, parameter estimates of the production func-
tion with OLS will be biased. Specifically, input coefficients will be biased upward
if there is serial correlation in productivity shock, xt (Petrick and Closs 2013). This
effect will be stronger, the easier to adjust input use in response to productivity
shocks.
Several approaches have been proposed to overcome these problems. Arellano
and Bond (1991) suggest the instrumental variable-based estimator. Within estima-
tors have also been employed in studies on productivity of R&D investments. Olley
and Pakes (1996) developed a semi-parametric estimation algorithm using invest-
ment and age as proxy for productivity. Levinsohn and Petrin (2003) contribution to
Olley and Pakes’ (1996) semi-parametric estimator by using material as an alterna-
tive to investment proxy. However, the shortcoming of the fixed effects estimator is
that it overcomes the simultaneity problem only if we are willing to assume that the
unobserved, firm specific productivity is time invariant (Yasar et al. 2008).
Moreover, the within and difference estimator may remove too much variance from
the data and render the estimation impracticable. The strength of Olley and Pakes’
(1996) algorithm is that it explicitly takes both the selection and simultaneous
problem into account by taking cognizance of the idiosyncratic productivity shocks
and exit behavior of the production unit. In this model, a firm is assumed to maximize
the expected discounted value of net cash flows (Van Beveren 2012). The investment
exit decision will depend on the firm’s perception about the distribution of the future
market structure given the information currently available. To achieve consistency a
number of assumptions have been further made. First, the productivity of the firm is
assumed to be the only state variable, evolving through the first-order Markov pro-
cess. Second, a monotonicity assumption is imposed on the investment variable to
ensure stability of the investment demand function. Therefore, investment increases
in productivity are conditional on the values of all the state variables. Consequently,
only nonnegative values of investments can be used in the analysis. Moreover, if
industry-wide prices are used to deflate the input and output measured in value terms
to proxy their respective quantities, it is implicitly assumed that all firms in the
industry face common prices (Ackerberg et al. 2007).
Overall, the investment decision will depend on capital and productivity as:
I it ¼ it ðkit ; xit Þ ð14:9Þ
where lower case letters represent the logarithmic transformation of variables. If we

assume that investment is strictly increasing with respect to productivity, condi-
tional on capital, the investment decision can be inverted to allow the expression of
the unobserved productivity as a function of the observables such that:
xit ¼ it ðkit ; iit Þ ð14:10Þ
where ht ð:Þ ¼ I t ð:Þ:

Given this understanding, Eq. (14.7) can be written as:
Y t ¼ b0 þ bl lit þ bk k kt þ ht ðiit ; kit Þ þ uqt ð14:11Þ
Next, if we define the investment function ut ðk it ; iit Þ as follows:
ut ðk it; ; I it Þ ¼ b0 þ bk kkt þ ht ðIit ; Kit Þ þ uqt
Then, Eq. (14.11) can be rewritten as:
Yt ¼ bl lit þ ut ðiit ; kit Þ þ uqt ð14:12Þ
Estimation of Eq. (14.11) proceeds in two stages (Olley and Pakes 1996). In the
first stage, output (value added) is regressed on log of labor and capital and a
polynomial function of investment and capital (i and k) to obtain a consistent
estimate of the labor elasticity parameter and ut ðkit ; Iit Þ, the combined effect of
capital and efficiency or productivity level. By this action, the estimated labor
coefficient and other included free variables are expected to be lower since this
corrects for downward bias in capital (Hall and Mairesse 2007; Van Beveren 2012).
The second stage of the estimation process, which recovers the coefficient on
capital variable, exploits the information on firm dynamics. Specifically,
productivity is assumed to follow a first-order Markov process, that is,

xit þ 1 ¼ EðxIt þ 1 jxit þ nit þ 1 Þ:
where nit þ 1 represents the news component assumed to be uncorrelated with
productivity and capital in period t + 0.1. Firms will continue to operate provided
their productivity levels exceed the lower bounds.
vit þ 1 ¼ 1 xit þ 1 xit þ 1 where vit þ 1 is a survival indicator variable. Because
the news component nit þ 1 , is correlated with freely variable inputs, in the analysis
labor and other freely variable inputs are subtracted from the output. Therefore, the
analysis considers the expectation of:

E ðyit þ 1 bl lit þ 1 Þjkit þ 1 ; vit þ 1 ¼ 1

¼ b0 þ bk kit þ E xit þ 1jxit: vit þ 1 ¼ 1
The second stage of the estimation algorithm is then derived by using the law of
motion.
In contrast to Olley and Pakes’ (1996) decision to use investment as proxy for
productivity, Levinsohn and Petrin (2003) relied on intermediate inputs as proxy.
Second, their estimation does not correct for selection bias.
In our study, a hybrid Olley and Pakes (1996) and Levisohn and Pakes (2003)
estimator was implemented. Specifically, the model is similar to the Olley and
Pakes (1996) estimator in terms of employing investment as a proxy for produc-
tivity. It resembles Levinsohn and Petrin (2003) as it does not correct for selection
bias. The latter is consistent with the aggregate nature of the data used.
Appendix 2: Data Summary Statistics
See Table 14.4.
Table 14.4 Summary statistics of the data

Country: Benin Rep. Mean Std. dev. Min Max
Agricultural GDP 2506.621 546.5719 1494.044 3162.646
Raw materials export 2.70e+08 1.08e+08 2,983,042 4.21e+08
TFP 1.008466 0.1942747 0.7939172 1.404898
Burkina Faso
Raw materials export 1.68e+08 1.04e+08 2.53e+07 3.62e+08
TFP 0.9114271 0.2677091 7.18e−16 1.142234
Madagascar
Raw materials export 2.32e+07 1.34e+07 3231.464 5.87e+07
(continued)

Country: Benin Rep. Mean Std. dev. Min Max
TFP 0.988527 0.472788 0.8466374 1.074218
Ghana
TFP 3.76258 9.772114 0.6393817 34.79007
Mali
TFP 0.9706709 0.0782413 0.8783707 1.109699
Togo
TFP 5.70297 20.34471 0.2488871 92.08675
Kenya
Agricultural GDP 9471.444 1797.619 6628.193 11,837.5
TFP 0.948233 0.20344 0.0231489 1.125005
Nigeria
Agricultural GDP 42,000.04 9416.445 25,909.01 57,168.83
Raw materials export 3.49e+07 6.54e+07 1,108,543 2.60e+08
TFP 1.018321 0.1350183 0.695404 1.195999
References
Adams NA (1973) A note on trade as a handmaiden of growth. Econ J 83(329):210–212

AfDB (African Development Bank) (2008) Key challenges in agro-industrial development in
Africa. Paper presented at Global Agribusiness Forum, New Delhi, April
Ackerberg D, Benkard CL, Berry S, Pakes A (2007) Econometric tools for analyzing market
outcomes. In: Heckman JJ, Leamer EE (eds) Handbook of econometrics, vol 6A.
Elsevier/North-Holland, Amsterdam, pp 4171–4176
Aghion P, Howitt P (1992) A model of growth through creative destruction. Econometrica 60
(2):323–351
Anderson K, Masters W (eds) (2009) Distortions to agricultural incentives in Africa. The World
Bank, Washington, DC
Arellano M, Bond S (1991) Some tests of specification for panel data: Monte Carlo evidence and
an application to employment equations. Rev Econ Stud 58(2):277–297
Agricultural Science and Technology Indicators (ASTI). Internationally comparable data on
agricultural R&D investments and capacity for developing countries. International Food policy
Research Institute, Washington, DC. Available at: http://www.asti.cgiar.org
Akande, O.R. (2012). “Do Domestic Food Producers in Food Deficit Countries Benefits from
International Trade: Evidence from Five Rice Markets in West African Countries”, Paper
presented at the 86th Agricultural Economists Conference, Warwick UK, 16–18 April 2012.
Available at: https://ideas.repec.org/p/ags/aesc12/134725.html
Binswanger H, Townsend R (2000) The growth performance of agriculture in sub-Saharan Africa.

Am J Agr Econ 85:1075–1986
Crammer C (1999) Can Africa industrialize by processing primary commodities? The case of
Mozambican cashew nuts. World Dev 27(7):1247–1266
Davidson R, MacKinnon JG (2004) Econometric theory and methods. Oxford University Press,
New York
da Silva CA, Baker B (2007) Introduction. In: da Silva CA, Baker B, Shepherd AW, Jenane C,
Miranda-Da-Cruz S (eds) Agro industries for development. Food and Agriculture Organization
(FAO) and United Nations Industrial Development Organizations (UNIDO), Rome
Edwards S (1992) Trade orientation, distortions and growth in developing countries. J Dev Econ
39(1):31–57
Edwards S (1998) Openness, productivity and growth: what do we really know? Economic Journal
108(2):383–398
FAO (Food and Agriculture Organization) (1997) “The state of food and agriculture 1997”, FAO
agriculture series No 30. FAO, Rome
Fuglie KO, Rada NE (2013) Resources, policies, and agricultural productivity in Sub-Saharan
Africa, ERR-145. US Department of Agriculture, Economic Research Service, February
Grilliches Z (1998) R&D and productivity: the econometric evidence. University of Chicago Press,
Chicago
Grossman G, Helpman E (1991) Innovation and growth in the global economy. MIT Press,
Cambridge, MA
Hall BH, Lotti F, Mairesse J (2007). Employment, Innovation, and Productivity: Evidence from
Italian Microdata. NBER Working Paper No. 13296, August 2007
Hirschman AO (1958) The strategy of economic development. Yale University Press, New Haven
Kharas H (2007) Trend and issues in development Aid”, Wolfensohn Center for development
working paper 1. Brookings Global Economy and Development
Krueger AO (1978) Foreign trade regimes and economic development: liberalization attempts and
consequences. Ballinger, Cambridge, MA
Levinsohn J, Petrin A (2003) Estimating production functions using inputs to control for
unobservables. Rev Econ Stud 70(2):317–342
Love J (1983) Concentration, diversification and earnings instability: some evidence on
developing countries’ exports of manufactures and primary products. World Dev 11(9):787–
793
Lussier M (1993) Impact of exports on economic performance: a comparative study. Journal of
African Economics 2(1):106–127
Marschak J, Andrews WH (1944) Random simultaneous equations and the theory of production.
Econometrica 12:143–205
Nkamelu GB (2013) Extensification versus intensification: revisiting the role of land in African
agricultural growth. Proceedings of African economic conference, Johannesburg, South Africa,
28–30 October. Available at: http://www.afdb.org/en/aec-2011/papers/extensification-versus-
intensification-revisiting-the-role-of-land-in-african-agricultural-growth/
Olley GS, Pakes A (1996) The dynamics of productivity in the telecommunications equipment
industry. Econometrica 64(6):1263–1297
Petrick M, Kloss M (2013) Identifying factor productivity from micro-data: the case of EU
agriculture. Factor markets working paper No. 34, January
Pinstrup-Andersen P (2000) Food policy research for developing countries: emerging issues and
unfinished business. Food Policy 25:125–141
Ramachandran MK (2009) Economics of agro-based industries: a study of Kerala. Mittal
Publications, New Delhi
Romer P (1990) Endogenous technological change. J Polit Econ 98:S71–S102
Strawson T, Beecher J, Hills R, Ifan G, Knox D, Lonsdale C, Osborne A, Tew R, Townsend I,
Walton D (2015). Improving ODA allocation for a post-2015 world: targeting aid to benefit the
poorest 20% of people in developing countries. United Nations Development Initiatives and
UK Aid
UNDP (2011). Towards human resilience: sustaining MDG progress in an age of economic
uncertainty. United Nations Development Programme Bureau for Development Policy, New
York
Van Beveren I (2012) Total factor productivity estimation: a practical review. J Econ Surv
26(1):98–128
Yasar M, Raciborski R, Poi BP (2008) Production function estimation in Stata using the Olley and
Pakes method. Stata J 8:221–231
Chapter 15
Determinants of Service Sector Firms’
Growth in Rwanda
Eric Uwitonze and Almas Heshmati
Abstract The service sector is an avenue for economic transformation as not all
countries have a competitive edge in manufacturing. Findings from a micro-level
research on the service sector confirm that ICT integration, firm’s age, the education
of the owner, the boss’ attitude, family business, networks, new processes, major
improvements, market share, on the job training and know-how significantly, and
positively increase the probability of a firm’s growth. Even though the growth rate
of services is currently impressive in the Rwandan economy, no investigations have
been done on the determinants of the growth of the firms in the service sector. This
paper studies the development of services over the years in Rwanda’s economy in
detail and empirically estimates its determinants by using an econometric
methodology. The empirical results are based on micro-data collected by the
Rwanda Enterprise Survey (2011) and the 2014 Establishment Census. The survey
has data on 241 firms and establishments. Linear and limited dependent variable
techniques are employed to investigate the factors behind the development of
service firms. Models are specified and estimated to assess the factors contributing
to sales growth, innovations, and turnovers of service firms. The results show that
the key factors driving the development of service firms in Rwanda include access
to credit, application of ICT, availability of skilled labor, employee development
and acquisition of fixed assets. The results suggest that the government should
uphold the use of ICT in all service firms, promote access to finance to new service
firms and promote on-work training in service firms to speed up Rwanda’s shift
from a low income to a middle-income state.
E. Uwitonze
Ministry of Gender and Family Promotion (MIGEPROF), MIGEPROF,
Single Project Implementation Unit, Kigali, Rwanda
e-mail: uwitonzeric@gmail.com
A. Heshmati (&)
Jönköping International Business School (JIBS),
Jönköping University, Jönköping, Sweden

DOI 10.1007/978-981-10-4451-9_15
332 E. Uwitonze and A. Heshmati
Keywords Limited dependent variables Services Openness Growth East

Africa Rwanda
JEL Classification Codes C35 F13 G29 O47 O55
15.1 Introduction
15.1.1 Background
As per the 2014 Rwanda Services Policy Review, the service sector was the largest
and most dynamic sector in the Rwandan economy. The Rwandan service sector is
subdivided into two broad categories of trade and transport services. Trade and
transport services include maintenance and repair of motor vehicles, wholesale and
retail trade, transport services and other services such as hotels and restaurants;
information and communication; financial services; real estate activities; profes-
sional, scientific and technical activities; administrative and support services; public
administration and defense; compulsory social security; education services; human
health; social work services; and cultural, domestic and other services.
The service sector spearheaded the strong economic growth journey as it
accounted for a bigger share of GDP by 2015—47% GDP as compared to 33% by
the primary sector (agriculture, forestry and fisheries) while the growth of services
was impressive at around 9% by 2014 against 7% for industry and 4% for agri-
culture. Trade and transport services contributed to services’ share in GDP at
159 billion RWF1 in 1999 which increased to 784 billion RWF in 2014 of which
wholesale and retail trade had 615 billion RWF in 2014 against 133 billion RWF in
1999. Other services including hotels and restaurants, information and communi-
cation and financial services increasingly contributed to GDP from 430 billion
RWF in 1999 to 1505 billion RWF in 2014. The service sector’s contribution grew
to 2290 billion RWF in 2014 as compared to 563 billion RWF in 1999. Authorized
loans by the central bank to the service sector increased from 1.5 billion RWF in
2010 to 12 billion RWF in 2014. All these statistics are at fixed 2011 prices and
suggest increased attention and public support for the service sector’s development.
The Doing Business in Sub-Saharan Africa Report (2013–2014) ranked Rwanda
second after Mauritius and its service sector received a big share of foreign private
investments. As a matter of fact, 41.4% of foreign private investments were allo-
cated to ICT and tourism (12.8%), while others like mining received 13.8%,
manufacturing (10.8%) and other sectors received a significant (21.7%) share of
private investments. Meanwhile, as documented in the Rwandan Vision 2020
document, the service sector is believed to be the engine for Rwanda’s economy
with a growth rate of 13.5% and a contribution of 42% to GDP.
1
USD 1 = 746 RWF on 9 March 2016.
15 Determinants of Service Sector Firms’ Growth in Rwanda 333
15.1.1.1 The Rwandan Service Sector’s Development and Growth
According to the Rwandan Integrated Household Living Condition Survey

(EICV4), the indicator of an increase in private and business-oriented mixed
establishments by industry in 2011–2014 went up to 24% in which the contribution
of each service sub-sector showed a rise and fall in percentage change. An increase
was found in wholesale and retail trade and repair of motor vehicles and motor-
cycles (21%); accommodation and food service activities (34%); transport and
storage (7%); professional, scientific and technical activities (3.9%); administrative
and support services (23.1%); health and social work activities (33.1); art, enter-
tainment and recreation (31.0%); financial and insurance activities (18.4%); private
form of education (0.6%); and other service activities (32.0%), whereas a fall was
recorded in information and communication (−28.3%) and real estate activities
(−76.5%).
Employment changes in private and business-oriented establishments by
industry in 2011–2014 increased up to 34.5% within which in the service sub-sector
a large increase was recorded in administrative and support activities (268.3%);
financial and insurance activities (81.2%); transport and storage (54.9%); arts,
entertainment and recreation (67.7%); health and social work activities (50.2%);
accommodation and food service activities (37.7%); and wholesale and retail trade
and repair of motor vehicles and motorcycles (28.7%).
By 2020, the contribution of services is projected to be 57% of GDP as com-
pared to 24% of agriculture followed by 19% of industry. As per EICV4, the
service sector was the biggest contributor to GDP growth with 2536 billion RWF in
2013 compared to 774 billion RWF for industry and 1785 billion RWF for agri-
culture in the same year. This reflects the transition of the Rwandan economy
toward a service-based one. This is also evidenced by a change in the share of
economic sectors in GDP from 1970 to 2010. In 1970, agriculture led other sectors
as it had a 55.9% share in GDP compared to a 19% share of industry and 25.0% of
services. Since 2000, the service sector is leading with a contribution of 45.6% to
GDP, 49.7% in 2010 and 53.3% in 2013.
15.1.1.2 Developing Services by Economic Activity
The distribution of businesses by economic activity shows that the service sector
achieved positive growth in both rural and urban areas. The main sub-sectors in the
service sector that showed more than 30% growth include accommodation and food
services; human health and social work activities; and art, entertainment and
recreation activities. According to Singh and Kaur (2014) rapid urbanization is
a key factor which contributes to the growth of services and leads us to analyze
this growth of the service sector in urban and rural areas in 2011–2014.
Accommodation and food service activities showed greater growth; they had
26,190 registered establishments in 2011 and 36,545 registered establishments in
2014 in rural areas showing a 40% increase whereas in urban areas 7095
establishments were registered in 2011 and 8076 in 2014 corresponding to a 13.8%

increase. The average growth of the accommodation and food services sub-sector
was 34% between 2011 and 2014 in private establishments and the business-
oriented mixed sector by economic activity where 33,285 accommodation and food
establishments were registered (out of 119,270) in 2011 and 44,621 establishments
(out of 148,376) were registered in 2014. It is obvious that the accommodation and
food services sub-sector is growing faster in rural areas than in urban areas and the
growth of these sub-sectors contributed to the overall growth of the service sector
(NISR 2014).
As stated by Latha and Shanmugam (2014), advancement of the service sector is
correlated with the expansion of quality health services indicated by complete
physical, mental and social well-being and not just the nonexistence of diseases and
aliments. While analyzing the service sector’s development in Rwanda, it was
found that human health and social work activities demonstrated an interesting
growth of 33.1%. In rural areas, 83 human health and social work establishments
were registered in 2011 compared to 167 registered in 2014 showing a 101%
increase over the period. In urban areas, 261 establishments in human health and
social work activities were registered in 2011 as compared to 291 registered in 2014
or an 11.5% increase in human health and social work establishments in urban
areas. The growth of establishments in human health and social work activities was
eight times higher in rural areas as compared to urban areas from 2011 to 2014.
Therefore, there is great conviction that the growth of the service sector is linked to
high growth in its sub-sectors, particularly in rural areas.
Though wholesale and retail trade and repair of motor vehicles and motorcycles
are not mentioned among the fastest growing service sub-sectors, it is worth ana-
lyzing them since they had a lion’s share in the service sector. In wholesale and
retail trade, there was an average increase of 7% by 2014 while the motor vehicles
and motorcycle repair sub-sector had a 37% increase in rural areas as compared to a
7% increase in urban areas. This was a result of 30,708 establishments registered in
2011 going up to 42,101 establishments registered in 2014 in rural areas as com-
pared to urban areas where 33,968 establishments were registered in 2011 and
36,352 in 2014. Generally, the rural areas spearhead economic activity in the
service sector. Figure 15.1 shows the remarkable growth of the service sub-sectors
accommodation and food activities and wholesale trade, repair of motor vehicles
and motorcycles.
15.1.1.3 Employment Growth in the Service Sector
A growing body of literature supports employment to measure the growth of firms

since they reflect both short-term and long-term changes (Isaga 2015). In keeping
with this thinking, this section gives a descriptive analysis of employment in the
service sector in Rwanda.
According to the Establishment Census (2014), the service sector employed
401,173 workers or 81.3% of the total workers. The biggest service sub-sectors in
Fig. 15.1 Economic activities of private and business-oriented mixed establishments according to
urban/rural areas (2011 and 2014) Source NISR’s Establishment Census (2014)
terms of the number of people employed included wholesale and retail trade, repair
of motorcycles and motor vehicles (with 120,482 employees equivalent to 24.4% of
the total employment), followed by education employing 83,569 (16.9% of the total
employment) and accommodation and food service activities having 82,213
employed people (16.7% of the total employment). These sub-sectors supported the
growth of the service sector since they provided more jobs as compared to other
economic sectors.
Men were predominant in almost all the service sub-sectors except human health
and social work activities where they represented 47.7% of the total employed
while female workers reached 52.3%. A general picture of the share of employment
within the service sector shows that gender inequalities persist. Only 36.8% of the
total employment in the service sector was with females as compared to male
workers who had the lion’s share of service sector employment at 63.2%.
Considering women’s share in the total population of Rwanda—53% as compared
to 47% for men—there is hope that the service sector will continue to grow if there
is full participation of women in its employment. Figure 15.2 illustrates the way
employment is divided across economic activities.
15.1.1.4 GDP Share of Service Sector Growth
According to the National Institute of Statistics of Rwanda (2014), the service

sector was the biggest contributor to GDP. The shift from an agriculture-based
economy to a service-led economy has been effective since 2004 when the annual
output in agriculture was 879 billion RWF compared to the service sector output at
882 billion RWF. Till 2016, the service sector spearheaded the contribution of the
economic sector to GDP growth in Rwanda.
Fig. 15.2 Distribution of number of workers and gender structure by economic activity (2014).
Source NISR’s Establishment Census (2014)
The impressive growth of the service sector was documented around 9% by

2014 against 7% for industry and 4% for agriculture while the annual average GDP
was 8% by 2014. The total output in the service sector increased up to four times
from 1999 to 2014. The total output of the service sector in 1999 was 563 billion
RWF which grew to 2290 billion RWF in 2014 (Fig. 15.3).
The service sub-sectors that contributed more include wholesale and retail trade
with a contribution of 130 billion RWF in 1999 and 615 billion RWF in 2014.
Though they did not show growth, real estate activities contributed more to the
share of the service sector in GDP. In 1999, the total output in real estate activities
was 283 billion RWF which did not grow much and amounted to 311 billion RWF
Fig. 15.3 GDP by sector activity at constant 2011 prices (in billion RWF). Source National
Institute of Statistics of Rwanda (2014)
by 2014. Tremendous growth in hotels and restaurants (accommodation and food

activities) was witnessed by the contribution of this sector to GDP. In 1999, hotels
and restaurants contributed 19 billion RWF which grew to 113 billion RWF in
2014. In general, the contribution of the service sector to GDP shows that the sector
has been growing since 1999. The effective transition of the economy happened in
2004, at a time when the service sector became the top sector. Though the con-
tribution of the service sector is remarkable in the Rwandan economy, there are no
studies which analyze service firms’ growth or which specifically analyze the
determinants of service firms’ growth.
Our study analyzes the development of service firms and their contribution to
economic growth in Rwanda. Thus, the prime purpose of our study is analyzing
trends in the expansion of services in Rwanda and pointing out the contributing
factors driving its development using survey data covering various parts of the
service sector.
The general objective of our study is to investigate factors of development of
service firms. Its specific objectives include (i) analyzing the contribution of service
firms to economic growth in Rwanda and (ii) unveiling key factors contributing to
the development of service firms in Rwanda.
The findings open up additional academic investigations in service firms’ thus
contributing to the body of knowledge about the role of the service sector in
economic growth in developing countries of which Rwanda is classified as one.
Further, it sheds light on Rwanda’s ambitious target as listed in its Vision 2020
document for holistically understanding what to concentrate on in the service sector
for the economic growth of the country.
This rest of the paper is organized as follows. The next section reviews literature
on the service sector’s development in the world, and in Rwanda in particular.
Section 15.3 on methodology discusses data, the conceptual framework and
empirical models. Section 15.4 focuses on understanding the empirical results and
gives an analysis of the trends. The last section gives the conclusion, a summary of
the findings and recommendations.
A literature review shows that a number of researchers and international organi-

zations have supported the role of the service sector as a key driver in the growth of
an economy in both developing and developed countries. Recently, the United
Nations Economic Commission for Africa (UNECA) affirmed that the service
sector was an avenue for economic transformation as not all countries had a
competitive edge in the manufacturing sector (UNECA 2015). The service sector’s
development is also providing infrastructure that promotes productivity in manu-
facturing and agriculture.
15.2.1 Growth and Development of the Service Sector
The service sector’s economic development is the only way of promoting economic
structural adjustment and accelerating the transformation of economic growth
(Zhou 2015). A declining share of agricultural employment is a key feature in
economic development (Alverez-Cuadrado and Poschke 2011); structural trans-
formation usually coincides with a growing role of industry and services in the
economy (UNECA 2015). The growing size of the service sector and its impact on
the other parts of the economy make it all the more important to promote efficiency
in the provision of services thereby boosting economy-wide labor productivity as
witnessed in OECD member countries. The slowdown in the service sector brought
down labor productivity in the entire economy from more than 4% in 1976–1989 to
less than 2% in 1999–2004 (Jones and Yoon 2008).
Acharya and Patel (2015) confirm that the service sector is the fastest growing
sector in India, contributing significantly to GDP, economic growth, trade and
foreign direct investment (FDI) inflows as the total share of this sector to India’s
GDP is around 65%.
Singh and Kaur (2014) state that the main reasons for the growth in services in
India are rapid urbanization, expansion of the public sector and increased demand
for intermediate and final consumer services. Domestic investments and openness
also positively affect the share of the service sector in GDP, and the main service
sectors attracting FDI in India are telecommunications, construction and hotels and
restaurants. Lee and Malin (2013) says that the service sector has become the main
contributor to GDP not only in developed economies such as the US, Japan and
UK, but also in developing economies such as China, Indonesia, Pakistan and India.
Concluding their study on the determinants of innovation capacity with empirical
evidence from service firms, Madeira et al. (2014), affirms that the greater the
financial investments in the acquisition of machinery, equipment and software; in
internal research and development; in acquisition of external knowledge; and in
marketing activities and other procedures, the greater the propensity of firms to
innovate in terms of services.
According to Park and Shin (2012) general wisdom is that when a country
industrializes, the shares of industry and service sectors in both GDP and
employment increase whereas the share of agriculture falls and when a country
de-industrializes and moves into the post-industrial phase, the share of services
increases while the shares of both industry and agriculture fall. They found that
when computing the contribution of agriculture, industry and services to GDP
growth, in general the service sector made the biggest contribution. Further, the
lower the per capita GDP, the greater the scope of labor productivity growth in
the service sector, which implies that there is still a lot of room for growth in the
productivity of services. Thus, Buera and Kaboski (2009) argue that as productivity
grows, individuals consume new services. Eventually, labor productivity increases
enough which makes the absolute cost advantage of market-production smaller and
leads individuals to home produced customized versions of services which yield

higher utility.
In the early 1980s, Fuchs (1980) argued that the decline in agriculture was
attributable primarily to differences in income elasticity of demand while the shift
from industry to services was attributable primarily to differential rates of growth of
output per worker. Economic growth also contributes to an increase in service
employment through an increase in female labor force participation because fam-
ilies with working wives tend to spend a higher proportion of their incomes on
consuming services.
15.2.2 Productivity of Service Sector Firms
Sahu (2015) analyzed micro-data on service sector companies to test high growth in
total factor productivity (TFP) assessing if better factor allocation led to TFP
growth. He found that a reduction in the misallocation of resources in the service
sector resulted in an accelerated pace of TFP growth. Therefore, the communication
and community service industries registered the fastest growth in terms of moving
toward efficient TFP levels. Acharya (2016) affirms what accounts for exceptional
TFP growth performance in some ICT industries using industries where produc-
tivity gains in the production of ICT are given as an answer in the US and in the
Organization of Economic Cooperation for Development (OECD) countries. Van
der Marel and Shepherd (2013) confirm that ICT capital and legal institutions are
particularly important determinants of a country’s ability to successfully export
services. Further, the tradability indices are strongly correlated with important
factors such as country productivity and size, factor endowments, trade costs and
regulatory measures.
Geishecker and Görg (2013) claim that measuring both service and material
off-shoring is not straightforward and is greatly limited in available data when it
comes to coherent and comparable information on such activities. Thus, trade
economists usually revert to measuring trade in intermediaries as proxy. In addition,
they assessed the impact of off-shoring activities on individual wages in an industry
which are conceptualized as average hourly gross labor earnings including bonus,
premium and other extra payments. The explanatory variables are demographic and
human capital variables including age, age squared; dummies for the presence of
children and being married; job tenure; tenure squared; a high education indicator;
dummies for occupation; and dummies for firm size and regional dummies. Their
results show that workers in industries with increasing levels of off-shoring services
were likely to experience reduction in their wages. They conclude what would have
been considered as a perfect case of spillovers from ICT using conventional
methods—the impact of research and development and other intangible capital.
Madeira et al. (2014) investigated the main determinants of innovation in the
service sector in the area of innovation activities. They found the use of the logit
model to be appropriate for measuring direct and indirect effects of a selected set of
explanatory variables of the innovation capacity of Portuguese service firms. They

point out the existence of several factors that stimulate and limit the innovation
capacities of firms such as investments in innovation activities, firm size and
sub-sector services in the sector of the activity.
Many research findings show that the contribution of research and development
activities is fundamental to the growth of the economic sector in any country.
Jafaridehkord et al. (2015) argue that firms benefit immensely from spending on
their human capital because this investment adds value to their companies.
Heshmati and Kim (2011) discuss the fact that a decrease in research and devel-
opment investments results in decreasing productivity growth. Schoonjans et al.
(2013) claim that the effect of knowledge networking on firm growth is signifi-
cantly larger for service firms than for manufacturing firms since it positively affects
net asset and value-added growth of service firms.
According to Du and Temouri (2015), firms in both manufacturing and service
sectors are likely to become high-growth firms (HGF) when they exhibit higher
TFP. The TFP growth model shows that openness to foreign companies and the
world economy, restructuring the economy through a shift of resources between
sectors and the presence of foreign companies in Malaysia are major contributors to
TFP growth (Jajri 2008).
15.2.3 Determinants of Productivity Growth

in Service Firms
Capital, labor and knowledge-based capital are key inputs in the production of
goods and services. Salehi-Isfahani (2006) claims that urban households are a
source of growth in human capital in the Middle East and North of Africa (MENA)
countries. But households in that region have to face the state playing a large role in
the economy, which distorts the incentive to invest in education and the labor
market and in social norms regarding gender. As a result, households invest in an
inefficient portfolio of human capital with dire consequences for long-run growth.
Literature argues about the relevance of knowledge-based capital in a firm.
Yli-Renko et al. (2001) found that knowledge acquisition was positively associated
with knowledge exploitation for competitive advantages through new product
development, technological distinctiveness and sale cost efficiency. Corporate
entrepreneurship was positively associated with knowledge-based capital (Simsek
and Heavy 2011) and business services can have an effect comparable to the
traditional production factor only when it applies to the service sector (Drejer
2002).
A review of contemporary literature suggests that regulatory, policy and insti-
tutional environments, competition in the product market, spillovers and external-
ities and internalization and globalization are constituents of a business
environment affecting a firm’s performance.
Bouazza et al. (2015) confirm that the key factors of a business environment
affecting Algerian firms are unfair competition from the informal sector; cumber-
some and costly bureaucratic procedures; burdensome laws, policies and regula-
tions; an inefficient tax system; lack of access to external financing; and low human
resources capacity. The main internal factors responsible for unstable and limited
growth include entrepreneurial characteristics, low managerial capacity, lack of
market skills and low technological skills. Gale et al. (2015) confirm the existence
of a negative relationship between the rate of firm formation and the top income tax
rate by finding that a cut in top income tax automatically generates or necessitates
growth.
The economic growth of a country in terms of GDP growth is determined by the
real value-added growth of underlying firms. According to Pop et al. (2014), in an
economic crisis it becomes clear that the smaller firms are often capable of
responding faster, they are more targeted and flexible to fluctuations in the global
economy and to withstanding the recessionary phase.
Khan (2011) tested the important determinants of a firm’s growth. He highlights
that a firm’s age, the education of the owner, the boss’ attitude, family business,
networks, new processes, major improvements, market share, on the job training
and know-how significantly and positively increased the probability of a firm’s
growth. The age of the owner, foreign trade regulations, taxes, other regulations,
political instability, inflation and lack of skilled labor adversely reduced the
probability of a firm’s growth in terms of employment opportunities. Olivera and
Fortunato (2008) and Lenaerts and Merlevede (2015) claim that a firm’s growth is
mainly explained by the firm’s age and size.
Existing literature states that expenditure on ICT has a positive impact on
exports of producer services (Guerrieri and Meliciani 2004) and sees ICT as the
bedrock of improving business processes, customer relations and efficient delivery
of goods and services to satisfy customer needs (Atom 2013). According to
Bethapudi (2013), ICT integration provides a powerful tool that brings advantages
to promoting and strengthening the tourism industry. Mihalic et al. (2015) mention
that ICT is also becoming an important factor in business and competitiveness
because of, as discussed by Borghoff (2011), its influence on the three
sub-processes of globalization: internationalization, global network building and
global evolutionary dynamics.
As for ICT applicability in the service sector, its role is crucial in facilitating
trade (Gupta 2012). According to Liu and Nath (2013) the trade-enhancing effect of
ICT is on its use. Internet subscriptions and Internet hosts have a significant positive
effect on both exports and imports. ICT in transport services plays a decisive role in
reducing energy consumption and CO2 emissions in the road transport sector
(Gupta 2012).
According to Agwu and Carter (2014) the use of mobile banking and automatic
teller machines (ATMs) has made financial services easily accessible and has
reduced costs for both customers and financial service providers in Nigeria.
Information technology has enabled banks to understand and serve customers better
than their competitors; they have developed and improved new products for cus-
tomers and further improved processes and relationships with customers and
business partners (Muro et al. 2013).
15.2.4 Employment and Productivity Growth in Services
Arnold et al. (2016) demonstrate the presence of a link between India’s policy
reforms in the service and productivity of manufacturing firms. They find that
banking, telecommunications, insurance and transport reforms have all had sig-
nificant effects on productivity in manufacturing firms; these effects tend to be
stronger on foreign owned firms.
El-Said and Kattara (2013) researched the application of information technology
versus human interaction services in an Egyptian hotel. They found that customers
preferred to contact an employee rather than depending on technology-based
self-services in a majority of service encounters. In Uganda, more than 80% of the
households were employed in tourism services. Tourism employment can provide
initial capital for supplementary activities.
Heshmati and Kim (2011) came to the conclusion that the competitiveness in
Korea’s service industry can be driven by an incentive system for skilled workers
and investing more in research and development in order to increase labor pro-
ductivity. In addition, the Korean government should implement an open market
policy to liberalize labor movement and induce low paid labor to move to the
production process to a large extent.
15.2.5 Summary of the Literature Review

on Service Development
Departing from the macroeconomic point of view, the growing size of the service
sector and its impact on the other parts of the economy makes it all the more
important to promote efficiency in the provision of services thereby boosting
economy-wide labor productivity (Jones and Yoon 2008). The main reasons for the
growth in services are rapid urbanization, domestic investments, openness,
expansion of the public sector and increased demand for intermediate and final
consumer services (Singh and Kaur 2014). The lower the per capita GDP, the
greater the scope for labor productivity growth in the service sector, which implies
that there is still a lot of room for growth in the productivity of services (Park and
Shin 2012).
Our microeconomic literature review supports that the development of service
firms is mainly backed with knowledge acquisition (Yli-Renko et al. 2001),
knowledge-based capital (Simsek and Heavy 2011), on the job training and
know-how and skilled labor and ICT applicability (Gupta 2012). Firms benefit
immensely from spending on their human capital because this investment adds
value to their companies (Jafaridehkord et al. 2015). The effect of knowledge
networking on firm growth is significantly larger for service firms than for manu-
facturing firms since it positively affects net asset and value-added growth of ser-
vice firms (Schoonjans et al. 2013).
To conclude, throughout research it is claimed that the important determinants of
a firm’s growth include a firm’s age, the education of the owner, the boss’ attitude,
family business, networks, new processes, major improvements, market share, on
the job training and know-how which significantly and positively increase the
probability of the firm’s growth (Khan 2011).
15.3 Methodology
15.3.1 Understanding the Key Concepts
In our study, services are conceptualized as non-agricultural and non-manufacturing

economic activities in firms operating in the Rwandan economy. National
accounting of GDP complies with the International Standards Industrial
Classification (ISIC) of all economic activities.2
Openness is conceived as an interaction with activities outside the Rwandan
service sector in terms of import and export of services, foreign direct investment
firms and acquisition of working capital externally. Yeboah et al. (2012) have
argued that the trade effect on productivity is much greater in an outwardly oriented
economy than in an inwardly oriented nation. The relationship between trade
openness and economic growth is significantly positive in developing countries
(Tahir and Azid 2015). The openness of a firm’s founders and early preparations for
growth determine both the extent of organizational learning and the speed at which
it is developed and used (Hagen and Zucchella 2014).
Growth is conceptualized as the increase in the service sector measured as
GDP. King and Levine (1993) claim that financial development is robustly corre-
lated to the future rate of economic growth, accumulation of physical capital and
improvements in economic efficiency. Growth in foreign sales contributes to a
firm’s growth if there is greater interaction among the management team’s members
2
ISIC classified services into sections from G to U as per individual categories in such a way to
(U) include wholesale and retail trade, repair of motor vehicles and motorcycles, transport and
storage, accommodation and food service activities, information and communication, financial and
insurance activities, real estate activities, professional, scientific and technical activities, admin-
istrative and support service activities, public administration and defense, compulsory social
security, education, human health and social work activities, arts, entertainment and recreation,
other services’ activities, activities of households as employers, undifferentiated goods and ser-
vices producing activities for households for own use and activities of extra-territorial organiza-
tions and bodies (UN 2008).
and a higher degree of joint decision making among the owners and managers of
small firms (Reuber and Fischer 2002). Sustaining economic growth and improving
living standards requires shifting labor into both the manufacturing and service
sectors (Eichengreen and Gupta 2011).
A firm’s growth is conceived as an increase in the product or service as the main
business, increase in sales, increase in the number of new employed persons and the
size of the establishment in the service sector. Smith and Verner (2006) found that
the proportion of women in top management jobs had a positive effect on a firm’s
performance and that the effect depended on the qualifications of female top
managers in Denmark. Dawkins et al. (2007) argue that both large firms and those
which are highly specialized, enjoy higher profit margins, whereas the more capital
intensive the firm the lower its profitability.
15.3.2 Performance Models
In order to investigate the determinants of service sector development, we focus on

the role of total annual sales and innovation and turnover in service firms as
dependent variables. These are commonly used measures of performance
throughout literature and are endogenous to firms in their decision making.
A number of hypotheses were formulated and tested. The first hypothesis was
that the service sector’s development can be investigated through total annual sales
of a firm. In the Rwanda Enterprise Survey (2011), firms were asked what the
establishment’s total sales were in 2010 and what the establishment’s total annual
sales were in the three previous fiscal years since fiscal year 2008. Thus, total sales
growth up to 2010 was used as the dependent variable. Variables that have an effect
on total sales growth are employment cost, loan size, ICT and a firm’s innovation
characteristics. The null hypothesis is that these factors have no effect on total sales
and growth rate, while the alternative hypothesis is that they have positive effects on
total sales and growth rate.
Total annual sales were measured in terms of the amount of money a firm
acquired by selling services domestically and through direct or indirect exports over
three years starting 2008. Labor utility was included in the costs incurred for
employment by a service firm. Working capital was estimated using the loan size
approved to track the role of financial institutions as channels of access to financial
service activities. ICT application was tracked by using e-mails to communicate
with clients or suppliers and the use of cell phones for the operations of an
establishment. A firm’s innovation characteristics were defined as employee
development, research and development activities, internal or external training, new
methods, new practices, new marketing strategies and new logistics.
The model for investigating the determinants of total sales growth in service
firms is constructed as:
!
employment cost; working capital; ICT;
Total sales growth ¼ f
firm innovation criteria; acquisition of fixed asset
ð15:1Þ
The second hypothesis is that the service sector’s development is reflected in its
innovations that are expressed in the introduction of new products or services. In
the Rwanda Enterprise Survey (2011), firms were asked whether they had intro-
duced new products or services in the last three year. The variable of the intro-
duction of new products or services which is conceived as innovation is taken as the
dependent variable. Independent variables include internal research and develop-
ment (R&D) activities, external or internal acquisition of research and development
(ext. R&D) as time given to employees in a service firm to develop or try out a new
approach or a new idea about products or services, business process, firm man-
agement, marketing, training, access to finance as illustrated by the acquisition of
fixed assets and a firm’s characteristics in term of size. The null hypothesis suggests
that these factors do not influence service innovation, while the alternative
hypothesis suggests that they have a positive effect on service innovation of new
products and services. The model to investigate the factors affecting service sales is
structured as:
!
R&D; ext:acquisition of R&D; acquisition of training;
Service innovation ¼ f
acquisition of fixed assets; other firms’ criteria
ð15:2Þ
The third hypothesis is that the turnover of a service firm is affected by a number
of factors such as the capital used, openness conceived as buying and selling outside
the country, the manager’s gender, paying value-added tax, paying income tax and
the service sub-sector. The turnover of a service firm is defined as the amount of
money that is received in sales. In the Establishment Census (2014), the information
collected on this variable is classified in categories where the first category includes
all firms with turnovers less than 300,000 RWF, the second category includes all
firms with turnovers ranging from 300,000 RWF to 12 million RWF, the third
category has all firms with turnovers ranging from 12 million to 50 million RWF
and the last category includes all firms with turnovers more than 50 million RWF.
This is a category-dependent variable. Categorization of the turnover leads to
information about losses within the category; it also sheds light on category dif-
ferences in performance and the variations in their determinants.
The first dummy variable on openness contains information on whether a firm
sells or buys goods or services abroad. The second dummy variable ‘gender’
defines whether the manager of a firm is female or male. The third dummy variable
on value-added tax (VAT) contains information on whether or not the firm pays
VAT. The fourth dummy variable has information on whether or not the firm pays
income tax. There is also a factor variable on the service sub-sector where 7 stands
for wholesale and retail trade and repair of motor vehicles and motorcycles, 8 stands
for transportation and storage, 9 stands for accommodation and food service
activities, 10 stands for information and communication, 11 stands for financial and
insurance activities and 12 stands for real estate activities. The other factor variable
‘capital’ contains information classified in categories in such a way that the first
category considers firms using less than 500,000 RWF as capital, the second using
500,000 to 15 million RWF, the third using 15 million to 75 million RWF and the
last category using capital more than 75 million RWF. Thus, this is a categorical
variable. Factors affecting change in turnover are constructed with the variables
mentioned earlier and are expressed as:
Turnover ¼ f ðcapital used; openness; gender; taxes; service sub-sectorÞ ð15:3Þ
15.3.2.1 Relationship Between Sales, Innovation and Turnover
As discussed earlier, sales are used as an indicator to measure a firm’s growth and
this growth as its turnover. In our study, sales and turnover are both used with
different model specifications because the datasets used are different. Otherwise,
they should have the same model specifications since they can be used
interchangeably.
The model on the sales of service sector firms is constructed with the variables
used in the collection of data during the 2011 Rwanda Enterprise Survey by the
National Institute of Statistics of Rwanda in partnership with the World Bank.
Because this database contained missing values, we constructed a model on turn-
over with the variable used to collect information in the Establishment Census
(2014) by the National Institute of Statistics of Rwanda. This was done to track the
main factors affecting sales or turnover.
For the innovation model, we used the same database as the sales model because
the 2011 Enterprise Survey attached more interest to the innovation factor in the
performance of firms. Only the predictors of the innovation model can appear in the
sales model in order to prove the contribution of innovation in the growth of sales
of service firms.
15.3.3 Description of the Data
Data about the performance of Rwanda’s service sector used in this study was
provided by the National Institute of Statistics of Rwanda. The data came from two
important data collection channels—the 2010–2012 Enterprise Survey in Rwanda
and the 2014 Establishment Census.
The Enterprise Survey focuses on the many factors which shape the business
environment and is useful for both policymakers and researchers. The Enterprise
Survey is conducted by the World Bank and its partners across all geographic
regions and covers small, medium and large companies. The sample is consistently
defined in all countries and includes the entire manufacturing sector, the service
sector and the transport and construction sector. The 2011 Rwanda Enterprise
Survey covered 241 firms including 159 service firms and 82 manufacturing firms.
The cleaned raw database contains 148 firm observations each with 247 variables
describing various aspects of the firms and their activities (WB 2014).
The Rwanda Establishment Census (2014) consists of a complete count of all
establishments practicing specific economic activities in Rwanda except
not-for-sale government services. It covered themes such as economic activity,
legal status, registration of establishment, taxation, capital employed, regular
operation accounts, socioeconomic characteristics of an establishment’s staff,
payment status and sex of employees. The dataset contains 154,236 cases with 91
variables (NISR 2014).
The dependent variable is service firm growth which is measured by several
attributes such as turnover/sales, employment, assets, market shares and profits. The
Rwanda Enterprise Survey (2011) provides data on total sales for three years and
the 2010 fiscal year and data on the introduction of new products or services which
are a measure of innovation output in the previous three years. Factors affecting
total sales, growth of employment and service innovation determine the develop-
ment of the service sector. Literature highlights key measures of a firm’s growth as
sales, employment and innovation. Zhou and Wit (2009) and Isaga (2015) used
sales and employment to measure the growth of a firm since they reflect both
short-term and long-term changes in a firm.
In the model on service innovation, the dependent variable is a binary variable
on the introduction of new products or services in three years from 2010. According
to Neely and Hii (1998), innovation has a direct impact on the competitiveness of a
firm. The values created by innovations are often manifested in new ways of doing
things or new products and processes that contribute to wealth. In their studies,
Arvanistis and Stucki (2012) and Madeira et al. (2014) used a firm’s innovations for
measuring growth because it is argued that innovation start-ups are important dri-
vers of economic growth.
The model on turnover uses a categorical dependent variable where the turnover
of a firm is classified into four categories as described earlier. An ordinary scale
with many categories (5 or more), interval and ratio are usually analyzed using the
traditional approaches of statistical tests (Newsom 2013).
Independent variables in new service development are classified into four cat-
egories—firm characteristics, innovation characteristics, managerial characteristics
and business environment. In this study, a firm’s characteristics consider the firm’s
size, gender composition and legal status. Considering firm size, Madeira et al.
(2014) found a positive and increasing effect of firm size on firm innovation.
Medium-sized firms showed greater propensity to innovate than small sized firms.
Innovation characteristics include market conditions, new management prac-
tices, new market methods, spending on research and development activities, a
service firm’s employees’ development, a firm’s access to finance expressed in the
acquisition of fixed assets and degree of competition. Acs and Audretch (1988) and
Prajogo and Sohal (2006) claim that there is a positive relationship between
innovation and research and development activities of firms.
Managerial characteristics are pointed out with the top managers’ levels of
education and the years of their working experience in the service sector. Education
is measured by level of education attained classified as: no education, primary
school, secondary school, vocational training, some university training and graduate
degree. Queiro (2016) found that firms which switch to more educated managers’
experience sharp increases in growth relative to comparable firms managed by less
experienced managers. More educated managers increase the use of incentive pay
and are likely to report new products and services and incorporate new technolo-
gies. The correlation matrix of the dependent and independent variables is given in
Appendix 1.
15.3.4 Estimation Methods: Linear and Logistic

Regression Models
Madeira et al. (2014) have argued that a firm’s capacity to innovate is a complex
phenomenon influenced by a wide range of factors. Thus, the logistic regression
(logit model) helps to study the statistical relationship of the dependent variable in
relation to more than one determinant variable. Stock and Watson (2011) discuss a
regression with a binary dependent variable and conclude that when dependent
variable Y is binary, the population regression function is the probability that Y = 1,
conditional on the regressors. The resulting predicted values are predicted proba-
bilities and the estimated effect of a change in regressor X is the estimated change in
the probability that Y = 1 arising from the change in X. The standard estimation in
the maximum likelihood method and its estimates proceeds in the same way as it
does in linear multiple regressions.
In our study, dependent variables for service innovation are conceived as the
introduction of new products or services; they are binary variables where value of
zero translates into the fact that a firm did not introduce a new product or service
and 1 for firms that introduced new products or services. The same applies to
According to Verbeek (2004), who discusses models with limited dependent
variables, when the dependent variable is zero for a substantial part of the popu-
lation but positive for the rest of the population with many different outcomes, the
logistic regression model is particularly suited for these types of variables. Since a
violation of distribution leads to inconsistent maximum likelihood estimators,
testing for misspecifications is to be conducted and necessary measures undertaken.
To estimate the total sales growth in service firms, we used the multivariate
regression analysis since growth is expected to be analyzed in the three years’ total
annual sales of a service firm. We need to track the factors that contributed to the
change in total annual sales in service firms. In this case, using the linear regression
model is helpful.
15.3.5 The Empirical Model and Its Specifications
Empirical models for an analysis of the service sector’s development and its
determinants in Rwanda are expressed on the basis of total annual sales, service
sector innovativeness and service sector turnovers to track the factors influencing
the dependent variables. Starting with the factors affecting sales in service firms
(Model 1), we can construct the multivariate regression model as:
Sales i ¼ b0 þ b1 x1 þ b2 x2 þ b3 x3 þ b4 x4 þ b5 x5 þ b6 x6
ð15:4Þ
þ b7 x7 þ b8 x8 þ b9 x9 þ b10 x10 þ b11 x11 þ b12 x12 þ ei :
In this model, the dependent variable ‘Sales’ stands for the level of total sales
given the values of X’s that are independent or determinant variables. X1 stands for
the total annual cost of labor including wages, salaries, bonus and social security
payments as the performance expression in service firms, X2 stands for the size of
the most recent loan or line of credit approved as a source of finance, X3 stands for a
dummy variable on the use of Internet expressed by e-mails to communicate with
clients or suppliers as an ICT application, X4 stands for a dummy variable of
employees’ development activities through new ideas or approaches about products
or services, X5 stands for a dummy variable on the spending on formal research and
development activities to create new products or to find more efficient methods of
production, X6 stands for a dummy variable on innovation expressed as the intro-
duction of products or services, X7 stands for a dummy variable on engaging in
internal or external training of personnel, X8 stands for a dummy variable on the
acquisition of fixed assets such as machinery, vehicles, equipment, land or build-
ings, X9 stands for a dummy variable on the new or significantly improved methods
of offering services, X10 stands for a dummy variable on the new or significantly
logistical or business support processes, X11 stands for a dummy variable on
introduced new or significant improved marketing methods, X12 stands for a
dummy variable on the new or significantly improved organizational structure or
management practices.
The coefficients are represented by the symbol b with subscripts from 0 to 12
according to the dependent variables. On the one hand is the null hypothesis H0:
bi ¼ 0, that is, b1 ; b2 ; . . .; bn ¼ 0. In this case, no independent variable has any
effect on the total annual sales of service firms, and on the other hand, is the
alternative hypothesis, H1: bi 6¼ 0 meaning that in the independent variables the
results change in total annual sales of service firms. A positive coefficient is
interpreted as having a positive effect and a negative effect on sales. Thus, the main
focus is on the properties of the effects namely the signs of the effects and their
consistency with our expectations, the size of the effects and their statistical sig-
nificance. The model can also be specified in the form of changes in sales between
two years or labor productivity that is sales per employee.
The innovation model was also used to assess the determinants of service sector
innovativeness which can influence firms’ growth. The model for service innova-
tion (Model 2) is specified as:
Pr:ðY ¼ 1jzÞ ¼ u0 þ u1 z1 þ u2 z2 þ u3 z3 þ u4 z4 þ u5 z5 þ u6 z6 þ u7 z7
ð15:5Þ
þ u8 z8 þ u9 z9 þ u10 z10 þ lt :
The probability that the service firms introduced new products or services is
portrayed with Y as the binary dependent variable. The symbol z with subscripts
ranging from 0 to 10 stands for different independent variables or determinants of
innovativeness that are thought to have an effect on the extent to which a firm
innovates.
As conceived in Eq. (15.5), z1 stands for new or significantly improved methods
of offering services, z2 stands for a dummy variable on the new or significantly
logistical or business support processes, z3 stands for a dummy variable on intro-
duced new or significant improved marketing methods, z5 stands for a dummy
variable on spending on formal research and development activities to create new
products or to find more efficient methods of production, z6 stands for a dummy
variable on employees’ development activities through new ideas or approaches
about products or services, z7 stands for a dummy variable on engaging in internal
or external training of personnel, z8 stands for a dummy variable on the acquisition
of fixed assets such as machinery, vehicles, equipment, land or buildings, z9 stands
for a dummy variable on having a line or a loan from a financial institution, z10
stands for a factor variable on the firm size defined as small (5–19 employees),
medium (20–99 employees) and large (100 employees and above) and lt stands for
the random error term.
For this model, the null hypothesis, H0: ui ¼ 0, implies that all the independent
variables do not affect or generate the introduction of new products or services and
the alternative hypothesis, H1: ui 6¼ 0, suggests that the independent variables have
an effect on the introduction of new products or services. Although maximum
likelihood estimators have the property of being consistent, the likelihood function
has to be correctly specified for this to hold. The most convenient framework for
such a test is the Lagrange multiplier framework (Verbeek 2004).
Turnover as a measure of growth is used to assess the factor that influences it in
the service sub-sectors. The model on the service firm turnover (Model 3) is con-
structed as:
Turnover ¼ h0 þ h1 X1 þ h2 X2 þ h3 X3 þ h4 X4 þ h5 X5 þ h6 X6 þ i ð15:6Þ
The level of the turnover of service firms given the predictor Xi in this model is
represented by G and the coefficients are symbolized by h with subscripts 1–6. The
independent variable X1 stands for the gender of the manager, X2 stands for
openness in the service firm as selling and buying goods or services abroad, X3
stands for tax on added value, X4 stands for tax on income, X5 stands for a cate-
gorical variable on the main service sub-sector, X6 stands for a categorical variable
on the capital used by the service firm and i represents the error term. The null
hypothesis, H0 : h ¼ 0 implies that the independent variables have no effect on the
level of turnovers in service firms. The alternative hypothesis, H1 : h 6¼ 0 implies
that independent variables affect the level of turnover in service firms. The sign of
the coefficient is checked to be consistent with expectations.
15.4 Estimation, Testing and Results
15.4.1 Linear Regression of Service Sales Model
The results of the multivariate linear regression of the service sales model (Model 1)
are presented in Table 15.1. At a 5% confidence interval, the variable on
employment coefficient, loan size, employees’ development and Internet use are
statistically significant with a positive effect on the growth in sales except
employees’ development. Therefore, we reject the null hypothesis. Other coeffi-
cients are statistically insignificant, thus we fail to reject the null hypothesis.
Innovation, training, acquisition of fixed assets, new methods, new practices, new
marketing and new logistics do not have any effect on total annual sales. The R2 is
0.84, meaning that the independent variables explain variations in sales of service
firms at 84%.
15.4.2 Logistic Regression of the Service Innovation Model
The results of the logistic regression of the service innovation model (Model 2) in
output are given in Table 15.2. The results for the innovation model show that the
independent variables on new or improved methods of offering services, engaging
in internal or external training and acquisition of fixed assets are statistically sig-
nificant at 5%, that is, they effect the service firms’ innovation. Thus, we reject the
null hypothesis. The other variables in the model are statistically insignificant as
they have no effect on the innovativeness of the service sector.
Testing the fit of the model, we find that AIC is lower than BIC which implies
that our model is well fit (see Table 15.3). The logistic model of innovation is
correctly classified at 76.58%. The log likelihood ratio test is recommended with
inference at −80.4422 with Chi2 (1) = 1.63 and Prob > Chi2 = 0.2015 at 5%,
implying that the model is fully fitted (Appendix 2). According to Scott (1997), the
LR test assesses constraints by comparing the likelihood of the unconstrained
model to the log likelihood of the constrained model. If the constraint significantly
Table 15.1 Linear regression of service sales model (Model 1) and its determinants
Source ss df MS No of Obs = 48
Model 152.5216 12 12.7101 F(12, 35) = 15.80
Residual 28.1634 35 0.8046 Prob > F = 0.000
Total 180.6851 47 3.8443 R-squared = 0.8441
Adj. R-squared = 0.7907
Root MSE = 0.8970
Log total sales Coef. Std. Err. t p > |t| [95% conf. interval]
Log employ cost 0.7220 0.1079 6.689 0.0000 0.5029 0.9412
Log loan size 0.2361 0.0852 2.771 0.0089 0.0631 0.4090
Internet use 1.2684 0.5292 2.397 0.0220 0.1940 0.3428
Employe dvt −1.0810 0.4163 −2.596 0.137 −1.9262 −0.2358
Research devpt −0.9456 0.3223 −2.934 0.0059 −1.5999 −0.2914
innovation 0.1124 0.4208 0.267 0.7910 −0.7419 −0.9668
Trainings −0.1875 0.4509 0.416 0.6801 −1.1028 0.7278
Fixed asset −0.2912 0.3970 −0.733 0.4681 −1.1028 0.5147
New methods −0.6576 0.4593 −1.432 0.1611 −1.5901 0.2749
New practices 0.1796 0.4846 0.371 0.7132 −0.8043 1.1634
New marketing −0.6149 0.3739 −1.645 0.1697 −0.3740 0.1442
New logistics 0.7042 0.5023 1.402 0.1697 −0.3155 1.7239
_Cons 3.4132 1.5283 2.233 0.0320 0.3105 6.5159
Table 15.2 Logistic regression model of innovation performance (Model 2) and its determinants
Logistic regression Number of obs = 158
LR chi2 (12) = 46.28
Prob > chi2 = 0.0000
Log Likelihood = −81.257932 Pseudo R2 = 0.2217
Innovation Coef. Std. Err. Z P > |z| [95% Conf. Interval]
New methods 1.0971 0.4907 2.236 0.0254 0.1354 2.0587
New logistics 0.2143 0.5451 0.393 0.6943 −0.8542 1.2827
New practices −0.1162 0.5654 −0.205 0.8372 −1.2243 0.9920
New marketing −0.2969 0.4911 −0.605 0.5454 −1.2595 0.6656
Research dvpt 0.2238 0.4919 0.455 0.6491 −0.7402 1.1878
Employee dvpt 0.8771 0.4861 1.804 0.0712 −0.757 1.8399
Training 0.9657 0.4720 2.046 0.0408 0.0406 1.8909
Fixed asset −1.1771 0.4449 −2.646 0.0082 2.0491 −0.3051
Loan 0.6215 0.4092 1.519 0.1288 −0.1805 1.4234
(continued)

Innovation Coef. Std. Err. Z P > |z| [95% Conf. Interval]
Firm size
1 −0.4398 1.0077 −0.436 −0.6625 −2.4148 1.5352
2 0.0959 1.0425 0.092 0.9267 −1.9473 2.1391
3 1.0922 1.2691 0.861 0.3895 −1.3952 3.5797
_Cons −0.8578 1.0347 −0.829 0.4071 −2.8858 1.1702
Table 15.3 Summary of post-estimation of Akaike’s and Baysian information criteria (AIC, BIC)
Model Obs 11(null) 11(model) df AIC BIC
– 158 −80.44222 14 188.8844 231.7608
reduces the likelihood, then the null hypothesis is rejected. The results of an
alternative skewed logistic regression of innovation are presented in Appendix 3.
15.4.3 Ordered Logistic Regression

of the Service Turnover Model
In order to estimate the service turnover model, we used ordered logistic regression
because turnover is a dependent variable defined as a categorical variable. If the
primary interest is understanding how the explanatory variable affects the con-
ceptual dimension represented by an ordinal variable, an ordinal variable is
appropriate. The results of an ordinal logistic model are the same as those for a
traditional logistic model with the exception that there is a cut point instead of
a constant (Powers and Xie 1999).
The results presented in Table 15.4 indicate that the coefficients of gender,
openness, value-added tax, income tax, capital used and service sub-sectors 8, 9 and
11 are statistically significant. Meaning that, they influence the level of turnover of
a service firm. The others are statistically insignificant which implies that they have
no effect on the change in the level of turnover.
15.4.4 Analysis of the Empirical Results
This section gives an interpretation and analysis of the results for the three models
specified and estimated earlier. From this, we can gain advanced knowledge about
the constituents of the service sector and the determinants contributing to the
development of this sector. Service sector development is measured by considering
key measures of a firm’s performance and growth such as innovation, sales and
Table 15.4 Ordered logistic regression of service turnover model (Model 3) and its determinants
Logistic regression Number of obs = 35575
LR chi2 (12) = 17,932.95
Prob > chi2 = 0.0000
Log Likelihood = −21.21409.823 Pseudo R2 = 0.2952
Turnover Coef. Std. Err. z p > |z| [95% conf. Interval]
Gender manager −0.0624 0.0280 −2.224 0.0262 −0.1174 −0.0074
Openness 0.7192 0.0891 8.075 0.0000 0.5447 0.8938
Value-added tax 1.8273 0.0816 22.380 0.0000 1.6672 1.9873
Income tax 0.2105 0.0479 4.394 0.0000 0.1166 0.3043
Ssubsectors
8 0.7318 0.2213 3.306 0.0009 0.2980 1.1656
9 −0.3654 0.0277 −13.193 0.0000 −0.4197 −0.3111
10 −0.0246 0.2351 −0.105 0.9166 −0.4854 0.4361
11 1.9284 0.1207 15.983 0.0000 1.6920 2.1649
12 −0.4586 1.1115 −0.413 0.6399 −2.6371 1.7200
Capital
2 2.7719 0.0334 82.892 0.0000 2.7063 2.8374
3 5.3948 0.1121 48.128 0.0000 5.1751 5.6145
4 6.4496 0.1464 44.058 0.0000 6.1626 6.7365
turnover, and these are taken to be dependent variables for forming and estimating
the models. The growth in sales of service firms contributes to the growth of the
service sector’s share in Rwanda’s GDP. Innovations bring in new products or
services which in turn push the growth of the sector. The factors influencing growth
in sales, service innovativeness and turnover are used to find the drivers of service
sector development. These determinants are taken into consideration in shaping and
sustaining the service-led economy path as it is a national strategy for economic
growth.
15.4.4.1 Factors Affecting Total Sales Growth
Estimation results of the linear regression of the sales model indicate that
employment costs, size of the approved loan and use of Internet positively affected
the change in sales of service firms for the period 2008–2010. Growth in
employment is a good indicator of a firm’s performance whereby the cost of
employment for three years is positively reflected in total sales. A 1% change in
costs attributed to employment resulted in a 0.72% change in sales in service firms,
other things holding constant.
UNECA (2015) reported that financial services are the oil of transactions and
provide access to credit for investments for most other businesses. This is proven by
the fact that in our model on sales, the size of the most recent loan or line of credit
approved was positively correlated to the change in total sales of service firms.
Other things holding constant, a 1% change in the size of the loan resulted in a
0.236% change in the total annual sales of a service firm.
Liu and Nath (2013) argue that the trade-enhancing effect of ICT infrastructure
or ICT capability depends on its use. Internet subscriptions and Internet hosts have
significant positive effects on both exports and imports. In our model on sales, the
use of e-mails to communicate with clients or suppliers expressed as Internet use
had a positive relationship to total sales as has also been found in previous studies.
Holding other things constant, a 1% change in the use of Internet brought a 1.268%
change in the sales of a service firm.
Both employees’ development and research and development activities were
negatively correlated with a change in the sales generated in service firms. Holding
other things constant, a 1% decrease in employees’ development resulted in a
0.108% decrease in total service sales. A 1% decrease in spending on research and
development activities induced a 0.94% decrease in total sales, other things holding
constant.
The change in total sales of service firms in Rwanda is attributed to financial
services through access to credit, ICT applications in service provision principally
via e-mail operationalization, employment growth expressed by the costs incurred
by a service firm on employment, employees’ development as a trial of a new
approach or new idea about products or services, business process, firm manage-
ment or marketing. Last but not least is the expenditure incurred on research and
development activities. These variables are explained in the model at 84% as
measured by R2 and all are statistically significant as their t-statistic is greater than
1.96 with p-values less than 0.05.
15.4.4.2 Factors Contributing to Innovativeness
The logistic regression of the service innovation model (Model 2) finds the factors
contributing to innovations in Rwanda’s service firms. In the summary of results for
Table 15.2, the number of observations shows that 158 firms were included in the
estimation. The significance test of the likelihood ratio indicating whether the
predictors in the model together accounted for significant variations in the depen-
dent variable is 46.28 where the probability Chi-square test is 0.000. This implies
that the independent variables influenced the dependent variable. Variables such as
new methods, training and acquisition of fixed assets were statistically significant at
the 95% confidence interval since their p-values are less than 0.05 and their z values
in absolute terms are greater than 1.96. The approximate amount of variance is
accounted for by independent variables in this model as expressed by Pseudo R2
which is 0.22. The log likelihood is −81.2579.
A 1% increase in the use of new methods such as new or significantly improved

technology, equipment and software for production, finishing, packaging or quality
control resulted in a 1.097% increase in the innovativeness of a service firm,
holding other things constant.
A 1% increase in the level of acquisition of internal or external training resulted
in a 0.965% increase in the level of introducing new products or services in a firm,
holding other things constant. This result is consistent with prior literature on the
importance of training in the performance of a firm. In his study on the effect of
training on employee performance with evidence from Uganda, Nassazi (2013)
reported that training and development had an impact on employees’ performance
with regard to their jobs. Training develops skills, competencies and abilities and
ultimately improves employee performance and organizational productivity (Amir
and Amen 2013).
A 1% decrease in the acquisition of fixed assets such as machinery, vehicles,
land and buildings resulted in a 1.17% decrease in the introduction of new products
or services, holding other things constant. This indicates that the acquisition of fixed
assets is a key factor for the innovation process in service firms. It is clear that a lack
of fixed assets not only hampers service innovation, but also affects existing service
provision which is bad for the country’s economy. Silva found that the greater the
financial investment in acquisition of machinery, equipment and software, the
greater the propensity for firms to innovate in their services.
In conclusion, service firms’ innovations in Rwanda are attributed to the new
methods applied, acquisition of internal or external training and acquisition of fixed
assets. These factors affect the service sector’s performance and growth by enabling
the introduction of new products or services.
15.4.4.3 Factor Determining Levels and Variations in Turnover
Turnovers of service firms are conceived as the amount of money taken by a

business in a particular period. The estimation of the ordered logistic regression
model on turnover in the service sector revealed that the gender of the manager,
openness and taxes were statistically significant and influenced the turnover of
service firms at a 95% confidence interval. This implies that the p-value of the
independent variable was less than 0.05. The control variable on the level of capital
used was positive and statistically significant at the 95% confidence interval due to
the fact that the p-values were less than 0.05. The service sub-sectors of transport
and storage (8), accommodation and food service activities (9) and financial and
insurance activities (11) were statistically significant because their p-values were
less than 0.05 at the 95% confidence interval. In accordance with our estimated
model on turnover, these three service sub-sectors influenced variations in turnovers
of service firms in Rwanda which impacted the service sector as a whole. The
estimation results are presented in Table 15.3.
Table 15.3 shows that the gender of the manager was statistically significant and
negatively correlated to the turnover of a service firm. In our study, being a male
manager negatively influenced the turnover in a service firm at the 6% level.

Meanwhile, Johnsen and McMahon (2005) report that consistent statistically sig-
nificant differences in financial performance and business growth do not exist
between female and male owned/managed concerns once appropriate demographic
and other relevant controlling influences are taken into account. According to
Watson (2003), female managers are just as effective (as males) in using resources.
However, females (on average) invest fewer resources in their ventures and also
seem to get involved in less risky enterprises. Their overall performance is likely to
be the same as that of the males provided appropriate measures of performance are
used such as sales or profits. Considering prior research findings, the negative
relationship found in our study does not imply differences in female and male
managers in terms of performance, rather it is possible to view this in terms of the
risk associated with a business and this is a subject for subsequent studies for more
clarifications.
Openness is conceived as an interaction outside the Rwandan service sector in
terms of imports and exports of services. In our study, interaction was assessed
through buying and selling services abroad and the estimation results show that
there was a statistically significant and positive relationship between turnover and
openness in service firms. A 1% change in the level of openness increased the level
of turnover by 0.71%. Singh and Kaur (2014) claim that openness positively affects
the share of the service sector in gross domestic product. According to Halpern
et al. (2015) importing all inputs will increase a firm’s revenue productivity by
0.22%, about one-half of which is due to imperfect substitution between foreign and
domestic inputs. They argue that productivity gains from a tariff cut are larger when
the economy has many importers and many foreign firms.
An assessment of the determinants of service sector development looked at the
role of Rwanda’s taxation system to boost the service sector. The estimation of
ordered logistic regression of turnover to value-added tax and income tax showed
that there was a statistically significant positive relationship at the 95% confidence
level. This means that the tax system in Rwanda positively affects the development
of the service sector. A 1% change in payment of the value-added tax resulted in a
1.82% change in the growth of turnover in services and a 1% change in the payment
of income tax increased the turnover of the service sector up to 0.21%, holding
other things constant. Stoilova and Patonov (2013) claim the existence of a clear
and strongly expressed impact of direct taxes on economic growth. In addition, they
argue that a tax structure based on direct taxes is more efficient in terms of sup-
porting economic growth. Wu et al. (2012) argue that in China private firms with
politically connected managers enjoy tax benefits. Chude et al. (2015) conclude that
the positive and significant relation between profitability and taxation explanatory
variables indicates that if policymakers expand tax revenue through more effective
tax administration it will positively impact a company’s profitability.
Capital is used as a control variable since the capital used by a service firm is
categorized as its capability. Estimation results show that the capital used at all
levels was significantly positive. Holding other things constant, for a service firm
using capital ranging between 500,000 and 15 million RWF, a 1% change in the
level of capital used resulted in an increase in turnover of up to 2.77%. For firms

using capital between 15 million and 75 million RWF, a 1% increase in capital
resulted in a 5.40% increase in turnover, holding other thing constant. For firms
using more than 75 million RWF, the estimation results indicated that a 1% change
in the capital used resulted in a 6.44% increase in turnover, holding other things
constant. Briefly, the more the capital used, the more the turnover of a service firm.
Thus, capital is another factor contributing to the service sector’s development since
any increase in the capital used results in an increase in the turnover of a service
firm.
An ordered regression of the service sector’s turnover as per different sub-sectors
indicates that the transport and storage, accommodation and food services and
financial and insurance sub-sectors had a significant positive effect on the turnover
of a service firm. Holding other things constant, a 1% increase in transport and
storage for a service firm resulted in a 0.76% increase in its turnover. A 1%
decrease in the level in the accommodation and food services sub-sector resulted in
a 0.36% decrease in turnover, holding other things constant. Lastly, a 1% increase
in financial and insurance activities brought in a 1.94% increase in turnover,
holding other things constant.
15.4.5 Usefulness of the Results
The ultimate goal of this study was to carry out an analysis of trends in the
development of service firms in Rwanda and identifying contributing factors
driving its performance using survey data covering various aspects of the service
sector. Literature was reviewed to assess the similarities and dissimilarities in
findings all over the world, a descriptive analysis of existing data and an empirical
analysis of micro-data on service firms were used to understand the functioning of
service firms in Rwanda and in other parts of the world. The results are interesting
and are useful for academics and both the public and private sectors.
15.4.5.1 Adoption and Scaling-Up of Innovation Activities
The results of factors influencing innovation in service firms are very useful for the
government because innovation is a key to economic growth and development. In
public sector management, innovation is a priority for all nations because the
current wealthy nations have got a wide range of innovations in various disciplines.
In our study, innovation as a stand-alone variable did not influence any change in
total sales; though some of the variables characterizing innovation were statistically
significant, namely new methods and training. Therefore, the government could use
these findings to scale-up innovation activities in service firms and shape capacity
building strategies and policy with these empirical facts. Innovation is a prime
contributor to sales growth and needs to be geared up to sustain service firms’

development as a way for economic growth.
This study is an asset for academicians and for future studies by researchers and
graduate students. Its findings on service innovation can form the basis for
expanding research in economic growth since it is Rwanda’s national policy in
Vision 2020 of becoming a middle-income country. Thus, it is the responsibility of
academia to support the government by providing facts to monitor the implemen-
tation of government policies for evidence-based interventions and decision
making.
15.4.5.2 Diversification of Sources of Service Firms’ Development
The results of the linear regression of the sales model are very useful in assessing
the role of economic integration. One of the objectives of economic integration is to
operate in a large market where nationals buy and sell their products and services.
Having openness as a significant variable to change turnovers indicates that eco-
nomic agents in service firms should take advantage of this information to increase
the returns to their businesses. The private sector can use this information to exploit
unused channels and do a study of regional markets to expand their businesses since
it has been a while since the government signed the agreement to be a member of
East African Countries (EAC) and other regional economic integration cooperation
efforts.
Focus on ICT is found to be another source of better performance in service
firms. Daily use of Internet as a communication channel must be looked at as a
strategy to be widely adopted by competitive managers of service firms. This fits
well with Rwanda’s national commitment of becoming an ICT regional hub and an
ICT connected country.
For academic research purposes, this information is crucial since it opens up the
ground for further empirical studies to assess how the government is benefiting
from regional economic integration in terms of economic growth and development.
Further, it will be interesting to conduct an empirical study on ICT applicability and
economic performance in Rwanda.
15.4.5.3 Providing Insights on Turnovers of Firms

and Their Access to Finance
All firms aim to increase their turnovers as they are profit-based entities. The results
from the model on service firms’ turnovers gives information on interacting with
the foreign market by either buying or selling products or services. The more the
capital used, the more the turnover increases which could inform investors attracted
by service related economic activities such as transport and storage and accom-
modation and food services. These service sub-sectors are found to be more
profitable in the overall service sector. The spillover effects of taxes are marked in
the turnovers of service companies. This could be used to back the importance of
paying taxes by service sector taxpayers. Looking at the value-added tax, which is
paid by consumers, helps us conclude that the service firms’ development is
demand elastic because the more the consumers pay VAT, the more the turnovers
are generated. Income tax is normally paid depending on the income earned by a
firm through the year. The correlation of income tax and growth in turnover implies
good performance of service firms. Generally, taxes support the economy at large
and it is important to know how taxes affect the service firms’ development in
particular.
Access to finance is one of the most needed inputs for the good performance of a
service firm; this is provided by financial institutions like banks. Our investigation
of the determinants of service sector development qualifies it to be more appropriate
for service firms’ performance as indicated by acquisition of fixed assets, loan size
and capital used. The government should take note of this in steering monetary
policy and encourage financial institutions to facilitate service firms’ investors in
accessing funds.
15.5 Conclusion and Recommendations
15.5.1 Conclusions
Our study on service firms’ development and their performance in Rwandan eco-
nomic growth provides useful details about service firms over years and empirically
estimates the determinants of service firms’ development by using the econometric
methodology. The measures of firm growth used include innovation, sales and
turnover. The estimation was enabled by using micro-data collected by the National
Institute of Statistics of Rwanda namely the 2011 Rwanda Enterprise Survey and
the 2014 Establishment Census.
The literature review on the service sector supports that services contribute more
to economic growth. Zhou (2015) and William (1997) claim that services accelerate
the transformation of economic growth, raise employment and boost economy-wide
labor productivity. The key factors that contribute to the growth of the service
sector include rapid urbanization, expansion of the public sector, increased demand
for intermediate and final consumer services, domestic investments and openness,
education skills, cultural adaptability, financial attractiveness, business environ-
ment, expansion of quality health services, application of information and tech-
nology, increase in consumption expenditure, incentive systems and investing more
in research and development. In Rwanda, the services are dominated by wholesale
and retail trade, motorcycle and motor vehicle repairs, accommodation and food
services and human health and social work sub-sectors.
After estimating models on sales, innovation and turnovers in service firms, the
results show that service firms’ development in Rwanda is driven by access to
finance, an increased labor force, training personnel, ICT applications, embryonic

innovations and the tax system. Access to finance has enabled service firms to grow
over the past few years in Rwanda. The size of the loans approved by financial
institutions such as banks and cooperatives had a positive effect on three years’ total
annual sales, capital used by the service firms which also positively impacted
turnovers of service firms and the acquisition of fixed assets which positively
influenced service innovativeness. FinScope Rwanda (2016) has revealed that 89%
of the adult population has access to finance.
Increased labor participation in services, employee development and training of
personnel have boosted services in Rwanda. As it has been explored through lit-
erature, service firms’ development can be attributed to employment. Our study
shows that the cost allocated to employment in services is positively correlated with
total sales generated over three years and a descriptive analysis confirms that the
service sector is at the top in employment, even though there is gender inequality in
the sector. Despite lack of innovations influencing changes in sales, some variables
characterizing innovation are inducing service innovativeness such as internal and
external training. Further, research and development and employee development
were found to influence sales over the three years studied. This draws attention to
future research to assess the innovation propensity in the service sector.
Openness and ICT applications have definitely contributed to the growth of
service firms in Rwanda. Benefiting from accessing a wider market was a national
aspiration when Rwanda signed the regional economic integration agreement. Our
study indicates that it is on track whereby openness has had a positive effect on
turnovers of service firms. In addition, we have seen the Government of Rwanda
putting more effort in extending optic fiber across the country that has influenced
service firms. Our study finds that communication via e-mails influenced sales
generated over the three years studied.
Tax collection, typically VAT and income tax, impacted services’ development
in Rwanda. As previous findings have illustrated, there is a positive relationship
between taxes and economic growth. Our study also reaffirmed this as it found that
value-added tax and income tax had a positive effect on turnovers of service firms.
It is suggested that data should be collected on the sub-sectors in the service sector
to better understand why some service firms are growing faster than others in the
same sector. Our study opens up a number of research avenues for the future on the
contribution of regional economic integration to service sector development, an
analysis of ICT applicability and contribution to old service firms and an empirical
analysis of gender inequalities in the service sector’s growth in Rwanda.
15.5.2 Recommendations
As the Government of Rwanda has opted for driving its economic growth through
service development and aims to become a middle-income country, this study
makes recommendations that can help it in speeding up the shift form a low income
to a middle-income economy.
Our study shows that both employees’ development and research and devel-
opment activities are negatively correlated with a change in the sales generated in
service firms. From literature’s point of view, research and development has a great
effect on the development of service firms (Madeira et al. 2014). Reducing expenses
on research and development reduces the sales of service firms. From the results of
our study, it is worthy to recommend that policymakers encourage research and
development in service firms to increase and sustain their levels of performance in
the Rwandan economy.
The use of ICT in service firms as expressed by use of improved technology,
equipment and software for production and the use of Internet were relevant factors
in the sales performance of service firms. A review of microeconomic literature on
service firms reaffirmed ICT as a bedrock for improving business processes, cus-
tomer relations and efficient delivery of goods and services to satisfy the needs of
customers (Atom 2013). The study recommends that the government should
advance and sustain the use of ICT in all service firms in Rwanda. In addition, the
application of ICT in service firms can generate multiple effects on the performance
of service firms which is correlated with the national aspiration of becoming an ICT
regional hub to accelerate its target of economic growth.
Our study shows that the size of the loan approved and capital used by a service
firm influence its sales. New service firms need to be supported to generate addi-
tional value on the performance of service firms in the economy. For this reason, the
study suggests that the government promote access to finance to new service firms
through, for instance, setting up an affordable collateral value and extending the
time for paying back the loan approved by giving a sufficient grace period.
In our study, we found both employees’ development and acquisition of internal
or external training to have a great impact on the performance of service firms in
Rwanda. Prior to this study, it was also found that training develops skills, com-
petencies and abilities and ultimately improves employee performance and orga-
nizational productivity (Amir and Amen 2013). Hence, the government should
promote on-work training in service firms to speed up Rwanda’s shift from a low
income to a middle-income state.
Due to the fact that the acquisition of fixed assets such as machinery, vehicles,
equipment, land and buildings has multiple effects on innovation, the government
should facilitate the import of necessary fixed assets to be used by service firms.
This could be tax exemptions and incentives depending on the value of the
imported fixed assets. Further, since the acquisition of fixed assets is a proxy
indicator for accessing finance for firms, the government should regulate finance in
a way that facilitates firms to have easy access to finance from financial institutions
like working out the interest rate charged from a firm when it wants to purchase
fixed assets.
The key recommendations from the analysis of service firms’ development and
economic growth in Rwanda can be summarized as:
• Advancing and sustaining the use of ICT in all service firms, specifically the use
of the Internet;
• Promoting new service firms’ access to finance in terms of loans to acquire fixed
assets;
• Promoting on-work training in service firms to increase the level of firm
performance;
• Putting in place a services innovation policy complementing existing employ-
ment with emphasis on employees’ development and enhanced training
strategies;
• Expanding ICT applications for service firms to become mobile based for tar-
geting the countryside population; and
• Putting in place a foreign trade policy with emphasis on service exports in forms
that benefit from existing economic integration.
Appendix 1: Correlation Matrix of Different Covariates,

n = 158
New New New New New research Employee Training Fixed

methods logistics practices marketing developments asset
New 0.2407
methods
New −0.0909 0.2972
logistics
New −0.0117 −0.0840 0.3197
practices
New −0.0513 −0.0224 −0.0104 0.2419
marketing
New research −0.0092 −0.0172 −0.0262 −0.0136 0.2419
developments
Employee −0.0128 0.0054 −0.0699 −0.0381 0.0028 0.2363
Training −0.0039 −0.0356 −0.0186 0.0480 −0.0563 −0.0295 0.2228
Fixed asset 0.0138 −0.0328 −0.0245 −0.0226 −0.0131 −0.0298 −0.0101 0.1980
Loan −0.0066 −0.0255 −0.0073 0.0047 −0.0119 0.0210 0.0332 −0.0172
1. Firm size 0.0115 −0.0644 0.0660 −0.0040 0.0435 −0.0713 0.0018 −0.214
2. Firm size −0.0148 −0.0355 0.630 0.282 0.0304 −0.0624 −0.0236 −0.0742
3. Firm size 0.0049 −0.0589 0.0078 0.0174 0.0028 −0.0327 0.0099 −0.0623
_Cons −0.0485 0.0358 −0.1518 −0.0462 −0.0102 0.0072 −0.0161 0.0229
Loan 1. Firm size 2. Firm size 3. Firm size _Cons

Loan 0.1674
1. Firm size −0.0165 1.0154
2. Firm size −0.0221 0.9493 1.0868
3. Firm size −0.0684 0.9390 0.9593 0.6107
_Cons −0.0518 −0.9038 −0.8926 −0.8474 1.0707
Appendix 2: Logistic Model for Innovation
+ 87 25 112
− 12 34 46
Total 99 59 158
Classified + if predicted pr (D) > = 0.5
True D defined as innovation! = 0
Sensitivity Pr (+ | D) 87.88%
Specificity Pr (− | *D) 57.63%
Positive predictive value Pr (D | +) 77.68%
Negative predicative value Pr (*D | −) 73.91%
False + rate for true *D Pr (+ | *D) 42.37%
False − rate for true D Pr (− | D) 12.12%
False + rate for classified + Pr (*D | +) 22.32%
False − rate for classification − Pr (D | −) 26.09%
Correctly classified 76.58%
Appendix 3: Skewed Logistic Regression of Innovation
Skewed logistic regression Number of Obs = 158

Zero = 59
Log likelihood = −80.44222 Nonzero outcomes = 99
Innovation Coef. Std. Err. z p > |z| [95% conf. Interval]

New 0.7162 0.3399 2.107 0.0351 0.0499 1.3824
methods
(continued)
(continued)
Innovation Coef. Std. Err. z p > |z| [95% conf. Interval]
New 0.1210 0.3527 0.343 0.7316 −0.5703 0.8123
logistics
New −0.0990 0.3551 −0.279 0.7804 −0.7950 0.5970
practices
New −0.1649 0.3152 −0.523 0.6009 −0.7827 0.4529
marketing
Research 0.2385 0.2765 0.862 0.3885 −0.3035 0.7804
dvpt
Employee 0.6093 0.3223 1.891 0.0587 −0.0224 1.2410
dvpt
Training 0.5703 0.2875 1.984 0.0473 0.0068 1.1339
Fixed asset −0.6896 0.2680 −2.573 0.0101 −1.2149 −0.1643
Loan 0.4360 0.2527 1.725 0.0845 −0.0593 0.9313
Firm size
1 −0.3431 0.6728 −0.510 0.6101 −1.6617 0.9756
2 0.0838 0.6807 0.123 0.9021 −1.2504 1.4180
3 0.5803 0.7563 0.767 0.4429 −0.9020 2.0626
_Cons −15.5998 1523.5291 −0.010 −0.9918 −3.00e 2970.4623
+03
/lnalpha 14.5702 1523.5288 0.010 0.9924 −2.97e 3000.6318
+03
alpha 2.13e+06 3.24e+09 0.0000
References
Acharya R (2016) ICT use and total factor productivity growth: intangible capital or productive
externalities? Oxf Econ J 68(1):16–39
Acharya R, Patel R (2015) Contribution of telecom sector to growth of indian service sector: an
empirical study. Indian J Sci Technol 8(4):101–105
Acs ZJ, Audretsch DB (1988) Innovation in large and small firms: an empirical analysis. Am Econ
Rev 78(4):678–690
Agwu EM, Carter AL (2014) Mobile phone banking in Nigeria: benefits, problems and prospects.
Int J Bus Commer 3(6):50–70
Alvarez-Cuadrado F, Poschke M (2011) Structural change out of agriculture: labor push versus
Labor pull. Am J Macroecon 3(3):127–158
Amir E, Amen I (2013) The effect of training on employee performance. Eur J Bus Manag 5
(4):137–147
Arnold JM, Jovorcik B, Lipsomb M, Mattoo A (2016) Services reform and manufacturing
performance: evidence from India. Econ J 129(590):1–39
Arvanitis S, Stucki T (2012) What determines the innovation capability of firm founders? Ind Corp
Change 24(4):1049–1084
Atom B (2013) The impact of information communication technology (ICT) on business. Asian J
Bus Manage Sci 3(2):13–28
Bethapudi A (2013) The role of ICT in tourism Industry. J Appl Econ Bus 1(4):67–79
Borghoff T (2011) The role of ICT in the globalization of firms. J Mod Account Audit 7(10):1128–
1149
Bouazza AB, Ardjouman D, Abada O (2015) Establishing the factors affecting the growth of small
and medium-sized enterprises in Algeria. Am Int J Soc Sci 4(2):101–115
Buera FJ, Kaboski JK (2009) The rise of the service economy. Am Econ Rev 102(6):2540–2569
Chude DI, Nkuru P (2015) Impact of company income taxation on the profitability of companies in
Nigeria: a study of Nigerian breweries. Eur J Account Audit Financ Res 3(8):1–11
Dawkins P, Feeny S, Harris MN (2007) Benchmarking firm performance. Benchmark Int J 14
(6):693–710
Drejer I (2002) Business services as a production factor. Econ Syst Res 14(4):389–405
Du J, Temouri Y (2015) High-growth firms and productivity: evidence from the United Kingdom.
Small Bus Econ 44(1):123–143
Eichengreen B, Gupta P (2011) The service sector as India’s road to economic growth. NBER
El-Said OA, Kattara HS (2013) Customers’ preferences for new technology-based self-services
versus human interaction services in hotels. Tour Hosp Res 13(2):67–82
Fuchs VR (1980) Economic growth and the rise of service employment. NBER working paper no.
486
Gale WG, Krupkin A, Rueben K (2015) The relationship between taxes and growth at the state
level: new evidence. Nat Tax J 68(4):919–942
Geishecker I, Görg H (2013) Services offshoring and wages: evidence from micro data. Oxf Econ J
65(1):124–146
Guerrieri P, Meliciani V (2004) International competitiveness in producer services. Available at
SSRN: https://ssrn.com/abstract=521445
Gupta R (2012) The role of ICTs in facilitating India’s external trade. J Decis Making 12(1):11–15
Hagen B, Zucchella A (2014) Born global or born to run? The long-term growth of born global
firms. Manage Int Rev 54(4):497–525
Halpern L, Koren M, Szeidl A (2015) Imported inputs and productivity. Am Econ Rev 105
(2):3660–3703
Heshmati A, Kim H (2011) The R&D and productivity relationship of Korean listed firms. J Prod
Anal 36(2):125–142
Isaga N (2015) Owner-managers’ demographic characteristics and the growth of Tanzanian small
and medium enterprises. Int J Bus Manage 10(5):168–181
Jafaridehkordi H, Rahim RA, Jafaridehkordi P (2015) Intellectual capital and investment
opportunity set in advanced technology companies. Int J Innov Appl Stud 10(3):1022–1027
Jajri I (2008) Determinants of total factor productivity growth in Malaysia. J Econ Coop 28(3):41–
58
Johnsen GJ, McMahon RGP (2005) Owner-manager gender, financial performance and business
growth amongst SMEs from Australia’s business longitudinal survey. Int Small Bus J 23
(2):115–142
Jones RS, Yoon T (2008) Enhancing the globalisation of Korea. Economic department working
papers no. 614. OECD
Khan KS (2011) Determinants of firm growth: an empirical examination of SMEs in Gujranwala,
Gujarat and Sialkot districts. Interdisci J Contemp Res Bus 3(1):1389–1409
King RG, Levine R (1993) Finance and growth: Schumpeter might be right. Q J Econ 108(3):717–
737
Latha CM, Shanmugam V (2014) Growth of service sector in India. IOSR J Humanit Soc Sci 19
(1):8–12
Lee S, Malin BA (2013) Education’s role in China’s structural transformation. J Dev Econ
101:148–166
Lenaerts K, Merlevede B (2015) Firm size and spillover effects from foreign direct investment: the
case of Romania. Small Bus Econ 45(3):595–616
Liu L, Nath HK (2013) Information and communications technology and trade in emerging market
economies. Emerg Markets Financ Trade 49(6):67–87
Madeira MJ, Jorge S, Sousa G, Moreira J, Mainardes EW (2014) Determinants of innovation
capacity: empirical evidence from service firms. Innov Manage Policy Pract 16(3):404–416
Mihalic T, Praniceric DG, Arneric J (2015) The changing role of ICT competitiveness: the case of
the Slovenian hotel sector. Econ Res Ekon Istraživanja 28(1):367–383
Muro MB, Magutu PO, Getembe KN (2013) The strategic benefits and challenges in the use of
customer relationship management systems among commercial banks in Kenya. Eur Sci J 9
(13):327–349
Nassazi A (2013). Effect of training on employee performance with evidence from Uganda.
Business Economics and Tourism, 1–57
Neely A, Hii J (1998) Innovation and business performance. University of Cambridge, London
Newsom J (2013) Levels of measurement and choosing the correct statistical test. USP 634 data
analysis, Spring 2013, New York
NISR (2014) Establishment census. National Institute of Statistics of Rwanda, Kigali
Oliveira B, Fortunato A (2008) The dynamics of the growth of firms: evidence from the services
sector. Empirica 35(3):293–312
Park D, Shin K (2012) The service sector in Asia: is it an engine of growth? ADB economics
working paper series no. 322
Pop ZW, Stümpel HJ, Bordean ON (2014) From strategic decisions to corporate governance in the
SME sector in Germany. Stud Univ Babeș-Bolyai Oecon 59(3):57–67
Powers AD, Xie Y (1999) Statistical methods for categorical data analysis. Academic Press Inc,
Austin
Prajogo DI, Sohal AS (2006) The integration of TQM and technology/R&D management in
determining quality and innovation performance. Omega 34(3):296–312
Queiro F (2016) The effect of managers education on firm growth. Q J Econ 118(4):1169–1208
RES (2011) Rwanda Enterprise Survey. Rwanda National Institute of Statistics. http://www.
statistics.gov.rw/
RES (2016) Rwanda Enterprise Survey. Rwanda National Institute of Statistics. http://www.
statistics.gov.rw/
Reuber AR, Fischer E (2002) Foreign sales and small firm growth: the moderating role of the
management team. Entrep Theory Pract 27(1):29–45
Sahu S (2015) Source of service sector TFP growth in India: evidence from micro-data. South
Asian J Macroeconom Pub Financ 4(1):62–90
Salehi-Isfahani D (2006) Microeconomics of growth in MENA: the role of households. Contrib
Econ Anal 278:159–194
Schoonjans B, Van Cauwenberge P, Vander Bauwhede H (2013) Does Formal Business
Networking contribute to SME Growth?—An empirical examination. Working Papers of
Faculty of Economics and Business Administration, Ghent University, Belgium from Ghent
University, Faculty of Economics and Business Administration, 2011/708
Schoonjans B, Van Cauwenberge P, Vander Bauwhede H (2013) Formal Business Networking
and SME Growth. Small Business Economics 41(1):169–181
Scott L (1997) Advanced quantitative techniques in the social science. International Educational
and Professional Publisher, New Delhi
Simsek Z, Heavey C (2011) The mediating role of knowledge-based capital for corporate
entrepreneurship effects on performance: a study of small- to medium-sized firms. Strateg
Entrepreneurship J 5(1):81–100
Singh M, Kaur K (2014) Indian’s service sector and its determinants: an empirical investigation.
J Econ Dev Stud 2(2):385–406
Smith N, Smith V, Verner M (2006) Do women in top management affect firm performance? A
panel study of 2,500 Danish firms. Int J Prod Perform Manage 55(7):569–593
Stock JH, Watson MW (2011) Introduction to econometrics, 3rd edn. Pearson Education Inc,
Boston
Stoilova D, Patonov N (2013) An empirical evidence for the impact of Taxation on economy
growth in the European Union. In: Proceedings TMS international conference 2012: Financial
management, accounting & taxation, vol 2, pp 1031–1039
Tahir M, Azid T (2015) The relationship between international trade openness and economic
growth in the developing economies: Some new dimensions. J Chin Econ Foreign Trade Stud 8
(2):123–139
UN (2008) International standards industrial classification of all economic activities. Revision 4
UNECA (2015) Economic report on Africa 2015: industrializing through trade
Van der Marel E, Shepherd B (2013) International tradability indices for services: policy research
Verbeek M (2004) A guide to modern economitrics, 2nd edn. Wiley, Rotterdam
Watson J (2003) SME performance: does gender matter. A paper for the small enterprise
association of Australia and New Zealand 16th Annual conference. Ballarat
WB (2014) Enterprise surveys: Rwanda country profile 2011. Enterprise surveys country profile.
World Bank Group, Washington, DC
Williams CC (1997) Understanding the role of consumer services in local economic development:
some evidence from the Fens. Environ Plann 28(3):555–571
Wu W, Wu C, Zhou C, Wu J (2012) Political connections, tax benefits and firm performance:
evidence from China. Journal of accounting and public policy. J Account Public Policy 31
(3):277–300
Yeboah O, Naanwaab C, Saleem S, Akuffo A (2012) Effects of trade openness on economic
growth: the case of African countries. Agribusiness, Applied Economics and Agriscience
Education—NC A&T, Birmingham
Yli-Renko H, Autio E, Sapienza HJ (2001) Social capital, knowledge acquisition, and knowledge
exploitation in young technology-based firms. Strateg Manag J 22(6–7):587–613
Zhou Z (2015) The development of service economy: a general trend of the changing economy.
Development Research Center of Shanghai, Shanghai
Zhou H, de Wit G (2009) Determinants and dimensions of firm growth. SCALES EIM research
reports (H200903). Groningen
Chapter 16
Labor-Use Efficiency in Kenyan
Manufacturing and Service Industries
Masoomeh Rashidghalam
Abstract This study uses the labor-use requirement model to estimate labor-use
efficiency of Kenyan manufacturing and service sectors. It also studies the deter-
minants of labor-use efficiency. The data are obtained from the World Bank’s
Enterprise Survey (ES). The Cobb–Douglas functional form of labor-use frontier
estimates shows that wages, sales, capital, fuel, and electricity affected the amount
of labor used in Kenya. The determinants of labor-use efficiency were the man-
ager’s experience, female share, labor training, education, and obstacles. The results
show that the estimated firm labor-use efficiency ranged from 0.14 to 0.87 with a
mean labor-use efficiency value of 0.66. According to the results, most of the firms
operated within the labor-use efficiency range of 0.70–0.80 suggesting that there is
space for improvements in labor use of 20–30% as compared to the firms with best
labor-use practices.
Keywords Firm Kenya Labor-use efficiency Labor-use requirement frontier

JEL Classification Codes C23 E24 J23 L60
16.1 Introduction
The World Bank’s most recent Kenya Economic Update (KEU) (March 2016)
projected a 5.9% growth in 2016, rising to 6% in 2017. The report attributes this
positive outlook to low oil prices, good agricultural performance, a supportive
monetary policy, and ongoing infrastructure investments. According to the latest
Kenya National Bureau of Statistics’ (KNBS) quarterly report, Kenya’s economy
expanded by 6.2% in the second quarter of 2016 as compared to 5.9% in the same
period in 2015. This growth was mainly supported by agriculture, forestry, and
fishing; transportation and storage; real estate; and wholesale and retail trade.
M. Rashidghalam (&)
Department of Agricultural Economics, University of Tabriz, Tabriz, Iran
e-mail: maso.azar@gmail.com

DOI 10.1007/978-981-10-4451-9_16
370 M. Rashidghalam
Manufacturing, construction, and the financial and insurance sectors slowed down
during this quarter while accommodation and food services, mining and quarrying,
electricity and water supply, and information and communication sectors recorded
improvements.1
Although manufacturing companies in Kenya are small, they are the most
sophisticated in East Africa. Industries in Kenya have been growing since the late
1990s and into the new century. These companies are also relatively diverse. The
transformation of agricultural raw materials, particularly of coffee and tea, remains
the principal industrial activity. Meat and fruit canning, wheat flour and cornmeal
milling, and sugar refining are also important. Production of electronics, vehicle
assembly, publishing, and soda ash processing are all significant. Assembly of
computer components began in 1987. Kenya also manufactures chemicals, textiles,
ceramics, shoes, beer and soft drinks, cigarettes, soap, machinery, metal products,
batteries, plastics, cement, aluminum, steel, glass, rubber, wood, cork, furniture, and
leather goods. It also produces a small number of trucks and automobiles. The most
common manufacturing industries in Kenya include small-scale consumer goods
(plastic, furniture, batteries, textiles, clothing, soaps, cigarettes, and flour), agri-
cultural products, horticulture, oil refining, aluminum industries, steel industries,
lead industries, cement industries, and commercial ship repairs.2
Kenya is also a leading sub-Saharan African (SSA) producer and exporter of
services. According to the World Bank, Kenya has a comparative advantage in
services production. It has the largest service economy in the East African
Countries (EAC). It produced $19 billion of services in 2012; this amount repre-
sents almost half of the nation’s GDP and accounted for an estimated 43% of the
EAC’s total services output (Serletis 2014). As East Africa’s distribution hub,
telecommunication axis, and financial center, Kenya has a broad array of
well-developed service industries with an abundance of service suppliers. These
factors make Kenya a promising source of increased exports of services. In addi-
tion, the Government of Kenya is aiming to spur economic growth by promoting
exports of services, including professional services, which are critical for Kenya’s
economic development and also serve as key inputs for East Africa’s growth. In
most of the years, this sector accounts for the largest share of jobs in Kenya.
In 2006, Kenya’s labor force was estimated to include about 12 million workers
of which almost 75% worked in agriculture. About 6 million were employed
outside small-scale agriculture and pastoralism. Approximately 15% of the labor
force was officially classified as unemployed in 2004. As Kenya became increas-
ingly urbanized, the labor force shifted from the countryside to cities (The World
Bank 2015). The service sector absorbed a majority of the inflow of labor to urban
areas. Labor force participation rates for both women and men were constant
between 1997 and 2010. In 1997, 65% of the women were employed in some type
of labor market activity, while the corresponding number for men was 76%
1
http://www.worldbank.org/en/country/kenya/overview.
2
https://softkenya.com/industry/.
16 Labor-Use Efficiency in Kenyan Manufacturing … 371
(the World Bank 2015). Around 60% of the women and 70% of the men were in the
labor force in 2005. Their shares increased in 2010, when 61% of the women and
72% of the men were a part of the labor force.
In this regard, studying labor-use efficiency in these two main economic sectors
of Kenya is important. Therefore, our study investigates labor-use efficiency and its
determinants in manufacturing and service sectors in Kenya. Labor efficiency is a
measure of how efficiently a given workforce accomplishes a task when compared
to the standard in that industry or setting. As labor efficiency goes up, costs go
down. It may also be possible to increase production because more labor hours are
available for producing goods and services. This will be even more important in
periods of increased demand when a company needs more laborers to make more
goods or offer more services. More efficiency can also translate into wider oppor-
tunities for research and development as a company has workers available to put to
these tasks instead of having to focus on meeting the needs of the production line.
One way of looking at labor efficiency is by comparing the number of hours
actually required to produce a given product or service with those usually spent. If
the workforce is producing products and services at below the usual rate, it is
operating with high efficiency, cutting time off production. This can translate into
significant savings as the company will spend less money on wages and overheads
because it is turning out finished services and products at a more efficient rate.3
In particular, we address the following questions in our paper: What are the
levels of labor-use efficiency in manufacturing and service sectors in Kenya and
which factors determine the efficiency of labor in Kenya? The results of our study
will provide researchers and employers with information about how labor and farm
characteristics affect labor-use efficiency.
The rest of the paper is organized as follows. Section 2 includes a brief review of
relevant literature. Section 3 outlines the relevant labor-use requirement model and
determinants of efficiency. Data sources along with identification of inputs and
outputs are reported in Sect. 4. Section 5 discusses the findings from the empirical
analysis, and Sect. 6 gives a conclusion.
A number of studies in production, cost, and performance analysis literature ana-

lyze labor-use efficiency including those by Heshmati and Su (2014), LaFave and
Thomas (2016), Nagler and Naudé (2014), Ogutu et al. (2014).
Abid and Drine (2011) studied the determinants of the inefficient functioning of
the Tunisian labor market. They took advantage of recent developments in
stochastic frontier techniques and estimated the matching function for Tunisia using
disaggregated data. They included control variables as determinants of matching
3
http://www.wisegeek.com/what-is-labor-efficiency.htm.
372 M. Rashidghalam
efficiency and regional disparities and confirmed that the persistently high rate of
unemployment was the result of not only excess labor supply but was also related to
a shortfall between supply and demand (sector, location, and qualification).
Anyiro et al. (2013) examined labor-use efficiency by smallholder yam farmers
in Nigeria. The Cobb–Douglas functional form of labor-use frontier estimates
showed that the quantity of harvested yam, size of the cleared farmland, and the
quantity of fertilizers applied significantly affected the amount of labor used in yam.
The socioeconomic determinants of labor-use efficiency were age, education, farm
size, gender, labor wage, and household size; these were statistically significant.
According to their results, labor-use efficiency ranged from 0.20 to 0.97 with a
mean labor-use efficiency value of 0.76. Policies aimed at increasing yam farmers’
scale of operations through improved access to production inputs such as fertilizers,
agrochemicals, and capital are required for increasing labor-use efficiency in the
area.
Das et al. (2009) used a data envelopment analysis to measure labor-use effi-
ciency of individual branches of a large public sector bank in India. They intro-
duced the concept of area or spatial efficiency for each region relative to the nation
as a whole. Their findings suggest that the policies, procedures, and incentives
handed down from the corporate level cannot fully neutralize the detrimental
influence of the local work culture across different regions. Most of the potential
reduction in labor cost appeared to come from possible downsizing in the clerical
and subordinate staff.
16.3 Methods
The labor-use requirement frontier model determines the minimum amount of labor
required to produce a given level of output. This model is expressed as (Akanni and
Dada 2012; Anyiro et al. 2013; Martinez and Burns 1999; Masso and Heshmati
2004):
Labori ¼ f ðWi ; Outputi ; Zu : bÞ ð16:1Þ
where
Labori labor-use requirement frontier model
Wi real wage
Outputi sale
Zu vector characterizing the production process
b unknown parameters associated with determinants of optimal labor use
Our study estimated a Cobb–Douglas labor-use frontier as a function as:
Ln Labour ¼ b0 þ b1 Ln Wage þ b2 Ln Capital þ b3 Ln sale

ð16:2Þ
þ b4 Ln Electricity þ b5 Ln Fuel þ e
where
Ln Labor natural log of annual employment
Ln Wage natural log of wage per employee in KES
Ln Capital natural log of annual investment per employee
Ln Sale natural log of total sales (in Kenyan shilling, KES)
Ln Electricity natural log of annual cost of electric energy per labor
Ln Fuel natural log of annual cost of all fuels per labor (fuel intensity) in
KES
To study the determinants of labor-use efficiency (LE), the following model was
formulated:
LE ¼ d0 þ d1 Z1 þ d2 Z2 þ þ d8 Z8 ð16:3Þ
where
LE labor-use efficiency
Z1 experience of manager (in years)
Z2 female share of employees
Z3 training programs for employees (yes = 1, No = 0)
Z4 average number of years of education of a typical female production worker
(years)
Z5 percentage of full-time permanent workers who have completed secondary
school (%)
Z6 age of firm (years)
Z7 does the firm face minor and moderate obstacles (Yes = 1, No = 0)
Z8 does the firm face major and severe obstacles (Yes = 1, No = 0).
16.4 Data
The data used in this study are from the World Bank’s Enterprise Survey (ES). As
part of these surveys, the World Bank collects data from key manufacturing and
service sectors in every region of the world. The surveys use standardized survey
instruments and a uniform sampling methodology to minimize measurement errors
and to yield data that is comparable across the world’s economies and as such is
suitable for comparative economic studies. The initial dataset consisted of 670
firm-level observations in Kenya’s manufacturing and service firms in 2013. Data
374 M. Rashidghalam
for estimating labor efficiency determinants comprised of dependent, independent,

and characteristic variables. The dependent variable is labor (LABPRO) defined as
the number of workers.
The independent and characteristic variables include two categories classified as
main labor and firm-related variables. Main variables include capital intensity,
electric intensity, fuel intensity, wages, and sales. The capital intensity (CAPINT)
variable is measured as the sum of annual investments in machinery, vehicles, and
equipment, and annual investments in land, buildings, and structures per labor. The
electricity intensity (ELEINT) variable is the annual cost of electric energy per
employee purchased from public or private utility companies or received from other
establishments that belong to the same firm. The fuel intensity (FEUINT) variable is
the annual cost of all fuels per labor which are consumed for heating, power,
transportation, or the generation of electricity.
SALE is value of all annual sales counting manufactured goods and goods that
an establishment has bought for trading per labor. For services, it refers to the value
of all the services provided during the year per unit of labor. Finally, the wage
(WAGE) variable is the average wage per employee in a given firm and is obtained
by dividing total wages by the total yearly average number of workers. It includes
wages, salaries, and benefits including food, transport, and social security.
The second category includes eight variables related to employment: experience
of manager (EXPERI), female share of employees (SFEM), training programs for
employees (TRAIN), average number of years of education of a typical female
production worker (FEDUC), percentage of full-time permanent workers who have
completed secondary school (PSEC), age of firm (AGE), does the firm face minor
and moderate obstacles (MMOBS), and does the firm face major and severe
obstacles (MSOBS). The variable training program for employees is a dummy
variable where 1 indicates skill upgrading for a firm’s labor force and zero no skill
upgrading. A firm’s age (AGE) is measured in years. The fourth category—infras-
tructure—includes eight variables that play a crucial role in the smooth operations of
a firm. The other variables include the degree to which telecommunications are seen
as an obstacle by a firm (TOBSTA).
Tables 16.1 provide summary statistics of the data for the input and output vari-
ables and labor, firm, and market characteristics used in this study. Sales averaged
at 1170 million Kenyan shilling (KES) with dispersion 6.54 times the mean. The
average employment in a sample firm was 98 persons. It varied in intervals 1 and
8000, with a dispersion of 4.32 around the mean value. The ratio of the two
variables, the amount of sales per employee, which measures labor productivity
varied from 3000 to 1720 billion KES with mean and standard deviations of 14.4
and 90.2 million KES. The value of investment per employee indicates consider-
able variations in the dataset. Mean wage per employee was 1.05 million KES with
Table 16.1 Summary statistics of key variables and labor-use efficiency determinants in the
Kenyan (2013) enterprise data (N = 670)
Variable Variable Mean Std. dev. Minimum Maximum
definition
A. Key variables
Employment Annual 98 424.80 1 8000
employment
Sale Total sales (in 1,170,000,000 7,650,000,000 90,000 120,000,000,000
Kenyan
shilling, KES)
LABOR Sale per 14,400,000 90,200,000 3000 1,720,000,000
employee
(labor
productivity)
in KES
CAPINT Annual 7,806,039 35,000,000 0.16 484,000,000
investment per
employee
(capital
intensity) in
KES
EENINT Annual cost of 238,817 1,610,259 0.00 36,000,000
electric energy
per labor
(energy
intensity) in
KES
FENINT Annual cost of 4,821,125 10,800,000 0.00 200,000,000
all fuels per
labor (fuel
intensity) in
KES
WAGE Wage per 1,049,785 7,095,087 1000 170,000,000
employee in
KES
SALE Total sales (in 1,170,000,000 7,650,000,000 90,000 120,000,000,000
Kenyan
shilling, KES)
B. Labor-use efficiency determinants
expe Manager’s 18.79 10.77 2 57.00
experience in
years
femsh Female share 0.81 1.27 0 9.50
of employees
train Training 0.45 0.50 0 1.00
programs for
employees
(equals 1 if
(continued)
376 M. Rashidghalam

Variable Variable Mean Std. dev. Minimum Maximum
definition
employees
underwent a
training
program)
feduc Average 11.66 2.95 0 25.00
number of
years of
education of a
typical female
production
worker
psec Percentage of 79.18 28.09 0 100.00
full-time
permanent
workers who
have
completed
secondary
school
age Enterprise age 24.63 18.14 2.00 108.00
in years
tobst To what degree are telecommunications an obstacle?
Tobst0 No 0.41 0.49 0 1.00
obstacle = 0
Tobst1 Minor and 0.39 0.48 0 1.00
moderate
obstacle = 1
Tobst2 Major and 0.19 0.39 0 1.00
severe
obstacle = 2
Note US$1 = 99.7. Kenyan shilling on March 13, 2016
a large standard deviation of 7.1 million. It varied in the interval 1000 and
170 million KES. Energy and fuel intensity variables also showed large variations
among firms.
The sample average capital intensity per employee was 7.8 million KES with
standard deviation of 35 million KES. The highest in the sample—a
capital-intensive technology firm—used 484 million KES in capital per employee.
Variability in energy (electricity and fuel) use per employee also varied greatly. An
average manager’s experience was about 19 years in our dataset, which varied
between 2 and 57 years. The average age of firms was 25 years with a standard
deviation of 18 years. It varied in the interval 2 and 108. On average, the male labor
share was 0.81, and, on average, firms’ CEOs had about 12 years of education.
Around 80% of the permanent workers had completed secondary schooling.
In order to check for collinearity among the explanatory variables, correlation

coefficients among all the 14 variables are presented in Table 16.2. Labor use, as
expected, was negatively correlated with wages and sales; it correlated positively
with sales and electricity intensity. The remaining pairs were low correlated with
each other and did not show any signs of serious multi-collinearity. The age of a
firm, training for workers, secondary education of workers, and female education
were positively correlated with labor.
The model in Eq. 16.2 is estimated by ordinary least squares. The labor
requirement frontier is a function of wages, capital, sales, fuel, and electricity, and
the results are presented in Table 16.3. The parameters are statistically significant at
any conventional significant level. The elasticities with respect to wages, sales, and
electricity were significantly negative. The signs of the average elasticities were as
expected; wage was negative; and sale was positive.
Wage elasticity was negative (−0.261) and statistically significant at the less than
1% level. Consistent with theory and our expectations, a higher level of wages
decreased labor demand.
Sales had the strongest effect on the level of labor use.
In this model, the effect of capital intensity on labor use was negative (−0.083),
implying that labor and capital were competitive. A third significant factor with
effects on labor use is sales, as expected, had a positive effect on labor use, implying
that to increase sales we need to increase labor. Two other significant factors which
effected labor use are electricity and fuel intensity that are significant in our model.
The determinants of labor-use efficiency in manufacturing and service sectors in
Kenya are presented in Table 16.4. The table shows that experience, female share,
training, age of the firm, and infrastructure obstacles were statistically significant.
Female share in employment showed a negative relationship (−0.025) with
labor-use efficiency. This implies that increasing female share will lead to a
decrease in labor-use efficiency.
The coefficient for labor training (0.022) was positive and had a significant
(at 5% level) relationship with labor-use efficiency. This implies that an increase in
the level of training led to an increase in labor-use efficiency. A firm’s age also
showed a positive effect on labor-use efficiency. (0.001). This result supports the
argument that labor becomes more efficient in older firms.
As expected, minor and major telecommunication obstacles had negative coef-
ficients (−0.019 and −0.023, respectively). This indicates that labor-use efficiency
in yam production was gender sensitive. Further, it can be adduced from the results
that males were more efficient in the use of labor.
Table 16.4 presents the distribution of labor-use efficiency in the manufacturing
and service sectors in Kenya. According to Table 16.4, in 2013, the mean technical
efficiency of labor in Kenya was 0.66 with a maximum of 0.87 and minimum of
0.14. Therefore, the gap between the most efficient and inefficient labor was about
0.34. However, the results show that about 40.14% of the firms operated within the
labor-use efficiency range of 0.70–0.80. The estimates are skewed to the right,
implying a high level of efficiency.
378
Table 16.2 Correlation matrix of the variables (N = 670)

llabor lwagei lcapi lsale leleci lfueli expe femsh train feduc psec age obst1 obst2
llabor 1.00
lwagei −0.07 1.00
lcapi −0.28 0.08 1.00
lsale 0.68 0.34 −0.13 1.00
leleci 0.13 0.36 0.02 0.27 1.00
lfueli −0.28 0.18 0.12 −0.09 0.13 1.00
expe 0.23 0.00 −0.06 0.20 0.08 −0.15 1.00
femsh −0.43 0.15 0.21 −0.25 −0.05 −0.16 −0.06 1.00
train 0.28 −0.01 −0.14 0.26 0.03 −0.06 0.00 −0.09 1.00
feduc 0.12 0.05 −0.10 0.11 0.08 0.09 −0.03 −0.12 −0.01 1.00
psec 0.10 0.10 0.06 0.14 0.10 0.10 0.00 −0.16 0.00 0.25 1.00
age 0.31 0.02 −0.08 0.28 0.06 −0.20 0.32 0.02 0.10 −0.01 −0.07 1.00
obst1 0.06 0.02 −0.07 0.09 0.12 −0.12 0.10 0.06 0.02 0.05 0.05 0.09 1.00
obst2 −0.01 0.03 0.00 0.04 −0.06 0.06 −0.07 −0.08 0.06 0.04 −0.02 0.01 −0.39 1.00
M. Rashidghalam
Table 16.3 Estimates of ordinary least squares parameter of frontier model and efficiency
determinant (N = 670)
Variable Coefficient Std. err. Variable Coefficient Std. err.
A. Frontier model B. Efficiency model
Intercept −0.503 0.367 Intercept −0.679a 0.025
WAGE −0.261a 0.024 expe 0.001 0.000
CAPINT −0.083a 0.014 femsh −0.025a 0.004
SALE 0.440a 0.017 train 0.022b 0.010
ELEINT 0.031b 0.013 feduc 0.001 0.002
FEUINT −0.042a 0.007 psec −0.001 0.000
age 0.001a 0.000
Tobst1 −0.019c 0.011
Tobst2 −0.023c 0.013
sigma_u 0.448a 0.086
sigma_v 0.754a 0.047
F-value
Note Significant at less than 1% (a), 1–5% (b) and 5–10% (c) levels of significance
Table 16.4 Distribution of labor-use efficiency in Kenyan manufacturing and service sectors
Labor-use efficiency range Frequency Percentage
0.10–0.20 4 0.59
0.20–0.30 9 1.34
0.30–0.40 29 4.32
0.40–0.50 35 5.22
0.50–0.60 88 13.13
0.60–0.70 201 30.00
0.70–0.80 269 40.14
0.80–0.90 35 5.22
Total 670 100.00
Maximum labor-use efficiency 0.87
Minimum labor-use efficiency 0.14
Mean labor-use efficiency 0.65
This paper analyzed labor-use efficiency at the firm level using data from 670 firms
in the manufacturing and service sectors in Kenya. The data were sourced from the
World Bank’s Enterprise Survey (ES). It was concerned with two important issues.
First, modeling labor-use requirements, and second, considering labor-use effi-
ciency and its determinants. In estimating the labor-use requirement model, we
studied the effects of wages, sales, capital, electricity, and fuel use in labor demand.
380 M. Rashidghalam
Then, to study labor-use efficiency, we considered the manager’s experience,

female share, employee training and education, age of the firm, and infrastructure
obstacles.
The results imply that a firm’s labor use decreased with an increase in wages and
capital, and it increased with an increase in sales and electricity use. Mean technical
efficiency of labor use was 0.66 with a maximum of 0.87 and minimum of 0.14.
Most of the firms operated within the labor-use efficiency range of 0.70–0.80. The
results suggest that firms, on average, can reduce labor use by 20–30% compared to
firms with best practices in the use of labor. According to the results of labor-use
efficiency, training programs for labor increased their ability and hence improved
efficiency. The older a firm, the greater will be labor efficiency. Female education
might be regarded as a factor for increased efficiency since it enhances their ability
to read and adopt recommended practices. As the rate of female share in labor use
increases, efficiency decreases. Finally, as expected, infrastructure obstacles
decreased efficiency.
References
Abid AB, Drine I (2011) Efficiency frontier and matching process on the labor market: evidence
from Tunisia. Econ Model 28:1131–1139
Akanni AK, Dada AO (2012) Analysis of labor-use patterns among small-holder Cocoa farmers in
South Western Nigeria. J Agric Sci Technol 2:107–113
Anyiro CO, Emerole CO, Osondu CK, Udah SC, Ugorji SE (2013) Labor-use efficiency by
smallholder yam farmers in Abia State Nigeria: a labor-use requirement frontier approach. Int J
Food Agricul Econ 1(1):151–163
Das A, Subhash CR, Nag A (2009) Labor-use efficiency in Indian banking: a branch-level
analysis. Int J Manag Sci 37:411–425
Heshmati A, Su B (2014) Development and sources of labor productivity in Chinese provinces.
China Econ Policy Rev 2(2):1–30
LaFave D, Thomas D (2016) Height and cognition at work: labor market productivity in a low
income setting. Available at doi:10.1016/j.ehb.2016.10.008
Martinez MG, Burns J (1999) Sources of technological development in the Spanish food and drink
industry, a ‘supplier dominated’ industry. Agribusiness 15(4):4431–4448
Masso J, Heshmati A (2004) Optimality and overuse of labor in Estonian manufacturing
enterprises. Econ Transit 12(4):683–720
Nagler P, Naudé W. (2014) Labor productivity in rural African enterprises: empirical evidence
from the LSMS-ISA. IZA Discussion Paper No. 2014: 8524
Ogutu SO, Okello JJ, Otieno DJ (2014) Impact of information and communication
technology-based market information services on smallholder farm input use and productivity:
the case of Kenya. World Dev 64:311–321
Serletis G (2014) Kenya’s services output and exports are among the highest in Sub-Saharan
Africa. USITC Exec Brief Trade 202:205–3315
The World Bank (IBRD-IDA) (2015) Labor force participation rate, female (per cent of female
population ages 15+) (modeled ILO estimate). Available at http://data.worldbank.org/indicator/
SL.TLF.CACT.FE.ZS?
Author Index
A Allendorf, K., 12
Abada, O., 341 Allsopp, M.S., 12, 14, 17
Abadian, S., 12, 15, 16 Al Riyami, A., 15
Abdoul, G.M., 130 Alsop, R., 14, 18
Abid, A.B., 371 Alvarez-Cuadrado, 338
Abor, J., 166 Amen, I., 356, 362
Acemoglu, D., 67 Amidu, M., 159
Acharya, R., 338, 339 Amir, E., 356, 362
Ackerberg, D., 326 Andersen, E., 185
Ackerberg, D.A., 50 Anderson, D.R., 220, 221, 224
Acs, Z.J., 348 Anderson, K., 312
Adam, C., 80 Andrews, W.H., 325
Adams, N.A., 315 Anwar, M., 12
Adesola, W.A., 159 Anyanwu, J.C., 131, 140, 142
Afifi, M., 15 Anyiro, C.O., 372
Afriyie, K., 12 Appleton, S., 46
Agarwal, J.P., 127, 128 Ardjouman, D., 341
Aghion, P., 181, 182, 312 Arellano, M., 63, 72, 73, 75, 77, 325
Agwu, E.M., 341 Armstrong, J.S., 215
Ahmad, F., 55 Arneric, J., 341
Ahmed, I., 159, 160, 167 Arnold, J.M., 342
Ahmed, N., 159, 160, 167 Arvanitis, S., 347
Ahmed, Q.M., 299 Asafu-Adjaye, J., 292, 295, 297
Ahmed, S., 104 Asea, K., 255, 256, 258
Ahmed, Z., 159, 160, 167 Asekeny, L., 46
Ahn, M., 259 Asiedu, E., 129, 130, 135, 140, 142
Ajakaiye, O., 46, 49, 50, 52 Assefa, M., 124, 131, 294
Akande, O.R., 316 Astatike, G., 124, 131
Akanni, A.K., 372 Atom, B., 341, 362
Akehurst, G., 210, 214 Attah-Obeng, P., 114
Akin, J.S., 45 Audretsch, D.B., 348
Akther, S., 12, 15 Autio, E., 340, 342
Akyol, A., 210, 214 Ayele, G., 294
Albaum, G., 217, 230 Azid, T., 343
Alchian, A., 184
Alemayehu, G., 294 B
Alesina, A., 179 Bachewe, F.N., 295
Alfani, S., 54 Baker, B., 315
Ali, D.A., 12, 13 Baker, M., 158

DOI 10.1007/978-981-10-4451-9
382 Author Index
Balassa, B, 5, 254, 256, 258, 260, 273 C

Baldauf, A., 208–210, 229, 230 Cadogan, J.W., 214, 217, 227
Banerjee, A.J., 143 Card, D., 51
Barney, J., 210 Carlsson, B., 186
Barro, R., 109 Carter, A.L., 341
Barro, R.J., 178 Carter, R.A.L., 45
Barros, P., 44 Cassar, G., 159, 160, 166
Basannar, D.R., 12, 15 Casson, M., 128
Basu, A., 129 Castellacci, F., 213
Bategeka, L.O., 46 Cauley, J.A., 55
Baumol, W.J., 200 Cavalcanti, V., 264, 265
Bauwhede, H.V., 340, 343 Caves, K., 49
Baylie, F., 253 Caves, R.E., 127
Beamish, P.W., 210 Cavusgil, S.T., 214
Becker, S., 12, 17, 20, 34, 36 Cecilia, G., 178
Behrman, J.R., 49 Chan, K.S., 110
Beinhocker, E.D., 185, 186 Chandrasekhar, T., 43
Bende-Nabende, A., 130 Chaudhry, I.S., 12
Benhabib, J., 179 Chen, M., 263
Beretta, C., 156 Chenery, H., 80
Besley, T., 292, 295 Chetty, S.K., 210, 214
Bethapudi, A., 341 Chickering, K.L., 18
Bhagwati, N., 254 Chimobi, O.P., 111
Bhargava, A., 42 Chitiga, M., 130
Bhatia, R.J., 109, 114 Chowdury, A., 104, 111
Bhattarai, K., 266 Chuah, P., 255, 267
Bhutani, S., 159 Chude, D.I., 357
Biger, N., 159 Chuoudhri, E., 254
Bijmolt, T.H.A., 208 Coase, R.H., 128, 179
Binswanger, H., 313 Coelho, F., 209, 210, 213–215, 231
Bitrán, R., 54 Collier, P., 241, 242
Bjerkholt, O., 245 Collis, J., 211, 213, 215
Blanchard, O.J., 107 Comaniciu, C., 292, 300
Blomqvist, A.G., 45 Coniglio, N., 208
Bloom, S.S., 12, 15 Conner, K.R., 210
Blunch, N., 54 Cook, S., 54, 55
Bodart, C., 54 Cooper, D.R., 215
Boly, A., 208 Corden, M., 256, 258
Bond, S., 63, 65, 67, 72, 74, 75, 77, 325 Craig, E.D., 292, 300
Booth, L., 159, 160 Crammer, C., 315
Borghoff, T., 341 Cravens, D.W., 208–210, 217, 229, 230
Bouazza, A.B., 341 Creedy, J., 299
Boulton, J., 184 Cunningham, P., 44
Bourbonnais, R., 114 Czelusta, J., 242
Brase, C.H., 218, 220 Czinkota, M.R., 208
Brase, C.P., 218
Bruno, M., 106, 110, 111 D
Buchmueller, T.C., 44 Dada, A.O., 372
Buckley, P., 128 da Silva, C.A., 315
Buera, F.J., 338 Datta, K., 111
Buffie, E.F., 80, 96 Davidson, R., 316
Bulir, A., 93 Davis, G.A., 243
Bunescu, L., 292, 300 Dawkins, P., 344
Burns, J., 372 Dawood, M.H., 159, 160
Author Index 383
Dean, D.L., 208, 209, 214, 216, 217, 229, 230 Essendi, H., 12, 15
De Bethune, X., 54 Evangelista, F., 217, 230
De Gregorio, J., 109, 260 Ezeh, A.C., 12, 15
De Haan, J., 230
Deininger, K., 12, 13 F
Dennis, G.J., 92, 93, 98 Faridi, M.Z., 12
Deolalikar, A., 50 Feenstra, R., 255
Deolalikar, A.B., 49 Feeny, S., 344
Dess, G.G., 214 Feger, T., 292, 295, 297
Dhanaraj, C., 210 Felbermayr, G.J., 217
Diamantopoulos, A., 214 Ferdous, M., 114, 115
Dimitratos, P., 214, 231 Fevolden, A., 186, 213
Diop, F., 54 Fevolden, A.M., 209, 213
Dolado, J., 143 Fidell, L.S., 226, 228
Doornik, J.A., 93, 98 Fischer, E., 344
Doorslaer, E.V., 44 Fischer, S., 107, 109, 110, 114
Dopfer, K., 185 Florens, J.P., 51
Dornbusch, R., 107 Foly, C., 209, 230
Dorrance, G.S., 109, 114 Fortunato, A., 341
Dorrance, S., 109, 110, 114 Foster, J., 186–189, 191, 193
Dosi, G., 186 Foster, M., 68, 95
Dougherty, M.L., 239, 240, 242 Fotso, J.C., 12, 15
Drejer, I., 340 Foucade, A.L., 298
Drine, I., 255, 267, 371 Franco-Rodriguez, S., 80, 82, 95
Drukker, D., 104, 111 Fredman, L., 55
Druzic, I., 254, 256 Freeman, J., 211, 212, 214, 230
Du, J., 340 Freeman, S., 214, 227
Dunning, J.H., 128, 132, 135, 144 Frey, B.S., 136
Duray, R., 214 Fuchs, V.R., 339
Dushko, J., 67 Fugazza, M., 229
Fuglie, K.O., 312
E
Easterby-Smith, M., 213, 215 G
Easterly, W., 106, 110, 111 Görg, H., 339
Eberhardt, M., 262, 264, 265 Gale, W.G., 341
Ebru, Ç., 166 Ganle, J.K., 12
Ederer, S., 192 Gatti, D., 184
Edwards, S., 315 Gebreselassie, T., 43
Ehdaie, J., 300 Geda, A., 125, 129, 131, 136, 142
Eichengreen, B., 344 Geishecker, I., 339
Eifert, B., 96 Gelb, A., 96
Elbadawi, I., 136, 140 Gemmell, N., 299
El-Halawany, H.S., 12, 16, 35 Genimakis, G., 159, 166, 167
El-Hennawi, M., 159, 160 Getembe, K.N., 342
Eliasson, G., 186 Ghatak, M., 292
Elo, I., 54 Ghuman, S.J., 12, 16
Elovaino, R., 44 Gill, A., 159
El-Said, O.A., 342 Gipson, J.D., 12, 14, 16
Engle, R., 263, 266 Glaeser, E., 67
Engle, R.F., 91 Godfrey, L.G., 92
Enu, P., 114 Gokal, V., 105–107
Erbaykal, E., 111 Goldfarb, R.S., 178
384 Author Index
Goldman, M.J., 14, 17 Helpman, E., 65, 312

Goldstein, M., 12, 13 Hendry, D.F., 91
Goldstein, R., 158 Herberger, C., 256
Golla, A.M., 12, 13, 15 Hernandez-Verme, P., 104, 111
Gomis-Porgueras, P., 104, 111 Herstad, S.J., 209, 213
Gopakumar, K., 105 Heshmati, A., 3, 6, 340, 342, 371, 372
Granger, C., 263, 266 Hii, J., 347
Granger, C.W.J., 91, 116, 117 Hirschman, A.O., 312, 313
Greene, W.L., 49 Hodgson, G., 182, 186, 191
Gregory, A.W., 132 Hoeffler, A., 241
Griffin, D.K., 45 Hollos, M., 12, 15
Griffith, D., 214, 217, 227, 230 Holm, J., 185, 186
Griliches, Z., 49, 189 Holmes, S., 159, 160, 166
Grossman G., 312 Hossen, S., 22, 24
Grossman G.M., 65 Hotchkiss, D.R., 12, 15–17, 20, 22, 33
Grossman M., 42, 46 Howard, M., 298
Guerrieri, P., 341 Howitt, P., 181, 182, 312
Gujarati, D.R., 117 Hussey, R., 211–213, 215
Guo, Q., 255 Hymer, S.H., 127, 128
Gupta, K., 22
Gupta, M.D., 12 I
Gupta, P., 344 Ibeh, K., 214, 231
Gyimah-Brempong, K., 129, 130 Inklaar, R., 255
Isaga, N., 334, 347
H Isard, P., 255
Hagen, B., 343
Hahn, B., 44, 47 J
Hahn, F., 180, 183 Jabeen, S., 255
Haider, A., 255 Jackson, P., 213, 215
Hall, G., 255 Jafaridehkordi, H., 340, 343
Hall, R.E., 66 Jafaridehkordi, P., 340, 343
Halpern, L., 357 Jajri, I., 340
Hamilton, J.D., 87 Jayaraman, A.S., 43
Hamman, A.J., 93 Jeffrey, S.Z., 45
Hanif, S., 105, 107, 111 Jenkins, G.P, 293, 298
Hansen, B.E., 110, 132 Jensen, M.C., 158, 159
Hansen, H., 98 Jeong, I., 210
Hardoon, H., 216 Johansen, S., 86, 87, 132, 133
Harris, M., 160 Johanson, H.G., 109, 110, 114
Harris, M.N., 344 Johnsen, G.J., 357
Harvey, A.C., 133 Johnson, S., 67
Harvey, S.R., 50, 54 Jones, A.M., 44
Hashemi, S.M., 12, 15 Jones, C.I., 66
Haslag, J. H., 109 Jones, R.S., 338, 342
Hassan, F., 159, 166, 255 Jordan, A.C., 229, 230
Haughton, J., 299 Jorgenson, D.W., 127
Hawkins, T., 243 Jowett, M., 49
Heavey, C., 340, 342 Ju, N., 158
Heckman, M.C., 51 Judson, R., 73
Heins, A.J., 292, 300 Julian, C., 216, 217
Heinsohn, N., 14, 18 Juselius, K., 86, 91, 92, 99, 132, 133
Helleiner, G.K., 128 Jütting, J.P., 44
Author Index 385
K Kumo, W., 257

Kabeer, N., 14, 17, 18, 20, 33 Kuo, C.Y., 293, 298
Kaboski, J.K., 338
Kabubo-Mariara, J., 43, 50, 51 L
Kahiya, E.T., 209, 210, 217, 229, 230 Lacka, I., 212, 217, 228, 230
Kahn, S., 254 Lado, N., 208, 217, 230
Kajola, S.O., 159 LaFave, D., 371
Kalek, A., 210, 217 Lages, C.R., 210
Kandiero, T., 130 Lages, L.F., 210, 217, 230
Kar, S.B., 18 Lahaye, J.P., 54
Karasek, D., 12, 15, 16, 22 Larsen, U., 12, 15, 16
Katsikea, C.S., 208, 210 Latha, C. M., 334
Katsikea, E., 208 Latimer, E., 45
Katsikea, S., 217 Lawson, D., 43
Kattara, H.S., 342 Lebmann, O., 16
Katz, L., 54 Lee, C., 214, 217, 227, 230
Kaufmann, D., 137 Lee, H.J., 12, 16
Kaur, K., 333, 338, 342, 357 Lee, S., 338
Kearney, C., 107 Lee-Rife, S.M., 12, 15, 22
Kemper, P., 44 Leijonhufvud, A., 183
Khan, A., 24 Leland, H., 158
Khan, A.J., 159, 160, 167 Lenaerts, K., 341
Khan, K.S., 341, 343 Lengler, J.F.B., 208, 209, 214, 215, 217, 230,
Khan, M.H., 296 231
Khan, M.S., 110 Leonard, T.C., 178
Khan, S., 159, 167 Leonidou, L.C., 208–210, 214
Kharas, H., 312 Lépine, A., 45, 47
Kigume, R.W., 111 Leuthold, J.H., 299
Kiker, B.F., 43 Levine, L., 115
Kilindo, A., 105 Levine, R., 343
Killick, T., 95 Levinsohn, J., 115, 313, 316–318, 325, 327
Kim, H., 340, 342 Lewis, S., 12, 14, 15
Kinde, A.B., 167, 168 Liebman, J., 54
Kindleberger, C.P., 127 Lilliefors, H.W., 98
King, R.G., 343 Lindelow, M., 42
Kioko, M.U., 49, 51 Ling Yee, L., 217
Kishor, S., 12, 14, 16, 18, 22, 23, 26, 33–36 Lipsey, E., 254
Kiyotaki, N., 107 Little, J.S., 14, 17
Klasen, S., 16 Litvack, J.I., 54
Kling, J., 54 Liu, L., 341, 355
Kloss, M., 325 Lloyd, T., 82, 83, 85
Kohler, M., 260, 261, 269 Love, J., 315
Kojima, K., 129 Lucas, R., 182
Koolman, X., 44 Lumsdaine, R.L., 138
Koren, M., 357 Luppu, D.V., 104, 111
Kraay, A., 137 Lussier, M., 316
Kravis, B., 254 Lütkepohl, H., 132
Krueger, A.O., 316 Lydie, T.B., 131
Krugell, H., 130
Krugman, P., 184 M
Krugman, P.R., 127–129 Mabry, R.M., 15
Krupkin, A., 341 M’Amanja, D., 80–82
Kumar, C., 111 MacDougall, G.D.A., 127
Kumar, M.S., 159, 160, 167 Machado, P., 44
386 Author Index
MacKinnon, J.G., 316 Metcalfe, J.S., 186, 187

Maddala, S., 263 Meza, D., 51
Madeira, M.J., 338, 339, 347, 348, 362 Mezgebe, M., 157
Madhukar, S., 104 Mihalic, T., 341
Mahbuba Shahid, E., 114, 115 Miller, L., 54, 157, 159
Mahmood, R., 210, 212, 214 Miller, M.H., 157, 159, 160
Mahmud, S., 12, 16, 17, 20, 22, 33–36 Mills, G., 243
Mahonye, N., 243 Milwood, T.A.T., 292
Mairesse, J., 49, 326 Mincer, J., 178
Maita, M., 293 Miyajima, K., 255
Malhotra, A., 12, 16, 35 Mocan, N.H., 45
Malhotra, N.K., 162 Modigliani, F., 157
Malik, S., 255 Mody, A., 128
Malin B.A., 338 Mohaddes, K., 264
Mallik, G., 104, 111 Moller, L.C., 295
Mandishara, L., 243 Montgomery, D.B., 217
Manji, J.E., 54 Moreno, M.A., 293
Mankiw, N.G., 110 Morgan, N.A., 210
Manning, W.G., 49 Morgan, R.E., 208
Mansfield, E., 189 Morisset, J., 129, 136
Markowitz, S., 44, 49, 54 Morri, G., 156
Marschak, J., 325 Morrissey, B.T., 81, 83, 85, 91, 95
Martínez-López, F.J., 208, 209, 215, 231 Morrissey, O., 80–83, 85
Martínez-Ros, E., 208, 217, 230 Mosconi, R., 89
Martín-Martín, P., 210 Mosikari, T.J., 115
Martinez, M.G., 372 Motley, B., 110
Martins, P., 82, 83, 85 Mr, F.N., 86
Martinsson, P., 50 Mubarik, Y.A., 110, 111
Mascagni, G., 82, 83 Muhammad, S.D., 299
Masso J., 372 Mundell, R., 108, 109
Masters, W., 312 Muro, M.B., 342
Mastruzzi, M., 137 Musiime, J.A., 46
Matanda, M.J., 214, 227 Mwabu, D., 43, 50, 51
Mather, M., 12, 16 Mwabu, G., 46, 49, 50, 52
Matthews, R., 180 Mwabu, G.J., 44, 45
Mawia, M., 298 Mwega, F., 136, 140
McCalla, A., 313 Myers, C.P., 208, 209, 214, 216
McCluskey, W.J., 299 Myers, S.C., 157–160, 166
McCool, J. H., 43
McGillivray, M., 80–82, 85 N
McKeown, T., 42 Naeem, M.A., 159, 160, 167
Mclaren, A., 229 Nagarjuna, B., 104
McMahon, R.G.P., 357 Nagler, P., 371
Meckling, W.H., 158, 159 Najjar, N., 159, 160, 166, 167
Meer, J., 50, 54 Nalcaci, G., 210, 212, 214, 215, 218
Mehra, R., 12, 13, 15 Nanda, P., 12, 15
Meliciani, V., 341 Naqvi, Z.F., 12
Mendonca, M., 136 Nath, H.K., 341, 355
Mendoza, G., 255 Naudé, W., 371
Mengüç, B., 208, 209, 217, 230 Naude, W.A., 64, 67
Merkel, A., 209 Navarro, A., 214
Merlevede, B., 341 Navarro-García, A., 209, 213, 214, 227
Mestre, R., 143 Nd’enge, G.K., 43, 50, 51
Author Index 387
Neely, A., 347 Park, D., 338, 342

Nelson, R., 180, 181, 185, 186 Pascual, C.A., 18
Nelson, R.R., 65, 186 Patel, P., 35
Nestour, A., 45, 47 Patel, R., 338
Neuhaus, M., 73 Patonov, N., 357
Neun, S.P., 42 Patrikar, S.R., 12, 15, 22
Newbery, D.M., 295 Pattichis, C.A., 132
Newey, W., 138, 139 Pedroni, P., 255, 263, 264, 267
Newhouse, J.P., 49, 54 Peluso, S., 67
Newsom, J., 347 Perotti, R., 179
Ng, Y.C., 43 Perrot, J., 44
Nganda, B., 44, 45 Persson, T., 178, 179, 295
Nickell, S., 72 Pesaran, H., 255, 262, 265
Nkamelu, G.B., 320 Pesaran, M.H., 132–134, 138, 143
Nkuru, P., 357 Petesch, P., 35
Nonnenberg, M., 136 Petrick, M., 325
Noreen, S., 24 Petrin, A., 313, 316–319, 325, 327
North, D.C., 64, 66, 70, 71 Petrov, K., 159, 160, 166, 167
Nougtara, A., 45 Phelps, C.E., 54
Noulas, A., 159, 166, 167 Piercy, N.F., 217
Novello, S., 208, 209, 212, 214, 215, 217, 227 Pinstrup-Andersen, P., 316
Nyman, J.A., 51 Podkaminer, L., 258
Nzomoi, J., 298 Poi, B.P., 325
Pop, Z.W., 341
O Popkin, B.M., 45
O’Cass, A., 216, 217 Porter, M.E., 230
O’Connell, S., 80, 96 Poschke, M., 338
Ocampo, J.A., 128 Potts, J., 185
Ogutu, S.O., 371 Powers, A.D., 353
Ohlin, B., 127 Pradana, M.B.J., 104, 111
Okello, J.J., 371 Prahalad, C.K., 210
Okpara, G.C., 132 Prajogo, D.I., 348
Okuyan, H., 111 Praniceric, D.G., 341
Oliveira, B., 341 Prasanna, S., 105
Olley, G.S., 313, 317, 325, 326 Prichard, W., 293, 299
Onaolapo, A.A., 159 Prota, F., 208
Onaran, O., 192
Onyeiwu, S., 130, 140, 142 Q
Orlik, A., 257 Queiro, F., 348
Osei, R., 82, 84
Osland, G., 210 R
Osoro, N., 298 Raciborski, R., 325
Otieno, D.J., 371 Rada, N.E., 312
Overton, T.S., 215 Rahim, R.A., 340, 343
Owen, A., 73 Raissi, M., 264, 265
Oxaal, Z., 54, 55 Rajan, R.G., 156, 159, 162, 166–168
Ramachandran, M. K., 314
P Rambocas, M., 209–212, 214
Pai, C., 159 Ramlogan, R., 186, 187
Pakes, A., 313, 316, 325–327 Rashad, I.K., 44, 49, 54
Pallant, J., 218, 221, 224, 226, 228 Rashid, S., 294
Palmer, T., 50 Rathnayaka, M.K.T.R., 104, 111
Papadopoulos, N., 210 Rathouz, P., 50
Papell, D.H., 138 Rault, C., 255, 267
388 Author Index
Raviv, A., 160 Schindler, P.S., 215

Redek, T., 66 Schmidt, A.C. M., 209, 214, 227
Rehme, G., 179 Schneider, F., 136
Reuber, A.R., 344 Schoonjans, B., 340, 343
Rey-Moreno, M., 209, 213, 214, 227 Schuler, S.R., 12, 15
Rialp, A., 210, 213–215, 217, 218, 230, 231 Schulte, J., 35
Rialp, J., 210, 213, 214, 217, 218, 230, 231 Schultz, T.P., 43, 49, 51
Ricardo, D., 178, 180 Scott, E., 298
Richard, J.F., 91 Scott, L., 351
Richard, T., 107 Seetanah, B., 131
Ridde, V., 54 Segbefia, A.O., 12
Riddell, R., 81 Senhadji, S.A., 110
Riddle, L., 208 Seric, A., 208
Riley, A.P., 12, 15 Serletis, G., 370
Riley, G., 242 Shah, A., 159, 167
Ringel, J.S., 45 Shah, N.M., 12, 16, 17, 20, 22, 34, 36
Robertson, C., 210, 214 Shahnaz, L., 12
Robinson, J.A., 67 Shanmugam, V., 334
Robinson, R.B., 214 Sharif, B., 159, 160, 167
Rodrik, D., 66, 179 Sharma, M.S., 12, 15, 22
Roecker, E.B., 138 Shepherd, B., 339
Rogoff, K., 256 Shimeles, A., 43, 44, 54, 294, 295
Rojid, S., 131 Shin, K., 338, 342
Romer, D., 110 Shin, Y., 132, 133, 261, 265
Romer, P., 182, 312 Shoham, A., 217, 230
Roodman, D., 73 Shrestha, H., 130, 140, 142
Rosenzweig, M.R., 43, 49 Shukla, G.P., 293, 298
Rowlands, J., 18 Sidrauski, M., 108
Rueben, K., 341 Siguaw, J., 214
Rustichini, A., 179 Simsek, Z., 340, 342
Singer, N.M., 300
S Singh, H., 210, 212, 214
Sado, L., 12, 15–17, 20, 22, 33 Singh, M., 333, 338, 342, 357
Safarzyńska, K., 186 Smith, H.L., 12, 16
Sahu, S., 339 Smith, N., 344
Saikkonen, P., 132 Smith, R.J., 106
Saksena, P., 44 Sohal, A.S., 348
Salehi-Isfahani, D., 340 Solomon, M.A., 167
Samiee, S., 214 Solow, R., 108, 181, 183
Samman, E., 14, 18 Solow, R.M., 64, 65
Sampat, B.N., 65 Song, H.S., 159
Sandven, T., 209, 213 Song, L., 46
Sanjeev, G., 292 Soukiazis, E., 257
Santerre R.E., 42 Sousa, C.M.P., 208–210, 212–215, 217, 227,
Santos, M.E., 14, 18 230, 231
Sapienza, H.J., 340, 342 Spaho, A., 12, 15–17, 20, 22, 33
Sapsford, R., 13 Srinivasan, K., 129, 130
Sarel, M., 110 Staiger, D., 51
Sarris, A.H., 295 Stan, S., 210
Satterfield, S., 55 Stefko, O., 212, 217, 228, 230
Sauerborn, R., 45 Stein, P., 105
Saunders, M., 211–213, 215 Stock, J.H., 50, 51, 348
Sayeed, M.A., 159 Stockhammer, E., 192
Author Index 389
Stockman, A.C., 108 U

Stoian, M.C., 210, 213, 214, 217, 230, 231 Umaru, A., 104
Stoilova, D., 357 Upadhyay, U.D., 12, 14–16, 22
Strauss, J., 49
Strawson, Tim, 312 V
Strout, W., 80 Valenzuela, A., 208, 217, 230
Stucki, T., 347 Valeriani, E., 67
Stümpel, H.J., 341 Van Beveren, I., 325, 326
Styles, C., 211, 212, 230, 231 van den Bergh, J., 186
Su, B., 371 Van der Marel, E., 339
Subaiya, L., 12, 14, 16, 18, 20, 22, 23, 26, Van Kerkhoff, H., 230
33–36 Verbeek, M., 348, 350
Subramanian, A., 66 Verner, M., 344
Sultana, A., 22, 24 Vernon, R., 127
Sunday, A.K., 131 Vytlacil, E., 51
Sung, B., 209
Sušjan, A., 66 W
Swan, T. W., 64, 108, 181 Wagner, J., 209, 212
Symansky, S., 255 Wagner, U., 208–210, 229, 230
Szeidl, A., 357 Wai, U.T., 110, 114
Szirmai, A., 200 Wangombe, B., 44, 45
Ward, P.T., 214
T Watson, J., 116, 357
Tabachnick, B.G., 226, 228 Watson, M.W., 348
Tabellini, G., 178 Well, D.N., 110
Tahir, M., 343 Wessels, R., 159, 166
Tallontire, A., 12, 14, 17 West, K., 138
Tanzi, V., 292 Wheeler, C., 214, 231
Tareq, S., 292 Wheeler, D., 128
Tarp, F., 86 White, H., 138
Tasneem, S., 22 Wierts, P., 230
Tasseven, O., 159 Wild, P., 188, 189, 191, 193
Taylor, C., 210 Williams, C.C., 360
Teker, D., 159 Wilson, E., 255
Tekin, E., 45 Winter, S., 186, 191
Temouri, Y., 340 Withers, M., 12, 14, 15
Terza, J., 50 Wolf, H., 260
Theodosiou, M., 208 Wooldridge, J.M., 49, 50, 138
Thomas, D., 4, 49, 371 Wright, G., 242
Thorpe, R., 213 Wu, C., 357
Tica, J., 254, 256 Wu, J., 357
Timmer, M., 200 Wu, S., 263
Timmis, E., 82–84 Wu, W., 357
Titman, S., 159, 166, 167 Wurgler, J., 158
Tobin, J., 108, 109 Wypij, D., 12
Todaro, M.P., 179
Tookey, D.A., 209 X
Townsend, R., 313 Xiao, J., 105
Trebbi, F., 66 Xie, Y., 353
Trinh, H.L., 299 Xu, K., 44
Triplett, J.E., 301
Trommlerova, S.K., 16 Y
Tsay, R.S., 110 Yagci, M.I., 210, 212, 214, 215, 218
Tukel, A., 159 Yalcin, E., 217
390 Author Index
Yasar, M., 325 Zeleke, H., 157

Yazbeck, A., 54 Zhao, H., 208
Yeboah, O., 343 Zhou, C., 357
Yeoh, P.L., 210 Zhou, H., 347
Yimer, A., 131, 136, 142 Zhou, Z., 338, 360
Yli-Renko, H., 340, 342 Zingales, L., 156, 159, 163, 166, 168
Yoder, R., 54 Zou, S., 208, 210, 214
Yoon, T., 338, 342 Zubairu, J., 104
Zucchella, A., 343
Z Zwart, P.S., 208
Zee, H.H., 292
Subject Index
A Cross-sectional dependence, 5, 253, 255, 261,

Agricultural productivity, 6, 311, 312, 316, 262, 265, 267, 276
320, 322 Cultural norms, 13, 15, 36
Agricultural productivity growth, 6, 294, 311, Cumulative empowerment index, 19, 21, 22,
313, 324 24, 25, 34
Agricultural tax revenue, 291–293, 295, 296, Currency union, 2
304, 305
Agro-based industries, 314 D
Agro-industrial development, 311–313, 315, Decision-making index, 19, 21, 22, 24, 25, 34
321, 324 Demand for health care, 2, 3, 41, 43–45, 47,
Agro-processing, 6, 311, 312, 314–318, 49–52, 54
320–324 Direct tax revenue, 291, 295, 299, 305
Aid conditionalities, 80 Domestic borrowing, 4, 79, 80, 82, 83, 85, 86,
Aid delivery system, 4, 79, 97 93, 95, 97
Aid illusion, 88 Domestic fiscal variables, 4, 79, 82, 91, 94–96
Asset tangibility, 4, 156, 164–166, 168 Dynamic relationship, 4, 79, 96
Auto-correlation, 63, 73, 77, 98, 116, 133, 138,
140, 155 E
Autoregressive distributed lag, 4, 123, Economic stability measures, 4, 103, 120
132–134, 138, 141, 142, 144, 264, 266, 279 Economic transformation, 6, 331, 337
Economic welfare, 7
B Employment growth, 334, 355
Bank loan growth effect, 2, 4 Empowerment covariates, 13, 19, 22, 24
Budget planning, 4, 79, 81, 85, 87, 89, 95, 96 Empowerment index, 19, 21, 22, 24, 34
Business freedom index, 65 Endogeneity, 43, 45, 49, 50, 54, 66, 73, 132,
Business income tax, 293, 294, 304 325
Business risk, 4, 155, 156, 159, 167, 168 Endogenous growth theory, 108, 182, 312
Environmentally sustainable development, 1
C Evolutionary economic framework, 5, 177, 187
Capital structure, 2, 4, 155–158, 160, 169 Evolutionary economic perspective, 2, 180,
Channels of empowerment, 12 183
Co-integration relationship, 5, 133 Evolutionary economics, 180, 184–188, 191,
Comparative advantage, 129, 316, 370 193
Control of corruption, 3, 66, 69, 137, 141, 142, Exchange rate trade zones, 2–5, 123, 131, 134,
145 136
Corruption, 1, 63, 66–69, 76, 77, 130, 137, Expansionary fiscal policy, 5, 261, 269, 272,
142, 144 291–293, 298, 304
Corruption index, 75 Export performance, 5, 207–209

DOI 10.1007/978-981-10-4451-9
392 Subject Index
External Balassa Hypothesis, 5, 253–257, 261, Income distribution, 4, 177–180, 188, 192,
269, 273, 274 193, 195, 199, 200
Inflation rate, 3, 104–106, 110–113, 118, 120,
F 134, 136, 140
Factor endowment, 2, 339 Innovation capacity, 338, 340
FDI inflows, 4, 123–126, 129, 131, 136, 137, Inpatient health care, 47
140–142, 144 Institutional factors, 4, 90, 123
Fertility preferences, 15 Institutional indicators, 66, 67, 134, 138, 142
Firm competitiveness, 6, 342, 347 Institutional quality, 3, 63, 64, 66–69, 71, 72,
Firm-specific effects, 129, 155, 165, 167 74, 75, 77, 123, 124, 144
First marriage, 3, 11, 13, 16, 19, 20, 22, 23, 26, Insurance companies, 4, 155–157, 160–163,
28, 32–36 166–169
Fiscal disequilibrium, 83, 90, 94 International aid agencies, 7
Fiscal effects of aid, 4, 81–83 International trade, 64, 129, 131, 210
Fiscal equilibrium, 79, 83, 87, 89, 90, 92–96
Fiscal performance, 4, 79 L
Fiscal response model, 80, 87, 88 Labor force participation, 12, 13, 36, 339, 370
Fixed capital formation, 73–77, 126 Labor productivity, 185, 194, 199, 200, 312,
Foreign aid, 4, 79–81, 85, 96, 131 338, 342, 350, 360, 374, 375
Foreign direct investment, 3, 80, 123, 208, 239, Labor-use efficiency, 6, 369, 371–373, 375,
240, 317, 318, 320, 338, 343 377, 379, 380
Foreign exchange earning, 80 Land use fee buoyancy, 302
Fostering productivity growth, 5 Liberalization policies, 125, 294
Liquidity, 4, 155, 156, 158, 160, 161, 163–165,
G 168
Global economic crisis, 74, 341
Global financial crisis, 1, 125, 157 M
Government consumption, 77, 81 Manufacturing-led development, 6
Government credibility, 70, 137, 140 Market activities, 370
Government effectiveness, 3, 63, 64, 66, 68, Maternal health, 15, 43
70, 72–77, 123, 131, 137, 140–143, 145 Measuring empowerment, 14
Government fiscal planning, 4, 79, 85, 97 Media exposure, 3, 11, 13, 16, 19, 20, 22–25,
Government fiscal statistics, 4, 79, 97 27, 29–31, 33, 34, 36
Granger causality, 111, 113, 117, 132 Micro-environmental factors, 211, 217, 221,
Gross capital formation, 68 222
Gross enrollment rate, 68, 74, 76 Microfinance, 24, 37
Middle income countries, 5
H Millennium development goals, 42
Health insurance schemes, 3, 43, 44, 55 Mineral export, 2, 5, 237–243, 245–251
Health sector reform, 41 Mineral resources, 238–240, 242–245, 250
Health service utilization, 43, 54, 55 Mining employment, 243
Heterogeneity, 2, 5, 43–45, 47, 51, 52, 253, Multi-dimensional approach, 207, 210
255, 264 Multi-dimensional index, 70
Heteroskedasticity, 63, 73, 77, 138, 139, 143, Multiple approach perspective, 2
155, 162, 163
Household decision making, 12, 13, 16, 17, N
19–22, 26, 27, 29, 33–37 National care health system, 44
Human capital, 42, 65, 67, 70–73, 76, 77, 129, Natural resources, 4, 64, 123, 129, 131, 135,
131, 179, 182, 257, 339, 340, 343 240, 244, 313
Nominal exchange rate, 136, 141, 142, 261,
I 268
Impact of institutions, 2, 3, 63, 67, 72, 76 Non-market activities, 42
Import substitution, 254 Non-stationarity, 263
Inclusive growth, 7 Non-tradable, 256, 259–261, 273, 274
Subject Index 393
O Service-based economies, 6
Off-farm employment, 314 Service innovation, 345, 347, 348, 350, 351,
Out-of-pocket health care, 3, 41 355, 356, 359
Outpatient health care, 3, 41, 43, 47, 54, 55 Service sector, 6, 130, 242, 331–340, 343, 344,
Overseas Development Assistance (ODA), 80 349, 354, 357, 359, 361, 369, 371, 373,
377, 379
P Service sector development, 344, 353, 357,
Pecking order theory, 155, 158–161, 166–169 360, 361
Personal income tax, 291, 293, 297, 301, Service turnover model, 353
304–306 Small and medium sized enterprises, 208
Physical abuse, 3, 11, 20 Social conflict, 179
Political and institutional risk, 137 Social development, 2, 16, 35
Political and institutional variables, 136 Social interaction, 54
Political stability, 4, 123, 125, 130, 134, 137, Socio-demographic variables, 46, 47
140–142, 144 Socioeconomic status, 42, 55
Population aging, 1 Sources of empowerment, 11–13, 16, 22, 33
Poverty alleviation, 2 Structural adjustment policies, 125
Poverty reduction strategy, 42, 43 Sustainability of growth, 2
Productivity and efficiency, 2, 6, 12, 13 Sustainable development, 104, 120, 254, 316
Profitability, 4, 105, 127, 135, 136, 155, 156,
158, 159, 161, 164–166, 168, 214, 215, T
220, 221, 228, 229, 231, 243, 344, 357 Tangibility, 155, 156, 158, 159, 161, 163, 164,
Pro-poor economic growth, 64 166
Protection of property rights, 3, 64–66, 68, 70, Tax buoyancy, 293, 298–300, 302, 305
72–77 Tax effects, 106
Provision of aid, 3 Tax elasticity, 292, 298, 300, 303, 305
Public spending, 4, 79, 83, 97 Tax reform, 294, 301
Purchasing Power Parity (PPP), 104, 256, 317 Tax responsiveness, 6, 291, 298
Tax structure, 292, 293, 295, 298, 300, 357
Q Total factor productivity growth, 313
Quality of government policies, 70
Quality of life, 16 U
Unemployment, 4, 103, 104, 107, 112–115,
R 118, 120, 372
Raw material export, 311, 316–318, 320, 321, Utilization of health care, 3
323, 324
Real economic growth, 4, 103, 105, 106, 111, V
112, 120 Vector auto-regressive, 79, 82, 86, 116, 304
Regulatory quality, 123, 137, 140–142, 144, Vector error correction, 114
145 Voice of accountability, 137, 141
Rule of law, 66, 67, 123, 131, 137, 140–143,
145 W
Women’s autonomy, 16, 35, 37
S Women’s empowerment, 2, 3, 11–13, 15–17,
Self-esteem, 12, 13, 16, 17, 19, 21, 22, 24, 26, 19, 20, 22–25, 33, 35–37
33, 34 Women empowerment theories, 15
Semi-structured interviews, 209, 211–213, 215, Worldwide governance indicators, 137
227, 229–231
Journal of Statistical and Econometric Methods, vol.5, no.4, 2016, 63-91
ISSN: 1792-6602 (print), 1792-6939 (online)
Scienpress Ltd, 2016
Autoregressive Distributed Lag (ARDL)

cointegration technique:
application and interpretation
Emeka Nkoro 1 and Aham Kelvin Uko 2
Abstract
Economic analysis suggests that there is a long run relationship between variables
under consideration as stipulated by theory. This means that the long run
relationship properties are intact. In other words, the means and variances are
constant and not depending on time. However, most empirical researches have
shown that the constancy of the means and variances are not satisfied in analyzing
time series variables. In the event of resolving this problem most cointegration
techniques are wrongly applied, estimated, and interpreted. One of these techniques
is the Autoregressive Distributed Lag (ARDL) cointegration technique or bound
cointegration technique. Hence, this study reviews the issues surrounding the way
cointegration techniques are applied, estimated and interpreted within the context
of ARDL cointegration framework. The study shows that the adoption of the
1
Department of Economics, University of Port Harcourt, Port Harcourt, Nigeria.
E-mail: nkoro23@yahoo.co.uk
2
Department of Economics, University of Port Harcourt, Port Harcourt, Nigeria.
Article Info: Received : September 2, 2016. Revised : October 4, 2016.

Published online : December 1, 2016.
64 Autoregressive Distributed Lag (ARDL) cointegration technique
ARDL cointegration technique does not require pretests for unit roots unlike other
techniques. Consequently, ARDL cointegration technique is preferable when
dealing with variables that are integrated of different order, I(0), I(1) or
combination of the both and, robust when there is a single long run relationship
between the underlying variables in a small sample size. The long run relationship
of the underlying variables is detected through the F-statistic (Wald test). In this
approach, long run relationship of the series is said to be established when the F-
statistic exceeds the critical value band. The major advantage of this approach lies
in its identification of the cointegrating vectors where there are multiple
cointegrating vectors. However, this technique will crash in the presence of
integrated stochastic trend of I(2). To forestall effort in futility, it may be advisable
to test for unit roots, though not as a necessary condition. Based on forecast and
policy stance, there is need to explore the necessary conditions that give rise to
ARDL cointegration technique in order to avoid its wrongful application,
estimation, and interpretation. If the conditions are not followed, it may lead to
model misspecification and inconsistent and unrealistic estimates with its
implication on forecast and policy. However, this paper cannot claim to have
treated the underlying issues in their greatest details, but have endeavoured to
provide sufficient insight into the issues surrounding ARDL cointegration
technique to young practitioners to enable them to properly apply, estimate, and
interpret; in addition to following discussions of the issues in some more advanced
texts.
Keywords: Cointegration; Unit Roots; the Autoregressive Distributed Lag

Cointegration technique; Error Correction Mechanism
JEL Words: C5; C51; C52
Emeka Nkoro and Aham Kelvin Uko 65
1 Introduction
Theoretically, economic analysis suggests that there is a long run relationship

between variables under consideration. Oftentimes, econometricians/researchers
have ignored the inherent dynamic features of most time series in the process of
analyzing time series and formulating traditional regression models. It was assumed
that the underlying time series were stationary or at least stationary around a
deterministic trend and as well exhibited a long run relationship. Hence, it was
normal to formulate an econometric model in the conventional way assuming that
the means and variances of the variables were constant and not depending on time.
Thus, the estimated models were used to analyze theories formulated at abstract
level and, to forecast, evaluate and stimulate policies.
Recent development in econometrics have however, revealed that often times,
most time series are not stationary as was conventionally thought. Therefore,
different time series may not display the same features. Hence, it is possible to see
some time series that display the feature of diverging away from their mean over
time while others may converge to their mean over time. Time series that diverge
away from their mean over time are said to be non-stationary. Therefore, the
classical estimation of variables with this relationship most times gives misleading
inferences or spurious regression.
To overcome this problem of non-stationarity and prior restrictions on the lag
structure of a model, econometric analysis of time series data has increasingly
moved towards the issue of cointegration. The reason being that, cointegration is a
powerful way of detecting the presence of steady state equilibrium between
variables. Cointegration has become an over-riding requirement for any economic
model using non-stationary time series data. If the variables do not cointegrate,
then we have the problems of spurious regression and the results therein become
almost meaningless. On the other hand, if the variables do cointegrate then we have
cointegration.
In applied econometrics, the Granger (1981) and, Engle and Granger (1987),
Autoregressive Distributed Lag(ARDL) cointegration technique or bound test of
cointegration(Pesaran and Shin 1999 and Pesaran et al. 2001) and, Johansen and
Juselius(1990) cointegration techniques have become the solution to determining
the long run relationship between series that are non-stationary, as well as
reparameterizing them to the Error Correction Model (ECM). The reparameterized
result gives the short-run dynamics and long run relationship of the underlying
variables. However, given the versatility of cointegration technique in estimating
relationship between non-stationary variables and reconciling the short run
dynamics with long run equilibrium, most researchers still adopt the conventional
way of estimation even when it is glaring to test for cointegration among the
variables under consideration. That is most of the researchers are not conversant
with the conditions that necessitate the application of cointegration test and the
interpretation of the results therein, hence, presenting misleading inferences.
With this background, the objective of this paper is to examine the conditions that
necessitate the application of the Autoregressive Distributed Lag (ARDL)
cointegration or bound test of cointegration technique and its interpretation.
Accordingly, this paper is divided into five sections. Section one, which is the
introduction. Section two, examines the concept of stationarity, section three
focuses on various unit roots tests, section four deals on ARDL cointegration
approach, section five focuses on summary and conclusions.
2 Stationary and Non-Stationary Series Concept

A non-stationary time series is a stochastic process with unit roots or structural
breaks. However, unit roots are major sources of non-stationarity. The presence of
a unit root implies that a time series under consideration is non-stationary while the
absence of it entails that a time series is stationary. This depicts that unit root is
one of the sources of non-stationarity. A non-stationary stochastic process could be

Trend Stationary (deterministic) Process (TSP) or Difference Stationary Process
(DSP). A time series is said to be trend stationary process if the trend is completely
predictable and not variable, whereas if it is not predictable, we call it difference or
integrated stochastic trend or difference stationary process. In the case of
deterministic trend, the divergence from the initial value (represents non-stationary
mean) is purely random and they die out quickly. They do not contribute or affect
the long run development of the time series. However, in the case of integrated
stochastic trend, the random component (Ut) or divergence affects the long run
development of the series. Utilizing time series with these features in any
meaningful empirical analysis, the series must be purged of this trend. This is
referred to as detrending of the series. This could be carried out in two ways,
depending on whether the series is a difference stationary process or deterministic
stationary process. If a series is DSP, it means it has a unit root; hence, the
differencing of such series is stationary. Therefore, the solution to the non-
stationary series is to difference the series. Also, if a series is TSP, it means it
exhibits a deterministic trend, while a trend stationary variable with non-constant
mean may be I(0) after removal of a deterministic trend. That is, regressing such
series on time(t) and the residuals from this regression will be stationary(Yt =βt +
Ut). Hence, cointegration cannot be seen as a means to an end but restricted. It
should be made clear that if a time series is TSP, but treated as DSP, this is called
over-differencing. On the other hand, if a time series is DSP, but treated as TSP;
this is referred to as under-differencing. The implications of these types of
specification error can be serious, depending on how the serial correlation
properties of the resulting error terms are handled. However, it has been observed
that most time series are DSP rather than TSP. Therefore, when such non-stationary
time series (DSP) are used in estimation of an econometric model, the Ordinary
Least Square (OLS) traditional diagnostic statistics for evaluation of the validity of
the model estimates such as, coefficient of determination (R2), Fisher’s Ratio(F-
Statistic), Durbin-Watson(DW-Stat), t-statistic etc. become highly misleading and

unreliable in terms of forecast and policy. In such series, the mean, variance,
covariance and autocorrelation functions change overtime and affect the long run
development of the series. The presence of unit root in these series leads to the
violation of assumptions of constant means and variances of OLS. However, this
review dwells on Difference Stationary Process rather than Trend Stationary
Process since most time series are Difference Stationary Process.
As demonstrated above, many time series variables are stationary only after
differencing. Hence, using differenced variables for regressions imply loss of
relevant long run properties or information of the equilibrium relationship between
the variables under consideration. This means that we have to devise a way of
retaining the relevant long run information of the variables. Cointegration makes it
possible to retrieve the relevant long run information of the relationship between
the considered variables that had been lost on differencing. That is, it integrates
short run dynamics with long run equilibrium. This is the basis for obtaining
realistic estimates of a model, which is the driver of a meaningful forecast and
policy implementation. Cointegration is a preferred step for modeling empirically
meaningful relationships of DSP. Cointegration is concerned with the analysis of
long run relations between integrated variables and reparameterizing the
relationship between the considered variables into an Error Correction Model
(ECM). Under the conventional Granger (1981) and, Engle and Granger (1987)
cointegration analysis is not applicable in cases of variables that are integrated of
different orders (i.e, series-A is I(1) and series-B is I(0)) while in Johansen and
Juselius(1990), and ARDL cointegration procedure it is applicable. The ARDL
cointegration technique is used in determining the long run relationship between
series with different order of integration (Pesaran and Shin, 1999, and Pesaran et al.
2001). The reparameterized result gives the short-run dynamics and long run
relationship of the considered variables.
Although ARDL cointegration technique does not require pre-testing for unit
roots, to avoid ARDL model crash in the presence of integrated stochastic trend of
I(2), we are of the view the unit root test should be carried out to know the number
of unit roots in the series under consideration. This is presented in the next section.
3 Unit Root Stochastic Process

Given a Random Walk Model (RWM);
Yt = ρYt-1 + Ut (3.1)
-1≤ ρ ≤1
In the above RWM without drift, If ρ =1, we are faced with unit root problem, that
is, a situation of non-stationarity. In this case the variance of Yt is not stationary.
However, If /ρ/ ≤1, that is if the absolute value of ρ is less than one, then the
series, Yt is said to be stationary. Given this, Ut is said to be white noise and
distributed normally with zero mean and unit variance. Hence, it follows that
E(Yt) = 0 and Var(Yt) = 1/(1- ρ2).
A stochastic process Yt is assumed to have a unit root problem if its first

difference, (Yt-Yt-1) is stationary. In practice, the presence of unit root shows that
the time series under consideration is non-stationary unless the reverse is the case.
On the other hand, a series with unit root have no tendency to return to long-run
deterministic path and the variance of the series is time dependent. A series with
unit root suffers permanent effects from random shocks, thus, follow a random
walk. That is, using (dependent and independent) time series that contain unit root
in regression analysis, the classical results of the regression may be misleading.
However, I(1) variables that exhibit a random walk without drift may have a mean
that is constant over time, expected value of zero and, with trending variance;
hence making the series with unit root to have the tendency to return to long-run
path after removing deterministic trend. This reemphasized that; cointegration
cannot be seen as a means to an end, but restricted. However, this paper focuses on
series with unit root, I(1) (no constant mean and variance) that have no tendency of
returning to the long-run path.
There are various methods of testing unit roots. They include; Durbin-Watson
(DW) test, Dickey-Fuller test(1979)(DF), Augmented Dickey-Fuller(1981)(ADF)
test, Philip-Perron(1988) (PP) test, among others. It is of the view that before
pursuing formal tests to plot the time series under consideration, to determine the
likely features of the series and; run the classical regression. If the series is trending
upwards it shows that the mean of the series has been changing with time. In the
case of the classical regression, if Durbin– Watson statistics is very low and a high
R2 (Granger–Newbold, 1974), this perhaps reveals that the series is not stationary.
This is the initial step for a more formal test of stationarity. The most popular
strategy for testing the stationarity property of a single time series involves using
the Dickey Fuller or Augmented Dickey Fuller test respectively. The choice of the
right tests depends on the set up of the problem which is of interest to the
practitioner. It is difficult to follow the latest advances or to understand the
problems between employing various tests. This should not be understood as a
motive for not performing other types of unit root tests. Comparing different
results from different test methods is a good way of testing the sensitivity of your
conclusions. Once you understand how these tests work, and their limitations, you
will understand when to use any test. The advantage is that it enables us to
understand the meaning and purpose of any test. However, when a test result is
inconclusive, the usual way is to continue the analysis with a warning note, or
simply assume one of the alternatives. Thus, the unit roots test is basically required
to ascertain the number of times a variable/series has to be differenced to achieve
stationarity. From this comes the definition of integration: A variable Y, is said to
be integrated of order d, I(d)] if it attained stationarity after differencing d
times(Engle and Granger, 1987).
3.1 The Durbin-Watson Test
This test is a simple but unreliable test for unit root. To understand how this
test works, recollect that the DW-value is calculated as DW =2(1−ˆρ)( Harvey,
1981), where ρ = ˆρ is the estimated first order autocorrelation. Thus, if Yt is a
random walk, ρ will equal unity and the DW value is zero. Under the null that Yt is
a random walk, the DW statistic calculated from the first order autocorrelation of
the series Yt = Yt−1 +Vt, will approach one. The DW value approaches 0 under the
null of a random walk. A DW value significantly different from zero rejects the
hypothesis that Yt is a random walk and I(1), in favor of the alternative that Yt is
not I(1), and perhaps I(0). The test is limited by the assumption that Yt is a random
walk variable. This test is not good for integrated variables in general. The critical
value at the 5% level for the maintained hypothesis of I(1) versus I(0) is 0.17. A
higher value rejects I(1) )( Bo Sjö, 2008).
3.2 Dickey-Fuller (DF) (1979) Test for Unit Roots

Assume that Yt is random walk process, Yt = Yt−1 + μt, then the regression
model becomes Yt = ρYt−1+ μt. Subtract Yt−1 from both sides of the equation,
Yt-Yt-1 = α1Yt-1 -Yt-1 + ut (3.2)
ΔYt = (α-1)Yt-1 + ut (3.3)
ΔYt = (α-1)Yt-1+ α2T + ut (3.4)
Where α-1= р1, Δ is change in Yt or first difference operator and t is the trend
factor. ut is a white nose residual.
ΔYt = р1Yt-1 + ut (3.5)
With a drift we have;
ΔYt = α0 + р1Yt-1 + ut (3.6)
In practice, we test the hypothesis that р=0. If р=0, “α” in equation 3.2 will be
equal to 1, meaning that we have a unit root. Therefore, the series under
consideration is non-stationary. In the case where р ≥ 0, that is, the time series is
stationary with zero mean and in the case of 3.4, the series, Yt is stationary around
a deterministic trend. If р ≥ 1, it means that the underlying variable will be
explosive.
However, conducting the DF test as in (3.3) or (3.4), it is assumed that Ut is
uncorrelated. But in the case the error terms (Ut) are correlated, the Augmented
Dickey-Fuller (ADF) is resorted to, since it adjusts the DF test to take care of
possible autocorrelation in the error terms (Ut), by adding the lagged difference
term of the dependent variable, ∆Yt.
3.3 The Augmented Dickey-Fuller (ADF) (1981) tests for Unit Root
k
Restrictive ADF Model: ΔYt = р1Yt-1 + ∑ α ∆Y
i =1
i t −i + ut (3.7)
k
Restrictive ADF Model: ΔYt = р1Yt-1+ α2T + ∑ α i ∆Yt −i + ut (3.8)
i =1
k
General ADF Model: ΔYt = α0 + р1Yt-1 + ∑ α ∆Y
i =1
i t −i + ut (3.9)
k
General ADF Model: ΔYt = α0 + р1Yt-1+ α2T + ∑ α i ∆Yt −i + ut (3.10)
i =1
ut is a pure white noise error term and ∆Yt-1 =(Yt-1 –Yt-2), ∆Yt-1 =(Yt-1 –Yt-2), etc.
The number of lagged difference terms to be included is often determined
empirically, the reason being to include enough terms so that the error term in (3.5)
and (3.6) are serially uncorrelated. k is the lagged values of ∆Y, to control for
higher-order correlation assuming that the series follow an AP(p). In ADF р=0 is
still tested and follow the same asymptotic distribution as DF statistic. H0: р1
=0(р1 ∼ I(1)), against Ha : р1 < 0(р1∼ I(0)).
In practice, an DF or ADF value with less than its critical value shows that the
underlying series is non-stationary. Contrarily, when an DF or ADF value that is
greater than its critical value shows that the underlying series is stationary.
However, the null hypothsis cannot be rejected about non-stationarity based on
ADF test, since its power is not strong as such. This decision can be verified using
other related tests, such as Kwiatkowski-Phillips-Schmidt-Shin (1992)(KPSS) or
Philips-Perron (PP) test. PP test has the same null hypothesis as ADF, and its
asymptotic distribution is the same as the ADF test statistic. But in the case of
KPSS test, the null hypothesis is different; it assumes stationarity of the variable of
interest. The results from ADF test differ from KPSS as KPSS does not provide a
p-value, showing different critical values instead. In this case, the test statistic value
is compared with the critical value on desired significance level. If the test statistic
is higher than the critical value, we reject the null hypothesis and when test statistic
is lower than the critical value, we cannot reject the null hypothesis. However,
when there is a conflicting of the tests, it all depends on the researchers aim and
objective. In general, the null hypothesis for ADF reads that the series is non-
stationary while KPSS reads that the series is stationary. For the treatment of
serial correlation, PP reads that there is serial correlation (non-parametric) while
ADF reads that there is serial correlation (parametric).
The test can also be performed on variables in first differences as a test for I(2).
Under the null, ˆр1 will be negatively biased in a limited sample, thus, unless yt is
explosive. A significant positive value implies an explosive process, which can be a
very difficult alternative hypothesis to handle. Conversely, When testing for I(2) or
differencing twice, a trend term is not a possible alternative. The two interesting
models here are the ones with and without a constant term. Furthermore, lag length
in the augmentation can also be assumed to be shorter.
However, it is a good strategy to start with the model containing both a

constant and a trend (3.10), because this model is the least restricted. If a unit root
is rejected here, due to a significant р1, there is no need to continue testing. If р1 = 0
cannot be rejected, the improved efficiency in a model without a time trend might
be better. There is also the lag length in the augmentation to consider ( Bo Sjö,
2008).
A substantial weakness of the original Dickey-Fuller test(equation 3.3) as
earlier stated is that it does not take account of possible autocorrelation in the error
process Ut. If μt is autocorrelated (that is, it is not white noise) then the OLS
estimates of the equations and, of its variants are inefficient. Therefore the simple
solution is to apply ADF by using the difference lagged dependent variable as
explanatory variables to take care of the autocorrelation.
The choice of the number of lags (p) to be included in the unit root test is based
on the significant lag of the autocorrelation function (ACF) and the partial
autocorrelation function (PACF) plots of the correlogram and partial correlogram.
The value of p is taken to be the number of lags at which the ACF cuts of or the
number of lags of the PACF that are significantly difference from zero. By rule of
thumb, we compute ACF up to one-third to one-quarter of the length of the time
series. The ACF and PACF show different lags that are correlated and compared
with the confidence bounds, mostly at 95 percent level. This will lead to AR
process in cognizance of the properties of the residual(Uko and Nkoro, 2012). The
characteristic of a time series has a far reaching implication for economic and
policy formulation and implementation. When a series has a unit root (р1 =0 ), any
shock to the data series is long lasting. Hence, there will be a cumulative
divergence from the mean/trend of the series. The instability exhibited by this
series will tend to render any policy formulated and implemented on the basis of a
model estimated using such data series inefficient. This is because what drives any
policy formulation and implementation is the clear assumption of the stability of
the series.
However, the Augmented Dickey-Fuller(ADF) test is considered superior

because of its popularity and wide application. The ADF test adjusts the DF test to
take care of possible autocorrelation in the error terms by adding the lagged
difference term of the dependent variable. In the case of PP test it also take cares of
the autocorrelation in the error term and, its asymptotic distribution is the same as
the ADF test statistic. However, ADF is commonly used because of its easy
applicability.
4 Cointegration Test
Modeling time series in order to keep their long-run information intact can be
done through cointegration. Granger (1981) and, Engle and Granger(1987) were
the first to formalize the idea of cointegration, providing tests and estimation
procedure to evaluate the existence of long-run relationship between set of
variables within a dynamic specification framework. Cointegration test examines
how time series, which though may be individually non-stationary and drift
extensively away from equilibrium can be paired such that the workings of
equilibrium forces will ensure they do not drift too far apart. That is, cointegration
involves a certain stationary linear combination of variables which are individually
non-stationary but integrated to an order, I(d). Cointegration is an econometric
concept that mimics the existence of a long-run equilibrium among underlying
economic time series that converges over time. Thus, cointegration establishes a
stronger statistical and economic basis for empirical error correction model, which
brings together short and long-run information in modeling variables. Testing for
cointegration is a necessary step to establish if a model empirically exhibits
meaningful long run relationships. If it failed to establish the cointegration among
underlying variables, it becomes imperative to continue to work with variables in
differences instead. However, the long run information will be missing. There are
several tests of cointegration, other than Engle and Granger(1987) procedure,
among them is; Autoregressive Distributed Lag cointegration technique or bound

cointegration testing technique. This becomes the focal point of this paper.
4.1 Autoregressive Distributed Lag Model (ARDL) Approach to

Cointegration Testing or Bound Cointegration Testing Approach
When one cointegrating vector exists, Johansen and Juselius(1990)

cointegration procedure cannot be applied. Hence, it become imperative to explore
Pesaran and Shin (1995) and Pesaran et al (1996b) proposed Autoregressive
Distributed Lag (ARDL) approach to cointegration or bound procedure for a long-
run relationship, irrespective of whether the underlying variables are I(0), I(1) or a
combination of both. In such situation, the application of ARDL approach to
cointegration will give realistic and efficient estimates. Unlike the Johansen and
Juselius(1990) cointegration procedure, Autoregressive Distributed Lag (ARDL)
approach to cointegration helps in identifying the cointegrating vector(s). That is,
each of the underlying variables stands as a single long run relationship equation. If
one cointegrating vector (i.e the underlying equation) is identified, the ARDL
model of the cointegrating vector is reparameterized into ECM. The
reparameterized result gives short-run dynamics (i.e. traditional ARDL) and long
run relationship of the variables of a single model. The re-parameterization is
possible because the ARDL is a dynamic single model equation and of the same
form with the ECM. Distributed lag Model simply means the inclusion of
unrestricted lag of the regressors in a regression function.
This cointegration testing procedure specifically helps us to know whether the
underlying variables in the model are cointegrated or not, given the endogenous
variable. However, when there are multiple cointegrating vectors ARDL Approach
to cointegration cannot be applied. Hence, Johansen and Juselius(1990) approach
becomes the alternative. The next sections expose the requirement for using this
approach and its application.
The ARDL(p,q1,q2......qk) model specification is given as follows;
k
Ф(L,p)yt = ∑ βi ( L, qi ) xit + δwt +ut (4.1)
i =1
where
Ф(L,p) = 1- Ф1L - Ф2L2-….-ФpLp
β(L,q) = 1- β1L - β2L2-….-βqLq, for i=1,2,3…….k, ut ~ iid(0;δ2).
L is a lag operator such that L0yt =Xt, L1yt=yt-1, and wt is a s x1 vector of
deterministic variables such as the intercept term, time trends, seasonal dummies,
or exogenous variables with the fixed lags. P=0,1,2…,m, q=0,1,2….,m, i=1,2….,k:
namely a total of (m+1)k+1 different ARDL models. The maximum lag order, m, is
chosen by the user. Sample period, t = m+1, m+2….,n.
OR
The ADRL(p,q) model specification:
Ф(L)yt = φ + θ(L)xt + ut, (4.2)
with
Ф(L) = 1− Ф1L−...− ФpLp,
θ(L) = β0- β1L-...- βqLq.
Hence, the general ARDL(p,q1,q2......qk) model;

Ф(L)yt = φ + θ 1(L)x1t + θ 2(L)x2t + θ k(L)xkt + μt (4.3)
Using the lag operator L applied to each component of a vector, Lky=yt-k, is
convenient to define the lag polynomial Ф(L,p) and the vector polynomial β(L,q).
As long as it can be assumed that the error term ut is a white noise process, or more
generally, is stationary and independent of xt, xt-1, … and yt, yt-1, …, the ARDL
models can be estimated consistently by ordinary least squares.
4.2 Requirements for the Application of Autoregressive Distributed

Lag Model (ARDL) Approach to Cointegration Testing
• Irrespective of whether the underlying variables are I(0) or I(1) or a

combination of both, ARDL technique can be applied. This helps to avoid
the pretesting problems associated with standard cointegration analysis
which requires the classification of the variables into I(0) and I(1). This
means that the bound cointegration testing procedure does not require the
pre-testing of the variables included in the model for unit roots and is robust
when there is a single long run relationship between the underlying
variables,
• If the F-statistics (Wald test) establishes that there is a single long run
relationship and the sample data size is small or finite, the ARDL error
correction representation becomes relatively more efficient.
• If the F-statistics (Wald test) establishes that there are multiple long-run
relations, ARDL approach cannot be applied. Hence, an alternative
approach like Johansen and Juselius (1990) can be applied. That is, if the
various single expression/equation of the underlying individual variable as
dependent variable shows a feedback effect(multiple long run relationships)
between the variables, then a multivariate procedure need to be employed.
• If the trace or Maximal eigenvalue or the F-statistics establishes that there is
a single long-run relationship, ARDL approach can be applied rather than
applying Johansen and Juselius approach.
To determine whether the above requirements are met or not see section 4.3.
4.3 Advantages of ARDL Approach
• Since each of the underlying variables stands as a single equation,

endogeneity is less of a problem in the ARDL technique because it is free of
residual correlation (i.e. all variables are assumed endogenous). Also, it

enable us analyze the reference model.
• When there is a single long run relationship, the ARDL procedure can
distinguish between dependent and explanatory variables. That is, the
ARDL approach assumes that only a single reduced form equation
relationship exists between the dependent variable and the exogenous
variables (Pesaran, Smith, and Shin, 2001).
• The major advantage of this approach lies in its identification of the
cointegrating vectors where there are multiple cointegrating vectors.
• The Error Correction Model (ECM) can be derived from ARDL model
through a simple linear transformation, which integrates short run
adjustments with long run equilibrium without losing long run information.
The associated ECM model takes a sufficient number of lags to capture the
data generating process in general to specific modeling frameworks.
4.4 The steps of the ARDL Cointegration Approach
This sub-section explores how one determines whether the above requirements
are met.
Step 1: Determination of the Existence of the Long Run Relationship of the
Variables
At the first stage the existence of the long-run relation between the variables
under investigation is tested by computing the Bound F-statistic (bound test for
cointegration) in order to establish a long run relationship among the variables.
This bound F-statistic is carried out on each of the variables as they stand as
endogenous variable while others are assumed as exogenous variables.
In practice, testing the relationship between the forcing variable(s) in the
ARDL model leads to hypothesis testing of the long-run relationship among the
underlying variables. In doing this, current values of the underlying variable(s) are
excluded from ARDL model approach to Cointegration.
This approach is illustrated by using an ARDL (p,q) regression with an I(d)
regressor,
yt = Ф1yt-1 + … + Фpyt-p + θ0xt + θ1xt-1 …+ q1xt-p +u1t (4.4)
or
xt = Ф2xt-1 + … + Фpxt-p + θ0yt + θ1yt-1 …+ q1yt-p + u2t (4.5)
t =1,2,…T μt ~ iid(0, δ2).
For convenience the deterministic regressors such as constant and linear time
trend are not included. Where Ф, θ0 and θ1 are unknown parameters, and xt( or yt)
is an I(d) process generated by;
xt= xt-1+Ԑt;
or
yt= yt-1+Ԑt;
ut and Ԑt are uncorrelated for all lags such that xt (or yt) is strictly exogenous with
respect to ut.. Ԑt is a general linear stationary process.
(Cointegration/stability Condition) /Ф/ <1, so that the model is dynamically stable.
This assumption is similar to the stationarity condition for an AR(1) process and
implies that there exists a stable long-run relationship between yt(xt) and xt (yt). If
Ф =1, then there would be no long-run relationship. In practice, this can also be
denoted as follows:
The ARDL (p,q1,q2......qk) model approach to Cointegration testing;
k k
∆𝑋𝑡 = 𝛿0𝑖 + ∑ α i ∆X t −1 + ∑ α 2 ∆Yt −i + δ1X𝑡−1 + δ2Y𝑡−1 + v1𝑡
i =1 i =1
(4.6)
k k
∆Y𝑡 = 𝛿0𝑖 + ∑ α i ∆Yt −1 + ∑ α ∆X 2 t −i + δ1Y𝑡−1 + δ2X𝑡−1 + v1𝑡 (4.7)
i =1 i =1
k is the ARDL model maximum lag order and chosen by the user. The F-statistic
is carried out on the joint null hypothesis that the coefficients of the lagged
variables (δ1X𝑡−1 δ1Y𝑡−1 or δ1Y𝑡−1 δ1X𝑡−1) are zero. (δ1 – δ2) correspond to the
long-run relationship, while (α1 – α2) represent the short-run dynamics of the
model.
The hypothesis that the coefficients of the lag level variables are zero is to be
tested.
The null of non-existence of the long-run relationship is defined by;
Ho: δ1 = δ2= 0 (null, i.e. the long run relationship does not exist)
H1: δ1 ≠ δ2 ≠ 0 (Alternative, i.e. the long run relationship exists)
This is tested in each of the models as specified by the number of variables.
This can also be denoted as follows:
FX(X1│Y1,. . . . . Yk) (4.8)
Fy(Y1│X1,. . . . . Xk) (4.9)
The hypothesis is tested by means of the F- statistic (Wald test) in equation 4.8 and
4.9, respectively. The distribution of this F-statistics is non-standard, irrespective of
whether the variables in the system are I(0) or I(1). The critical values of the F-
statistics for different number of variables (K), and whether the ARDL model
contains an intercept and/or trend are available in Pesaran and Pesaran (1996a), and
Pesaran et al. (2001). They give two sets of critical values. One set assuming that
all the variables are I(0)(i.e. lower critical bound which assumes all the variables
are I(0), meaning that there is no cointegration among the underlying variables) and
another assuming that all the variables in the ARDL model are I(1)( i.e. upper
critical bound which assumes all the variables are I(1), meaning that there is
cointegration among the underlying variables). For each application, there is a band
covering all the possible classifications of the variables into I(0) and I(1). However,
according to Narayan (2005), the existing critical values in Pesaran et al. (2001)
cannot be applied for small sample sizes as they are based on large sample sizes.
Hence, Narayan (2005) provides a set of critical values for small sample sizes,
ranging from 30 to 80 observations. The critical values are 2.496 - 3.346, 2.962 –
3.910, and 4.068 – 5.250 at 90%, 95%, and 99%, respectively.
If the relevant computed F-statistic for the joint significance of the level
variables in each of the equations(4.6 and 4.9), δ1, and δ2 falls outside this band, a
conclusive decision can be made, without the need to know whether the underlying
variables are I(0) or I(1), or fractionally integrated. That is, when the computed F-
statistic is greater than the upper bound critical value, then the H0 is rejected (the
variables are cointegrated). If the F-statistic is below the lower bound critical value,
then the H0 cannot be rejected (there is no cointegration among the variables). If
long run (or multiple long-run relationships) relationships exist in both equations
(4.8 and 4.9) the ARDL approach cannot be applied, hence, Johansen and Juselius
(1990) approach becomes the alternative.
If the computed statistic falls within(between the lower and upper bound) the
critical value band, the result of the inference is inconclusive and depends on
whether the underlying variables are I(0) or I(1). It is at this stage in the analysis
that the investigator may have to carry out unit root tests on the variables (Pesaran
and Pesaran, 1996a). Also, if the variables are I(2), the computed F-statistics of the
bounds test are rendered invalid because they are based on the assumption that the
variables are 1(0) or 1(1) or mutually cointegrated (Chigusiwa et al., 2011).
However, to forestall an effort in futility, it may be advisable to first perform unit
roots, though not as a necessary condition in order to ensure that none of the
variables is I(2) or beyond, before carrying out the bound F-test.
Step 2: Choosing the Appropriate Lag Length for the ARDL Model/
Estimation of the Long Run Estimates of the Selected ARDL Model
If a long run relationship exists between the underlying variables, while the
hypothesis of no long run relations between the variables in the other equations
cannot be rejected, then ARDL approach to cointegration can be applied. The issue
of finding the appropriate lag length for each of the underlying variables in the
ARDL model is very important because we want to have Gaussian error terms (i.e.
standard normal error terms that do not suffer from non-normality, autocorrelation,
heteroskedasticity etc.). In order to select the appropriate model of the long run
underlying equation, it is necessary to determine the optimum lag length(k) by

using proper model order selection criteria such as; the Akaike Information
Criterion(AIC), Schwarz Bayesian Criterion (SBC) or Hannan-Quinn
Criterion(HQC).
The values of AIC, SBC and LP for model 4.3 are given by;
AICp = -n/2(1+log2π)-n/2logδ2-P
SBCp = log(δ2) +(logn/n)P
HQC =log δ +(2loglogn/n)P
LRp,p = n(log[∑p]-log[ˆ∑p])
Where δ2 is Maximum Likelihood(ML) estimator of the variance of the regression
disturbances, ˆ∑p is the estimated sum of squared residuals, and 𝑛 is the number of
estimated parameters, p=0,1,2……P, where P is the optimum order of the model
selected.
The ARDL model should be estimated given the variables in their levels (non-
differenced data) form. The lags of the variables should be alternated, model re-
estimated and compared. Model selection criteria- The model with the smallest
AIC, SBC estimates or small standard errors and high R2 performs relatively better.
The estimates from the best performed become the long run coefficients. This is
appropriate to embark on if it is satisfied that there is long-run relationship between
the underlying variables in order to avoid spurious regression.
The long-run coefficients for yt( or xt) to a unit change in xt( or yt) are
estimated by;
ˆθi = ˆβi(1, ˆqi) = ˆβi0 + ˆβi1 + . . .+ ˆβiq i = 1, 2 . . .
ˆϕ(1,ˆp) 1 – ˆϕ1 – ˆϕ2 - . . .ˆϕp

Where ˆp and ˆqi , i =1, 2,. . .k are the selected(estimated values of p and q, i =1, 2.
. ,k
Similarly, the long-run coefficients associated with the deterministic/exogenous
variables with fixed lags are estimated by;
ˆψ = ˆδ(ˆp, ˆq1,ˆq2,. . ˆ qk)

1 – ˆϕ1 – ˆϕ2 - . .ˆϕp

Where ˆδ(ˆp, ˆq1,ˆq2,. . ˆ qk) denote the OLS estimate of δ in (equation 4.1) for the
selected ARDL model.
In practice, this can also be denoted as follows:

The selected ARDL(k) model long run equation;
k k k k
Y𝑡 = 𝛿0 + ∑ α1 X 1t + ∑ α 2 X 2 t + ∑ α 3 X 3t + ∑α n X nt + 𝑣1𝑡 (4.10)
i =1 i =1 i =1 i =1
𝑋s (𝑋1𝑡, 𝑋2𝑡 , 𝑋3𝑡, ……….. 𝑋n𝑡) are the explanatory or the long run forcing variables, k
is the number of optimum lag order.
The best performed model provides the estimates of the associated Error Correction
Model (ECM).
Step 3: Reparameterization of ARDL Model into Error Correction Model
As we said earlier, when non-stationary variables are regressed in a model we

may get results that are spurious. One way of resolving this is to difference the
data (since most data exhibit DSP) in order to achieve stationarity of the variables.
In this case, the estimates of the parameters from the regression model may be
correct and the spurious equation problem resolved. However, the regression
equation only gives us the short-run relationship between the variables. It does not
give any information about the long run behaviour of the parameters in the model.
This constitutes a problem since researchers are mainly interested in long-run
relationships between the variables under consideration, and in order to resolve
this, the concept of cointegration and the ECM becomes imperative. With the
specification of ECM, we now have both long-run and short-run information
incorporated.
The unrestricted error correction model associated with the ARDL(ˆp, ˆq1,ˆq2,.
. ˆ qk) model can be obtain by rewriting equation 4.1 in terms of the lagged levels
and the first differences of yt..x1t.. ,x2t. . . xkt and wt. First note that;
yt = Δyt + yt-1
s −1
yt-1 = yt −∑ ∆yi − j s =1,2, . . p
j =1
and similarly,
wt = Δwt +wt-1
xt = Δxt +xt-1
s −1
x1t-s = yit-1 −∑ ∆xit − j , s =1,2, . .qi
j =1
Substituting these relations into 4.1 we have;

k p −1 k ^ q −1
Δyt = - ϕ(1,ˆp)ECt-I + ∑ βi 0 ∆xit +δΔwt −∑ φ j ∆ t − j −∑ ∑ βij ∆xi ,1− j + μt (4.11)
i =1 j =1 =i 1 =j 1
ECt is the error correction term defined by;

k
ECt = Ԑt = yt −∑ ^ θi xit – ψ’wt
i =1
The term ECt as the speed of adjustment parameter or feedback effect is

derived as the error term from the cointegration models (4.6 and 4.7) whose
coefficients are obtained by normalizing the equation on Xt (4.6) and Y𝑡 (4.7)
respectively. The ECt shows how much of the disequilibrium is being corrected,
that is, the extent to which any disequilibrium in the previous period is being
adjusted in yt. A positive coefficient indicates a divergence, while a negative
coefficient indicates convergence. If the estimate of ECt = 1, then 100% of the
adjustment takes place within the period, or the adjustment is instantaneous and
full, if the estimate of ECt = 0.5, then 50% of the adjustment takes place each
period/year. ECt = 0, shows that there is no adjustment, and to claim that there is a
long-run relationship does not make sense any more.
Recall that ϕ(1,ˆp) = 1- ˆϕ1 - ˆϕ2 - . . . ˆϕp measures the quantitative importance
of the error correction term. The remaining coefficients ˆϕj and βij, relate to the
short-run dynamics of the model’s convergence to equilibrium. ECt is the residuals
that are obtained from the estimated cointegration model of equations 4.6 and 4.7.
The ARDL models and its associated ECM can be estimated by the OLS
method.
5 Summary and Conclusion

Given the deficiencies associated with standard Johansen and Juselius(1990)
cointegration procedure, it becomes imperative to explore Pesaran and Shin (1999)
and Pesaran et al (1996b) proposed Autoregressive Distributed Lag (ARDL)
approach to cointegration or bound procedure for a long-run relationship. Some of
the deficiencies include: identifying the cointegrating vector(s) where there are
multiple cointegrating relations; applicability when one cointegrating vector of
different order exists. Based on this, this study reviewed Autoregressive Distributed
Lag (ARDL) Approach to cointegration testing in terms of its application,
estimation and interpretation. Given this, the following findings were made:
• ARDL cointegration technique is adopted irrespective of whether the
underlying variables are I(0), I(1) or a combination of both, and cannot be
applied when the underlying variables are integrated of order I(2).
However, to avoid crashing of the ARDL technique and, effort in futility, it
is advisable to tests for unit roots since variables that are integration of
order I(2) leads to the crashing of the technique.
• If the trace or Maximal eigenvalue or the F-statistics establishes that there
exists a single long-run relation among the variables (i.e underlying
variables), ARDL approach can be applied rather than applying Johansen
and Juselius approach. The ARDL technique provides a unified framework
for testing and estimating of cointegration relations in the context of a
single equation.
• If the F-statistics (Wald test) establishes that there is a single long run
relationship and the sample data size is small (n≤ 30) or finite, the ARDL
error correction representation becomes relatively more efficient.
• The ARDL model is reparameterized into ECM when there is one
cointegrating vector among the underlying variables. The reparameterized
result gives the short-run dynamics and long run relationship of the
underlying variables.
• When there are multiple long-run relationships, ARDL approach cannot be
applied. Hence, an alternative approach like Johansen and Juselius (1990)
becomes more appropriate.
This review is an important starting point for future practitioners, as well as a

more reliable research. ARDL cointegration technique is one of the greatest
discoveries of the 20th century solution to the analysis of series with one
cointegrating vector and, it does not require pretesting of unit root. Therefore, there
is need to explore the necessary conditions that give rise to ARDL cointegration
technique in order to avoid its wrongful application, estimation, and interpretation
which may in turn lead to model misspecification and unrealistic estimates.
However, this paper cannot claim to have treated the underlying issues in their
greatest details, but have endeavoured to provide sufficient insight into the issues
surrounding Autoregressive Distributed Lag (ARDL) cointegration technique to
young practitioners to enable them apply the technique, estimate the problem
therein, and interpret the result thereafter. Also, to enable them follow discussions
of the issues in some more advanced texts.
References
[1] D. Asteriou, and S.G. Hall, Applied Econometrics: A Modern Approach,

PALGRAVE MACMILLAN, New York, 2007.
[2] J.E. Davidson, H. David, F. Hendry, F. Srba, and S. Yeo, Econometric
Modeling of the Aggregate Time Series Relationship Between Consumers’
Expenditure and Income in the United Kingdom, Economic Journal, 88,
(1978), 661–692.
[3] M. Bahmani-Oskooee, and T.J.A. Brooks, New Criteria for Selecting the
Optimum Lags in Johansen's Cointegration Technique, Applied Economics, 35,
(2003), 875-880.
[4] Bo Sjö, Testing for Unit Roots and Cointegration, Memo, (2008).
[5] G. Box and G. Jenkins, Time Series Analysis, Forecasting and Control. San
Francisco: Holden-Day, (1970).
[6] P.T. Brandt, and J.T. Williams, Multiple Time Series Models: Quantitative
Applications in the Social Sciences, Sage Publications Ltd, London, 2006.
[7] L. Chigusiwa, S. Bindu, V. Mudavanhu, L. Muchabaiwa and D. Muzambani,
Export-Led Growth Hypothesis in Zimbabwe: Does Export Composition
Matter? International Journal of Economic Resources, (2), (2011), 111-129.
[8] D. Dickey and W. Fuller, Distribution of the Estimators for Autoregressive
Time Series with a Unit Root, Journal of the American Statistical Association,
74, (1979), 427-431.
[9] D. Dickey and W. Fuller, Likelihood Ratio Statistics for Autoregressive Time
Series with a Unit Root, Econometrica, 49, (1981), 1057-1072.
[10] W. Enders, Applied Econometric Time Series, 2nd Ed, John Wiley & Sons Inc,
New York, 2004.
[11] R. Engle, and G. Granger, Cointegration and Error Correction: Representation,
Estimation and Testing, Econometrica, 55, (1987), 251-276.
[12] J. Geweke, R. Meese and W. Dent, Comparing Alternative Tests of
Causality in Temporal System, Journal of Econometrics, 77, (1983), 161-
194.
[13] C.W.J.Granger, Investigating Causal Relations by Econometric Models and

Cross Spectral Methods, Econometrica, 37, (1969), 428-438.
[14] C.W.J. Granger, Some Properties of Time Series Data and Their Use in
Econometric Model Specification, Journal of Econometrics, 28, (1981),
121-130.
[15] C.W.J. Granger, Cointegrated Variables and Error-Correcting Models,
UCSD Discussion, Paper 83-13, (1983).
[16] C.W.J. Granger, Some Recent Developments in a Concept of Causality,
Journal of Econometrics, 39, (1988), 199-211.
[17] C.W.J. Granger, and J. Lin, Causality in the Long-run. Econometric Theory,
11, (1995), 530-536.
[18] C.W.J. Granger, and P. Newbold, Spurious Regressions in Econometrics,
Journal of Econometrics, 26, (1974), 1045-1066.
[19] D. Hendry and G. Mizon, Serial Correlation as a Convenient Simplification
not a Nuisance: A Comment on a Study of the Demand for Money by the
Bank of England. Economic Journal, 88, (1978), 349-363.
[20] D. Hendry and J.F. Richard, The Econometric Analysis of Economic Time
Series, International Statistical Review, 51, (1983), 111-163.
[21] M. Iyoha and O.T Ekanem, Introduction to Econometrics. Mareh
Publishers, Benin City, Nigeria, 2002.
[22] S. Johansen and K. Juselius, Hypothesis Testing for Cointegration Vectors
with an Application to the Demand for Money in Denmark and Finland.
Working Paper No. 88-05, University of Copenhagen, (1988).
[23] S. Johansen and K. Juselius, Maximum Likelihood Estimation and
Inference on Cointegration-With Applications to the Demand for Money.
Oxford Bulletin of Economics and Statistics, 52(2), (1990), 169-210.
[24] D. Kwiatkowski, D., P.C.B. Phillips, P. Schmidt and Y. Shin, Testing the
Null Hypothesis of Stationarity Against the Alternative of a Unit Root,
Journal of Econometrics, (1992), 15978.
[25] G.S. Maddalas, Introduction to Econometrics, 2nd Ed, Englewood Cliffs,

Prentice Hall, 1992.
[26] P.K Narayan, The Saving and Investment Nexus for China: Evidence from
Cointegration Tests, Applied Economics, 37, (2005), 1979–1990.
[27] C.R. Nelson and C.I. Plosser, Trend and Random Walks in
Macroeconomics Time Series. Journal of Monetary Economics, 10(2),
(1982), 139-162.
[28] M.H. Pesaran and B. Pesaran, Microfit 4.0, Cambridge London, Windows
Version, CAMFIT DATA LIMITED, 1996.
[29] M.H. Pesaran, R.J. Smith, and Y. Shin, Testing for the Existence of a long
run Relationship, DAE Working paper No.9622, Department of Applied
Economics, University of Cambridge, (1996b).
[30] M.H. Pesaran, R.J. Smith and Y. Shin, Bounds Testing Approaches to the
Analysis of Level Relationships, Journal of Applied Econometrics, 16,
(2001), 289-326.
[31] M.H. Pesaran and Y. Shin, An Autoregressive Distributed Lag Modeling
Approach to Cointegration Analysis, In: Strom, S., Holly, A., Diamond, P.
(Eds.), Centennial Volume of Rangar Frisch, Cambridge University Press,
Cambridge, (1999).
[32] P. Phillips and P. Perron, Testing for a Unit Root in Time Series
Regression. Bimetrika, 75, (1988), 335-346.
[33] J.D. Sargan, Wages and Prices in the United Kingdom: A Study in
Econometric Methodology. In Econometric Analysis for National Economic
Planning (R.E. Hart, G. Mills, J.K. Whittaker, eds.), Butterworths, London,
(1964), 25–54.
[34] H.R. Seddighi, K.A. Lawler and A.V. Katos, Econometrics: A Practical
Approach, Routledge, London EC4P4EE, 2000.
[35] C. Sims, Money, Income and Causality, American Economic Review, 62,
(1972), 540-552.
[36] A.K. Uko and E. Nkoro, Inflation Forecast with ARIMA, Vector
Autoregressive and Error Correction Models in Nigeria, EJEFAS, Issue 50,
July, (2012).
Munich Personal RePEc Archive
ARDL model as a remedy for spurious

regression: problems, performance and
prospectus
Ghouse, Ghulam and Khan, Saud Ahmed and Rehman, Atiq

Ur
Pakistan Institute of Development Economics
10 January 2018
Online at https://mpra.ub.uni-muenchen.de/83973/
MPRA Paper No. 83973, posted 19 Jan 2018 02:37 UTC
ARDL model as a remedy for spurious regression: problems,
performance and prospectus
(1) Ghulam Ghouse
Ghouserazaa786@gmail.com
PhD scholar (Department of Econometrics and Statistics)
Pakistan Institute of Development Economics, Islamabad, Pakistan.
(2) Saud Ahmed Khan

saudak2k3@yahoo.com
Assistant Professor (Department of Econometrics and Statistics)
(3) Atiq Ur Rehman

atiq@pide.org.pk
Assistant Professor (Department of Econometrics and Statistics)
Abstract
Spurious regression have performed a vital role in the construction of contemporary time series
econometrics and have developed many tools employed in applied macroeconomics. The
conventional Econometrics has limitations in the treatment of spurious regression in non-stationary
time series. While reviewing a well-established study of Granger and Newbold (1974) we realized
that the experiments constituted in this paper lacked Lag Dynamics thus leading to spurious
regression. As a result of this paper, in conventional Econometrics, the Unit root and Cointegration
analysis have become the only ways to circumvent the spurious regression. These procedures are
also equally capricious because of some specification decisions like, choice of the deterministic
part, structural breaks, autoregressive lag length choice and innovation process distribution. This
study explores an alternative treatment for spurious regression. We concluded that it is the missing
variable (lag values) that are the major cause of spurious regression therefore an alternative way
to look at the problem of spurious regression takes us back to the missing variable which further
leads to ARDL Model. The study mainly focus on Monte Carlo simulations. The results are
providing justification, that ARDL model can be used as an alternative tool to avoid the spurious
regression problem.
Keywords: Spurious regression, Stationarity, unit root, cointegration and ARDL.
1. Introduction
The most important feature that led to development of new time series econometrics was spurious
regression. Spurious regression is a phenomena known to econometricians since the times of Yule
(1926). Spurious regression was attributed to missing variable until Granger and Newbold (1974)
who showed that spurious regression could be found with nonstationary time series even with no
missing variable. Nelson and Plosser (1982) argued that most of the time series are better
characterized as nonstationary. Spurious regression have performed a vital role in the construction
of contemporary time series econometrics and have developed many tools employed in applied
macroeconomics. However, the widespread literature considers the non-stationarity as the only
reason for spurious regression. To evade the problem of spurious regression caused by the non-
stationarity, researchers frequently employed unit root and co-integration testing.
Supposing that the spurious regression occurs due to non-stationarity and unit root and
cointegration testing are used as the remedy, even then it is very hard to find reliable inference.
There is no test of unit root with good size and power in small sample. The unit root and
cointegration procedures involves many prior specification decisions e.g. lag length, trend and
structural stability etc. If we do a data based decision making, it will involve a large battery of
tests. Each test is having specific statistical error (type I, II error). The cumulative probability of
error in all tests leave the results of unit root test unreliable. Because, of these reasons, the literature
is still underdevelopment after four decades without of reaching any conclusion.

It is a common fallacy that the unit root only cause of spurious regression. Nonetheless, the missing
relevant variable is a major cause of spurious regression. Even it can be shown that the spurious
regression in Granger and Newbold (1974) experiment was also due to missing variable see,
(section, 5.1).
So, an alternative way to look at the problem of spurious regression takes us back to missing
variable which further leads as to ARDL. Suppose, we have two independent autoregressive
nonstationary series
𝑌𝑡 = 𝜌𝑌𝑡−1 + 𝜀𝑦𝑡 ……….. (1) 𝜌=1
𝑋𝑡 = 𝜌𝑋𝑡−1 + 𝜀𝑥𝑡 ……….. (2) 𝜌=1

Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values. There is no third variable involved in
the construction of both variables. Granger and Newbold (1974) shown that the spurious regression
by estimating of regression of the type
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝜀𝑦𝑡 ……….. (3)
But we know that true data generating process (DGP) of Y and X contain lag of values, including
the lag of Y and X we get
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝛽1 𝑋𝑡−1 + 𝛽3 𝑌𝑡−1 + 𝜀𝑦𝑡 ……….. (4)
Which is an ARDL model. It is observed in our study (section 4) that this kind of model
significantly reduce the probability of spurious regression in case of nonstationary series. This
indicates that spurious regression occur due to missing variable and can be avoided by including
missing lag see, (section, 5).
The objective of this study is to explore an alternative solution that is expected to perform for
nonstationary series. This study will investigates that, is it possible to use ARDL model to evade
the spurious regression bypassing the very complicated and ambiguous unit root testing,
cointegration analysis and other treatments. We will generate the autoregressive (nonstationary,
stationary and negative moving average) series and investigate that how the probability of spurious
increase dramatically in nonstationary case by ignoring the lag dynamics through Monte Carlo
simulations.
2. Literature review
An immense amount of studies are available on spurious regression topic in time series
econometric literature. In this section we briefly discuss the proposed theoretical and empirical
methods for the treatment of spurious regression in literature. The literature review is arranged as
follows
2.1 Spurious Regression in Classical Econometrics
There is long historical debate on nonsense correlation (spurious regression) issue in econometrics
literature, at least seeing back to the well-known study of Yule (1926). In his study, he presented
the presence of a strong correlation of 0.95 between mortality rate and proportion of marriages of the
Church of England to all marriages during 1866 to 1911. Yule (1926) thought that the spurious
regression is a consequence of relevant missing variables.
Simon (1954) also supported the idea that the missing variable is a source of spurious correlation.
Simon described that if we are uncertain that the perceived correlation is spurious, we have to
introduce extra variable which could be observed the genuine correlation.
2.1.1 Granger and Newbold’s Experiment

Granger and Newbold (1974) showed that if the series are nonstationary then the results would be
significant. In their experiment they generated independent autoregressive series like, 𝑋𝑡 and 𝑌𝑡 .
Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values.
𝑌𝑡 = 𝑌𝑡−1 + 𝜀𝑦𝑡 ……….. (5)
𝑋𝑡 = 𝑋𝑡−1 + 𝜀𝑥𝑡 ……….. (6)

There is no third variable involved in the construction of both variables. They regressed 𝑋𝑡 on 𝑌𝑡
and 𝑌𝑡 on 𝑋𝑡 .
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝜀𝑦𝑡 ……….. (7)
𝑋𝑡 = 𝑎 + 𝛽1 𝑌𝑡 + 𝜀𝑥𝑡 ……….. (8)

They came up with spurious results. This alternative explanation of spurious regression become
more popular in literature and other explanations went to the darkness.
2.1.2 Aftermath of Granger and Newbold’s Experiment
2.1.2.1 Why is spurious regression a problem?
To find the relationship between the economic variables is the core objective of economic studies.
The spurious regression offers deceptive statistical evidence of strong relationship even though the
variables are independent. Hendry (1980) demonstrated a spurious correlation between cumulative
rainfall and price level in UK. He inspected that all these time series were stationary at difference
except unemployment rate. Plosser and Schwert (1978) claimed that, the regression without taking
difference of nonstationary series most probably come up with invalid or nonsense results. The
reasoning behind this claim is that if we run regression without taking difference of difference
stationary series, the estimator properties and the distribution of test statistics are no more reliable.
Phillips (1986) examined the asymptotic properties of spurious least square regression model and
endorsed Granger and Newbold (1974) simulation results that the misspecification of level of
series is the key element of spurious correlation.
2.1.2.2 Example of spurious regression in classical literature
Mostly, the nominal economic variables are correlated, even there is no relationship between them,
and the mutual presence of price level in data series develops correlation between them. It was
also shown that many time series are nonstationary that’s why the probability of spurious
regression is very high. We are presenting here some examples of spurious regression form time
series econometrics literature.
Chaouachi (2013) inspected that Dar et al. (2012) in their study provided spurious strong positive
relationship among usage of nass chewing, hookah smoking and many other habits with
oesophageal squamous cell carcinoma (ESCC) risk. Dar et al. (2012) conducted a case control
study in valley of Kashmir, India. They considered 702 historical cases of oesophageal squamous
cell carcinoma (ESCC) and 1663 hospital based controls, exclusively matched to the cases for sex,
age and residence district from Sep, 2008 to Jan, 2012. They used monthly data from Sep, 2008 to
Jan, 2012. They concluded that nass chewing and hookah smoking are strongly positively
associated with (ESCC) risk, which is based on severe misinterpretation. According to Chaouachi
(2013) all the relevant studies showed that there is feeble or insignificant association among nass
chewing, hookah smoking with (ESCC) risk. Chaouachi (2013) stated that Dar et al. (2012) came
up with spurious results because they did not incorporate the very significant element which is
filtering factor of water.
Roger and Jupp (2006) described an example of spurious positive relationship between human
baby’s birth and stork nesting in the sequence of spring, because these two variables are correlated
to a third variable. According to the Roger and Jupp (2006) the sequence of Dutch statistics is
showing a positive relationship between stork nesting in the sequence of spring and human baby’s
birth at that time, it is due to that the both variables are associated to the state of weather. It means
that both variables are independent, but they have relation with the state of weather. This shows
that both variables are spuriously correlated because of third missing variable. According to the
Hofer et al. (2004) this spurious correlation is due to lack of statistical information.
2.1.2.3 Nelson and Plosser experiment and implications
Nelson and Plosser (1982) examined that most of the macroeconomics series of U.S.A economy
are having unit root. Their study is generally acknowledged as a significant contribution with
consequences for the theory and policy. They employed Dickey Fuller test for unit root detection
on fourteen historical macroeconomics series for U.S.A economy, including GNP, wage,
employment, prices, stock prices and interest rate and they found that twelve out of fourteen series
were having unit root. In fact Nelson and Plosser (1982) study is a noteworthy contribution in time
series econometric literature which enhanced the interest of researchers in unit root tests. That’s
why it has fashioned the development in the unit root theory.
2.1.2.4 Development in cointegration tests
Engle and Granger (1987) introduced the co-integration technique as a solution of spurious
regression due to non-stationary time series. According to Granger the non-stationary time series
are cointegrated, if their linear combination is a stationary process. Now the problem is that how
to estimate the long run equilibrium relationship parameters for this Engle and Granger presented
an Error Correction Mechanism. The residuals of equilibrium regression can be used for error
correction model. The first drawback of EG (Engle and Granger) cointegration test is that it only
deals with one cointegrated vector. Second, it depends upon two step estimator, first step is to
produce series of residuals and second, to check the stationarity of residuals series. Third, the
major limitation is the distributions of the estimators are non-standard. Phillips and Ouliaris (1990)
proposed residual based tests under the null hypothesis of no cointegration in time series. In which
the asymptotic distributions of residual based tests depend upon number of variables and
deterministic trend terms. Engle and Yoo (1991) proposed three step procedure to evade the
limitations of EG model, which is an extension of EG model. Engle and Yoo (EY) procedure
confirms that the distributions of the estimators yield the normal distribution. It is also only useful
for one cointegrated vector.
When we have more than one variable then there is the possibility of more than one cointegrated
vector. EG and EY cointegration do not provide any solution in this situation. So, to overcome this
problem Johansen and Juselius (1992) introduced the multivariate cointegration test. The Johansen
and Juselius (JJ) test allows to find out more than one cointegrated vectors so, it is generally more
applicable than EG and EY cointegration tests. We knew that EG and EY single equation
procedures ignore short run dynamics, when the relationships are estimated. But, the JJ procedure
also considers the short run dynamics. Pesaran et al. (1996) and Pesaran (1997) proposed a single
equation ARDL (autoregressive distributed lag) approach for cointegration as an alternative of EG
and EY. The first advantage is the ARDL cointegration approach provides explicit tests for the
presence of a single cointegrating vector, instead of assuming uniqueness. Pesaran and Shin
(1995) revealed that asymptotically valid inference on short run and long run parameters could be
made by employing ordinary least square estimations of ARDL model. So, the ARDL model order
is properly augmented to grant for contemporary correlation among the stochastic elements of the
data generating processes involved in estimation.

2.1.2.5 Problems in cointegration analysis
The cointegration testing is involves many specification decisions which cut the reliability of
results. The existing cointegration testing procedures do not provide any reasonable criteria
regarding these specification decisions: choice of the deterministic part; the structural breaks;
autoregressive lag length choice and innovation process distribution. For further detail see, (section, 2.3.2).
2.2 Conceptual Flaws in Understating of Spurious Regression
It is a common misconception that the spurious regression only prevails due to unit root.
Nevertheless, the missing relevant variable is a major cause of spurious regression. Yule (1926)
first time anticipated that the nonsense correlations could prevail due to missing variable.
Simon (1954) argued that the missing variable is a cause of spurious correlation. Simon has
described this problem in following tactic that if we are uncertain that the observed correlation is
spurious, we should introduce another (extra) variable which may observed the true correlation.
Frey (2002) argued that the spurious regression could be probably due to missing variable.
Even it can be shown that the spurious regression in Granger and Newbold (1974) experiment was
also due to missing variable. In their experiment they generated independent autoregressive series
like, 𝑋𝑡 and 𝑌𝑡 . Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values.
𝑌𝑡 = 𝑌𝑡−1 + 𝜀𝑦𝑡 ……….. (9)
𝑋𝑡 = 𝑋𝑡−1 + 𝜀𝑥𝑡 ……….. (10)

There is no third variable involved in the construction of both variables. They regressed 𝑋𝑡 on 𝑌𝑡
or vice versa without involving their lag values in regression analysis.
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝜀𝑦𝑡 ……….. (11)
𝑋𝑡 = 𝑎 + 𝛽1 𝑌𝑡 + 𝜀𝑥𝑡 ……….. (12)

They came up with spurious results due to missing variable because they did not include the lag
values of variables as an independent variable. It is obvious that on determinant of 𝑌𝑡 that is 𝑌𝑡−1 is
missing in equation (11) and similarly one determinant of 𝑋𝑡 i.e. 𝑋𝑡−1 is missing in equation (12).
Taking these missing variables into account the equation shall become
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝛽2 𝑌𝑡−1 + 𝜀𝑦𝑡 ……….. (13)
Therefore, equation (13) shall not have spurious regression if our supposition if missing variable
problem is true. It is shown in section (4) that it is actually true.
2.3 Problems in prevailing treatments
The most familiar procedures to evade the spurious regression are unit root and cointegrating
testing. These methods are equally capricious because of some specification decisions like, choice
of the deterministic part; the structural breaks; autoregressive lag length choice and innovation process
distribution see, (section, 2.3.1.1). The cointegration analysis which is employed as a tool to avoid
spurious regression, also experience with specification decisions problems see, (section, 2.3.2). It
involves unit root testing which is also unreliable. The tests of unit root are so unreliable that is
why it is very hard to conclude something reasonable see, (section 2.3.1).
2.3.1 Unit root testing
Numerous financial and economic series exhibit nonstationary or trending behavior like, Stock
prices, exchange rate and Gross Domestic Product (GDP) and many others. It is unlikely to get
accurate results from trendy series. The most common procedures to avoid the spurious regression
are unit root and cointegrating testing. These procedures are equally unreliable due to specification
decisions. The cointegration analysis which is used as a tool to avoid spurious regression, suffer
numerous problems. It involves unit root testing and then testing for cointegration. The tests of
unit root are so unreliable that is why it is very hard to conclude something reasonable. The US
GNP is the series used by the large number of researchers as a guinea pig for the tests of unit root.
However, nothing reasonable could be said about the unit root in series. Rehman and Zaman (2008)
summarize findings of researchers in US GNP as follows.
“Trend Stationary: Perron (1989), Zivot and Andrews (1992), Diebold and Senhadji (1996),
Papell and Prodan (2003),
Difference stationary: Nelson and Plosser (1982), Murray and Nelson (2002), Kilian and Ohanian
(2002),
Don’t know; Rudebusch (1993)”.
2.3.1.1 Why unit root tests are so unreliable
So, the important task in econometrics is to determine the most suitable arrangement of trend in
time series. There are two common procedures to eradicate the trend of data are regression with
time trend and differencing. The unit root testing procedure offers an idea which procedure can
be adopted to render the time series stationary. Besides, the precision and specification of unit root
procedures are still a paradox, though, since mid-eighties the literature on unit root testing has been
raised stormily.
Rehman and Zaman (2008) investigated that the two main causes for inadequate performance of
unit root tests are observational equivalence and model misspecification. They mainly targeted
four specification decisions: choice of the deterministic part; the structural breaks; autoregressive lag
length choice and innovation process distribution, and examine their role in an inference from unit root
tests. They explored that these specification decisions seriously impact the performance of unit
root tests. Also investigated that the existing unit root tests do not provide any set criteria regarding
these specification decisions, that is why they came up with unreliable results.
DeJong et al. (1992) found that Choi and Philips (1991) and Philips and Perron (1988) unit root
procedures suffer from size distortion and low power issues in the presence of moving average
(MA). While, Augmented Dicky Fuller (ADF) behaved well. Schwert (2002) Investigated that the
Dicky Fuller (1979, 1981) is responsive to pure autoregressive process assumption means the data
generating process of series is pure autoregressive (AR). When the moving average competent
involves in fundamental process, then the Dicky Fuller reported distribution and test statistic
distribution can be quite different. Many other unit root tests are being proposed, at some extent
they all are facing similar problems.
2.3.2 Problems with Cointegration Testing
Like unit root tests the cointegration testing is also involves many specification decisions which
cut the reliability of results. The existing cointegration testing procedures do not provide any
reasonable criteria regarding these specification decisions, and that leads to their results are
unreliable.
For example, Lag length specification is a significant practical question about the application of
any econometric analysis. Like, in case of unit root test, if the lag length is too short then the serial
correlation remains in errors and the results will be biased. If the lag length is too large this will
reduce the power of the test. In the same way the cointegration tests are also very sensitive to lag
length selection. Agunloye et al. (2014) explored that the Engle Granger (EG) cointegration test is
extremely sensitive to lag length. Carrasco et al. (2009) examined that the lag length
misspecification may significantly affect the cointegration results. In case of the under
specification, it could undermine the cointegration results and in over specification, it may
diminish the power of test. Similarly, trend specification is also a very significant issue in
econometric literature.
Ahking (2002) explored that when the deterministic linear time trend included in Johansen’s
cointegration test it provides disproving results and after exclusion of deterministic linear time
trend got robust results. He also suggested that great attention must be taken in trend specification
in cointegration analysis. There are lot of studies are available in literature on this issue but most
of them are with different results. Leybourne and Newbold (2003) used three cointegration test for
independent integrated series and each series has a structural break. They found cointegration
among them until structural break are not properly treated. Choi et al. (2004) examined that the
economic models for cointegration are often provided erroneous results. The main reason is the
errors are unit root non-stationary owing one of the variable has non-stationary measurement error.
They stated that “If the money demand function is stable in the long-run, we have a cointegrating
regression when money is measured with a stationary measurement error but have a spurious
regression when money is measured with a nonstationary measurement error”.
3. What is ARDL Model?
In ARDL model the dependent variable is expressed by the lag and current values of independent
variable and its own lag value. Davidson et al. (1978) proposed ARDL methodology (DHSY
hereafter) to model the UK consumption function. ARDL model normally starts from reasonably
general and large dynamic model and progressively reducing its mass and altering variable by
imposing linear and non-linear restrictions (Charemza and Deadman, 1997). Autoregressive
distributed lag (ARDL) model is one of the most general dynamic unrestricted model in
econometric literature. As we know ARDL methodology follows general to specific approach,

that’s why it could be possible to tackle many econometric problems like, misspecification and
autocorrelation, and come up with a most appropriate interpretable model.
The ARDL (1, 1) is the simplest form of ARDL model. Consider an ARDL (1, 1) model
Hendry and Richard (1983), Hendry, Pagan and Sargan (1984) and Charemza and Deadman (1997)
argued that by imposing restrictions we can find out at least ten most appropriate economically
interpretable models from ARDL (1, 1) model. We are giving hare some important cases of
restriction
1. 𝛽2 = 𝛽3 = 0 Static regression,
2. 𝛽1 = 𝛽2 = 0 First order autoregressive process,
3. 𝛽3 = 1, 𝛽1 = −𝛽2 Equation in first difference,
4. 𝛽2 = 0 Partial adjustment equation
As discussed, the spurious regression is may be a consequence of missing variable. ARDL is a
general specification taking into account the lag structure. Therefore it could give better results.
4. The Methodology
This study mainly focuses on Monte Carlo Simulations. The data would be generated with pre
decided specifications and the probability of spurious regression would be tested using classical
methods and with ARDL model.
The Components of the methodology are as following
I. Data generating process (DGP) see (section 3.1)
II. Testing and Simulations see (section 4)

4.1 Data Generating Process (DGP)
Let’s, we have a data generating process
𝑥𝑡 𝜃 𝜃12 𝑥𝑡−𝑖 𝜀𝑥𝑡 𝜀𝑥𝑡 0 1 𝜌

[𝑦 ] = [ 1 ] [𝑦 ] + [𝜀 ] [𝜀 ] ~𝑁 ([ ] , [ ])
𝑡 𝜃21 𝜃2 𝑡−𝑖 𝑦𝑡 𝑦𝑡 0 𝜌 1
We can rewrite it as for simplification of notation
𝑋𝑡 = 𝐴𝑋𝑡−𝑖 + 𝜀𝑡 𝜀𝑡 ~𝑁(0, Σ) …….. (19)
The data generating process equation (18) can generate data in quite large types of scenarios.
Suppose, 𝜃12 = 𝜃21 = 0 and 𝜌 = 0, the data generating process will generate two independent
series and would be indication of spurious regression if the regression of 𝑥𝑡 on 𝑦𝑡 turns out to be
significant. If A= 0, it indicates that there is no autocorrelation and cross autocorrelation in the
series. If A is zero it means series would be IID (identically independently distributed). The value
of degree of association depends upon only ∑.
5. Results and Inference

In this section we present inferences based on real and simulated data. The real data is based on
Gross domestic product of thirty seven countries Albania, Antigua and Barbuda, Argentina,
Austria, Bahamas, Bahrain, Barbados, Belgium, Botswana, Brazil, Brunei Darussalam, Cabo
Verde, Canada, Comoros, Congo, Costa Rica, Denmark, Dominica, El Salvador, Fiji, Finland,
France, Gabon, Gambia, Germany, Grenada, Guinea-Bissau, Guyana, Honduras, Hong Kong, Iraq
Iceland, Ireland, Israel, Italy, Kiribati and Luxembourg from 1980 to 2014. We employed the
ADF unit root test and come to know all the series are stationary at first difference. All the series
are statistically independent of each other. We regress Antigua and Barbuda, Argentina, Austria,
Bahamas, Bahrain, Barbados, Belgium, Botswana, Brazil, Brunei Darussalam, Cabo Verde,
Canada, Comoros, Congo, Costa Rica, Denmark, Dominica, El Salvador, Fiji, Finland, France,
Gabon, Gambia, Germany, Grenada, Guinea-Bissau, Guyana, Honduras, Hong Kong, Iraq Iceland,
Ireland, Israel , Italy, Kiribati and Luxembourg on Albania and found that all regression come up
with significant results. Even though all the series are independent of each other. As we can see in
table 1 which is consists on linear regression results, all the GDP series are having statistically
significant relations. Table consists on the coefficient values and the P values are in parenthesis.
The P-values indicating that all the relation are highly significant even at 1% level of significance.
The table 3 shows the residual analysis of linear regression model. It shows that all the results of
autocorrelation are significant at 1% level of significance. While the LM test for heteroskedasticity
results are also significant, expect 15 cases. It means out of 36 regression only 15 regression
residuals facing heteroskedasticity. Nonetheless, the table 4 is presenting the residual analysis of
ARDL model. As we can see that the autocorrelation test are insignificant at 5% except Argentina
and Brunei Darussalam, they are insignificant at 1%. The Hetroscedasticity test statistics are
insignificant at 5% except Argentina, Canada but in case of Canada it is insignificant at 1%.
These results infer that ARDL model significantly reduced the probability of spurious regression
from 100% to approximately 5%. It also rejects the common misconception about the spurious
regression that it is only prevails due to unit root. Nevertheless, the missing relevant variable is a
major cause of spurious regression. As we introduced the lag values the probability of spurious
regression reduced significantly.

Table 1 Results after running Simple Linear Regression Model
Countries ATG ARG AUT BHS BHR BRB BEL BWA BRA BRN CPV CAN
173.456 0.845535 2.79251 115.61 55.0787 1176.54 2.44742 6.94373 0.374535 89.9904 3.37695 0.355487
Coeffi
[0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000]
Countries COM COG CRI DNK DMA SLV FJI FIN FRA GAB GMB DEU
9.22622 0.558613 0.245078 0.299897 682.991 69.922 158.877 4.29561 0.278377 0.127738 32.1748 0.20898
Coeffi
[0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000]
Countries GRD GNB GUY HND HKG ISL IRQ IRL ISR ITA KIR LUX
324.926 1.98374 2.03643 3.9283 0.2911 0.374548 0.0050 2.78307 0.630868 0.319785 5020.83 13.7727
Coeffi
[0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000]
Table 2 Results after employing ARDL model

Coeffi 2.2113 2.3136 1.9177 3.3884 2.3568 0.97170 1.5636 1.2890 2.6769 2.9427 2.5011 1.7673
[0.1271] [0.0984] [0.1515] [0.024]* [0.0949] [0.4211] [0.2220] [0.2991] [0.0679] [0.0517] [0.0692] [0.1781]
Coeffi 2.5938 1.0733 2.4079 0.55250 1.3533 2.5329 3.9684 2.7890 2.4591 0.75471 1.7834 1.2943
[0.0741] [0.3776] [0.0900] [0.6510] [0.2789] [0.0789] [0.018]* [0.0605] [0.0905] [0.4795] [0.1751] [0.2900]
Coeffi 2.5668 1.8490 2.7830 2.2760 1.4923 2.1955 2.1649 2.8124 2.2770 0.19603 2.5335 3.0034
[0.0947] [0.1631] [0.0609] [0.1034] [0.2399] [0.1301] [0.1163] [0.0591] [0.1033] [0.8231] [0.0666] [0.0658]
The coefficient values are given in table 1 and 2. The P values are in square brackets. The table 2 consists on the F-stat coefficient value which is used
to check the joint significance of independent variable and its lag values. Under null hypothesis H0: restrictions are valid. * shows the values which are
significant at less than 5% level of significance.
Table 3 Residual Analysis after simple linear regression Model
108.46 37.166 74.421 50.957 44.826 28.088 58.430 47.607 46.912 70.425 42.454 93.299
AR (1-2) [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000]
4.8584 11.807 3.0080 3.4207 10.664 1.8093 4.6607 6.8516 1.5076 0.38325 12.618 0.42721
Hetro test [0.0144] [0.0002] [0.0650] [0.0464] [0.0003] [0.1818] [0.0176] [0.0037] [0.2383] [0.6850] [0.0001] [0.6564]
23.463 27.073 48.132 192.11 49.202 92.324 70.445 52.093 179.65 108.46 37.166 176.23
AR (1-2) [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000]
3.9430 0.47118 12.139 2.6543 6.0150 1.4328 0.93896 0.56898 1.9298 4.8584 11.807 3.0440
Hetro test
[0.0306] [0.6290] [0.0001] [0.0874] [0.0065] [0.2550] [0.4026] [0.5723] [0.1634] [0.0144] [0.0002] [0.0631]
51.204 44.374 60.715 38.748 43.418 46.786 17.337 70.961 56.707 271.27 36.165 55.628
AR (1-2)
[0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000] [0.0000]
0.84706 1.8925 1.7900 9.6925 6.4124 0.32064 8.8652 0.0082 9.2912 4.0579 0.54471 4.9846
Hetro test
[0.4390] [0.1688] [0.1849] [0.0006] [0.0049] [0.7282] [0.0010] [0.9917] [0.0008] [0.0279] [0.5858] [0.0138]
Table 4 Residual Analysis after ARDL Model

3.2957 4.8584 2.8581 1.8220 1.9423 1.5511 4.0584 2.4124 2.5211 3.5946 2.0736 0.91149
AR (1-2) [0.0530] [0.0144] [0.0770] [0.1834] [0.1653] [0.2325] [0.1144] [0.1110] [0.1014] [0.0431] [0.1477] [0.4154]
0.32156 173.456 1.7750 1.7026 1.4521 1.0165 173.456 1.3759 1.9732 0.83462 0.74033 2.5478
Hetro test
[0.9195] (0.000) [0.1287] [0.1461] [0.2257] [0.4624] (0.8006) [0.2572] [0.0911] [0.6020] [0.6804] [0.0341]
2.6788 2.5176 2.2340 1.6776 2.4530 2.6688 1.9338 2.9251 4.1468 3.2957 2.0250 2.4418
AR (1-2)
[0.0891] [0.1017] [0.1289] [0.2080] [0.1073] [0.0898] [0.1665] [0.0730] [0.5284] [0.0530] [0.1539] [0.1067]
2.1383 0.85294 1.4666 0.90939 1.3667 1.4615 2.6555 2.0190 3.0658 0.32156 2.4831 2.1177
Hetro test
[0.0684] [0.5871] [0.2201] [0.5422] [0.2612] [0.2221] [0.0285] [0.0841] [0.2147] [0.9195] [0.0380] [0.0869]
3.0651 3.3840 3.0038 3.3708 2.2546 3.1490 1.0909 2.4140 1.6316 3.2739 0.92427 3.2180
AR (1-2) [0.0638] [0.0507] [0.0670] [0.1499] [0.1267] [0.0596] [0.3520] [0.1109] [0.2166] [0.0539] [0.4117] [0.0564]
0.82311 1.2434 2.2719 0.26387 1.1274 0.60599 0.68400 1.6800 1.9253 1.5242 2.0197 1.3186
Hetro test
[0.5628] [0.3214] [0.0691] [0.9486] [0.3885] [0.7231] [0.7276] [0.1520] [0.0990] [0.2112] [0.0848] [0.2857]
AR null hypothesis H0: There is autocorrelation. LM test for Hetroskedastic with null hypothesis H0: There is no hetroskedasticity
The reason behind the spurious regression is that when the potential variable is missing from the
regression, then the irrelevant variable acts as a proxy of potential variable. It captures the effect
of potential variables and then the results would be significant. If we start with ARDL model it
will overtake the problem of missing variable. Even it can be shown that the results in Granger
and Newbold (1974) experiments were significant only due to missing lag values. See, (section,
5.1).
5.1 Simulation results with nonstationary series of integrated order 1
We have generated two independent autoregressive random nonstationary series of integrated
order 1 by using our data generating process given above, imposing restrictions 𝜃12 = 𝜃21 = 0
and 𝜌 = 0. Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values and the coefficients of lag
values 𝜃1 = 𝜃2 = 1 .
𝑌𝑡 = 𝛼0 +𝜃1 𝑌𝑡−1 + 𝜀𝑦𝑡 ……….. (15)
𝑋𝑡 = 𝛼1 + 𝜃2 𝑋𝑡−1 + 𝜀𝑥𝑡 ……….. (16)
We are using sample size of 50 observations. We regress 𝑋𝑡 on 𝑌𝑡 by using simple linear regression
model. The equation is following
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝜀𝑦𝑡 ……….. (17)
Monte Carlo simulation are used for simulations of results. We simulated the t-stat value of X
variable 1000 time and the results are explained through figure 1 given below. The vertical lines
are indicating the asymptotic critical value at 5% nominal level of significance which is 1.96. It is
noticeable that wider area of distribution lies in rejection region. The regression is estimated at 5%
nominal level of significance but after 1000 time simulations of t-statistics for coefficient, we got
the probability of spurious regression is increased from 5% to 67%. It means that we got 670 times
significant results out 1000 instead of 50 times out of 1000.
Figure 1: The distribution of t-statistics for coefficient of X (t)
0.09 The distribution of t-statistics for coefficient
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
-20 -15 -10 -5 0 5 10 15 20
These spurious results are due to missing variable because we did not include the lag values of
variables as an independent variable. Now, if we include the lag value as an independent variable
then the model become ARDL (1, 1). We can see that the ARDL (1, 1) model reduce the
probability of spurious regression and eliminate the chances of spurious regression. The equation
is following
Figure 2 shows the distribution of t-statistics for coefficient 𝑋𝑡 after ARDL (1, 1) model. The
vertical lines are indicating the asymptotic critical value at 5% nominal level of significance which
is 1.96. It is noticeable that smaller area of distribution lies in rejection region. The regression is
estimated at 5% nominal level of significance, after 1000 time simulation of t-statistics for
coefficient, the probability of spurious regression recorded to be approximately 5%. This directs
that ARDL can be used as a treatment of spurious regression with nonstationary series. Same
experiments were done in Granger and Newbold (1974) experiments and they did not consider the
lag dynamic. That’s why they got spurious results.
Figure 2: The distribution of t-statistics of coefficient of X(t) after ARDL model
0.40
The distribution of t-statistics of coefficient after ARDL model
0.35
0.30
0.25
0.20
0.15
0.10
0.05
-4 -3 -2 -1 0 1 2 3 4
5.2 Simulation results with nonstationary series of integrated order 2
We have generated two independent autoregressive random nonstationary series of integrated
order 2 by using our data generating process given above, imposing restrictions 𝜃12 = 𝜃21 = 0
and 𝜌 = 0. Where 𝑋𝑡 and 𝑌𝑡 both are expressed by their own lag values and the coefficients of lag
values 𝜃1 = 𝜃2 = 1 . We are using sample size of 50 observations. We regress 𝑋𝑡 on 𝑌𝑡 by using
simple linear regression model. the equation is following

𝑌𝑡 = 𝑌𝑡−1 + 𝑌𝑡−2 + 𝜀𝑦𝑡 ……….. (19)
𝑋𝑡 = 𝑋𝑡−1 + 𝑋𝑡−2 + 𝜀𝑥𝑡 ……….. (20)
There is no third variable involved in the construction of both variables. We regressed 𝑋𝑡 on 𝑌𝑡 and
𝑌𝑡 on 𝑋𝑡 without involving their lag values in regression analysis.
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝜀𝑦𝑡 ……….. (21)
Figure 3: The distribution of t-statistics for coefficient of X (t)
0.0225
The distribution of t-statistics for coefficient
0.0200
0.0175
0.0150
0.0125
0.0100
0.0075
0.0050
0.0025
-200 -150 -100 -50 0 50 100 150 200
The vertical lines are indicating the asymptotic critical value at 5% nominal level of significance
which is 1.96. It is noticeable that wider area of distribution lies in rejection region. The regression
is estimated at 5% nominal level of significance but after 1000 time simulation of t-statistics for
coefficient, we got the probability of spurious regression is 92%. It means that the probability of
spurious regression is increased 87%.

These spurious results are due to missing variable because we did not include the lag values of
variables as an independent variable. Now at first, we include the one lag value of X and Y as an
independent variables then the model become ARDL (1, 1). The equation is following
Figure 4: The distribution of t-statistics for coefficient of X (t) after ARDL (1, 1)
The distribution of t-statistics of coefficient after ARDL(1,1)
0.125
0.100
0.075
0.050
0.025
-12.5 -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0
Figure 4 shows the distribution of t-statistics for coefficient of linear regression. The vertical lines
are indicating the asymptotic critical value at 5% nominal level of significance which is 1.96. It is
noticeable that wider area of distribution lies in rejection region. The regression is estimated at 5%
nominal level of significance but after 1000 time simulation of t-statistics for coefficient, we got
actual level of significance which is 50%. It means ARDL (1, 1) reduced the probability of spurious
regression from 87% to 45%.

These spurious results are due to missing variable because we did not include the second lag values
of variables as an independent variable. Now, we also include the second lag values of X and Y as
an independent variable then the model become ARDL (2, 2). The equation is following
𝑌𝑡 = 𝑎 + 𝛽1 𝑋𝑡 + 𝛽2 𝑋𝑡−1 + 𝛽3 𝑌𝑡−1 + 𝛽2 𝑋𝑡−2 + 𝛽3 𝑌𝑡−2 + 𝜀𝑦𝑡 ……….. (23)
Figure 5: The distribution of t-statistics for coefficient of X(t) after ARDL (2, 2)
The distribution of t-statistics of coefficient after ARDL(2,2)
0.35
0.30
0.25
0.20
0.15
0.10
0.05
-5 -4 -3 -2 -1 0 1 2 3
It is noticeable that wider area of distribution lies in rejection region. The regression is estimated
at 5% nominal level of significance but after 1000 time simulation of t-statistics for coefficient,
we got the probability of spurious regression is 7%. This indicates a distortion of only 2%. It means
that we got 70 times significant results out 1000 instead of 50 times out of 1000. This directs that
ARDL can be used as a treatment of spurious regression in case of higher integrated order time
series.
6. Conclusion
The Unit root and Cointegration analysis are the only ways to circumvent the spurious regression
in case of nonstationarity in conventional Econometrics. Nevertheless, these procedures are
equally unreliable because of some specification decisions like, autoregressive lag length choice,
choice of the deterministic part, structural breaks and innovation process distribution. After having
reviewed an excessive amount of available literature and inferences, we have been able to conclude
that it is the missing variable (lag values) that are the major cause of spurious regression in all the
cases therefore an alternative way to look at the problem of spurious regression takes us back to
the missing variable which further leads to ARDL Model. The results are also providing
justification that ARDL model can be used as a remedy of spurious regression.
Reference
Atiq-ur-Rehman, A. U. R., & Zaman, A. (2008). Model specification, observational equivalence
and performance of unit root tests.
Ahking, F. W. (2002). Model mis-specification and Johansen's co-integration analysis: an
application to the US money demand. Journal of Macroeconomics, 24(1), 51-66.
Agunloye, O. K., & Shangodoyin, D. K. (2014). Lag Length Specification in Engle-Granger
Cointegration Test: A Modified Koyck Mean Lag Approach Based on Partial Correlation.
Statistics in Transition new series, 15(4).
Charemza, W. W., & Deadman, D. F. (1997). New directions in econometric practice. Books.
Choi, C. Y., Hu, L., & Ogaki, M. (2004). A spurious regression approach to estimating structural
parameters. Ohio State University Department of Economics Working Paper, (04-01).

Carrasco Gutierrez, C. E., Castro Souza, R., & Teixeira de Carvalho Guillén, O. (2009). Selection
of optimal lag length in cointegrated VAR models with weak form of common cyclical features.
Chaouachi, K. (2013). False positive result in study on hookah smoking and cancer in Kashmir:
measuring risk of poor hygiene is not the same as measuring risk of inhaling water filtered tobacco
smoke all over the world. British journal of cancer, 108(6), 1389.
Davidson, J. E., Hendry, D. F., Srba, F., & Yeo, S. (1978). Econometric modelling of the aggregate
time-series relationship between consumers' expenditure and income in the United Kingdom. The
Economic Journal, 661-692.
DeJong, D. N., Nankervis, J. C., Savin, N. E., & Whiteman, C. H. (1992). The power problems of
unit root test in time series with autoregressive errors. Journal of Econometrics, 53(1-3), 323-343.
Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series
with a unit root. Journal of the American statistical association, 74(366a), 427-431.
Dickey, D. A., & Fuller, W. A. (1981). Likelihood ratio statistics for autoregressive time series
with a unit root. Econometrica: Journal of the Econometric Society, 1057-1072.
Engle, R. F., & Granger, C. W. (1987). Co-integration and error correction: representation,
estimation, and testing. Econometrica: journal of the Econometric Society, 251-276.
Engle, R. and Yoo Sam (1991). Forecasting and Testing in Co-integrated Systems, In Engle and
Granger (eds.), Long Run Economic Relationships. Readings in Cointegration, Oxford University
Press, New York, 237-67.
Frey, B. S. (2002). Inspiring economics: Human motivation in political economy. Edward Elgar
Publishing.
Granger, C. W., & Newbold, P. (1974). Spurious regressions in econometrics. Journal of
econometrics, 2(2), 111-120.
Hashimzade, N., & Thornton, M. A. (Eds.). (2013). Handbook of research methods and
applications in empirical macroeconomics. Edward Elgar Publishing.
Hassler, U. (2003). Nonsense regressions due to neglected time-varying means. Statistical Papers,
44(2), 169-182.
Hendry, D. F. (1980). Econometrics-alchemy or science?. Economica, 387-406.
Hendry, D. F., & Richard, J. F. (1983). The econometric analysis of economic time series.
International Statistical Review/Revue Internationale de Statistique, 111-148.
Hendry, D. F., Pagan, A. R., & Sargan, J. D. (1984). Dynamic specification. Handbook of
econometrics, 2, 1023-1100.
Höfer, Thomas; Hildegard Przyrembel; Silvia Verleger (2004). New evidence for the Theory of
the Stork. Paediatric and Perinatal Epidemiology. 18 (1): 18–22.
Juselius, K. (1992). Testing structural hypotheses in a multivariate cointegration analysis of the
PPP and the UIP for UK. Journal of econometrics, 53(1-3), 211-244.
Leybourne, S. J., & Newbold, P. (2003). Spurious rejections by cointegration tests induced by
structural breaks. Applied Economics, 35(9), 1117-1121.
Nelson, C. R., & Plosser, C. R. (1982). Trends and random walks in macroeconmic time series:
some evidence and implications. Journal of monetary economics, 10(2), 139-162.
Plosser, C. I., & Schwert, G. W. (1978). Money, income, and sunspots: measuring economic
relationships and the effects of differencing. Journal of Monetary Economics, 4(4), 637-660.
Perron, P. (1990). Testing for a unit root in a time series with a changing mean. Journal of Business
& Economic Statistics, 8(2), 153-162.
Phillips, P. C. (1986). Understanding spurious regressions in econometrics. Journal of
econometrics, 33(3), 311-340.
Phillips, P. C., & Perron, P. (1988). Testing for a unit root in time series regression. Biometrika,
335-346.
Pesaran, M. H., Shin, Y., & Smith, R. J. (1996). Testing for the' Existence of a Long-run
Relationship' (No. 9622). Faculty of Economics, University of Cambridge.
Pesaran, M. H. (1997). The role of economic theory in modelling the long run. The Economic
Journal, 107(440), 178-191.
Pesaran, M. H., & Smith, R. (1995). Estimating long-run relationships from dynamic
heterogeneous panels. Journal of econometrics, 68(1), 79-113.
Rehman, A. U., & Malik, M. I. (2014). The modified R a robust measure of association for time
series. Electronic Journal of Applied Statistical Analysis, 7(1), 1-13.
Sapsford, Roger; Jupp, Victor, eds. (2006). Data Collection and Analysis. Sage. ISBN 0-7619-
4362-5.
Simon, H. A. (1954). Spurious correlation: a causal interpretation. Journal of the American
statistical Association, 49(267), 467-479.
Sun, Y. (2004). A convergent t-statistic in spurious regressions. Econometric Theory, 20(05), 943-
962.
Su, J. J. (2008). A note on spurious regressions between stationary series. Applied Economics
Letters, 15(15), 1225-1230.
Schwert, G. W. (2002). Tests for unit roots: A Monte Carlo investigation. Journal of Business &
Economic Statistics, 20(1), 5-17.
Ventosa-Santaulària, D. (2009). Spurious regression. Journal of Probability and Statistics, 2009.
Yule, G. U. (1926). Why do we sometimes get nonsense-correlations between Time-Series?--a
study in sampling and the nature of time-series. Journal of the royal statistical society, 89(1), 1-
63.
ardl: Estimating autoregressive distributed lag

and equilibrium correction models
Sebastian Kripfganz1 Daniel C. Schneider2
1 University of Exeter Business School, Department of Economics, Exeter, UK

2 Max Planck Institute for Demographic Research, Rostock, Germany
London Stata Conference

September 7, 2018
ssc install ardl

ARDL: autoregressive distributed lag model
The autoregressive distributed lag (ARDL)1 model is being

used for decades to model the relationship between
(economic) variables in a single-equation time series setup.
Its popularity also stems from the fact that cointegration of
nonstationary variables is equivalent to an error correction
(EC) process, and the ARDL model has a reparameterization
in EC form (Engle and Granger, 1987; Hassler and Wolters, 2006).
The existence of a long-run / cointegrating relationship can
be tested based on the EC representation. A bounds testing
procedure is available to draw conclusive inference without
knowing whether the variables are integrated of order zero or
one, I(0) or I(1), respectively (Pesaran, Shin, and Smith, 2001).
1
Another commonly used abbreviation is ADL.
The ARDL / EC model is useful for forecasting and to

disentangle long-run relationships from short-run dynamics.

Long-run relationship: Some time series are bound together
due to equilibrium forces even though the individual time
series might move considerably.
5
1960 1965 1970 1975 1980
log consumption
log income
log investment
Data: National accounts, West Germany, seasonally adjusted, quarterly, billion DM, Lütkepohl (1993, Table E.1).
ARDL model
ARDL(p, q, . . . , q) model:
p q
β 0i xt−i + ut ,
X X
yt = c0 + c1 t + φi yt−i +
i=1 i=0
p ≥ 1, q ≥ 0, for simplicity assuming that the lag order q is

the same for all variables in the K × 1 vector xt .
ardl depvar [indepvars ] [if ] [in ] [, options ]
ardl options for the lag order selection:
Fixed lag order for some or all variables: lags(numlist )
Optimally with the Akaike information criterion: aic
Optimally with the Bayesian information criterion:2 bic
Maximum lag order for selection criteria: maxlags(numlist )
Store information criteria in a matrix: matcrit(name )
Default: lags(.) bic maxlags(4)
2
The BIC is also known as the Schwarz or Schwarz-Bayesian information criterion.
Reproducible example: ARDL lag specification

. webuse lutkepohl2
(Quarterly SA West German macro data, Bil DM, from Lutkepohl 1993 Table E.1)
. ardl ln_consump ln_inc ln_inv, lags(. . 0) aic maxlags(. 2 .) matcrit(lagcombs)

F( 7, 80) = 49993.34
Prob > F = 0.0000
R-squared = 0.9998
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_consump |
L1. | .4568483 .1064085 4.29 0.000 .2450887 .6686079
L2. | .3250994 .1127767 2.88 0.005 .1006666 .5495322
L3. | .1048324 .1092992 0.96 0.340 -.11268 .3223449
L4. | -.1632413 .0853844 -1.91 0.059 -.3331616 .0066791
|
ln_inc |
--. | .4629184 .078421 5.90 0.000 .3068557 .6189812
L1. | -.202756 .0965775 -2.10 0.039 -.3949513 -.0105607
|
ln_inv | .0080284 .0118391 0.68 0.500 -.0155322 .0315889
_cons | .0373585 .0143755 2.60 0.011 .0087504 .0659667
------------------------------------------------------------------------------
Example (continued): Information criteria
. matrix list lagcombs
lagcombs[12,4]
ln_consump ln_inc ln_inv aic
r1 1 0 0 -585.22447
r2 1 1 0 -585.39189
r3 1 2 0 -583.88179
r4 2 0 0 -590.66282
r5 2 1 0 -592.6904
r6 2 2 0 -591.62792
r7 3 0 0 -588.69069
r8 3 1 0 -590.83183
r9 3 2 0 -589.67101
r10 4 0 0 -590.03466
r11 4 1 0 -592.73282
r12 4 2 0 -592.15636
. estat ic
Akaike’s information criterion and Bayesian information criterion
-----------------------------------------------------------------------------
Model | Obs ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
. | 88 -64.51057 304.3747 8 -592.7495 -572.9308
-----------------------------------------------------------------------------
Note: N=Obs used in calculating BIC; see [R] BIC note.
Example (continued): Fast automatic lag selection

. timer on 1
. ardl ln_consump ln_inc ln_inv, aic dots noheader

----+---20%---+---40%---+---60%---+---80%---+-100%
..................................................
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 1
. timer list 1
1: 0.01 / 1 = 0.0150
Example (continued): Slow automatic lag selection

. timer on 2
. ardl ln_consump ln_inc ln_inv, aic dots noheader nofast

----+---20%---+---40%---+---60%---+---80%---+-100%
..................................................
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_consump |
L1. | .3068554 .0958427 3.20 0.002 .1160853 .4976255
L2. | .325385 .0789039 4.12 0.000 .1683307 .4824393
|
ln_inc | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
--. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
L1. | -.0375288 .0225036 -1.67 0.099 -.0823212 .0072636
L2. | .0228142 .0228968 1.00 0.322 -.0227607 .0683892
L3. | -.0129321 .0226411 -0.57 0.569 -.0579981 .0321339
L4. | -.0528173 .0184696 -2.86 0.005 -.0895801 -.0160544
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
. timer off 2
. timer list 2
2: 0.75 / 1 = 0.7520
Example (continued): Sample depends on lag selection

. ardl ln_consump ln_inc ln_inv, aic maxlags(8 8 4)

F( 8, 75) = 56976.90
Prob > F = 0.0000
R-squared = 0.9998
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_consump |
L1. | .30383 .0942165 3.22 0.002 .1161411 .491519
L2. | .3195318 .0776321 4.12 0.000 .1648808 .4741828
|
ln_inc | .3767587 .0389267 9.68 0.000 .2992128 .4543046
|
ln_inv |
--. | .0581759 .0170736 3.41 0.001 .0241635 .0921884
L1. | -.0185484 .0214624 -0.86 0.390 -.0613036 .0242068
L2. | .01012 .021505 0.47 0.639 -.0327202 .0529602
L3. | -.0146641 .0213098 -0.69 0.493 -.0571154 .0277872
L4. | -.0488136 .0174121 -2.80 0.006 -.0835003 -.0141269
|
_cons | .0416317 .0107782 3.86 0.000 .0201603 .063103
------------------------------------------------------------------------------
ARDL model: Optimal lag selection

The optimal model is the one with the smallest value (most
negative value) of the AIC or BIC. The BIC tends to select
more parsimonious models.
The information criteria are only comparable when the sample
is held constant. This can lead to different estimates even
with the same lag orders if the maximum lag order is varied.
ardl uses a fast Mata-based algorithm to obtain the optimal
lag order. This comes at the cost of minor numerical
differences in the values of the criteria compared to estat ic
but the ranking of the models is unaffected. The option
nofast avoids this problem but it uses a substantially slower
algorithm based on Stata’s regress command.
For very large models, it might be necessary to increase the
admissible maximum number of lag combinations with the
option maxcombs(# ).
EC representation
Reparameterization in conditional EC form (ardl option ec):
∆yt = c0 + c1 t − α(yt−1 − θxt )

p−1 q−1
ψ 0xi ∆xt−i + ut .
X X
+ ψyi ∆yt−i +
i=1 i=0
Pp
with the speed-of-adjustment coefficient α = 1 − j=1 φi and
Pq
β
j=0 j
the long-run coefficients θ = α .
Alternative EC parameterization (ardl option ec1):
∆yt = c0 + c1 t − α(yt−1 − θxt−1 )

p−1 q−1
0
ψ 0xi ∆xt−i + ut ,
X X
+ ψyi ∆yt−i + ω ∆xt +
i=1 i=1
Example (continued): EC representation
. ardl ln_consump ln_inc ln_inv, aic ec noheader
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
ln_inv | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inv |
D1. | .080464 .0187106 4.30 0.000 .0432214 .1177066
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
Example (continued): Alternative EC representation

. ardl ln_consump ln_inc ln_inv, aic ec1 noheader
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3677596 .0406085 -9.06 0.000 -.4485888 -.2869304
-------------+----------------------------------------------------------------
LR |
ln_inc |
L1. | 1.001427 .0265233 37.76 0.000 .9486337 1.05422
|
ln_inv |
L1. | -.0402213 .0309082 -1.30 0.197 -.1017424 .0212999
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.325385 .0789039 -4.12 0.000 -.4824393 -.1683307
|
ln_inc |
D1. | .3682844 .041534 8.87 0.000 .285613 .4509558
|
ln_inv |
D1. | .0656722 .0180596 3.64 0.000 .0297255 .1016189
LD. | .0429352 .0193931 2.21 0.030 .0043342 .0815361
L2D. | .0657494 .0181592 3.62 0.001 .0296045 .1018943
L3D. | .0528173 .0184696 2.86 0.005 .0160544 .0895801
|
_cons | .0469399 .0110639 4.24 0.000 .0249178 .068962
------------------------------------------------------------------------------
Example (continued): Attaching exogenous variables
. ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) aic ec noheader
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.3788728 .0420886 -9.00 0.000 -.4626481 -.2950975
-------------+----------------------------------------------------------------
LR |
ln_inc | .9669152 .0039557 244.44 0.000 .9590416 .9747889
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.346926 .0806726 -4.30 0.000 -.5075007 -.1863512
L2D. | -.1074193 .0790118 -1.36 0.178 -.2646883 .0498497
|
ln_inv |
D1. | .0758713 .0176989 4.29 0.000 .0406425 .1111002
LD. | .0422224 .0191523 2.20 0.030 .0041008 .080344
L2D. | .0678568 .0185208 3.66 0.000 .030992 .1047216
L3D. | .0485441 .0179609 2.70 0.008 .0127938 .0842944
|
_cons | .0504873 .0114518 4.41 0.000 .027693 .0732816
------------------------------------------------------------------------------
EC representation: Interpretation
The long-run coefficients θ are reported in the output section

LR. They represent the equilibrium effects of the independent
variables on the dependent variable. In the presence of
cointegration, they correspond to the negative cointegration
coefficients after normalizing the coefficient of the dependent
variable to unity. The latter is not explicitly displayed.
The negative speed-of-adjustment coefficient −α is reported
in the output section ADJ. It measures how strongly the
dependent variable reacts to a deviation from the equilibrium
relationship in one period or, in other words, how quickly such
an equilibrium distortion is corrected.
The short-run coefficients ψyi , ψ xi (and ω) are reported in the
output section SR. They account for short-run fluctuations not
due to deviations from the long-run equilibrium.
EC representation: Integration order
The independent variables are allowed to be individually I(0)

or I(1).
The independent variables must be long-run forcing (weakly
exogenous) for the dependent variable, i.e. there can be at
most one cointegrating relationship involving the dependent
variable. (There might be further cointegrating relationships
among the independent variables themselves.)
By default, each independent variable is included in the
long-run relationship. I(0) variables that shall only affect the
short-run dynamics can be specified with the option
exog(varlist ). An automatic lag selection or
first-difference transformation is not performed for the latter.
Pesaran, Shin, and Smith (2001) bounds test:

1 Use the F -statistic
Pto test thejoint null hypothesis
q
H0F : (α = 0) ∩ j=0 β j = 0 versus the alternative
P
q
hypothesis H1F : (α 6= 0) ∪ β
j=0 j 6
= 0 .3
2 If H0F is rejected, use the t-statistic to test the single
hypothesis H0t : α = 0 versus H1t : α 6= 0.
3 If H1F is rejected, use conventional z-tests (or Wald tests) to
test whether the elements of θ are individually (or jointly)
statistically significantly different from zero.
There is statistical evidence for the existence of a long-run /
cointegrating relationship if the null hypothesis is rejected in
all three steps.
3 Pq
The test is not directly performed on the long-run coefficients θ = βj /α.
j=0
The distributions of the test statistics in steps 1 and 2 are

nonstandard and depend on the integration order of the
Kripfganz and Schneider (2018) use response surface
regressions to obtain finite-sample and asymptotic critical
values, as well as approximate p-values, for the lower and
upper bound of all independent variables being purely I(0) or
purely I(1) (and not mutually cointegrated), respectively.
These critical values supersede the near-asymptotic critical
values provided by Pesaran, Shin, and Smith (2001) and the
finite-sample critical values by Narayan (2005), among others.
The critical values depend on the number of independent

variables, their integration order, the number of short-run
coefficients,4 and the inclusion of an intercept or time trend.
ardl options for the deterministic model components:
1 No intercept, no trend: noconstant
2 Restricted intercept, no trend: restricted
3 Unrestricted intercept, no trend: the default
4 Unrestricted intercept, restricted trend: trend(varname ) and
restricted
5 Unrestricted intercept, unrestricted trend: trend(varname )
4
The number of short-run coefficients only affects the finite-sample but not the asymptotic critical values
(Cheung and Lai, 1995; Kripfganz and Schneider, 2018). The elements of ω in the ec1 parameterization for
variables that have 0 lags in the ARDL model do not count towards this number.
Test decisions:
Do not reject H0F or H0t , respectively, if the test statistic is
closer to zero than the lower bound of the critical values.
Reject the H0F or H0t , respectively, if the test statistic is more
extreme than the upper bound of the critical values.
The first two steps of the bounds test are implemented in the
ardl postestimation command estat ectest.
By default, finite-sample critical values for the 1%, 5%, and
10% significance levels are provided. Asymptotic critical values
are displayed with option asymptotic. Alternative significance
levels can be specified with option siglevels(numlist ).
The test statistics in step 3 have the usual asymptotic
standard normal (or χ2 ) distributions irrespective of the
integration order of the independent variables.5
5
The OLS estimator for the long-run coefficients θ of I(1) independent variables is “super-consistent” with
√
convergence rate T instead of T (Pesaran and Shin, 1998; Hassler and Wolters, 2006).
Example (continued): Bounds test
. estat ectest

Case 3 t = -9.002
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.032 4.831 | 4.958 5.843 | 7.070 8.119 | 0.000 0.000
t | -2.550 -2.899 | -2.861 -3.225 | -3.470 -3.854 | 0.000 0.000
do not reject H0 if
reject H0 if
Example (continued): EC model with restricted trend
. ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic ec restricted noheader
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
ln_consump |
L1. | -.341178 .0431316 -7.91 0.000 -.4270464 -.2553096
-------------+----------------------------------------------------------------
LR |
ln_inc | 1.14358 .0782318 14.62 0.000 .9878321 1.299327
qtr | -.0036516 .0016171 -2.26 0.027 -.006871 -.0004322
-------------+----------------------------------------------------------------
SR |
ln_consump |
LD. | -.4362663 .0851 -5.13 0.000 -.6056874 -.2668452
L2D. | -.1899566 .0825977 -2.30 0.024 -.354396 -.0255172
|
ln_inv |
D1. | .0842961 .0173889 4.85 0.000 .0496775 .1189146
LD. | .0517241 .0188448 2.74 0.008 .0142069 .0892412
L2D. | .0726232 .017972 4.04 0.000 .0368437 .1084027
L3D. | .0482872 .0173383 2.79 0.007 .0137693 .0828051
|
_cons | -.3188651 .1422961 -2.24 0.028 -.602155 -.0355753
------------------------------------------------------------------------------
Example (continued): Bounds test with restricted trend
. estat ectest

Case 4 t = -7.910
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 4.066 4.582 | 4.784 5.351 | 6.396 7.057 | 0.000 0.000
t | -3.107 -3.384 | -3.412 -3.704 | -4.014 -4.327 | 0.000 0.000
do not reject H0 if
reject H0 if
Further information on the bounds test
The validity of the bounds test relies on normally distributed

error terms that are homoskedastic and serially uncorrelated,
as well as stability of the coefficients over time.
If in doubt about remaining serial error correlation, increase
the lag order for testing purposes (e.g. use the AIC instead of
the BIC to obtain the optimal lag order).
A more parsimonious model for interpretation and forecasting
purposes can be estimated after the testing procedure.
If the bounds test does not reject the null hypothesis of no
long-run relationship, an ARDL model purely in first differences
(without an equilibrium correction term) might be estimated.
Besides estat ectest, the ardl command supports

standard Stata postestimation commands such as estat ic,
estimates, lincom, nlcom, test, testnl, and lrtest.
predict allows to obtain fitted values (option xb) and
residuals (option residuals) in the usual way. In addition,
the option ec generates the equilibrium correction term:
b t = yt−1 − θ̂xt after ardl, ec
ec
b t = yt−1 − θ̂xt−1 after ardl, ec1
ec
The diagnostic commands sktest, qnorm, and pnorm are
helpful as well to detect nonnormality of the residuals.
The final ardl estimation results are internally obtained with

the regress command. These underlying regress estimates
can be stored with the ardl option regstore(name ) and
restored with estimates restore name .
Subsequently, all the familiar regress postestimation
commands are available, in particular:
estat hettest and estat imtest for heteroskedasticity and
normality testing,
estat bgodfrey and estat durbinalt for serial-correlation
testing,6
estat sbcusum, estat sbknown, and estat sbsingle for
structural-breaks testing.
6
estat dwatson is not valid for ARDL / EC models because the lagged dependent variable is not strictly
exogenous by construction.
Example (continued): Serial-correlation testing

. quietly ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic ec regstore(ardlreg)
. estimates restore ardlreg
(results ardlreg are active now)
. estat bgodfrey, lags(1/4) small
Breusch-Godfrey LM test for autocorrelation

---------------------------------------------------------------------------
-------------+-------------------------------------------------------------
1 | 0.116 ( 1, 77 ) 0.7341
2 | 0.068 ( 2, 76 ) 0.9340
3 | 0.364 ( 3, 75 ) 0.7791
4 | 0.453 ( 4, 74 ) 0.7702
---------------------------------------------------------------------------
. estat durbinalt, lags(1/4) small
Durbin’s alternative test for autocorrelation

---------------------------------------------------------------------------
-------------+-------------------------------------------------------------
1 | 0.102 ( 1, 77 ) 0.7505
2 | 0.059 ( 2, 76 ) 0.9426
3 | 0.314 ( 3, 75 ) 0.8150
4 | 0.389 ( 4, 74 ) 0.8162
---------------------------------------------------------------------------
Example (continued): Heteroskedasticity testing
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity

Ho: Constant variance
Variables: fitted values of D.ln_consump
chi2(1) = 0.26
Prob > chi2 = 0.6067
. estat imtest, white
White’s test for Ho: homoskedasticity

against Ha: unrestricted heteroskedasticity
chi2(54) = 52.03
Prob > chi2 = 0.5508
Cameron & Trivedi’s decomposition of IM-test
---------------------------------------------------
Source | chi2 df p
---------------------+-----------------------------
Heteroskedasticity | 52.03 54 0.5508
Skewness | 12.24 9 0.2000
Kurtosis | 0.02 1 0.8967
---------------------+-----------------------------
Total | 64.29 64 0.4664
---------------------------------------------------
Example (continued): Normality testing

. predict resid, residuals
(4 missing values generated)
. sktest resid
Skewness/Kurtosis tests for Normality

------ joint ------
Variable | Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2
-------------+---------------------------------------------------------------
resid | 88 0.3270 0.8107 1.04 0.5939
. qnorm resid
. pnorm resid
.02 1.00
.01 0.75
0 0.50
−.01 0.25
−.02 0.00
−.02 −.01 0 .01 .02 0.00 0.25 0.50 0.75 1.00

. estat sbcusum


------------------------------------------------------------------------------
recursive 1.4690 1.1430 0.9479 0.850
------------------------------------------------------------------------------
Recursive cusum plot of D.ln_consump

4
−2
−4
1961 1966 1971 1976 1981

. estat sbcusum, ols


------------------------------------------------------------------------------
ols 0.6793 1.6276 1.3581 1.224
------------------------------------------------------------------------------
OLS cusum plot of D.ln_consump

2
−1
−2
1961 1966 1971 1976 1981
. estat sbsingle, all

----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
..........
Number of obs = 88


-----------------------------------------------
swald 20.1088 0.3040
awald 13.9245 0.1019
ewald 7.9897 0.1939
slr 22.7977 0.1605
alr 16.3306 0.0330
elr 9.3047 0.0886
-----------------------------------------------
Coefficients included in test: L.ln_consump ln_inc LD.ln_consump L2D.ln_consump D.ln_inv LD.ln_inv
L2D.ln_inv L3D.ln_inv qtr _cons
. estat sbsingle, breakvars(L.ln_consump ln_inc) all

----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
..........
Number of obs = 88


-----------------------------------------------
swald 8.9039 0.1457
awald 2.5060 0.2608
ewald 2.0321 0.1738
slr 9.7492 0.1046
alr 2.8269 0.2027
elr 2.3571 0.1225
-----------------------------------------------
Coefficients included in test: L.ln_consump ln_inc
Note: This is a test for a structural break in the speed-of-adjustment and long-run coefficients.
Further topics
The ardl command can estimate autoregressive models

without independent variables. In this case, the bounds test
collapses to the familiar augmented Dickey-Fuller unit root
test. The Kripfganz and Schneider (2018) critical values cover
this special case, too.
The forecast command suite can be used for model
forecasting after ardl.
ardl does not compute robust standard errors. Yet, once the
optimal lag order is obtained, the final model can be
reestimated with the newey command to obtain Newey-West
standard errors.
Example (continued): Augmented Dickey-Fuller regression
. ardl dln_inv, aic ec restricted
ARDL(4) regression

R-squared = 0.6462
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ADJ |
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
-------------+----------------------------------------------------------------
LR |
_cons | .015006 .0060544 2.48 0.015 .0029618 .0270501
-------------+----------------------------------------------------------------
SR |
dln_inv |
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
------------------------------------------------------------------------------
Note: The aim is to test whether dln inv, the first difference of ln inv, is nonstationary.
. estat ectest

Case 2 t = -3.290
| 10% | 5% | 1% | p-value
| I(0) I(1) | I(0) I(1) | I(0) I(1) | I(0) I(1)
---+------------------+------------------+------------------+-----------------
F | 3.823 3.812 | 4.677 4.659 | 6.644 6.601 | 0.026 0.025
t | -2.565 -2.569 | -2.869 -2.874 | -3.463 -3.472 | 0.017 0.017
do not reject H0 if
reject H0 if
Note: The null hypothesis is that dln inv follows a unit root process (without drift).
. dfuller dln_inv if e(sample), lags(3) regress
Augmented Dickey-Fuller test for unit root Number of obs = 87
---------- Interpolated Dickey-Fuller ---------

Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value
------------------------------------------------------------------------------
Z(t) -3.290 -3.528 -2.900 -2.585
------------------------------------------------------------------------------
MacKinnon approximate p-value for Z(t) = 0.0153
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
dln_inv |
L1. | -.755277 .2295731 -3.29 0.001 -1.211971 -.2985831
LD. | -.4633003 .2005284 -2.31 0.023 -.8622152 -.0643855
L2D. | -.4938993 .1577325 -3.13 0.002 -.8076796 -.180119
L3D. | -.3133117 .1029967 -3.04 0.003 -.5182049 -.1084184
|
_cons | .0113337 .0060208 1.88 0.063 -.0006437 .023311
------------------------------------------------------------------------------
Example (continued): Forecasting

. quietly ardl ln_consump ln_inc ln_inv if qtr < tq(1981q1), trend(qtr)
. estimates store ardl
. forecast create ardl
Forecast model ardl started.
. forecast estimates ardl, predict(xb)

Added estimation results from ardl.
Forecast model ardl now contains 1 endogenous variable.
. forecast exogenous ln_inc ln_inv qtr

Forecast model ardl now contains 3 declared exogenous variables.
. forecast solve, begin(tq(1981q1))
Computing dynamic forecasts for model ardl.

-------------------------------------------
Starting period: 1981q1
Ending period: 1982q4
Forecast prefix: f_
1981q1: ...........
1981q2: ...........
1981q3: ...........
1981q4: ...........
1982q1: ...........
1982q2: ..........
1982q3: ..........
1982q4: ...........
Forecast 1 variable spanning 8 periods.

---------------------------------------
Example (continued): Forecast versus actual data

. twoway (tsline f_ln_consump if qtr>=tq(1979q1)) (tsline ln_consump if qtr>=tq(1979q1)), tline(1981q1)
7.75
7.7
7.65
7.6
7.55
1979 1980 1981 1982
log consumption (ardl f_)

log consumption
Note: The forecast period (1981q1 – 1982q4) is excluded from the estimation period (1961q1 – 1980q4).
Example (continued): Newey-West standard errors

. quietly ardl ln_consump ln_inc, exog(L(0/3)D.ln_inv) trend(qtr) aic regstore(ardlreg)
. quietly estimates restore ardlreg
. local cmdline ‘"‘e(cmdline)’"’
. gettoken cmd cmdline : cmdline
. newey ‘cmdline’ lag(4)
Regression with Newey-West standard errors Number of obs = 88

maximum lag: 4 F( 9, 78) = 62645.21
Prob > F = 0.0000
------------------------------------------------------------------------------
| Newey-West
-------------+----------------------------------------------------------------
ln_consump |
L1. | .2225557 .0931767 2.39 0.019 .0370552 .4080562
L2. | .2463097 .1003579 2.45 0.016 .0465125 .4461068
L3. | .1899566 .1013927 1.87 0.065 -.0119008 .3918141
|
ln_inc | .3901642 .0400174 9.75 0.000 .3104956 .4698327
|
ln_inv |
D1. | .0842961 .0258047 3.27 0.002 .0329229 .1356693
LD. | .0517241 .0158053 3.27 0.002 .0202582 .08319
L2D. | .0726232 .0156803 4.63 0.000 .0414061 .1038404
L3D. | .0482872 .017342 2.78 0.007 .013762 .0828124
|
qtr | -.0012458 .000383 -3.25 0.002 -.0020083 -.0004833
_cons | -.3188651 .1104624 -2.89 0.005 -.5387789 -.0989513
------------------------------------------------------------------------------
Example (continued): Long-run coefficient
. nlcom _b[ln_inc] / (1 - _b[L.ln_consump] - _b[L2.ln_consump] - _b[L3.ln_consump])
_nl_1: _b[ln_inc] / (1 - _b[L.ln_consump] - _b[L2.ln_consump] - _b[L3.ln_consump])
------------------------------------------------------------------------------
ln_consump | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_nl_1 | 1.14358 .0691576 16.54 0.000 1.008033 1.279126
------------------------------------------------------------------------------
Note: This is the same long-run coefficient as earlier but with Newey-West standard errors.
Summary: The ardl package for Stata
The ardl command estimates an ARDL model with optimal

or prespecified lag orders, possibly reparameterized in EC form.
The bounds test for the existence of a long-run /
cointegrating relationship is implemented as the
postestimation command estat ectest.
Asymptotic and finite-sample critical value bounds are
available (Kripfganz and Schneider, 2018).
The augmented Dickey-Fuller unit root test is a special case in
the absence of independent variables.
The usual regress postestimation commands can be applied.
ssc install ardl
help ardl
help ardl postestimation
References
Cheung, Y.-W., and K. S. Lai (1995). Lag order and critical values of the augmented Dickey-Fuller test.
Journal of Business & Economic Statistics 13(3): 277–280.
Engle, R. F., and C. W. J. Granger (1987). Co-integration and error correction: representation, estimation,
and testing. Econometrica 55(2): 251–276.
Hassler, U., and J. Wolters (2006). Autoregressive distributed lag models and cointegration. Allgemeines
Statistisches Archiv 90(1): 59–74.
Kripfganz, S., and D. C. Schneider (2018). Response surface regressions for critical value bounds and
approximate p-values in equilibrium correction models. Manuscript, University of Exeter and Max Planck
Institute for Demographic Research, www.kripfganz.de.
Lütkepohl, H. (1993). Introduction to Multiple Time Series Analysis (2nd edition), Berlin, New York:
Springer.
Narayan, P. K (2005). The saving and investment nexus for China: evidence from cointegration tests.
Applied Economics 37(17): 1979–1990.
Pesaran, M. H., and Y. Shin (1998). An autoregressive distributed-lag modelling approach to cointegration
analysis. In Econometrics and Economic Theory in the 20th Century. The Ragnar Frisch Centennial
Symposium, ed. S. Strøm, chap. 11, 371–413. Cambridge: Cambridge University Press.
Pesaran, M. H., Y. Shin, and R. Smith (2001). Bounds testing approaches to the analysis of level
relationships. Journal of Applied Econometrics 16(3): 289–326.
Le présent document est la propriété de l’Institut Tunisien de la Compétitivité et des
Études Quantitatives (ITCEQ). Toute reproduction ou représentation, intégrale ou
partielle, par quelque procédé que ce soit, de la présente publication, faite sans
l’autorisation écrite de l’ITCEQ, est considérée comme illicite et constitue une
contrefaçon.
Les résultats, interprétations et conclusions émis dans cette publication sont ceux de(s)
auteur(s) et ne devraient pas être attribués à l’ITCEQ, à sa Direction ou aux autorités
de tutelle.
Ce document est réalisé dans le cadre du programme d’activité de l’ITCEQ au sein de la

Direction Centrale des Etudes Economiques, sous la supervision de Mme Mounira BOU ALI.
Sommaire
Résumé ........................................................................................................................... 1
Introduction ...................................................................................................................... 2
1. Bilan de la situation des finances publiques en Tunisie (2000-2017).................. 4
2. Définition des concepts et revue de la littérature ................................................ 7
3. Présentation de la méthodologie et validation empirique .................................. 11
3.1 Présentation des données et choix de la technique d’estimation.................... 12
3.2 Estimations et interprétations.......................................................................... 14
3.3 Indice de l’effort fiscal ..................................................................................... 19
Conclusion ..................................................................................................................... 22
Annexes......................................................................................................................... 24
Liste des graphiques et tableaux
Graphique 1 : Evolution des soldes budgétaires en Tunisie (en % du PIB) ................... 5
Graphique 2 : Evolution du taux d'endettement public et de la pression fiscale de la

Tunisie (% du PIB) ........................................................................................................... 5
Tableau 1 : Pression fiscale en % du PIB (2010-2016)................................................... 6
Tableau 2 : Modèle ARDL (1, 0, 0, 0, 1, 0).................................................................... 15
Tableau 3 : Equation de long-terme .............................................................................. 15
Tableau 4 : Modèle à correction du ARDL(1, 0, 0, 0, 1, 0) ............................................ 18
Tableau 5 : Evolution de l’Indice d’effort fiscal en Tunisie ............................................. 20

Résumé
Lors de la conception d’une politique fiscale, les pouvoirs publics doivent conserver
les ressources fiscales permettant de soutenir les équilibres budgétaires en
respectant les facultés contributives de l’économie et des contribuables et instaurer
un système fiscal équitable. Le présent travail a pour objectif d’aborder la notion du
potentiel fiscal : définir et étudier ses déterminants pour le cas de la Tunisie. Et ce, à
travers une analyse des déterminants du potentiel fiscal en utilisant des techniques
économétriques en séries temporelles. Plus précisément l’approche ARDL qui
permet de faire une estimation et évaluation du potentiel fiscal en fonction de
plusieurs variables (la valeur ajoutée de l’agriculture dans le PIB, variante M2 de la
masse monétaire rapportée au PIB, rapport entre les importations plus exportations
et PIB, PIB réel par habitant et taux d’urbanisation….).
Cette estimation va permettre de déterminer la fonction de comportement de long et
de court terme de la pression fiscale en fonction de ses déterminants. En se basant
sur l’équation estimée on va déterminer la pression fiscale théorique (potentiel fiscal),
qui permettra par la suite, de calculer l’indice d’effort fiscal qui représente le rapport
entre la pression fiscale effective et le potentiel fiscal estimé.
Les résultats obtenus montrent que cet indice était toujours très proche de l’unité, ce
qui implique que la Tunisie se situe face à des difficultés de mobilisation de plus de
recettes fiscales avec cette même base des contribuables. Ainsi, un ensemble de
propositions a été formulé comme suit :
 Élargir la base des contribuables pour garantir plus d’équité fiscale.
 Adopter une stratégie de sensibilisation et de motivation qui vise plus de
civisme fiscal.
 Continuer à combattre l’évasion et la fraude fiscale en améliorant les moyens
humains et matériels mis à la disposition de l’administration fiscale et
consolider les efforts de sa digitalisation.
 Éliminer le régime forfaitaire
 Mettre en place des avantages et des procédures permettant de faciliter et
motiver le passage de l’informel au formel.
Et enfin réglementer de plus les paiements en espèces et veiller à l’application des
règles du droit qui régissent la matière.
1
Introduction
La Tunisie, comme la majorité des pays en développement, est loin d’avoir l’atout
des ressources naturelles pour affronter les défis de développement en faisant un
bon dosage entre l’aspect économique et l’aspect social qui gagne en pertinence ces
dernières années. D’ailleurs, elle a toujours été parmi les pays qui comptent en
grande partie sur ses ressources internes, particulièrement les ressources fiscales
pour financer ses dépenses publiques. En effet, les recettes fiscales ont financé le
budget de l’Etat à raison de 60% en moyenne entre les années 1986-2017.
De plus, répondre aux exigences croissantes des citoyens en infrastructures et
services publics de qualité, rend l’exercice et l’enjeu d’amélioration du rendement
des recettes fiscales assez compliqué. Lors de la conception d’une politique fiscale,
les pouvoirs publics doivent conserver les ressources fiscales permettant de soutenir
les équilibres budgétaires en respectant les facultés contributives de l’économie et
des contribuables et instaurer un système fiscal équitable. Un tel défi exige une
connaissance du potentiel fiscal de l’économie pour pouvoir déterminer la marge (le
domaine de définition) permettant aux autorités fiscales de rationaliser le système
d’imposition et d’éviter les cas extrêmes de sous imposition ou de sur imposition ;
« trop d’impôt tue l’impôt » Ibn Khaldoun (El mokadima).
Les gouvernements bienveillants cherchent toujours la combinaison qui génère

moins de distorsion tout en garantissant suffisamment des revenus. En effet, la
faiblesse de la performance et l’insuffisance des recettes fiscales (soit excès de
pression accompagné par un déséquilibre budgétaire financé par le recours à
l’endettement) implique que le pays peut avoir des limites dans ses mécanismes de
collecte des impôts.
Avec cette reconnaissance, la question de la performance fiscale est beaucoup plus

importante pour les pays à faibles ressources naturelles comme la Tunisie. Cette
question problématique fait appel à des notions fondamentales à savoir, la
mobilisation des ressources est donc la fiscalité, l’effort fiscal d’une part et la
capacité contributive d’autre part.
2
Dans ce contexte s’inscrit ce travail dont l’objectif est de chercher des éléments de
réponse à une telle question en se basant sur une estimation de la capacité
maximale de collecte d’impôt à l’échelle macro qui exige la connaissance des
déterminants structurels et non structurels de l’effort fiscal. En d’autres termes
aborder la notion du potentiel fiscal : définir et étudier ses déterminants pour le cas
de la Tunisie.
Le travail sera structuré comme suit :
 La première partie sera consacrée à la présentation d’un état des lieux de la

situation des finances publiques en Tunisie, et ce pour montrer l’importance
de la dimension fiscale dans l’économie.
 La deuxième partie abordera le concept du potentiel fiscal à travers une
revue de la littérature des principales études et travaux ayant traité cette
problématique.
 La troisième partie sera consacrée à une analyse des déterminants du
potentiel fiscal de la Tunisie en utilisant des techniques économétriques en
séries temporelles sur la période 1983-2016. Cette partie débouchera sur une
estimation et évaluation du potentiel et de l’effort fiscaux de la Tunisie sur la
période d’étude.
3
1. Bilan de la situation des finances publiques en Tunisie (2000-
2017)
Les difficultés économiques que connait la Tunisie ces dernières années ont mis en
devant le débat sur la situation critique des finances publiques, particulièrement les
problèmes touchant les équilibres budgétaires et la soutenabilité de la dette publique.
L’examen du déficit budgétaire durant la période 2000-2017 fait apparaître deux

tendances :
 une première qui s’étale sur la période 2000-2010: Plus ou moins performante, au
cours de laquelle le gouvernement a réussi à maintenir, un déficit ne dépassant
pas la barre de 3,4% du PIB et une moyenne de 2,61% sur toute la période. Faut-
il signaler que ces résultats ont été reluisants pour les deux années 2008 et 2010
(1% du PIB), malgré les pressions croissantes sur les finances publiques
émanant des retombées de la crise financière et de l’arrivée au point culminant
du démantèlement tarifaire des produits industriels dans le cadre de l’accord
d’association avec l’UE.
 Une seconde tendance concerne la période post-révolutionnaire (2011-2017) est
marquée par des dérapages au niveau des équilibres des finances publiques. En
effet, le déficit budgétaire (hors privatisations et dons) durant ces sept années a
été de l’ordre de 5,38% du PIB et a atteint une apogée drastique de 6,9% en
2013. Les améliorations enregistrées successivement au cours des deux années
suivantes n’ont pas pu perdurer et le déficit a fini par atteindre le niveau de 6,1%
du PIB en 2016 et 2017. Les convulsions sociales, l’incertitude et l’instabilité
politique au niveau national ont été, de concert, les principales causes de la
situation difficile des finances publiques subtiles.
Par ailleurs, l’évolution du solde budgétaire primaire entre 2000-2017 a montré des
alternances entre des périodes dégageant des excédents et d’autres affichant des
déficits. Plus particulièrement et à partir de l’année 2011, la Tunisie a affiché un
déficit primaire qui ne cesse de s’aggraver d’une année à l’autre (Le déficit primaire
est passé de 0,61% du PIB en 2011 à 2,44% en 2015 et 2,74% en 2016 et 2017),
des niveaux sans précédent durant les vingt dernières années.
4
Graphique 1 : Evolution des soldes budgétaires en Tunisie (en % du PIB)
Source : Ministère des Finances
Une telle situation n’a pas cessé de s’aggraver davantage vue la récession
économique de la période de transition démocratique depuis l’année 2011. En effet,
l’instabilité politique et l’incertitude ont affaibli l’investissement et la création de
richesse et ont conduit à une dynamique de la dette publique dans ces deux
composantes interne et externe pour financer le déficit budgétaire. En conséquence,
le taux d’endettement public s’est exacerbé pour atteindre 69,7% du PIB en 2017
alors qu’il était à 44,6% en 2011 et d’une moyenne de 51,34% pour la période 2000-
2017 (graphique 2). Ceci laisse présager des difficultés inhérentes à la soutenabilité
de la politique budgétaire en Tunisie et la capacité d’honorer ses engagements
envers ses créanciers.
Graphique 2 : Evolution du taux d'endettement public et de la pression fiscale de la
Tunisie (% du PIB)
80% 30%
70%
25%
60%
50%
20%
40%
30% 15%
1986
1989
1992
1995
1998
2001
2004
2007
2010
2013
2016
1986
1989
1992
1995
1998
2001
2004
2007
2010
2013
2016
taux d'endettement public pression fiscale
Source : Ministère des Finances
5
Dans ce sens, la Tunisie s’est trouvée dans un dilemme de plus en plus complexe,
d’une part elle devrait se procurer des ressources financières internes pour financer
les dépenses publiques tout en étant à l’abri d’un endettement plantureux, qui risque
de mettre en péril la souveraineté de l’Etat et porter préjudice à sa solvabilité
financière. D’autre part, elle doit freiner l’augmentation tendancielle de la pression
fiscale qui a atteint un pic de 23,1% du PIB en 2014.
Dans une approche de benchmarking pour la période 2010-2016, la Tunisie a

enregistré une pression fiscale au-dessus de la moyenne des pays de l’OCDE et de
l’UE qui était respectivement de l’ordre de 15,4% et 20,2% en 2016 (tableau 1). La
Belgique et la France ont enregistré une pression fiscale supérieure à celle de la
Tunisie. L’Egypte, le Liban et la Roumanie se révèlent des pays qui taxent
globalement moins leurs contribuables que la Tunisie. Le Portugal a enregistré une
pression fiscale semblable à celle de la Tunisie. La Turquie s’est positionnée en
dessous de la Tunisie au regard de leur pression fiscale moyenne entre 2010-2016.
Tableau 1 : Pression fiscale en % du PIB (2010-2016)

Pays/Année 2010 2011 2012 2013 2014 2015 2016
Belgique 24,3 24,7 25,7 26,2 26,2 24,7 23,1
France 21,9 21,8 22,6 23,3 23,2 23,2 23,1
Liban 16,9 16,3 15,2 14,3 14,2 13,6 13,9
Portugal 19,7 21,2 20,9 22,8 22,7 22,9 22,5
Roumanie 16,4 17,9 17,9 17,6 17,9 18,9 16,8
Turquie 19,4 18,7 18,4 18,5 18,1 18,2 18,3
Egypte 14,1 14,0 12,4 13,5 12,2 12,5 -
Tunisie 20,1 21,1 21,1 21,7 23,1 21,6 20,7
moyenne des pays de l'UE 19,4 19,6 19,8 20,2 20,2 20,2 20,2
Moyenne des pays de l'OCDE 13,9 14,4 14,6 15,1 15,4 15,5 15,4
Source : Ministère des Finances, Banque Mondiale
6
2. Définition des concepts et revue de la littérature
Le concept de l’effort fiscal est défini comme étant le degré d’exploitation du potentiel
fiscal d’un pays1. Pessino et Fenochietto (2010) présentent le potentiel fiscal ou la
capacité contributive comme étant le maximum de recettes fiscales qu’un pays
donné peut collecter compte tenu des facteurs structurels d’ordre économique,
social, institutionnel et démographique.
D’ailleurs, l’effort fiscal d’un pays est un indice de mesure ponctuelle de la

performance de mobilisation des ressources fiscales par rapport à son potentiel. Il
est calculé en rapportant les recettes fiscales actuelles aux recettes fiscales
estimées. Un effort fiscal supérieur à 1 renseigne sur la difficulté du pays à mobiliser
des ressources fiscales additionnelles en vue d’une pleine exploitation du potentiel.
Dans le sens d’un effort fiscal inférieur à 1, le pays est dans le cas d’une sous-
exploitation de son potentiel fiscal et dans ce cas les pouvoirs publics peuvent
envisager le renforcement de mobilisation des ressources fiscales (Brun et al. 2006).
Le sujet de l’effort fiscal et l’analyse de ses déterminants suscitait l’intérêt de

plusieurs économistes et spécialistes des finances publiques. En effet, Hinrichs
(1965) fut le pionnier à s’intéresser au sujet. Il essaya d’expliquer les déterminants
des recettes publiques (fiscales et non fiscales) pour un échantillon de 60 pays, sur
la période de 1957-1960. Il a utilisé, comme variables explicatives, le revenu par
habitant et l’effort d’exportation (exportation rapportées aux PNB) et a conclu que
cette dernière variable estime mieux les recettes publiques pour les pays les moins
développés (pays ayant un revenu par habitant moins de 500$) que la variable
revenu par habitant.
Lotz et Morss (1967) ont examiné les déterminants du niveau de taxation d’un pays
en modélisant la pression fiscale de 72 pays développés et en développement par
deux variables indépendantes : le revenu national brut par habitant et le taux
d’ouverture (import et export rapportés PNB). Ils concluaient que ces 2 variables
impactent positivement et significativement la pression fiscale.
Les travaux de l’UNCTAD en 1970, ont procédé à enrichir le modèle de Lotz et

Morss (1967) par l’ajout de deux variables (la part de l’agriculture et le taux
d’inflation) pour estimer la pression fiscale de 36 pays en développement pour la
1
Ministère de l’Economie et des Finances du Sénégal, Bulletin du CEPOD, Quatrième trimestre 2012
7
période 1955-1966. Les résultats ont montré que deux facteurs ont été retenus
comme déterminants de la pression fiscale à savoir la part de l’agriculture et le taux
d’ouverture.
Tanzi (1992) s’est intéressé à étudier les déterminants de la pression fiscale dans 83
pays en développement pour la période 1978-1988. Il a prouvé que : la part des
importations dans le PIB, le PIB par habitant, la part de l’agriculture dans le PIB et la
part de la dette extérieure dans le PIB influent sur le ratio des recettes fiscales
rapportées au PIB.
Stotsky et Wolde Mariam (1997) ont enrichi les efforts déjà dévoués en essayant de
présenter les déterminants de la pression fiscale pour 43 pays de l’Afrique Sub-
saharienne pour la période 1990-1995 à travers des données de panel et de
construire un indice de mesure de l’effort fiscal. Ils ont conclu que la part de
l’agriculture dans le PIB et la part des mines dans le PIB déterminent d’une façon
négative et significative la pression fiscale alors que la part des exports et le PIB par
habitant ont des effets positifs et significatifs.
Eltony (2002) s’est intéressé au sujet à travers l’analyse des déterminants de l’effort
fiscal dans seize pays arabes pour la période 1994-2000. En effet, Eltony a fini par
conclure que pour les six pays arabes du conseil de coopération du Golfe (CCG), la
part des mines affecte significativement et négativement la pression fiscale, alors
que l’influence du revenu par habitant est positive. Quant aux autres pays non
producteurs de pétrole, les résultats étaient statistiquement significatifs présentant un
effet négatif de la part de l’agriculture alors que l’effet était plutôt positif pour la part
des mines, la part des importations et celle des exportations et pour le revenu par
habitant.
De surcroit, Gupta (2007) à travers une régression sur des données de panel pour
une période de 25 ans portant sur 105 pays en développement a fini par conclure
que des facteurs d’ordre structurel (le revenu par habitant, la part de l’agriculture
dans le PIB, l’ouverture mesurée par la part des imports dans le PIB ainsi que les
aides étrangères) déterminent d’une façon significative la performance des recettes
publiques (hors subventions) de ces pays.
D’autres facteurs institutionnels mesurés par le degré de la corruption et la stabilité

politique influent significativement les revenus fiscaux des pays en développement :
la corruption affecte négativement les pays à revenus faibles et moyens, la stabilité
8
politique affecte positivement la collecte de revenus importants pour les pays à
faibles revenus.
En se basant sur les travaux d’Aigner et al. (1977) et d’Afirman (2003), Pessino et
Fenochietto (2010) ont utilisé un modèle à frontière stochastique afin de déterminer
l’effort fiscal de 96 pays sur la période 1991-2006 et ont fini aux conclusions
suivantes : un impact positif et significatif sur la pression fiscale est lié aux variables
revenu par habitant, taux d’ouverture et dépenses publiques en éducation en
pourcentage du PIB. Par contre, l’inflation mesurée par l’indice des prix à la
consommation, le degré de concentration des revenus qui est mesuré par l’indice de
GINI, la part de l’agriculture dans le PIB et la corruption ont tendance à diminuer
d’une façon significative la pression fiscale. De plus, ils ont conclu que les pays
n’ayant pas encore atteint leur capacité contributive et visant à augmenter la
pression fiscale risquent de créer un milieu favorable à la corruption. Un phénomène
qui est en train de réduire d’une façon importante l’efficience des recettes fiscales et
nécessiterait par conséquent, plus d’effort pour la combattre.
Karagöz (2013) s’est intéressé au sujet en utilisant des séries temporelles de 1970 à
2010 pour montrer la manière avec laquelle la structure sectorielle de l’économie de
la Turquie impacte la pression fiscale. La conclusion était que la pression fiscale
turque est impactée significativement d’une façon positive par la part de l’industrie
dans le PIB, la dette extérieure totale rapportée au PIB, le taux de monétarisation de
l’économie (M2 rapportée au PIB) et le taux d’urbanisation. Quant à la part de
l’agriculture dans le PIB, l’impact est significatif mais négatif. L’impact de l’ouverture
est non significatif. L’auteur finit par recommander aux pouvoirs publics turcs
d’augmenter la pression fiscale à moyen terme d’une façon progressive avec des
réformes au niveau des dépenses publiques afin de créer un espace budgétaire
durable pour prioriser les dépenses émergentes.
Un travail mené par Amin et al.(2014) vient enrichir la batterie des déterminants de la
pression fiscale par les facteurs qui affectent la collecte des taxes (directes,
indirectes et totales) en utilisant la méthode de cointégration de Pesaran et Shin sur
des séries temporelles de 1980 à 2010 pour le cas du Pakistan. Les chercheurs sont
parvenus aux résultats suivants: la pression fiscale totale est en relation inverse et
significative avec les variables de corruption, indice d’instabilité politique et le revenu
réel par habitant. La relation devient positive et significative avec la variable taux
9
d’ouverture et non significative avec la variable inflation mesurée par l’indice des prix
à la consommation.
En résumé, la revue de littérature empirique et théorique présentée jusqu’ici

débouche sur la liste des déterminants de la pression fiscale à savoir :
Part de l’agriculture dans le PIB

Des variables liées à la structure de l’économie Part de l’industrie et des mines dans le PIB
La part de l’énergie
Taux d’urbanisme
Des variables d’ordre institutionnel
Indice de corruption
Taux d’ouverture, le rapport volume de

commerce (exportation + importation) par rapport
au PIB.
Des variables d’ouverture Taux de pénétration des importations,
Effort d’exportation,
Le poids du solde commerciale dans le PIB.
Taux de liquidité ou la profondeur monétaire

Variable financière et monétaire
appréhendée par le rapport entre M2 sur PIB
10
3. Présentation de la méthodologie et validation empirique
Partant de la revue de la littérature empirique ayant trait à ce sujet, la méthode

adoptée pour la détermination du potentiel fiscal d’un pays consiste à estimer la
pression fiscale en se basant sur une régression de cette variable sur ses
déterminants théoriques, puis faire la comparaison entre la pression estimée et celle
observée.
L’étape qui suit consiste à estimer le comportement de la pression fiscale par rapport
à ses déterminants. Le modèle prend la forme linéaire suivante :
PFt  f ( X it )
PFt : Taux de la pression fiscale effective
X it : Vecteur des variables explicatives i à l’instant t, tel que i :

agri : La part de la valeur ajoutée de l’agriculture dans le PIB
M2 : la variante M2 de la masse monétaire dans le PIB
OUV : L’ouverture de l’économie tunisienne mesurée par le rapport entre les importations plus
exportations et PIB
PIBrh : Le PIB réel par habitant
urban : Taux de la population urbaine ou taux d’urbanisation
A travers cette démarche, la détermination du potentiel fiscal de l’économie

tunisienne donne lieu à l’estimation de son niveau maximal compte tenu des
différentes variables. Le modèle en définitif, s’écrit de la manière suivante :
l  pf t  0  1l  agri t  2l (M 2)t  3l (OUV )t  4l ( PIBrh)t  5l (urban)t   t (1)
0 : Constante
i : Le vecteur des coefficients relatifs à la variable explicative i
On a introduit l’opérateur log sur les variables du modèle afin de minimiser leurs
variances.
11
3.1 Présentation des données et choix de la technique d’estimation
Les séries temporelles relatives aux différentes variables sont collectées des bases
de données de la Banque Mondiale(WDI) qui permet de recueillir des séries plus
longues que d’autres sources. La série contient 34 observations annuelles sur la
période 1983-2016.
L’étude de la stationnarité des variables revêt une importance cruciale pour avoir une
estimation fiable du potentiel fiscal et éviter le risque des régressions fallacieuses.
Une série est stationnaire si son espérance et sa variance sont constantes et finies
et si la covariance ne dépend pas du temps.2
L’examen des graphiques relatifs aux différentes séries montre que les différentes
séries temporelles ne sont pas stationnaires en niveau. Une analyse plus profonde
par le recours à des tests de racine unitaire est nécessaire pour juger de cette
propriété stochastique. On optera en premier lieu pour le test ADF (Augmented
Dickey-Fuller) et le test de Phillips-Perron. En second lieu, on aura recours au test de
racine unitaire de Zivot et Andrews afin de prendre en considération la présence de
rupture structurelle dans les séries temporelles des différentes variables.
Variables Conclusion
L(PF) I(1) intégrée d’ordre 1

L(ouv) I(0) stationnaire en niveau
L(urban) I(0) stationnaire
L(M2) I(1) intégrée d’ordre 1
L(PIBrh) I(1) intégrée d’ordre 1
L(agri) I(1) intégrée d’ordre 1
Les résultats des tests de racine unitaire (Tableaux 1 et 2 annexes) affichent la

présence de variables à ordre d’intégration mixte I(0) et I(1) et laissent présager
l’existence d’une relation d’équilibre de long terme entre les différentes séries du
modèle, ce qui se traduit par l’éventuelle existence d’au moins une relation de
cointégration entre les variables.
2
Econométrie, Régis Bourbonnais, Dunod 2015, 9ème édition
12
La présence de variables à ordre d’intégration mixte I(0) et I(1), pousse vers le choix
de la technique de cointégration selon un modèle à correction d’erreur non contraint
de Pesaran (2001) ou modèle ARDL (Auto Regressive Distributed Lag). En effet, les
caractéristiques statistiques des variables en termes d’ordre d’intégration, vérifient la
condition nécessaire à l’application de cette approche qui exige qu’aucune des
variables n’est intégrée d’ordre 2 (I(2)), ainsi que la variable dépendante l(pf), soit
intégrée d’ordre 1 (I(1)).
Cette technique repose sur l’estimation d’un modèle optimal à correction d’erreurs
non contraint sous sa représentation ARDL dont la forme sera établie sur la base des
critères du choix du retard (AIC ou SIC ou HQ). Ainsi, le modèle relatif à l’équation 1
peut-être formulé comme suit :
Où  est la différence première,
p le retard optimal choisi et DUM2011, une variable muette pour introduire l’effet du
choc de la révolution, qui prendra comme valeur 0 avant 2011 et 1 à partir de 2011.
Par la suite et afin de mettre en œuvre l’existence d’une relation de long terme,
Pesaran et al. 1999 recommandent le recours au « ARDL Bound test » ou test de
cointégration par les bornes.
13
Selon l’approche ARDL, l’existence d’une relation de cointégration entre les variables revient à
calculer la statistique de Fisher (F-stat) du «test de Wald» ayant comme hypothèse nulle l’absence de
relation de cointégration à travers la nullité des coefficients des variables explicatives retardées du
modèle ARDL choisi ( i . Une fois calculée, la statistique de Fisher est comparée aux valeurs
critiques générées par la table de Narayan (2005)3. Une valeur inférieure relative à la borne
inférieure qui suppose que toutes les variables sont purement I(0) et une valeur supérieure qui
suppose que toutes les variables sont purement I(1). La règle de décision pour tester la relation de
cointégration est la suivante : Si F-stat est plus élevée que la limite supérieure alors l’hypothèse nulle
est rejetée en faveur de celle de présence d’une relation de cointégration, et si F-stat est en dessous de
la limite inférieure alors l’hypothèse nulle est acceptée. Sinon, pour une valeur de F-stat comprise
entre les deux bornes inférieure et supérieure, le résultat n’est pas concluant.
Ce test permet d’identifier l’existence d’une relation de long-terme d’où le modèle

s’écrit :
Les sont les coefficients de long termes et p le nombre de retard déterminé par le test bound
3.2 Estimations et interprétations

Les résultats (Le graphique 1 de l’annexe) montrent que le modèle à correction
d’erreurs non contraint retenu par le critère SIC est un ARDL (1, 0, 0, 0, 1, 0)4 (voir le
tableau 3 de l’annexe pour les résultats du Bound test).
Ce qui signifie que la forme du modèle soit : la variable dépendante LPF, les
variables explicatives : l(PF) retardée, la variable l(agri), lM2, l(ouv) ; l(PIBrh) et
l(PIBrh) retardée l(urban) et la variable DUM2011.
3
Narayan (2005) a fournit une table des valeurs critiques pour une taille d’échantillon entre 30 et 80 observations.
4
Pour les séries annuelles, Paseran and Shin (1999) recommandent de choisir deux retards au maximum
14
Tableau 2 : Modèle ARDL (1, 0, 0, 0, 1, 0)
variable dépendante : L(PF)
Erreur
Variables Coefficients t-Stat Probabilité
standard
L(PF(-1)) 0,392 0,152 2,570 0,017 (***)
L(AGRI) 0,038 0,059 0,651 0,521
L(M2) -0,241 0,099 -2,429 0,023 (**)
L(OUV) 0,233 0,065 3,560 0,002 (***)
L(PIBrh) -0,619 0,286 -2,160 0,041 (**)
L(PIBRH(-1)) 0,792 0,316 2,507 0,019 (**)
L(URBAN) -0,611 0,220 -2,772 0,011 (**)
DUM2011 0,065 0,027 2,389 0,025 (**)
C -2,750 0,818 -3,362 0,003 (***)
R2=0,86
Adjusted R2=0,81
SCR=0,017
(***) (**) significativité respectivement aux seuils de 1% et 5%
En effet, ces résultats montrent que la statistique de Fisher calculée est plus élevée
que les valeurs critiques des bornes supérieures aux seuils de significativité 5% et
10%. En guise de conclusion, il y a une relation de cointégration entre les variables
du modèle qui nous conduira à estimer une relation de long-terme entre les
variables.
Relation de long terme
Tableau 3 : Equation de long-terme

Erreur
standard
L(AGRI) 0,063 0,091 0,690 0,497
L(M2) -0,396 0,189 -2,098 0,047 (**)
L(OUV) 0,382 0,152 2,522 0,019 (**)
L(PIBRH) 0,285 0,125 2,282 0,032 (**)
L(URBAN) -1,003 0,329 -3,052 0,006 (***)
C -4,520 1,088 -4,154 0,000 (***)
15
Le tableau 3 montre qu’à long terme les régresseurs (L(M2), L(OUV), L(PIBrh),
L(URBAN)) expliquent la pression fiscale et leurs coefficients sont statistiquement
significatifs.
D’après les résultats des estimations, l’impact de l’agriculture sur la pression fiscale
est positif mais non significatif. Ce résultat est inattendu au regard de ceux de
plusieurs travaux précédents qui ont trouvé un signe négatif et significatif (Lee et al.
(2008) et Botlhole (2010)). Ce résultat peut être justifié par la faible part de
l’agriculture dans le PIB qui n’a pas dépassé les 12% depuis la fin des années 90,
avec une moyenne de 12,63% pour la période 1983-2016. Toutefois, faut-il signaler
que ce résultat corrobore ceux issus des travaux de Agbeyegbe et al.(2004),
Mahdavi (2008) et Chaudhry et Munir (2010).
L’effet de l’évolution de la variable M2 sur PIB sur la pression fiscale est négatif et
significatif. Ce résultat ne concorde pas avec celui des travaux précédents
(Lutfunnahar (2007) et Karagöz (2013)). En effet, Ngakosso (2015) dans son
ouvrage renseigne que plus une économie est monétarisée, plus les transactions
économiques se développent, et plus se créent des revenus imposables. Dans notre
cas, le degré de monétarisation de l’économie tunisienne influe inversement la
capacité de l’Etat à mobiliser plus de ressources.
Ce résultat intrigant pour la Tunisie mérite d’être analysé dans la mesure où le degré
de monétarisation moyen de l’économie tunisienne est de l’ordre de 53,4% pour la
période 1983-2016 et le taux de bancarisation est de l’ordre de 66,1% en 2015
(presque 2 comptes bancaires pour 3 habitants)5. Cette relation négative peut être
expliquée par une préférence de plus en plus accrue pour la liquidité entre la période
2000-2015 comme en témoigne l’augmentation de l’indicateur « Billets et Monnaies
en Circulation (BMC) rapporté au PIB ». Cet indicateur atteint un pic de 10,4% de
PIB en 2012, confirmant un engouement des agents économiques aux transactions
en espèces.
Ce comportement de préférence de la monnaie espèces sur la monnaie scripturale,

rend les impôts et taxes attachés à ces transactions, moins repérables par les
services fiscaux. Par conséquent, ce choix favorise la corruption, la fraude et
l’évasion fiscale ainsi que l’expansion du secteur informel.
5
Rapport Sur la Supervision Bancaire 2015, Banque Centrale de la Tunisie, Décembre 2016, page 41.
16
Pour ce qui est de l’ouverture de l’économie mesurée par le ratio importations et
exportations rapportées au PIB, son impact est positif et significatif à 5%. En effet,
une augmentation d’un point de pourcentage de la part des importations et
exportations dans le PIB (toute chose égale par ailleurs) entraîne une augmentation
de la pression fiscale de 0,382 point de pourcentage. Le résultat obtenu est attendu
et confirme celui de l’AFD (2007) qui conclut que les revenus issus du commerce
international constituent une assiette plus facilement taxable que les revenus ou les
consommations intérieures.
Quant au PIB réel par habitant utilisé comme proxy pour le développement de
l’économie, il s’en sort qu’il est un déterminant de la pression fiscale en Tunisie. Son
signe est positif et significatif à 5%. Le résultat est attendu et vient confirmer ceux de
Lotz et Morss (1967) et Pessino et Fenochietto (2010). En effet, le développement
des activités économiques crée de la richesse et augmente les capacités à mobiliser
et payer les impôts.
Pour ce qui est du taux d’urbanisation, utilisé dans le modèle comme proxy à la
demande en services publics, les résultats du modèle montrent un impact négatif et
significatif à 1% sur la pression fiscale en Tunisie. Une augmentation du taux
d’urbanisation d’un point de pourcentage provoque une diminution de la pression
fiscale de 1,003 point de pourcentage. Le résultat est inattendu parce que
théoriquement l’urbanisation augmente la demande des biens publics et crée de la
base imposable facilement taxable du fait de la concentration des activités formelles
dans les milieux urbains (Bird 2007). Toutefois, l’effet positif de l’urbanisation est
tributaire à la capacité du pouvoir public à concevoir une base d’aménagement
propice pour le développement des activités formelles et à fournir un service public
de qualité. En d’autres termes cet effet positif nécessite des préalables pour attendre
le civisme fiscal permettant de motiver les citoyens à bien accomplir leur devoir fiscal
d’une manière volontaire et loyale. Par conséquent, une mauvaise gouvernance est
de nature à nuire au capital confiance entre contribuables et gouvernement et
semble conduire à plusieurs formes de résistance à l’impôt, sous prétexte d’une
absence de contrepartie en services publics de qualité. La Tunisie se situe dans la
deuxième situation, l’augmentation du taux d’urbanisation enregistré durant les
dernières années s’est accompagnée par un rythme faible d’aménagement de
territoire et une détérioration de la qualité de service public.
17
Relation de court terme
Les résultats (Le tableau 4) présentent le modèle à court-terme et montrent que le

coefficient (-0,608) qui indique la vitesse de convergence vers l’équilibre de long
terme le coefficient de correction d’erreur ECT(-1) est négatif et significatif à 1%.
Ceci dit qu’une déviation à court-terme de l’équilibre de long terme s’ajuste à raison
de 60,8% pendant une année. Ainsi, à court-terme, la pression fiscale est influencée
négativement par la croissance économique et positivement par les effets
conjoncturels de la révolution.
Tableau 4 : Modèle à correction du ARDL (1, 0, 0, 0, 1, 0)

variable dépendante :
D(LPF)
Erreur
standard
DL(PIBrh) -0,619 0,149 -4,152 0,000 (***)
DUM2011 0,065 0,013 5,006 0,000 (***)
ECT(-1) -0,608 0,086 -7,047 0,000 (***)
R2=0,62
R2 ajusté= 0,60
SCR= 0,017
(***) significativité au seuil de 1%
En fait, l’existence d’une relation d’équilibre entre la pression fiscale et les différentes
variables explicatives du modèle permet de mettre en évidence une relation de long
terme entre elles au moins dans un sens. L’étude de la causalité servira de tremplin
pour affiner l’analyse et déterminer le sens de causalité dans les relations existantes
entre la pression fiscale et les régresseurs du modèle. Le choix du test de causalité
selon l’approche de Toda Yamamoto est motivé par la présence de variables mixtes
I(0) et I(1) dans le modèle.
Ainsi, les principaux résultats sont les suivants (voir tableau 4 en annexe) :
Trois relations de causalité bidirectionnelles existent :
i) le développement économique cause le taux d’urbanisation qui stipule que

l’amélioration de revenu améliore les conditions de vie qui se manifestent par plus de
civilisation et améliore le taux d’urbanisation. L’autre sens implique que plus
18
d’urbanisation entraine une amélioration de l’offre des facteurs ce qui peut générer
plus de richesse et donc de développement.
ii) Le degré de monétarisation de l’économie pousse vers l’ouverture Pour le

deuxième sens plus d’ouverture entraine une augmentation du volume des
transactions commerciales et donc plus de monétarisation.
iii) la double causalité entre la pression fiscale et le développement

économique stipule que plus de pression fiscale permet à l’Etat de mobiliser plus de
ressources pour améliorer l’investissement public productif et la croissance. De
même plus de développement cause la pression fiscale dans la mesure où la
croissance est encore modérée par rapport à l’augmentation des recettes fiscales.
Toutes les autres relations de causalité existantes sont unidirectionnelles,

notamment : la pression fiscale cause l’ouverture de l’économie et la part de la valeur
ajoutée de l’agriculture dans le PIB.
Tests de validation du modèle
Afin de valider le modèle, une série de tests économétriques doit-être réalisée sur le
résidu. Le tableau 2 (en annexes) montre une absence de corrélation des résidus du
modèle qui est confirmée par le test de « Breusch-Godfrey Lagrange multiplier » et
qui signifie l’absence d’hétéroscédasticité. Le test Jarque-Béra confirme que la
distribution est normale. De plus, le test de Ramsey Reset a confirmé la spécification
linéaire de notre modèle.
Finalement, afin de juger de la stabilité structurelle des coefficients du modèle, on a

eu recours à deux tests à savoir le test de la somme cumulée des résidus récursifs
(CUSUM) et le test de la somme cumulée du carré des résidus récursifs (CUSUM of
squares). Les résultats présentés aux graphiques 2 et 3 (en annexes) montrent que
les courbes ne coupent pas l’intervalle de confiance de 5% : Le modèle est
structurellement stable.
3.3 Indice de l’effort fiscal

L’indice de l’effort fiscal est le rapport entre la pression fiscale effective et le potentiel
fiscal de l’économie tunisienne, déterminé à partir de l’équation du modèle. Il va nous
permettre de juger si la Tunisie éprouve des difficultés à drainer les recettes fiscales.
19
Les résultats (tableau 5) montrent que la Tunisie est très proche et même au-delà de
son potentiel fiscal. En effet, l’évolution de l’indice de l’effort fiscal est presque égale
à l’unité au cours des périodes normales et dépasse l’unité dans les périodes de
mauvaise conjoncture.
Les valeurs qu’affiche cet indice pendant la période d’étude étaient toujours très
proches de l’unité, ce qui prouve que la Tunisie exploite pleinement le potentiel fiscal
disponible et que le risque de dépassement est très élevé. Faut-il signaler que le fait
que la Tunisie est sur les frontières de son potentiel fiscal dont la structure des
contribuables est prédominée par les salariés et quelques sociétés pétrolières peut
mettre en valeur l’équité fiscale.
Tableau 5 : Evolution de l’Indice d’effort fiscal en Tunisie

sur la période 1984-2016
pression fiscale
potentiel fiscal indice d’effort
Année effective
(en % du PIB) fiscal
(en % du PIB)
1984 0,230 0,231 0,998
1985 0,220 0,223 0,985
1986 0,228 0,219 1,042
1987 0,207 0,212 0,978
1988 0,205 0,206 0,993
1989 0,203 0,204 0,993
1990 0,201 0,203 0,988
1991 0,205 0,204 1,006
1992 0,205 0,200 1,025
1993 0,210 0,206 1,017
1994 0,208 0,208 0,999
1995 0,205 0,209 0,981
1996 0,198 0,196 1,009
1997 0,184 0,193 0,953
1998 0,192 0,190 1,011
1999 0,191 0,187 1,023
2000 0,193 0,189 1,022
2001 0,196 0,191 1,024
2002 0,195 0,195 0,999
2003 0,187 0,189 0,987
2004 0,187 0,188 0,996
2005 0,189 0,189 0,999
2006 0,185 0,193 0,957
2007 0,191 0,195 0,982
2008 0,205 0,200 1,026
2009 0,198 0,198 0,999
2010 0,201 0,198 1,014
2011 0,211 0,216 0,977
2012 0,211 0,214 0,985
2013 0,217 0,217 1,000
2014 0,231 0,216 1,070
20
pression fiscale
potentiel fiscal indice d’effort
Année effective
(en % du PIB) fiscal
(en % du PIB)
2015 0,216 0,219 0,987
2016 0,207 0,210 0,985
Source : Ministère des Finances, compilation ITCEQ
21
Conclusion
Le présent travail constitue un essai d’analyse du potentiel fiscal de la Tunisie et de
ses déterminants. L’approche ARDL a été mise en œuvre pour la période 1983-2016
sur les différentes séries temporelles des variables suivantes : part de la valeur
ajoutée de l’agriculture dans le PIB, variante M2 de la masse monétaire rapportée au
PIB, rapport entre les importations plus exportations et PIB, PIB réel par habitant et
taux d’urbanisation. Les résultats des estimations montrent l’existence d’une relation
d’équilibre entre les différentes variables et confirment que l’ouverture et le
développement de l’économie déterminent positivement la pression fiscale en
Tunisie alors que le signe attribué aux variables taux d’urbanisation et degré de
monétarisation de l’économie, est négatif et ne coïncide pas avec les résultats des
travaux précédents. La part de la valeur ajoutée de l’agriculture dans le PIB à
l’encontre des résultats attendus, n’impactent pas d’une façon significative la
pression fiscale.
L’indice d’effort fiscal qui représente le rapport entre la pression fiscale effective et le
potentiel fiscal calculé prouve que la Tunisie confronte des difficultés pour mobiliser
plus de recettes fiscales avec cette même base de contribuables. De ce fait elle est
appelé à orienter les actions de réforme vers deux aspects majeurs : élargir la base
des contribuables pour garantir plus d’équité fiscale et adopter une stratégie de
sensibilisation et de motivation qui vise plus de civisme fiscal.
Dans le cadre d’une perspective de politique économique, un ensemble de

propositions devrait être formulé. D’abord, la Tunisie devrait continuer à combattre
l’évasion et la fraude fiscale en améliorant les moyens humains et matériels mis à la
disposition de l’administration fiscale et consolider les efforts de sa digitalisation.
Aussi, il semblerait une des priorités du gouvernement à veiller à l’application des
règles de bonne gouvernance afin d’améliorer la transparence dans l’action publique.
D’ailleurs, et afin d’optimiser l’allocation des ressources budgétaires et apaiser la
pression sur les finances publiques, il serait opportun, même avec un retard par
rapport à la législation déjà en vigueur depuis fin 2015, d’initier le partenariat public-
privé pour prioriser les dépenses publiques afin de promouvoir la qualité des services
publics.
22
Elargir la base des contribuables par le réformes qui visent à éliminer le régime
forfaitaire et mettre en place des avantages et des procédures permettant de faciliter
et motiver le passage de l’informel au formel. Enfin, il serait judicieux de réglementer
de plus les paiements en espèces et veiller à l’application des règles du droit qui
régissent la matière.
23
ANNEXES
24
Tableaux
Tableau 1 : Tests de racine unitaire : Augmented Dickey-Fuller (ADF) et Phillips-
Perron
test ADF test PP
Série temporelle de la tendance et tendance et

constante
variable constante constante
L(pf) -2,099 -2,027 -2,341
L(agri) -3,145 -3,164 -1,179
L(M2) -1,174 -1,427 -0,235
L(OUV) -5,516 *** -2,345 -1,840
L(PIBrh) -2,337 -2,363 0,026
L(urban) -4,159 ** -2,011 -9,241
Δl(pf) -6,247 *** -6,253 -6,001
Δl(agri) -7,708 *** -8,378 -8,460
Δl(M2) -3,542 * -3,586 -3,649
Δl(OUV) -5,151 *** -6,944 -6,046
Δl(PIBrh) -3,729 ** -3,833 -3,776
Δl(urban) -2,448 -2,763 -0,818

(***) (**)(*) significativité respectivement aux seuils de 1%, 5% et 10%
MacKinnon (1996) one-sided p-values.
Δ est l’opérateur de la première différence
25
Tableau 2 : Tests de racine unitaire en présence de rupture structurelle de Zivot et
Andrews
Série rupture structurelle

rupture structurelle rupture structurelle
temporelle dans
dans la dans la
de la la tendance et la
constante tendance
variable constante
date de date de date de
rupture rupture rupture
- - -
L(pf) (0) 2008 (0) 2004 (0) 2003
3,622 3,584 3,549
- - -
L(agri) (0) 1997 (0)* 2011 (0) 2008
4,385 4,386 4,265
- - -
L(M2) (1) 1990 (1)** 1998 (1) 1997
3,913 4,672 4,365
- - -
L(OUV) (1) 1996 (0) 1990 (0) 1996
3,363 2,714 3,039
- - -
lPIBrh (2) 2011 (2) 2009 (2)** 2011
4,340 5,536 5,129
- - -
L(urban) (1) 2004 (1)* 1992 (1) 1991
4,459 4,387 4,716
- - -
Δl(pf) (4) 2008 (3) 1998 (4) * 2011
4,273 3,021 4,833
- - -
Δl(agri) (0)*** 1993 (4)*** 1998 (0) 1993
8,726 8,118 8,575
- - -
Δl(M2) (4) 1999 (3)* 1992 (4) 1999
4,511 4,206 3,729
- - -
Δl(OUV) (0)*** 2009 (2)** 2009 (0)*** 1990
5,499 4,424 6,262
- - -
Δl(PIBrh) (1) 2009 (4)** 2000 (1) 1999
3,555 4,424 4,281
- - -
Δl(urban) (0)* 1995 (4)* 2000 (0)*** 1995
6,767 4,391 5,977
(***) (**)(*) significativité respectivement aux seuils de 1%, 5% et 10%
26
Tableau 3 : test de co-intégration par les bornes
nombre d'observations =33
nombre de variables =5
5% 10%
I(0) I(1) I(0) I(1)
2,39 3,38 2,08 3
Statistique de Fisher = 5,675
Tableau 4 : Test de causalité de Toda Yamamoto
Variables indépendantes
Variables
L(pf) L(agri) L(M2) L(OUV) L(PIBRH) L(URBAN)
dépendantes
χ² (2) χ² (2) χ² (2) χ² (2) χ² (2)
L(pf) - 0,699 0,254 0,961 11,365 (***) 0,904
L(agri) 9,407 (***) - 5,710 (*) 4,672 (*) 0,390 1,913
L(M2) 1,216 0,523 - 8,904 (**) 1,481 0,037
L(OUV) 26,775 (***) 0,598 42,001 (***) - 19,985 (***) 39,928 (***)
L(PIBRH) 13,912 (***) 1,321 8,151 (**) 6,310 (**) - 9,892 (***)
L(URBAN) 3,766 1,267 1,009 6,889 (**) 7,197 (**) -
27
Tableau 5 : test de validation du modèle
Breusch-Godfrey Serial Correlation LM Test:
F-statistic 1,009 Prob. F(3,21) 0,408
Heteroskedasticity Test: Breusch-Pagan-Godfrey
F-statistic 0,273 Prob. F(8,24) 0,969
Test de normalité de Jarque-Bera
JB=2,804 Prob 0,246
Test de Ramsey RESET (*)
F-statistic 0,201 Prob 0,658

(*) : nous avons testé la forme quadratique seulement.
28
-3.72
-3.68
-3.64
-3.60
-3.56
-3.52
ARDL(1, 0, 0, 0,
1, 0)
-0.4
0.0
0.4
0.8
1.2
1.6
-8
-6
-4
-2
0
2
4
6
8
ARDL(1, 0, 0, 0,
1, 1)
ARDL(1, 0, 0, 0,
0, 1)
ARDL(1, 1, 0, 0,
2012
2012
1, 0)
ARDL(2, 0, 0, 0,
1, 0)
Graphique 2 : Test CUMSUM

ARDL(1, 0, 1, 0,
Graphique 3 : Test CUMSUM SQ

1, 0)
Graphique 1 : 20 meilleurs modèles
ARDL(1, 0, 0, 1,
2013
1, 0)
2013
ARDL(1, 0, 0, 0,
2, 0)
ARDL(1, 0, 0, 0,
29
0, 0)
CUSUM
ARDL(1, 0, 1, 0,
CUSUM of Squares
1, 1)
Graphiques
2014
ARDL(1, 0, 1, 0,
2014
0, 1)
ARDL(1, 1, 0, 0,
1, 1)
ARDL(1, 0, 0, 0,
1, 2)
ARDL(2, 0, 0, 0,
2015
2015
1, 1)
5% Significance
ARDL(2, 1, 0, 0,
5% Significance
1, 0)
ARDL(1, 0, 0, 1,
1, 1)
ARDL(1, 0, 0, 0,
2016
2, 1)
2016
ARDL(1, 1, 0, 0,
0, 1)
ARDL(1, 1, 0, 0,
0, 0)
ARDL(1, 2, 0, 0,
1, 0)
Références Bibliographiques
Agence Française de Développement, Aide et mobilisation fiscale dans les pays en

développement, Octobre 2007/21.
Ahmad,H.K., Ahmed, S., Mushtaq ,M., Nadeem, M(2016) Socio Economic Determinants of
Tax Revenue in Pakistan: An Empirical Analysis. Journal of Applied Environmental and
Biological Vol 6(2S), 32-42Science
Antoine Ngakosso , Comment la fiscalité peut-elle contribuer à la monétarisation d’une

économie ? Editions Publibook Université 2015.
Chaudhry, I. S., & Munir, F. (2010). Determinants of Low Tax Revenue in Pakistan.
Pakistan Journal of Social Sciences (PJSS) Vol, 30, 439-452.
Eltony,M.N (2002) The Determinants of Tax Effort in Arab Countries, Arab Planning
Institute
Lotz, J.R. and E.R. Morss, 1967.Measuring ‘Tax Effort’ in Developing Countries.
International
Monetary Staff Papers, 14: 479-497.
Lutfunnahar, B. (2007). A Panel Study on Tax Effort and Tax Buoyancy with Special
Reference to Bangladesh. Working Paper 715: Policy Analysis Unit (PAU) Research
Department Bangladesh Bank
Le, Tuan Minh; Moreno-Dodson, Blanca; Rojchaichaninthorn, Jeep. 2008. Expanding taxable
capacity and reaching revenue potential : cross-country analysis. Policy Research Working
Paper ; no. WPS 4559. Washington, DC : World Bank
Ministre de l'économie, des finances et du plan du Sénégal, Document d’Etude N°34

Septembre 2016, Evaluation du Potentiel fiscal du Sénégal.
Stotsky, J.G. and Wolde Mariam, A. (1997).Tax Effort in Sub-Saharan Africa. Working Paper
107: International Monetary Fund, Washington, DC
Karagöz.K Determinants of Tax Revenue: Does Sectorial Composition Matter? Journal of

Finance, Accounting and Management, 4(2), 50-63, July 2013 50
30
An Autoregressive Distributed Lag Modelling
Approach to Cointegration Analysis¤
M. Hashem Pesaran
Trinity College, Cambridge, England
Yongcheol Shin
Department of Applied Economics, University of Cambridge, England
First Version: February, 1995, Revised: January, 1997
Abstract
This paper examines the use of autoregressive distributed lag (ARDL) mod-
els for the analysis of long-run relations when the underlying variables are I(1).
It shows that after appropriate augmentation of the p order of the ARDL model,
the OLS estimators of the short-run parameters are T -consistent with the as-
ymptotically singular covariance matrix, and the ARDL-based estimators of the
long-run coe¢cients are super-consistent, and valid inferences on the long-run pa-
rameters can be made using standard normal asymptotic theory. The paper also
examines the relationship between the ARDL procedure and the fully modi…ed
OLS approach of Phillips and Hansen to estimation of cointegrating relations, and
compares the small sample performance of these two approaches via Monte Carlo
experiments. These results provide strong evidence in favour of a rehabilitation
of the traditional ARDL approach to time series econometric modelling. The
ARDL approach has the additional advantage of yielding consistent estimates of
the long-run coe¢cients that are asymptotically normal irrespective of whether
the underlying regressors are I(1) or I(0).
JEL Classi…cations: C12, C13, C15, C22.

Key Words: Autoregressive distributed lag model, Cointegration, I(1) and I(0)
regressors, Model selection, Monte Carlo simulation.
¤
This is a revised version of a paper presented at the Symposium at the Centennial of Ragnar
Frisch, The Norwegian Academy of Science and Letters, Oslo, March 3-5, 1995. We are grateful
to Peter Boswijk, Clive Granger, Alberto Holly, Kyung So Im, Brendan McCabe, Steve Satchell,
Richard Smith, Ron Smith and an anonymous referee for helpful comments. Partial …nancial
support from the ESRC (Grant No. R000233608) and the Isaac Newton Trust of Trinity College,
Cambridge is gratefully acknowledged.
1. INTRODUCTION
Econometric analysis of long-run relations has been the focus of much theoreti-
cal and empirical research in economics. In the case where the variables in the
long-run relation of interest are trend stationary, the general practice has been to
de-trend the series and to model the de-trended series as stationary distributed
lag or autoregressive distributed lag (ARDL) models. Estimation and inference
concerning the long-run properties of the model are then carried out using stan-
dard asymptotic normal theory. (For a comprehensive review of this literature
see Hendry, Pagan and Sargan (1984) and Wickens and Breusch (1988)). The
analysis becomes more complicated when the variables are di¤erence-stationary,
or integrated of order 1 (I(1) for short). The recent literature on cointegration is
concerned with the analysis of the long run relations between I(1) variables, and
its basic premise is, at least implicitly, that in the presence of I(1) variables the
traditional ARDL approach is no longer applicable. Consequently, a large number
of alternative estimation and hypothesis testing procedures have been speci…cally
developed for the analysis of I(1) variables. (See the pioneering work of Engle and
Granger (1987), Johansen (1991), Phillips (1991), Phillips and Hansen (1990) and
Phillips and Loretan (1991).)
In this paper we re-examine the use of the traditional ARDL approach for the
analysis of long run relations when the underlying variables are I(1). We consider
the following general ARDL(p; q) model:
p q¡1
X X
0
yt = ®0 + ®1 t + Ái yt¡i + ¯ xt + ¯ ¤0
i ¢xt¡i + ut ; (1.1)
i=1 i=0
¢xt = P1 ¢xt¡1 + P2 ¢xt¡2 + ¢ ¢ ¢ + Ps ¢xt¡s + "t ; (1.2)

where xt is the k-dimensional I(1) variables that are not cointegrated among
themselves, ut and "t are serially uncorrelated disturbances with zero means and
constant variance-covariances, and Pi are k £ k coe¢cient matrices such that the
vector
Ppautoregressive process in ¢xt is stable. We also assume that the roots of
i
1 ¡ i=1 Ái z = 0 all fall outside the unit circle and there exists a stable unique
long-run relationship between yt and xt .
We consider the problem of consistent estimation of the parameters of the
ARDL model both when ut and "t are uncorrelated, and when they are corre-
lated. In the former case we will show that the OLS estimators p of the short-run
¤ ¤
parameters, ®0 , ®1 , ¯, ¯ 1 ; :::; ¯ q¡1 and Á = (Á1 ; :::; Áp ) are T - consistent, and
the covariance matrix of these estimators has a well-de…ned limit which is as-
ymptotically singular such that the estimators of ®1 and ¯ are asymptotically
perfectly collinear with the estimator of Á. These results have the interesting
[1]
implication that the OLS estimators of the long-run coe¢cients,
Pp de…ned by the
ratios ± = ®1 =Á(1) and µ = ¯=Á(1), where Á(1) = 1 ¡ i=1 Ái , converge to their
true values faster than the estimators of the short run parameters ®1 and ¯. The
3
ARDL-based estimators of ± and µ are T 2 -consistent and T -consistent, respec-
tively. These results are not surprising and are familiar from the cointegration
literature. But more importantly, we will show that despite the singularity of
the covariance structure of the OLS estimators of the short-run parameters, valid
inferences on ± and µ, as well as on individual short run parameters, can be made
using standard normal asymptotic theory. Therefore, the traditional ARDL ap-
proach justi…ed in the case of trend-stationary regressors, is in fact equally valid
even if the regressors are …rst-di¤erence stationary.
In the case where ut and "t are correlated the ARDL speci…cation needs to be
augmented with an adequate number of lagged changes in the regressors before
estimation and inference are carried out. The degree of augmentation required
depends on whether q > s + 1 or not. Denoting the contemporaneous correlation
between ut and "t by the k £ 1 vector d, the augmented version of (1.1) can be
written as
p
X X
m¡1
0
yt = ®0 + ®1 t + Ái yt¡i + ¯ xt + ¼ 0i ¢xt¡i + ´ t ; (1.3)
i=1 i=0
where m = max(q; s + 1), ¼ i = ¯ ¤i ¡ P0i d, i = 0; 1; 2; :::; m ¡ 1, P0 = Ik , where

Ik is a k £ k identity matrix, ¯ ¤i = 0 for i ¸ q, and Pi = 0 for i ¸ s. In
this augmented speci…cation ´ t and "t are uncorrelated and the results stated
above will be directly applicable to the OLS estimators of the short-run and
long-run parameters of (1.3). Once again traditional methods of estimation and
inference, originally developed for trend-stationary variables, are applicable to
…rst-di¤erence stationary variables. The estimation of the short-run e¤ects still
requires an explicit modelling of the contemporaneous dependence between ut and
"t . In practice, an appropriate choice of the order of the ARDL model is crucial for
valid inference. But once this is done, estimation of the long-run parameters and
computation of valid standard errors for the resultant estimators can be carried
out either by the OLS method, using the so-called “delta” method (¢-method)
to compute the standard errors, or by the Bewely’s (1979) regression approach.
These two procedures yield identical results and a choice between them is only a
matter of computational convenience.
The use of the ARDL estimation procedure is directly comparable to the semi-
parametric, fully-modi…ed OLS approach of Phillips and Hansen (1990) to esti-
mation of cointegrating relations. In the static formulation of the cointegrating
regression,
yt = ¹ + ±t + µ0 xt + vt ; (1.4)
[2]
where ¢xt = et , and »t = (vt ; e0t )0 follows a general linear stationary process, the
3
OLS estimators of ± and µ are T 2 - and T -consistent, but in general the asymp-
totic distribution of the OLS estimator of µ involves the unit-root distribution
as well as the second-order bias in the presence of the contemporaneous correla-
tions that may exist between vt and et . Therefore, the …nite sample performance
of the OLS estimator is poor and in addition, due to the nuisance parameter
dependencies, inference on µ using the usual t-tests in the OLS regression of
(1.4) is invalid. To overcome these problems Phillips and Hansen (1990) have
suggested the fully-modi…ed OLS estimation procedure that asymptotically takes
account of these correlations in a semi-parametric manner, in the sense that the
fully-modi…ed estimators have the Gaussian mixture normal distribution asymp-
totically, and inferences on the long run parameters using the t-test based on the
limiting distribution of the fully-modi…ed estimator is valid.
The ARDL-based approach to estimation and inference, and the fully-modi…ed
OLS procedure are both asymptotically valid when the regressors are I(1), and a
choice between them has to be made on the basis of their small sample properties
and computational convenience. To examine the small sample performance of the
two estimators we have carried out a number of Monte Carlo experiments. Since
in practice the “true” orders of the ARDL(p; m) model are rarely known a priori,
in the Monte Carlo experiments we also consider a two-step strategy whereby p
and m are …rst selected (estimated) using either the Akaike Information Criterion
(AIC), or the Schwarz Bayesian Criterion (SC), and then the long-run coe¢cients
and their standard errors are estimated using the ARDL model selected in the
…rst step. We refer to these estimators as ARDL-AIC and ARDL-SC. The main
…ndings from these experiments are as follows:
(i) The ARDL-AIC and the ARDL-SC estimators have very similar small-sample
performances, with the ARDL-SC performing slightly better in the majority
of the experiments. This may re‡ect the fact that the Schwartz criterion is
a consistent model selection criterion while Akaike is not.
(ii) The ARDL test statistics that are computed using the ¢-method (or equiv-
alently by means of the so-called Bewley’s regression), generally perform
much better in small samples than the test statistics computed using the
asymptotic formula that explicitly takes account of the fact that the regres-
sors are I(1).
(iii) The ARDL-SC procedure when combined with the ¢-method of comput-
ing the standard errors of the long-run parameters generally dominates the
Phillips-Hansen estimator in small samples. This is in particular true of the
size-power performance of the tests on the long-run parameter.
[3]
(iv) The Monte Carlo results point strongly in favor of the two-step estimation
procedure, and this strategy seems to work even when the model under con-
sideration has endogenous regressors, irrespective of whether the regressors
are I(1) or I(0).1
The plan of the paper is as follows: Section 2 examines the asymptotic prop-
erties of the OLS estimators in the context of a simple autoregressive model with
a linear deterministic trend and the k-dimensional strictly exogenous I(1) regres-
sors. Section 3 considers a more general ARDL model, allowing for residual serial
correlations and possible endogeneity of the I(1) regressors, and develops the re-
sultant asymptotic theory. In Section 4 the ARDL-based approach is compared to
the cointegration-based approach of Phillips and Hansen (1990). Section 5 reports
and discusses the results of Monte Carlo experiments. Some concluding remarks
are presented in Section 6. Mathematical proofs are provided in an Appendix.
2. The Lagged Dependent Variable Model with the Deter-

ministic Trend and Exogenous I(1) Regressors
Initially we consider the simple ARDL(1,0) model containing I(1) regressors and
a linear deterministic trend,
Á(L)yt = ®0 + ®1 t + ¯ 0 xt + ut ; t = 1; :::; T; (2.1)
where yt is a scalar, Á(L) = 1 ¡ ÁL, with L being the one period lag operator, xt
is a k £ 1 vector of regressors assumed to be integrated of order 1:2
xt = xt¡1 + et ; (2.2)
and ¯ is a k £ 1 vector of unknown parameters. Suppose that the following

assumptions hold:
(A1) The scalar disturbance term, ut , in (2.1) is iid(0; ¾ 2u ),

1
The case where the regressors are I(1) and cointegrated among themselves presents ad-
ditional identi…cation problems and is best analyzed in the context of a system of long-run
structural equations. On this see Pesaran and Shin (1995).
2
Speci…cations (2.1) and (2.2) can easily be adapted to allow for inclusion of a drift term in
the xt process. Consider, for example, the process ¢xt = ¹x + et ; and note that it can also be
written as xt = ¹x t + x
~t , where ¢~xt = et : Therefore, substituting xt in (2.1) we have
Á(L)yt = ®0 + (®1 + ¯0 ¹x )t + ¯ 0 x
~t + ut ;
where x
~t follows an I(1) process without a drift.
[4]
(A2) The k-dimensional vector, et , in (2.2) has a general linear multivariate
stationary process,
(A3) ut and et are uncorrelated for all leads and lags such that xt is strictly
exogenous with respect to ut ,
(A4) The I(1) regressors, xt , are not cointegrated among themselves, and
(A5) jÁj < 1, so that the model is dynamically stable, and a long-run relationship
between yt and xt exists.3
We shall distinguish between two types of parameters, the parameters capturing

the short-run dynamics (®0 ; ®1 ; ¯ and Á), and the long run parameters on the
trended regressors, t and xt , de…ned by
®1 ¯
±= ; µ= : (2.3)
1¡Á 1¡Á
Applying the decomposition 1 ¡ ÁL = (1 ¡ Á) + Á(1 ¡ L) to (2.1), yt can be

expressed as
yt = ¹ + ±t + µ0 xt + vt ; (2.4)
where µ ¶
®0 Á
¹= ¡ ±;
1¡Á 1¡Á
and
X
1 X
1
i
vt = Á ut¡i ¡ Á Ái µ 0 et¡i :
i=0 i=0
From (2.1) and (2.4) it is clear that yt and xt are individually I(1), but must be
cointegrated for (2.1) to be meaningful.4 Similarly, we obtain
yt¡1 = ¹1 + ±t + µ0 xt + ·t ; (2.5)
where ¹1 = ¹ ¡ ±, ·t = vt¡1 ¡ µ0 et , and ·t is an I(0) process with variance ¾ 2· .

Our main aim is to derive the asymptotic properties of the OLS estimators of
the short-run as well as the long-run parameters in the context of the ARDL(1,0)
3
Tests of the existence of long-run relationships between yt and xt , when it is not known a
priori whether xt are I(0) or I(1), are discussed in Pesaran, Shin and Smith (1996).
4
A relationship between I(1) variables is said to be “stochastically cointegrated” if it is trend
stationary, while “deterministic cointegration” refers to the case where the cointegrating relation
is level stationary. For a discussion of these two types of cointegrating relations see Park (1992).
[5]
model, (2.1). For expositional convenience, we transform (2.1) to the partitioned
regression model in the matrix form as,
yT = ZT b + yT ¡1 Á + uT ; (2.6)
where yT = (y1 ; :::; yT )0 , yT ¡1 = (y0 ; :::; yT ¡1 )0 , ¿ T = (1; :::; 1)0 , tT = (1; :::; T )0 ,
XT = (x1 ; :::; xT )0 , ZT = (¿ T ; tT ; XT ), uT = (u1 ; :::; uT )0 , and b = (®0 ; ®1 ; ¯ 0 )0 .
Since our main interest is in the long-run coe¢cients on trended regressors, t and
xt , we also partition
µ ¶ µ ¶
®0 ®1
ZT = (¿ T ; ST ); ST = (tT ; XT ); b = ; c= ;
c ¯
where the dimensions of ZT , ST , b and c are T £ (k + 2), T £ (k + 1), (k + 2) £ 1

and (k + 1) £ 1, respectively.
Theorem 2.1. Under the assumptions (A1) - A(5), the OLS p estimators of Á and
c = (®1 ; ¯ 0 )0 in (2.6), denoted by Á^ T and ^cT , respectively, are T -consistent, and
have the following asymptotic distributions:
½ ¾
p ¾ 2u
^ a
T (ÁT ¡ Á) » N 0; 2 ; (2.7)
¾·
½ ¾
p a ¾ 2u 0
T (^
cT ¡ c) » N 0; 2 ¸¸ ; (2.8)
¾·
where ¸ = (±; µ0 )0 is a (k + 1) £ 1 vector of the long run parameters on trended
regressors, t and xt , and rank(¸¸0p ) = 1. In addition, the OLS estimator of
®0 in (2.6), denoted by ® ^ 0T , is also T -consistent, but has the mixture normal
distribution. De…ning h = (b0 ; Á)0 and PZT = (ZT ; yT ¡1 ), and denoting the OLS
estimator of h by h ^T , the covariance matrix of h ^ T can be consistently estimated
by
V^ (h
^T ) = ¾
^ 2uT (P0ZT PZT )¡1 ;
where ¾ ^T )0 (yT ¡ PZ h
^ 2uT = T ¡1 (yT ¡ PZT h ^T ), and V^ (h
^T ) is asymptotically sin-
T
gular with rank equal to 2.
Theorem 2.1 shows that despite the presence of stochastic and deterministic trends
p
in the ARDL model, the OLS estimators of the short-run parameters are T -
consistent.5 The second and more important …nding is that the OLS estimators
5
Similar results can also be obtained in the case of regressors with higher order trend terms
such as t2 ; t3 ; :::; or I(2), I(3), ..., variables.
[6]
of the coe¢cients on the trended regressors, ®1 and ¯, in (2.1) are asymptoti-
cally perfectly collinear with the OLS estimator of the coe¢cient on the lagged
dependent variable, Á; namely,
p n o
T (^ ^ T ¡ Á) = op (1):
cT ¡ c) + ¸(Á (2.9)
One interesting implication of this result is that the t-statistics for testing the sig-
ni…cance of individual impact coe¢cients on the I(1) regressors are asymptotically
equivalent, namely t¯^ i ¡ t¯^ j = op (1) for i 6= j, and t¯^ i ¡ t®^ 1 = op (1).6 Furthermore,
^ = op (1). Relation (2.9) in conjunction with
t¯^ i ¡ t(1¡Á)
^ T ¡ Á)
^ T ¡ ¸ = (^
¸
cT ¡ c) + ¸(Á
; (2.10)
^T )
(1 ¡ Á
also yields an important result familiar from the cointegration literature, which
we set out in the following theorem:
Theorem 2.2. Under assumptions (A1) - (A5), the ARDL-based estimators of

the long-run parameters, given by ^± T = ® ^ 1T =(1 ¡ Á^ T ), and µ
^T = ¯ ^ T ),
^ T =(1 ¡ Á
3
converge to their true values ± and µ, respectively, at the rates, T 2 and T . Also
asymptotically, T 2 (^± T ¡±) and T (µ
^T ¡µ) have the (mixture) normal distributions,
3
and therefore, ½ ¾
2
1
¡1 ^ T ¡ ¸) » N 0;
a ¾ u
QS2~ DST (¸ Ik+1 ; (2.11)
T (1 ¡ Á)2
where ¸ ^0T )0 , Q ~ = DS S0T HT ST DS ; ST = (tT ; XT ), HT = IT ¡
^ T = (^± T ; µ
ST T T
0 ¡1 0 ¡ 32 ¡1
¿ T (¿ T ¿ T ) ¿ T ; and DST = Diag(T ; T Ik ):
The …nding that the estimator of µ is T -consistent is known as the “super-

consistency” property in the cointegration literature. Since the limiting distri-
butions of T 2 (^± T ¡ ±) and T (µ^T ¡ µ) are (mixture) normal, optimal two-sided
3
inferences concerning ± and µ are possible. Notice also that the covariance matrix
of the estimator of ¸ simply depends on the inverse of the (scaled) demeaned
data matrix and the spectral density at zero frequency of (1 ¡ ÁL)¡1 ut , namely
¾ 2u =(1 ¡ Á)2 . Once again, this …nding is in line with the results already familiar
from the cointegration literature. (See Section 4 for further discussions.)
6
For large enough T we have t¯^ i ¼ (1 ¡ Á) (¾ · =¾ u ) : This explains the relatively low t-ratios
often obtained for short-run coe¢cients in ARDL regressions with I(1) variables, especially when
Á is close to unity.
[7]
Hypothesis testing on the general linear restrictions involving the k + 1 di-
mensional long-run parameter vector, ¸, can be carried out in the usual manner.
Consider the g linear restrictions on ¸,
R¸ = r;
where R is a g £ (k + 1) matrix and r is a g £ 1 vector of known constants. These

restrictions can be tested using the Wald statistic,
n o¡1
W = (R¸ ^ T ¡ r)0 RCov(¸ ^ T )R0 ^ T ¡ r)
(R¸ (2.12)
( )
^ 2
= (R¸ ^ T ¡ r)0 (1 ¡ ÁT ) (S0T HT ST ) (R¸ ^ T ¡ r):
^ 2uT
¾
Of special interest is the t-statistic on the individual coe¢cients given by

^ iT ¡ ¸i
¸
ti = ; i = 1; :::; k + 1; (2.13)
sî
where the standard error of the i-th coe¢cient is consistently estimated by
s
^ 2uT
¾
sî = (S0T HT ST )¡1
ii ;
^
(1 ¡ ÁT ) 2
and (S0T HT ST )¡1 0 ¡1

ii denotes the i-th diagonal element of (ST HT ST ) . By Theorem
2.2, the Wald statistic in (2.12) follows the asymptotic Â2 distribution with g
degrees of freedom, and t2i in (2.13) is distributed asymptotically as a Â2 variate
with one degree of freedom.
It is worth noting that the results in Theorem 2.2 equally apply to the purely
autoregressive model with deterministic trend,
yt = ®0 + ®1 t + Áyt¡1 + ut ; t = 1; :::; T; (2.14)
and to the ARDL(1,0) model without a deterministic trend,
yt = ®0 + ¯ 0 xt + Áyt¡1 + ut ; t = 1; :::; T: (2.15)
For completeness the asymptotic results for these models are summarized in The-
orems 2.3 and 2.4.
Theorem 2.3. Under the assumptions (A1) and (A5), the p OLS estimators of
®0 ; ®1 and Á in (2.14), denoted by ®
^ 0T , ® ^ T , are all T -consistent, and
^ 1T , and p
Á p
asymptotically normally distributed. In addition, T (^ ®1T ¡ ®1 ) and T (Á^ T ¡ Á)
[8]
are perfectly collinear asymptotically and the covariance matrix of (^ ®0T , ® ^T )
^ 1T , Á
is asymptotically singular with rank equal to 2. Furthermore, the estimator of the
long run parameter ±, computed by ® ^ 1T =(1 ¡ Á^ T ), has the following asymptotic
distribution: ½ ¾
2
12¾
T 2 (^± T ¡ ±) » N 0;
3 a u
: (2.16)
(1 ¡ Á)2
Theorem 2.4. Under assumptions (A1) - (A5), p the OLS estimators of ®0 ; ¯ and
Á in (2.15), denoted by ®^ 0T , ¯ ^ T are T -consistent, and have the asymp-
^ T , and Á
p p
totic (mixture) normal distributions. In addition, T (^ ®1T ¡ ®1 ) and T (Á ^ T ¡ Á)
are perfectly collinear asymptotically and so the covariance matrix of (^ ^T ,
®0T , ¯
^
ÁT ) is asymptotically singular with rank equal to 2. Furthermore, the estimator
of the long run parameter µ, given by µ ^T = ¯^ T =(1 ¡ Á^ T ); has the mixture normal
distribution asymptotically, and
½ ¾
1
^ a ¾ 2u
QX~ T (µ T ¡ µ) » N 0;
2
Ik ; (2.17)
T (1 ¡ Á)2
where QX~T = T ¡2 X0T HT XT :
Before considering a more general speci…cation of the ARDL model, we examine

the relation between the standard errors of the estimator of the long-run para-
meter, µ, obtained from our asymptotic results and the standard errors obtained
from the so called “delta” method (¢-method for short). For ease of exposition
we consider the simple model (2.15), and without loss of generality focus on the
case where xt is a scalar (i.e., k = 1). From Theorem 2.4 we have
" T # 12 ½ ¾
1 X ¾ 2u
T (^µT ¡ µ) = (^µT ¡ µ) » N 0;
2 a
QX~
2
(xt ¡ x¹) ; (2.18)
T
t=1
(1 ¡ Á)2
P P
where QX~T = T ¡2 Tt=1 (xt ¡ x¹)2 and x¹ = T ¡1 Tt=1 xt .7 Hence a consistent
estimator of the variance of ^µT is given by
^ 2uT
¾ 1
V^ (^µ T ) = P : (2.19)
(1 ¡ Á ^ T )2 T
(x t ¡ x
¹ ) 2
t=1
7
In the case where xt is I(0) p
we have the same asymptotic result given by (2.18); that is,
since T xT HT xT = Op (1) and T (^
¡1 0
µT ¡ µ) = Op (1), hence
" T # 12 ½ ¾
p X a ¾ 2u
T (^µ T ¡ µ) = (^
1
(T ¡1 x0T HT xT ) 2 (xt ¡ x
¹)2
µ T ¡ µ) » N 0; :
t=1
(1 ¡ Á)2
[9]
The computation of the variance of ^µT by the ¢-method involves approximating
^µT = g(ª
^T) = ¯^ T
;
1¡Á ^T
by a linear function of ª ^ T = (¯^ T ; Á^ T )0 , and then approximating the variance of

^µT by the variance of the resulting linear function. Denoting the estimator of the
variance of ^µT by V^¢ (^µT ), we have
Ã !0 Ã !
@g( ^T)
ª @g( ª^T)
V^¢ (^µT ) = V^ (ª ^T)
@ª^T @ª^T
2 1 3
" # ^
1 ¯^ T 6
1¡ÁT
7
= ; ^ 2uT (R0T HT RT )¡1 4
¾ 5;
^
1 ¡ ÁT (1 ¡ ÁT ) ^ 2 ^
¯T
^ )2
(1¡ÁT
where RT = (xT ; yT ¡1 ). After some algebra V^¢ (^µT ) can be expressed as
¾
^ 2 h i 1 · P 2
P ¸·
1
¸
(y ¡ y
¹ ) ¡ (y ¡ y
¹ )(x ¡ x
¹ )
V^¢ (^µT ) = uT
1; ^µT P t¡1 Pt¡1 t
^µT ;
(1 ¡ Á^ T )2 DT ¡ (yt¡1 ¡ y¹)(xt ¡ x¹) (xt ¡ x¹)2
(2.20)
where the bar over the variable denotes the sample mean, and
" T #" T # " T #2
X X X
DT = (xt ¡ x¹)2 (yt¡1 ¡ y¹)2 ¡ (yt¡1 ¡ y¹)(xt ¡ x¹) :
t=1 t=1 t=1
Using (2.5), recalling that ± = 0 and de…ning y~t¡1 = yt¡1 ¡ y¹; x~t = xt ¡ x¹ and
· ¹ , we also have
~ t = ·t ¡ ·
y~t¡1 = µ~
xt + ·
~t ; (2.21)
where · ~ t follows a general linear stationary process. Substituting this result in
(2.20), we obtain
PT P P
^ ^ ^ 2uT
¾ ~ 2t + (^µT ¡ µ)2 Tt=1 x~2t ¡ 2(^µT ¡ µ) Tt=1 x~t ·
t=1 · ~t
V¢ (µT ) = P P P : (2.22)
(1 ¡ Á ^ T )2 T T T
~ 2t ) ¡ ( t=1 x~t ·
( t=1 x~2t )( t=1 · ~ t )2
Since ·
~ t is I(0) and x~t is I(1), using the results familiar in the literature (see, for
example, Phillips and Durlauf (1986)), we have
X
T X
T X
T
¡1
T ~ 2t
· = Op (1); T ¡2
x~2t = Op (1); T ¡1
x~t ·
~ t = Op (1):
t=1 t=1 t=1
[10]
Also from the result of Theorem 2.4 we know that T (^µT ¡ µ) = Op (1). Hence,
taking probability limits of the right hand side of (2.22) as T ! 1, we have
¾ 2u 1
V^¢ (^µT ) = PT
+ op (1):
(1 ¡ Á) T ¡2 t=1 (xt ¡ x¹)2
2
Therefore, the standard error for the estimator of the long run parameter, µ,
obtained using the ¢-method is asymptotically the same as that given by (2.19),
which was derived assuming that xt is I(1). One important advantage of the
variance estimator obtained by the ¢-method over the asymptotic formula (2.19)
lies in the fact that it is asymptotically valid irrespective of whether xt is I(1) or
I(0), while the latter estimator is valid only if xt is I(1). P
The two variance estimators clearly di¤er in …nite samples. Notice that ( Tt=1 x~t ·
~ t )2
is asymptotically negligible compared to other terms in (2.22), but it may not be
negligible in …nite samples, especially when x~t and · ~ t are correlated. For a com-
parison of the small sample properties of the two variance estimators see the
Monte Carlo results reported in Section 5.
3. General Autoregressive Distributed Lag Models with a

Deterministic Trend and I(1) Regressors
So far we have derived the estimation and asymptotic results for the simple
ARDL(1,0) model under the two strong assumptions (A1) and (A3). These as-
sumptions, however, are too restrictive in the time series analysis, and so the
estimation procedures developed in Section 2 are not expected to be robust to
the violation of these assumptions, because the limiting distributions of the OLS
estimators would then be inconsistent and/or depend on nuisance parameters.
We …rst relax the assumption (A1) and allow for the possibility of the error
process in (2.1) to be serially correlated. To deal with this serial correlation we
consider the ARDL(p; q) model,8
Á(L)yt = ®0 + ®1 t + ¯ 0 (L)xt + ut ; (3.1)

P P
where Á(L) = 1 ¡ pj=1 Áj Lj , and ¯(L) = qj=0 ¯ j Lj , and assume
(A1)0 The scalar disturbance, ut ; in the ARDL(p; q) model (3.1) is iid(0; ¾ 2u ).

8
For convenience we use the same notation ut for the disturbance terms in (2.1) and (3.1). In
practice the order of the lag polynomials operating on di¤erent elements of xt could be di¤erent.
But this does not a¤ect the asymptotic theory presented below.
[11]
Pq
Using the decomposition ¯(L) = ¯(1) + (1 ¡ L)¯ ¤ (L), where ¯(1) = j=0 ¯j ;
P Pq
¯ ¤ (L) = q¡1 ¤ j ¤
j=0 ¯ j L and ¯ j = ¡ i=j+1 ¯ i ; (3.1) can be rewritten as
q¡1
X
0
Á(L)yt = ®0 + ®1 t + ¯ xt + ¯ ¤0
j ¢xt¡j + ut ; (3.2)
j=0
where we have used ¯ = ¯(1). Similarly, applying Pp the decomposition

Pp¡1 ¤Á(L) =
¤ ¤ j
Á(1) +
P (1 ¡ L)Á (L) to (3.2), where Á(1) = 1 ¡ Á
i=1 i ; Á (L) = Á
j=0 j L and
Á¤j = pi=j+1 Ái ; we have
q¡1
X
0
Á(1)yt = ®0 + ®1 t + ¯ xt + ¯ ¤0 ¤
j ¢xt¡j ¡ Á (L)¢yt + ut : (3.3)
j=0
Also from (3.1), we obtain

¢yt = [Á(L)]¡1 f®1 + ¯ 0 (L)¢xt + ¢ut g:
Substituting for ¢yt in (3.3) we have
© ¤ ¤ ¡1 ª0 © ¤ ¡1 ª
¯ (L) ¡ Á (L) [Á(L)] ¯(L) 1 ¡ (1 ¡ L)Á (L) [Á(L)]
yt = ¹0 +±t+µ0 xt + ¢xt + ut ;
Á(1) Á(1)
(3.4)
where
®0 ¡ Á¤ (1)± ®1 ¯
¹0 = ; ±= ; µ = µ(1) = :
Á(1) Á(1) Á(1)
Now it is easily seen that
(1 ¡ L)¯ ¤ (L) ¡ (1 ¡ L)Á¤ (L) [Á(L)]¡1 ¯(L)
= µ(L) ¡ µ;
Á(1)
and
1 ¡ (1 ¡ L)Á¤ (L) [Á(L)]¡1 1 ¡ fÁ(L) ¡ Á(1)g [Á(L)]¡1
= = [Á(L)]¡1 ;
Á(1) Á(1)
where µ(L) = ¯(L)=Á(L). Using these P results and the decomposition
P1 µ(L) =
µ(1) + (1 ¡ L)µ¤ (L), where µ¤ (L) = 1 j=0 jµ ¤ j
L and µ ¤
j = ¡ µ
i=j+1 i in (3.4) we
obtain
yt = ¹0 + ±t + µ0 xt + µ¤0 (L)¢x + [Á(L)]¡1 ut : (3.5)
Matching the regressors on the right-hand-side of (3.2) with those in (3.5) we
…nally obtain
q¡1
X
0
yt = ¹0 + ±t + µ xt + µ¤0
j ¢xt¡j + ·0t ; (3.6)
j=0
[12]
P1 ¡1
where ·0t = j=q µ¤0
j et¡j + [Á(L)] ut . Similarly,
q¡1
X
0 0
yt¡i = ¹i + ±t + µ xt + gij ¢xt¡j + ·it ; i = 1; :::; p; (3.7)
j=0
where ¹i = ¹0 ¡ i±, i = 1; :::; p;

½ ¾
¡µ if i > j
gij = ; 0 · j · q ¡ 1; i = 1; :::; p;
µ¤j¡1 if i · j
and
( P1 )
¤0 ¡1
j=q¡i µ j et¡i¡j + [Á(L)] ut¡i for i · q
·it = 0 Pi¡q¡1 ¤0 ¡1 : (3.8)
¡µ j=0 et¡q¡j + µ (L)et¡i + [Á(L)] ut for i > q
As in the previous section, we rewrite the ARDL(p; q) model (3.2) in matrix
notations in the partitioned regression form,
yT = GT f + YT Á + uT (3.9)
= ®0 ¿ T + ST c + WT ¯ ¤ + YT Á + uT ;
where yT = (y1 ; :::; yT )0 , yT;¡i = (y1¡i ; :::; yT ¡i )0 ; for i = 1; :::; p; YT = (yT;¡1 ; :::; yT;¡p );
¢XT;¡j = (¢x1¡j ; :::; ¢xT ¡j ) for j = 0; :::; q¡1; WT = (¢xT;0 ; ¢xT;¡1 ; :::; ¢xT;¡q+1 );
¿ T = (1; :::; 1)0 , tT = (1; :::; T )0 , XT = (x1 ; :::; xT )0 , GT = (¿ T ; tT ; XT ; WT ) =
(¿ T ; ST ; WT ), uT = (u1 ; :::; uT )0 , f = (®0 ; c0 ; ¯ ¤0 )0 , c = (®1 ; ¯ 0 )0 , ¯ ¤ = (¯ ¤0 ¤0
0 ; :::; ¯ q¡1 )
0
0
and Á = (Á1 ; :::; Áp ) : Note that the dimensions of YT , GT , Á and f are T £ p; T £
(k + kq + 2); p £ 1 and (k + kq + 2) £ 1, respectively.
Theorem 3.1. Under assumptions (A1)0 and (A2) - (A5), p the OLS estimators of
0 0
Á and c = (®1 ; ¯ ) in the ARDL(p; q) model (3.9) are T -consistent and have
the following asymptotic distributions:
p © ª
^ T ¡ Á) »
T (Á
a
N 0; ¾ 2u Q¡1 ; (3.10)
K
where QK is the p£p positive de…nite covariance matrix of (·1t ; ·2t ; :::; ·pt )0 de…ned
by (3.8), and p a © 0
ª
cT ¡ c) » N 0; ¾ 2u ¿ 0p Q¡1
T (^ K ¿ p ¸¸ ; (3.11)
where ¸ = (±; µ0 )0 , ¿ p is the p-dimensional unit vector, and rank(¸¸0 ) = 1. The
p
OLS estimators of ®0 and ¯ ¤ , denoted by ® ^ ¤T ; are also T -consistent, and
^ 0T and ¯
have the mixture normal distributions, asymptotically. The covariance matrix for
all the short-run parameters, h = (f 0 ; Á)0 , is asymptotically singular with rank
equal to kq + 2, and can be consistently estimated in the usual way by
V^ (h
^T ) = ¾^ 2uT (P0G PG )¡1 ;
T T
^T )0 (yT ¡ PG h
^ 2uT = T ¡1 (yT ¡ PGT h
where PGT = (GT ; YT ); and ¾ ^T ).
T
[13]
p p
From Theorem 3.1 we also …nd that T (^
® ¡ ® ) and ^ T ¡ ¯) are asymp-
T (¯
p 1T 1
totically perfectly collinear with T (Á^ T ¡ Á); that is,
p n o
T (^ ^
cT ¡ c) + ¸[ÁT (1) ¡ Á(1)] = op (1): (3.12)
^ T (1) = 1 ¡ Pp Á
where Á ^
i=1 iT . It is also straightforward to show that
^ T (1) ¡ Á(1)]
^ T ¡ ¸ = (^
¸
cT ¡ c) + ¸[Á
: (3.13)
^ T (1)
Á
Using Theorem 3.1, and results (3.12) and (3.13), we have:
Theorem 3.2. Under the assumptions (A1)0 and (A2) - (A5), the OLS estimators
of the long-run parameters, ¸ ^0T )0 = ^
^ T = (^± T; µ cT =Á^ T (1) in (3.9), converge to
their true values at faster rates than the estimators of the associated short-run
parameters, and follow the mixture normal distribution asymptotically. Therefore,
½ ¾
1
¡1 ^ a ¾ 2u
QS~ DST (¸T ¡ ¸) » N 0;
2
Ik+1 ; (3.14)
T [Á(1)]2
where QS~T and DST are as de…ned in Theorem 2.2.
Comparing Theorems 2.2 and 3.2, we …nd that the presence of the I(0) stationary
regressors in (3.9) (i.e., additional lagged changes in yt and the lagged changes
in xt which are introduced to deal with the residual serial correlation problem)
does not a¤ect the asymptotic properties of the OLS estimator of the long run
coe¢cients, ± and µ. Therefore, inferences concerning the long-run parameters
can be based on the same standard tests as given by (2.12) and (2.13). In this
more general case, however, the expression for the asymptotic variance of ¸ ^ T is
2 2
still given by (2.11), but with ¾ u =(1¡Á) replaced by the more general expression,
¾ 2u =[Á(1)]2 .
We now relax assumption (A3) and allow for the possibility of endogenous
regressors, but con…ne our attention to the case where ¢xt can be represented by
a …nite order vector AR(s) process,9
P(L)¢xt = "t ; (3.15)

P
where P(L) = Ik ¡ si=1 Pi , and Pi , i = 1; :::; s, are the k £ k coe¢cient matrices
such that the vector autoregressive process in ¢xt is stable. Here "t are assumed
9
Our analysis can also allow for the inclusion of lagged ¢y’s and a drift term in (3.15) without
a¤ecting the results presented below. On this see Boswijk (1995) and Pesaran, Shin and Smith
(1996).
[14]
to be serially uncorrelated, but possibly contemporaneously correlated with ut ;
namely, we assume that ³ t = (ut ; "0t )0 follows the multivariate iid process with
mean zero and the covariance matrix,
· 2 ¸
¾ u §u"
§³³ = : (3.16)
§"u §""
We will, however, continue to assume that Cov(ut¡j ; "t¡i ) = 0 for i 6= j. No-
tice that despite this assumption the model is still general enough to allow not
only for the contemporaneous but also for cross-autocorrelations between ut and
¢xt . With assumption (A3) relaxed, the OLS estimators in (3.1) are no longer
consistent. To correct for the endogeneity of xt , we model the contemporaneous
correlation between ut and "t by the linear regression of ut on "t
u t = d 0 "t + ´ t ; (3.17)
where using (3.16) we have d = §¡1 0
"" §u" , and "t is strictly exogenous with respect
to ´ t .10 Substituting (3.15) in (3.17) we obtain:
ut = d0 P(L)¢xt + ´t ; (3.18)
where ¢xt¡i ’s, i = 0; :::; s; are also strictly exogenous with respect to ´ t . The
parametric correction for the endogenous regressors is then equivalent to extending
the ARDL(p; q) model (3.2) to the more general ARDL(p; m) speci…cation,
X
m¡1
0
Á(L)yt = ®0 + ®1 t + ¯ xt + ¼0j ¢xt¡j + ´ t ; (3.19)
j=0
where m = max(q; s + 1), and ¼ i = ¯ ¤i ¡ P0i d, i = 0; 1; 2; :::; m ¡ 1, P0 = Ik ,

¯ ¤i = 0 for i ¸ q, and Pi = 0 for i ¸ s.
We now replace assumption (A3) by
(A3)0 The scalar disturbance ´t in (3.19) is iid(0; ¾ 2´ ), and ¢xt follows the general
stationary process given by (3.15). Furthermore, ´ t and "t are uncorrelated
such that xt and ¢xt¡j ’s j = 0; :::; m¡1; are strictly exogenous with respect
to ´ t in the ARDL(p; m) model (3.19).
There are two main di¤erences between the ARDL models de…ned by (3.2) and
(3.19). Firstly, the order of lagged ¢xt ’s in the two models can di¤er, and secondly,
the coe¢cients on ¢xt ’s and their lagged values have di¤erent interpretations.
Although this alters the dynamic structure of the model, the basic framework for
estimating the long-run parameters and carrying out statistical inference on them
is the same as before.
10
The relation (3.17) will be exact when the joint distribution of ut and "t is normal.
[15]
Theorem 3.3. Under the assumptions (A3)0 , (A4) and (A5), the OLS estimators p
of the short-run parameters in (3.19), ®0 , ®1 , ¯, Á1 ; :::; Áp , ¼ 0 ; :::; ¼ m¡1 , are T -
consistent, and asymptotically have the (mixture) normal distributions. Further-
p p h i
more, T (^ ^
cT ¡ c) is asymptotically perfectly collinear with T ÁT (1) ¡ Á(1) ,
P
where c = (®1 ; ¯ 0 )0 and Á(1) = 1 ¡ pi=1 Ái , such that the covariance matrix for
the estimators of the short-run parameters is asymptotically singular with rank
equal to km + 2. The asymptotic distribution of the OLS estimators of the long-
run parameters, ¸ ^0T )0 = ^
^ T = (^± T; µ cT =Á^ T (1) in (3.19), are mixture normal and
therefore, ½ ¾
2
1 ¾
¡1
QS2~ DST (¸^ T ¡ ¸) » N 0;
a ´
Ik+1 ; (3.20)
T [Á(1)]2
where ¾ 2´ is the variance of ´ t in (3.19), and QS~T and DST are as de…ned in
Theorem 2.2.
There are no fundamental di¤erences between the results of Theorems 2.2, 3.2
and 3.3, as far as the estimators of the log-run parameters are concerned. A com-
parison of (2.11), (3.14) and (3.20) shows that the asymptotic distributions of the
estimators of the long-run parameters, ¸ ^ T , under various assumptions discussed
above di¤er only by a scalar coe¢cient.
In sum, in the context of the ARDL model inference on the long run para-
meters, ± and µ, is quite simple and requires a priori knowledge or estimation of
the orders of the extended ARDL(p; m) model. Appropriate modi…cation of the
orders of the ARDL model is su¢cient to simultaneously correct for the resid-
ual serial correlation and the problem of endogenous regressors. Variances of the
OLS estimators of the long-run coe¢cients can then be consistently estimated
using either (3.20), or by means of the ¢-method applied directly to the long-
run estimators. Alternatively, one could compute the estimates of the long-run
coe¢cients and their associated standard errors using Bewley’s (1979) regression
procedure. Bewley’s method involves rewriting (3.19) as
p¡1
1 X 0 1 X ¤
m¡1
®0 ´
Á(L)yt = 0
+ ±t + µ xt + ¼ j ¢xt¡j ¡ Áj ¢yt¡j + t ; (3.21)
Á(1) Á(1) j=0 Á(1) j=0 Á(1)
and then estimating it by the instrumental variable method using (1, t, xt , ¢xt ,
¢xt¡1 ; :::; ¢xt¡m+1 , yt¡1 , yt¡2 ; :::; yt¡p ) as instruments. It is easy to show that the
IV estimators of ± and µ obtained using (3.21) are numerically identical to the
OLS estimators of ± and µ based on the ARDL model (3.19), and that the standard
errors of the IV estimators from the Bewley’s regression are numerically identical
to the standard errors of the OLS estimators of ± and µ obtained using the ¢-
method. (See, for example, Bardsen (1989).) The main attraction of the Bewley’s
[16]
regression procedure lies in its possible computational convenience as compared to
the direct OLS estimation of (3.19) and computation of the associated standard
errors by the ¢-method.11
Finally, we note in passing that the results developed in this section also apply
to the case where the underlying regressors, xt , given by (3.15), are I(0). (See
footnote 7 and the Monte Carlo simulation results in Section 5.)
4. A Comparison of ARDL and Phillips-Hansen Procedures

Here we focus on the case where there exists a unique cointegrating relation be-
tween I(1) variables, yt and xt , possibly with a deterministic trend. The case
where there are multiple cointegrating relations among I(1) variables presents ad-
ditional di¢culties and will not be discussed in this paper. (See Pesaran and Shin
(1995), and the references cited therein).
Consider the following cointegrating relation
yt = ¹ + ±t + µ0 xt + vt ; (4.1)
¢xt = et : (4.2)
Although the OLS estimator of µ is shown to be T -consistent, (see Stock (1987)),
it has also been found that the …nite sample behavior of the OLS estimator is
generally very poor (see, for example, Banerjee et. al. (1986)). Especially, in the
presence of non-zero correlation between vt and et , OLS estimators of µ in (4.1)
are often heavily biased in …nite samples, and inferences based on them are invalid
because of the dependence of the limiting distribution of the OLS estimators on
nuisance parameters. For details see Phillips and Loretan (1991).
Broadly speaking, there are two basic approaches to cointegration analysis: Jo-
hansen’s (1991) maximum likelihood approach, and Phillips-Hansen’s (1990, PH)
fully modi…ed OLS procedure.12 The ARDL approach to cointegration analysis
advanced in this paper is directly comparable to the PH procedure, and we shall,
therefore concentrate on this method. PH assume that vt and et in (4.1) and (4.2)
follow the general correlated linear stationary processes:13
vt = A1 (L)ut ; et = A2 (L)"t ; (4.3)

11
For a computer implementation of the ARDL approach using the ¢-method see Pesaran
and Pesaran (1997).
12
There are also other related procedures such as the original two-step method of Engle and
Granger (1987), the leads and lags estimation procedure suggested by Saikkonnen (1991) and
Stock and Watson (1993), and the canonical method by Park (1992).
13
For more details see Phillips and Solo (1992).
[17]
where ³ t = (ut ; "0t )0 are serially uncorrelated random variables with zero means
and a constant variance matrix given by (3.16). Assuming A1 (L) and A2 (L) are
invertible, (4.1) can be approximated as an ARDL speci…cation by truncating
the order of the in…nite order lag polynomials [A1 (L)]¡1 and [A2 (L)]¡1 such that
Á(L) ¼ [A1 (L)]¡1 and P(L) ¼ [A2 (L)]¡1 , where the orders of the lag polyno-
mials Á(L) and P(L) are denoted by p and s, respectively. Then we obtain the
approximate …nite-dimensional ARDL(p; m) speci…cation,
Á(L)yt = fÁ(1)¹ + ±Á0 (1)g + ±Á(1)t + Á(L)µ0 xt + §u" §¡1 "" P(L)¢xt + ´ t ; (4.4)
P
where Á0 (1) = ¡ pi=1 iÁi , m = max(p; s + 1), and by construction xt (and ¢xt ’s)
are uncorrelated with ´ t .14 Notice that (4.4) is of the same form as (3.19), with
the following relations among their parameters: ®0 = Á(1)¹ + ±Á0 (1), ®1 = ±Á(1),
¯ = Á(1)µ, ¼0 (L) = Á¤ (L)µ0 + §u" §¡1 ¤
"" P(L), where Á (L) is de…ned by Á(L) =
¤
Á(1) + (1 ¡ L)Á (L). Therefore, the ARDL speci…cation (4.4) and the static
cointegrating formulation, (4.1) and (4.2), represent alternative ways of modelling
the serial correlation in vt ’s and the endogeneity of xt .
Here we examine the PH estimation procedure in the context of the ARDL
approximation for the yt process given by (4.4). Assuming that » t = (vt ; e0t )0 in
(4.1) and (4.2) satisfy the multivariate invariance principle, the long-run variance
matrix of »t is given by15
( " T #)
XT X̀ X XT
-» = Plim T ¡1 » t » 0t + T ¡1 »t » 0t¡j + » t¡j »0t ; (4.5)
T !1
t=1 j=1 t=j+1 t=j+1
where the lag truncation parameter ` increases with T , such that `=T ! 0, as
T ! 1. We also de…ne
( T )
X X̀ X
T
¢» = Plim T ¡1 » t »0t + » t » 0t¡j ; (4.6)
T !1
t=1 j=1 t=j+1
and partition -» and ¢» conformably to »t = (vt ; e0t )0 ,

· ¸ · ¸
! vv -ve ¢vv ¢ve
-» = ; ¢» = :
-ev -ee ¢ev ¢ee
Although the use of the consistent estimator of the long-run variance matrix may
solve the serial correlation problem of vt , this does not address the endogeneity
14
As before, ´t = ut ¡ §u" §¡1
"" "t :
15
The random sequence {»t g is said to satisfy the multivariate invariance principle if it is
strictly stationary and ergodic with zero mean, …nite variances, and spectral density matrix
f»» (!) > 0: See Phillips and Durlauf (1986) for details.
[18]
problem. To deal with the cross-correlations between vt and current and lagged
values of et , PH consider the modi…ed error process, denoted by vt+ , which is
obtained from the regression of vt on et ,
vt+ = vt ¡ -ve -¡1
ee et ; (4.7)
and vt+ is not correlated with et by construction. Then, the long-run variance
matrix of » + + 0 0 + +
t = (vt ; et ) , denoted by -» , is block diagonal; that is, -» =
diag(! v¢e ; -ee ), where
! v¢e = ! vv ¡ -ve -¡1
ee -ev ; (4.8)
is the conditional long-run variance of vt given et . Combining (4.7) with (4.1) we
have the modi…ed “static” cointegrating relation,
yt+ = ¹ + ±t + µ0 xt + vt+ ; (4.9)
where yt+ = yt ¡ -ve -¡1ee ¢xt . There is still a bias term remaining in (4.9) because
of the correlation between xt and current and lagged values of vt+ , which is given
by ¢+ ¡1
ev = ¢ev ¡ ¢ee -ee -ev . Removing this bias leads to the Phillips-Hansen
fully-modi…ed OLS estimators,
2 + 3 8 2 39
¹
^T < 0 =
6 ^+ 7 ^+
0
4 ± T 5 = (ZT ZT )
¡1
Z0 y ^+ ¡ 4 0 5 T ¢ ev ; (4.10)
+ : T T ;
^
µT ¿ k
^T+
where ZT = (¿ T ; tT ; XT ), ¿ k is the k-dimensional column unit vector, and y
^ ev are consistent estimators of yt and ¢ev , respectively.
and ¢ + + +
Since the asymptotic distribution of the PH estimators of the coe¢cients on t

3
and xt (standardized by T 2 and T , respectively) is (mixture) normal, we have
1
QS2~ D¡1 ^+ a
(4.11)
ST (¸T ¡ ¸) » N f0; ! v¢e Ik+1 g ;
T
where ¸^+ ^+ ^+0 0

T = (± T ; µ T ) . This is directly comparable to the asymptotic result in
(3.20) obtained using the ARDL estimation procedure. First, we …nd that the
estimators of the long run parameters obtained using both the ARDL and the
PH estimation procedures have the mixture normal distributions asymptotically,
and standard inferences on µ using the Wald test are therefore asymptotically
valid. The main di¤erence between the ARDL-based estimators and the fully-
modi…ed OLS estimators lies in the computation of the long-run variance of the
disturbances in the cointegrating regression. In the case of the ARDL estimation
procedure the long run variance is given by ¾ 2´ =[Á(1)]2 , while in the case of the
PH estimation procedure the long run variance is given by ! v¢e . But as Theorem
8 below shows, ¾ 2´ =[Á(1)]2 and ! v¢e are identical for the ARDL speci…cation (3.19)
(or (4.4)).
[19]
Theorem 4.1. In the context of the ARDL speci…cation (3.19) or (4.4), the long-
run variance of the Phillips-Hansen modi…ed error process, vt+ in (4.9) (denoted
by ! v¢e ) is equal to ¾ 2´ =[Á(1)]2 , which is the spectral density at zero frequency of
[Á(L)]¡1 ´ t in (3.19).
5. Finite Sample Simulation Results

In this section, using Monte Carlo techniques, we compare …nite sample properties
of the Phillips-Hansen fully-modi…ed estimators of the long-run parameters with
the ARDL-based estimators. In the case of the ARDL procedure we consider
two di¤erent estimators of the variance of the long-run parameter, namely the
asymptotic formula (2.19), which is valid only for I(1) regressors, and the ¢-
method formula given by (2.20), which is valid more generally, irrespective of
whether the regressors are I(1) or I(0). We also include the OLS estimators of the
long-run parameters in the static cointegrating relation as a rather crude bench
mark of interest.
We consider the following data generating process (DGP), where the observa-
tions on yt and xt are generated according to the …nite-order ARDL (1,0) model:
yt = ® + Áyt¡1 + ¯xt + ut ; (5.1)
xt ¡ Ãxt¡1 = ½ (xt ¡ Ãxt¡1 ) + "t ; (5.2)

t = 1; :::; T; where (ut ; "t ) are serially uncorrelated and are generated according to
the following bivariate normal distribution:
µ ¶ ½ µ ¶¾
ut 1 ! 12
» N 0; - = : (5.3)
"t ! 12 1
We set ® = 0; ¯ = 1; ½ = 0:2; and experiment with the following parameter
values: Á = (0:2; 0:8), ! 12 = (¡0:5; 0:0; 0:5), and T = (50; 100; 250).
We carry out two sets of experiments: In the …rst set (Experiments 1) we …x Ã
at 1 and therefore, generate xt as an I(1) process. In the second set (Experiments
2) we set Ã to 0.95 such that xt is I(0) but with a high degree of persistence. It is
worth noting that in general (irrespective of whether xt is I(1) or I(0)), the long
run parameter on xt in (5.1) is given by
¯ + (1 ¡ Ã)! 12
µ= ;
1¡Á
and µ will be invariant to the parameters of the xt process only if ! 12 = 0 (i.e.,
xt is strictly exogenous in (5.1)) and/or when Ã = 1 (i.e., xt is I(1)). For a more
general treatment of this issue see Pesaran (1997).
[20]
Before discussing the simulation results, notice that when ! 12 = 0, the correct
speci…cation is the ARDL(1,0) model, and when ! 12 6= 0; it is the ARDL(1,2)
model. (See Section 3). But since in general the true order of the ARDL model is
not known a priori, we estimated 30 di¤erent ARDL models, namely ARDL(p; m),
p = 1; 2; :::; 5, m = 0; 1; 2; :::; 5, and used the Akaike Information Criterion (AIC),
and the Schwarz Criterion (SC) to select the orders of the ARDL model before
estimating the long-run coe¢cients and carrying out inferences. The estimates
obtained by these two-step procedures will be referred to as ARDL-AIC, and
ARDL-SC, respectively.
The simulation results are summarized in Tables 1a-1f and 2a-2f for Experi-
ments 1 and 2, respectively. Summary statistics included in these tables are:
Bias = ^µR ¡ µ0 , where µ0 is the true value of the long-run coe¢cient

P µ, ^µR is the
mean of the estimates of µ across replications, i.e., ^µR = i=1 ^µ i =R and R
R
is the number of replications,
STDE µ = Standard error of the estimator, ^µi , across replications,

µq ¶
P
RMSE = The root mean squared error of ^µi , R¡1 i=1 (^µi ¡ µ0 )2 ;
R
Mean t = Average t-statistic for testing µ = µ 0 against µ 6= µ 0 ,
STD t = Standard deviations of the t-statistic for testing µ = µ0 against µ 6= µ 0 ,
SIZE = Empirical size of the t-test of the null hypothesis µ = µ0 against µ 6= µ 0 ,
POWER+ = Empirical power of the t-test under the alternatives µ = 1:05µ0 ,
POWER¡ = Empirical power of the t-test under the alternatives µ = 0:95µ0 .
The nominal size of the tests is set at 5 percent, and the number of replications
at R = 2; 500.16
Tables 1a-1f summarize the results for the correctly speci…ed ARDL model
(namely the ARDL(1,0) when ! 12 = 0, and the ARDL(1,2) for ! 12 6= 0), the
estimates based on ARDL-AIC and the ARDL-SC procedures, and the Phillips-
Hansen fully modi…ed estimators based on the Bartlett’s window for window sizes
0, 5, 10, 20 and 40, which are reported under PH(0), PH(5), etc.
In the case where ! 12 = 0, the bias of the ARDL estimators is much smaller
than that of the PH estimators. The extent of the bias crucially depends on the
value of Á, and not surprisingly increases as Á is increased from 0.2 in Table 1a
16
In a very small number of replications Á(1) was estimated to be in excess of 0.99. These
cases are not included in the summary results.
[21]
to 0.8 in Table 1d. Also the RMSE’s of the ARDL and the PH estimators are
very similar when Á = 0:2, but diverge considerably for Á = 0:8. As can be seen
from Table 1d, for T = 50, the RMSE of the ARDL estimators is about one-third
of the RMSE of the PH estimators. The empirical sizes of the ARDL procedure
are much more satisfactory than the ones obtained using the PH fully modi…ed
estimators. When ! 12 = 0, the sizes of the tests based on the ARDL estimators
are generally reasonable and much nearer to their nominal size of 5 percent, than
the sizes of tests based on the PH estimators.
Empirical sizes of the tests based on the ARDL estimators computed using the
¢-method tend to be much closer to their nominal values, than those computed
using the asymptotic formula. This is particularly so when T is small. Therefore,
in what follows, we shall focus on the ARDL test statistics that are computed
using the ¢-method.
Another general feature of the simulation results is the slight superiority of the
ARDL-SC method over the ARDL-AIC procedure; which is in accordance with
the fact that the SC is a consistent model selection criterion, while the AIC is
not. (See, for example, Lütkepohl (1991, Chapter 4)).
Finally, there is a clear tendency for the tests based on the PH method to
over-reject in small samples, and the extent of this over-rejection increases with
Á, and declines only slowly with the sample size, T . For example, for Á = 0:8
and T = 100, the empirical sizes of the t-tests based on the PH method exceed
41 percent for all the …ve window sizes, and even for T = 250 do not fall below
20 percent. (See the column headed “SIZE” in Table 1d). By contrast the size of
the test based on the ¢-method in Table 1d is reasonable even for T = 50. For
the correct ARDL(1,0) speci…cation, the size of the test based on the ¢-method
is 7.2 percent and increases to 12.8 and 8.6 percents for the ARDL-AIC and the
ARDL-SC procedures, respectively.
Similar results are obtained in the case where ! 12 = 0:5, and hence xt and ut
are contemporaneously correlated. The ARDL estimators are now substantially
less biased than the PH estimators. (See the column headed “BIAS” in Table
1e). Once again the performance of the PH estimators improves with the sample
size, but very slowly. For T = 250, the bias of the PH estimators for the most
favorable window size is still around -0.14, but the biases of the ARDL estimators
lie between -0.0017 and 0.0024. The size performance of the two test procedures
also closely mirrors these di¤erences in the degree of biases of the estimators.
The empirical size of the tests based on the PH method ranges between 60 to 85
percent for T = 50, and falls to around 21 percent for T = 250 and a window size
of 20. The size of the tests based on the ARDL procedure, when the ¢-method
is used to compute the variances, is at most 13 percent for T = 50, and lies in the
range 5.2 to 7.7 percent when T is increased to 250. (See Table 1e).
[22]
Due to the large size distortions of the PH procedure, the results presented in
Tables 1a-1f do not allow proper comparisons of the power properties of the two
test procedures. But for T = 250 where the size distortion of the PH test is not
too excessive, the ARDL procedure consistently outperforms the PH method. For
example, in the case of Á = 0:8, ! 12 = 0:5; µ = 5, and T = 250, the power of the
ARDL procedure in rejecting the false null hypothesis, µ = 0:95µ0 , is consistently
above 98 percent while the power of the PH method is at most 62 percent even
though its associated size is 85 percent! There seems also to be a tendency for the
power function of the ARDL procedure in the case where ! 12 6= 0 and T small
to be asymmetric around µ = µ0 ; showing a higher power for the alternatives
exceeding µ0 as compared to the alternatives falling below µ 0 .
The results for Experiments 2 with an I(0) regressor are summarized in Tables
2a-2f. These results are very similar to those obtained for Experiments 1. The
overall performances of the ARDL-based methods with variances estimated using
the ¢-method are satisfactory for most cases, though slightly worse than those
obtained for Experiments 1. (In particular, the biases are slightly larger and the
tests are less powerful.) But, the performance of the PH estimators are still very
poor, especially when T is small.
Overall, the simulation results show that the ARDL-based estimation proce-
dure based on the ¢-method developed in the paper can be reliably used in small
samples to estimate and test hypotheses on the long-run coe¢cients in both cases
where the underlying regressors are I(1) or I(0). This is an important …nding since
the ARDL approach can avoid the pretesting problem implicitly involved in the
cointegration analysis of the long-run relationships. (Also see Cavanaugh et. al.
(1995) and Pesaran (1997).)
Before concluding this section, we note that the comparison of the small sam-
ple performance of the ARDL-based and the PH estimators is not comprehensive
in the sense that the data generating process we have used is biased in favor of the
ARDL procedure (see Inder (1993)). In this regard, it is more appropriate to con-
sider the relative performances of the ARDL and the PH estimators using more
general DGP’s, such as (4.1) and (4.2), that can allow for moving average error
processes. In the working paper version of this paper we also considered Monte
Carlo experiments using (4.1) and (4.2) as data generating processes. In one set of
experiments (called DGP 2) we used …rst-order bivariate vector moving-average
processes to generate the errors, vt and et , and in another set of experiments
(called DGP 3) we generated vt and et according to …rst-order vector autoregres-
sive processes. Neither of these DGP’s allows transformations of the model so
that xt could become strictly exogenous with respect to the disturbances of the
augmented ARDL model. We found that the simulation results based on these
DGP’s are less clear-cut, but the ARDL-based estimator using the ¢-method
[23]
still outperforms the PH estimator in most experiments, especially for small T .
Broadly speaking, the relative small sample performance of the two estimators
seems to depend on the signal-to-noise ratio, V ar(et )=V ar(vt ), with the ARDL
approach dominating the PH method when this ratio is low, and vice versa. This
is clearly an area for further research.17
6. Concluding Remarks
The theoretical analysis and the Monte Carlo results presented in this paper pro-
vide strong evidence in favor of a rehabilitation of the traditional ARDL approach
to time series econometric modelling. The focus of this paper, however, has been
exclusively on single equation estimation techniques and the important issue of
system estimation is not addressed here. Such an analysis inevitably involves
the problem of identi…cation of short-run and long-run relations and demands
a structural approach to the analysis of econometric models. The problem of
long-run structural modelling in the context of an unrestricted VAR model has
been addressed elsewhere. (See, for example, Johansen (1991), Phillips (1991)
and Pesaran and Shin (1995)). An alternative procedure, which takes us back to
the Cowles Commission approach, would be to extend the ARDL methodology
advanced in this paper to systems of equations subject to short-run and/or long-
run identifying restrictions. (See, for example, Boswijk (1995) and Hsiao (1995).)
We hope to pursue this line of research in the future; thus establishing a closer
link between the recent cointegration analysis and the traditional simultaneous
equations econometric methodology.
17
We are grateful to Peter Boswijk and an anonymous referee for drawing our attention to
this point.
[24]
Appendix: Mathematical Proofs
p a
For notational convenience we use “!”, “)” and “»” to signify the convergence
in probability, the weak convergence in probability measure, and the asymptotic
equality in distribution. All sums are over t = 1; 2; :::; T .
In the case where the regressors are stationary the usual method of deriving
the asymptotic distribution of the OLS estimators of the short-run parameters in,
for example, (2.1), would be to apply the Slutsky’s theorem to (P0ZT PZT )¡1 and
P0ZT uT , separately, where PZT = (¿ T ; tT ; XT ; yT ¡1 ); after appropriately scaling
it by the sample size. (The appropriate scaling of P0ZT PZT in this case is given
1 3
by DPT PZT P0ZT DPT where DPT = Diag(T ¡ 2 ; T ¡ 2 ; T ¡1 Ik ; T ¡1 ):) This procedure
cannot, however, be applied to dynamic time series models with trended regres-
sors (irrespective of whether the trends are stochastic or deterministic), because
P0ZT PZT does not converge to a non-singular matrix even if the individual elements
of P0ZT PZT are appropriately scaled by the sample size.
In what follows the asymptotic theory will be developed using the partitioned
regression techniques and then writing individual elements of the OLS estimators
of the short-run parameters as ratios of random variables, thus avoiding the need
to apply the Slutsky’s theorem to (P0ZT PZT )¡1 directly.
Since Theorems 2.1 - 2.4 are special cases of Theorems 3.1 and 3.2, and can
be proved in a similar manner, we omit their proofs to save space.
Proof of Theorem 3.1.
Before deriving the asymptotic distributions of the OLS estimators of the short
run parameters in (3.9) we derive some preliminary results. De…ne
1
qKT uT = T ¡ 2 K0T uT ; QKT = T ¡1 K0T KT ;
· ¸ · ¸
DZT Z0T uT qZT uT
qGT uT = DGT G0T uT = 1 = ;
T ¡ 2 WT0 uT qWT uT
· ¸ · ¸
DZT Z0T KT qZT KT
qGT KT = DGT G0T KT = 1 = ;
T ¡ 2 WT0 KT qWT KT
· 1 ¸ · ¸
DZT Z0T ZT DZT T ¡ 2 DZT Z0T WT QZT QZT WT
QGT = DGT G0T GT DGT = = ;
1
T ¡ 2 WT0 ZT DZT ¡1
T WT WT 0 Q0Z W QWT
T T
where KT = (·1T ; ·2T ; :::; ·pT ) with ·iT = (·i1 ; ·i2 ; :::; ·iT )0 for i = 1; :::; p; DGT =
1 3 1 1 3
Diag(T ¡ 2 ; T ¡ 2 ; T ¡1 Ik ; T ¡ 2 Ikq ) and DZT = Diag(T ¡ 2 ; T ¡ 2 ; T ¡1 Ik ): Using the
results in Phillips and Durlauf (1986), it is easily seen that as T ! 1,
p p
qKT uT ! qKu ; QKT ! QK ; (A.1)
[A.1]
· ¸ · ¸
qZu qZK
qGT uT ) qGu = ; qGT KT ) qGK = ; (A.2)
qW u qW K
· ¸
QZ 0
QGT ) QG = ; (A.3)
0 QW
where qKu , qW u , qW K , QK and QW are (…nite) probability limits of qKT uT , qWT uT ,
qWT KT , QKT and QWT , respectively, and qZu , qZK and QZ are functionals of
Brownian motions given by
2 3 2 3
B (1) B (1)
R1 u R1 K
qZu = 4 rdBu (r) 5 ; qZK = 4 rdBK (r) 5 ;
R1 00 R1 00
0
Be (r)dBu (r) 0
Be (r)dBK (r)
2 R1 3
1
1 B (r)dr
6 1
2
1
R 10 e 7
QZ = 4 rBe (r)dr 5 :
R1 0 2 R1 3 R1 0
0
Be (r)dr 0 rB0e (r)dr 0 B0e (r)Be (r)dr
Bu (r) is the scalar Brownian motion process with variance equal to r times ¾ 2u
(since ut is not serially correlated), Be (r) is a k-dimensional Brownian motion on
r 2 [0; 1] with variance equal to r times the long-run variance of et ; and BK (r)
is the p-dimensional Brownian motion on [0,1] with variance equal to r times the
long run variance of (·1T ; ·2T ; :::; ·pT ). The long-run variance of a stochastic
process is given by 2¼ multiplied by the spectral density of the process at zero
frequency. Notice that QZ (or QG ) is of the full column rank by assumption (A4),
and the elements in QZ involving Be (r) are random even asymptotically.
Since ·1T ; ·2T ; :::; ·pT ; and 1; t; xt ; ¢xt ; ¢xt¡1 ; :::; ¢xt¡q+1 are all distrib-
uted independently of ut such that BK (r) and Be (r) are independent of Bu (r), it
follows that ¡ ¢ ¡ ¢
a a
qKu » N 0; ¾ 2u Q· ; qGu » M N 0; ¾ 2u QG ; (A.4)
where M N denotes the mixture normal distribution. For details concerning the
theory of the mixture normal distribution see, for example, Phillips (1991). How-
ever, this (mixture) normality result does not hold in the case of qGK , because xt
and ¢xt¡i ’s (i = 0; :::; q ¡ 1) are correlated with ·it , i = 1; :::; p.
The OLS estimators of f and Á in (3.9), denoted by ^ fT and Á ^ T , satisfy the
relations,
^ T ¡ Á = (YT0 MG YT )¡1 (YT0 MG uT ) ;
Á (A.5)
h T ³T í
^
fT ¡ f = (G0T GT )
¡1
G0T uT ¡ G0T YT Á ^T ¡ Á ; (A.6)
where MGT = IT ¡ GT (G0T GT )¡1 G0T with IT being the T £ T identity matrix.
Using (3.7), YT can be expressed as
YT = GT ¡ + KT ; (A.7)
[A.2]
where 2 3
¹1 ¹2 ¢ ¢ ¢ ¹p
6 ± ± ¢¢¢ ± 7
¡=6
4 µ µ
7;
¢¢¢ µ 5
g1 g2 ¢ ¢ ¢ gp
0 0 0
and gi = (gi0 ; gi1 ; :::; gi;q¡1 )0 is a kq £ 1 vector of parameters. Using (A.7) we have
¡1
YT0 MGT YT = K0T KT ¡ K0T GT (G0T GT ) G0T KT ;
¡1
YT0 MGT uT = K0T uT ¡ K0T GT (G0T GT ) G0T uT ;
where we used G0T MGT = 0: Using (A.1) - (A.3), it can be shown that as T ! 1,
p
T ¡1 (YT0 MGT YT ) = QKT + op (1) ! QK ; (A.8)
1 p
T ¡ 2 (YT0 MGT uT ) = qKT uT + op (1) ! qKu : (A.9)
p
Multiplying (A.5) by T , and using (A.8), (A.9) and (A.4), we obtain (3.10).
Next, substituting YT from (A.7) in (A.6), we obtain
³ ´ ³ ´
^ ¡1
fT ¡ f = (G0T GT ) G0T uT ¡ ¡ Á ^ T ¡ Á ¡ (G0T GT )¡1 G0T KT Á ^ T ¡ Á : (A.10)
De…ne ³ ´ ³ ´
dT = ^
fT ¡ f + ¡ Á^T ¡ Á : (A.11)
Multiplying (A.11) by D¡1GT , using (A.1) - (A.3) and (A.10), and applying the
continuous mapping theorem (see, for example, Phillips and Durlauf (1986)), it
follows that
D¡1 ¡1 ¡1
GT dT = QGT qGT uT + op (1) ) QG qGu : (A.12)
Since qGu is shown to be mixture normal in (A.4), hence
a ¡ ¢ 1
a ¡ ¢
Q¡1G qGu » M N 0; ¾ 2 ¡1
u QG ; Q 2
GT D¡1
GT d T » N 0; ¾ 2
u I k+kq+2 :
1
Next, pre-multiplying (A.12) by the diagonal matrix, Diag(1; T ¡1 ; T ¡ 2 Ik ; Ikq ),
we have
2 3
1 0 0 0
p 6 0 T ¡1 0 0 7
T dT = 6 4 0 0
7 Q¡1 qG u + op (1) (A.13)
T 2 Ik 0 5 GT T T
1
¡
0 0 0 Ikq
2 3 8 2 11 39
1 0 0 0 >
> Q Z 0 0 0 >
>
6 0 0 0 0 7 ¡1 < 6 7=
6 7 a 6 0 0 0 0 7
) 4 Q q » M N 0; 4 5> ;
0 0 0 0 5 G Gu >
> 0 0 0 0 >
: ;
0 0 0 Ikq 0 0 0 Q¡1 W
[A.3]
¡1
where Q11Z is the (1,1) element of QZ . The above result can be rewritten sepa-
¤
rately for ® ^ T as
cT and ¯
^ 0T ; ^
p ¡ ¢p ³ ´
T (^
®0T ¡ ®0 ) + ¹1 ; ¹2 ; :::; ¹p T Á ^ T ¡ Á = dZu;1 + op (1); (A.14)
p p ³ ´
cT ¡ c) + ¸¿ 0p T Á
T (^ ^ T ¡ Á = op (1); (A.15)
p ³ ¤ ´ p ³ ´
^ T ¡ ¯ ¤ + (g1 ; g2 ; :::; gp ) T Á
T ¯ ^ T ¡ Á = Q¡1 qW u + op (1); (A.16)
W
where ¿ p is a p £ 1 vector of unity and dZu;1 is the …rst element of Q¡1 Z qZu . Using
(3.10) in (A.15) we obtain (3.11). It is also clear p from above results that the
¤
OLS estimators of ®0 and ¯ (standardized by T ) have the (mixture) normal
distributions asymptotically.
Finally, using (3.10), (3.11), and (A.13)-(A.16), it is easily seen that a consis-
tent estimator of the variance of h ^T is given by V^ (h
^T ) = ¾
^ 2uT (P0GT PGT )¡1 with
the rank of V^ (h
^T ) being equal to kq + 2.
Partition dT = (aT ; s0T ; wT0 )0 conformably to GT = (¿ T ; ST ; WT ); then sT is
given by
p p ³ ´
sT = T (^ cT ¡ c) + ¸¿ 0p T Á^T ¡ Á : (A.17)
Using (A.10) and (A.11), (s0T ; wT0 )0 can be expressed as

· ¸ · ¸¡1 · 0 ¸
sT S0T HT ST S0T HT WT ST HT uT
= (A.18)
wT WT0 HT ST WT0 HT WT WT0 HT uT
· 0 ¸¡1 · 0 ¸³ ´
ST HT ST S0T HT WT ST HT KT ^
¡ Á ¡ Á :
WT0 HT ST WT0 HT WT WT0 HT KT T
Let
qS~T uT = DST S0T HT uT ; QS~T = DST S0T HT ST DST ;
3
where DST = Diag(T ¡ 2 ; T ¡1 Ik ). Then, it is also easily seen that as T ! 1,
" R #
1 1
(r ¡ )dB u (r)
qS~T uT ) qSu ~ = R0 1 0 2 (A.19)
~ e (r)dBu (r) ;
B
0
" R1 #
1 1 ~
(r ¡ )B (r)dr
QS~T ) QS~ = R1 R0 1 0 2 e (A.20)
~ e (r)dr ;
12
1 ~0 ~ e (r)B
0
(r ¡ 2
)Be (r)dr 0
B
[A.4]
R
where B ~ e (r) = Be(r) ¡ 1 Be(r)dr is a k-dimensional demeaned Brownian motion
0
on [0; 1]. Since B~ e (r) is also distributed independently of Bu (r), we obtain as in
(A.4), ¡ ¢
a 2
qSu
~ » M N 0; ¾ u QS~ : (A.21)
1
Multiplying (A.18) by the diagonal matrix, Diag(D¡1
ST ; T ), using (A.19)-(A.21)
2
and noting that
DST S0T HT WT = Op (1); T ¡1 WT0 HT WT = Op (1);

1
DST S0T HT KT = Op (1); T ¡ 2 WT0 HT KT = Op (1);
we obtain ³ ´
a
D¡1
ST sT ) Q¡1
S~
q ~
Su » M N 0; ¾ 2 ¡1
Q
u S~ ;
and therefore,
1
a ¡ ¢
QS2~ D¡1 2
ST sT » N 0; ¾ u Ik+1 : (A.22)
T
Finally, by (3.13) and (A.15) we have
^T ¡ ¸ = sT
¸ : (A.23)
^ T (1)
Á
1
p
Multiplying (A.23) by QS2~ D¡1 ^
T
ST , using (A.22) and noting that ÁT (1) ! Á(1); we
obtain (3.14).
Proof of Theorem 3.3 can be established in a similar manner and is omitted to
save space.
Consider the dynamic ARDL(p; m) model (3.19) (or (4.4)), and its static coun-
terpart (4.1). Applying the decomposition Á(L) = Á(1) + (1 ¡ L)Á¤ (L) to (3.19)
we have
®0 0 ¼0 (L) ´t Á¤ (L)
yt = + ±t + µ xt + ¢xt + ¡ ¢yt : (A.24)
Á(1) Á(1) Á(1) Á(1)
Substituting for ¢yt = ± + µ0 ¢xt + ¢vt from (4.1) in (A.24), we have
¼ 0 (L) ´ Á¤ (L) 0
yt = ¹ + ±t + µ0 xt + ¢xt + t ¡ (µ ¢xt + ¢vt ) : (A.25)
Á(1) Á(1) Á(1)
Using (A.25), vt in (4.1) can be expressed as
¼0 (L) ¡ Á¤ (L)µ0 ´t Á¤ (L)

vt = ¢xt + ¡ ¢vt : (A.26)
Á(1) Á(1) Á(1)
[A.5]
h ¤ ¤
i
0 (L)µ 0
De…ning kt = (´t ; vt ; ¢x0t )0 = (´t ; vt ; e0t )0 , and ª(L) = 1
Á(1)
; ¡Á (L)(1¡L)
Á(1)
; ¼ (L)¡Á
Á(1)
,
then the spectral density of vt = ª(L)kt is given by
2¼fvv (!) = ª(eiw )V ar(kt )ª0 (e¡iw );
where 2 3
¾ 2´ ¾ ´v 0
V ar(kt ) = 4 ¾ 0´v ¾ 2v §ve 5 :
0 §0ve §ee
Hence, the spectral density of vt at zero frequency is given by
¾ 2´ + [¼0 (1) ¡ Á¤ (1)µ0 ] §ee [¼(1) ¡ Á¤ (1)µ]

2¼fvv (0) = : (A.27)
[Á(1)]2
The Phillips-Hansen semi-parametric correction is equivalent to removing the sec-

ond part of (A.27), by subtracting the terms involving ¢xt from vt . Using (A.26)
we have the following expression for the modi…ed disturbance term, vt+ , in the
Phillips-Hansen’s procedure:
¼ 0 (L) ¡ Á¤ (L)µ0 ´ Á¤ (L)

vt+ = vt ¡ ¢xt = t ¡ ¢vt = ª+ (L)k+ t ;
Á(1) Á(1) Á(1)
h i
¡Á¤ (L)(1¡L)
where k+
t
0 +
= (´t ; vt ) ; and ª (L) = Á(1) ;1
Á(1)
: Therefore, the spectral
density of vt+ at zero frequency is given by
+ +0
¾ 2´
2¼fv+ v+ (0) = ª (0)V ar(k+
t )ª (0) = :
[Á(1)]2
Using (4.7) we also have

fv+ v+ (0) = Bf»» (0)B0 ;
where B = [1; ¡-ve -¡1
ee ]. By de…nition -» = 2¼f»» (0), and
¾ 2´
2¼fv+ v+ (0) = B-» B0 = ! vv ¡ -ve -¡1
ee -ev = :
[Á(1)]2
Hence, by (4.8) ! v¢e = ¾ 2´ = [Á(1)]2 .
[A.6]
References
[1] Banerjee, A., J. Dolado, D. Hendry and G. Smith (1986), “Exploring Equi-
librium Relationships in Economics through Statistical Models: Some Monte
Carlo Evidence,” Oxford Bulletin of Economics and Statistics, 48: 253-277.
[2] Bardsen, G. (1989), “The Estimation of Long-Run Coe¢cients from Error

Correction Models,” Oxford Bulletin of Economics and Statistics, 51: 345-
350.
[3] Bewley, R. (1979), “The Direct Estimation of the Equilibrium Response in a

Linear Dynamic Model,” Economics Letters, 3: 357-361.
[4] Boswijk, H.P. (1995), “E¢cient Inference on Cointegration Parameters in

Structural Error Correction Models,” Journal of Econometrics, 69: 133-158.
[5] Cavanaugh, C.L., G. Elliott and J.H. Stock (1995), “Inference in Models with
Nearly Integrated Regressors,” Econometric Theory, 11: 1131-1147.
[6] Engle, R.F. and C.W.J. Granger (1987), “Cointegration and Error Correction
Representation: Estimation and Testing,” Econometrica, 55: 251-276.
[7] Hendry, D., A. Pagan and J. Sargan (1984), “Dynamic Speci…cations‘” Chap-
ter 18 in Handbook of Econometrics, Vol II (ed., Z. Griliches and M. Intrili-
gator), North Holland
[8] Hsiao, C. (1995), “Cointegration and Dynamic Simultaneous Equations

Model,” unpublished manuscript, University of Southern California.
[9] Inder, B. (1993), “Estimating Long Run Relationships in Economics,” Jour-

nal of Econometrics, 57: 53-68.
[10] Johansen, S. (1991), “Estimation and Hypothesis Testing of Cointegrating

Vectors in Gaussian Vector Autoregressive Models,” Econometrica, 59: 1551-
80.
[11] Lütkepohl H. (1991), Introduction to Multiple Time Series Analysis, New

York, N.Y. Springer and Verlag.
[12] Park, J.Y. (1992), “Canonical Cointegrating Regressions,” Econometrica, 60:

119-143.
[13] Pesaran, M.H. (1997), “The Role of Economic Theory in Modelling the Long-
Run,” The Economic Journal, 107: 178-191.
[R.1]
[14] Pesaran, M.H. and B. Pesaran (1997), Micro…t 4.0: Interactive Econometric
Analysis, Oxford University Press (forthcoming).
[15] Pesaran, M.H. and Y. Shin (1995), “Long-Run Structural Modelling,” un-
published manuscript, University of Cambridge.
[16] Pesaran, M.H., Y. Shin and R.J. Smith (1996), “Testing for the Existence of
a Long-Run Relationship,” DAE Working Papers Amalgamated Series, No.
9622, University of Cambridge.
[17] Phillips, P.C.B. (1991), “Optimal Inference in Cointegrated Systems,” Econo-

metrica, 59: 283-306.
[18] Phillips, P.C.B. and S.N. Durlauf (1986), “Multiple Time Series Regression
with Integrated Processes,” Review of Economic Studies, 53: 473-496.
[19] Phillips, P.C.B. and B. Hansen (1990), “Statistical Inference in Instrumental

Variables Regression with I(1) Processes,” Review of Economic Studies, 57:
99-125.
[20] Phillips, P.C.B. and M. Loretan (1991), “Estimating Long Run Economic
Equilibria,” Review of Economic Studies, 58: 407-436.
[21] Phillips, P.C.B. and V. Solo (1992), “Asymptotic for Linear Processes,” An-
nals of Statistics: 971-1001.
[22] Saikkonnen, P (1991), “Asymptotically E¢cient Estimation of Cointegration

Regressions,” Econometric Theory, 7: 1-21.
[23] Stock, J.H. (1987), “Asymptotic Properties of Least Squares Estimates of

Cointegrating Vectors,” Econometrica, 55: 1035-1056.
[24] Stock, J.H. and M.W. Watson (1993), “A Simple Estimator of Cointegrating
Vectors in Higher Order Integrated Systems,” Econometrica, 61: 783-820.
[25] Wickens, M.R. and T.S. Breusch (1988), “Dynamic Speci…cation, the Long
Run Estimation of the Transformed Regression models,” The Economic Jour-
nal, 98: 189-205.
[R.2]
Modèles à retards distribués et modèles ARDL
Christophe Hurlin
April 26, 2019
Abstract
Cette note propose une brève présentation des modèles à retards distribués en général
et des modèles de type Autoregressive Distributed-lagged model (ou ARDL) en particulier.
L’objectif est de comprendre la spéci…cité et les avantages des modèles ARDL en les remet-
tant en perspective par rapport aux modèles dynamiques à retards distribués. Dans une
première section, nous présentons les modèles à retards distribués non contraints. La sec-
onde section est consacrée aux modèles restreints (linéaire, géométrique, etc.) et notamment
aux modèles polynomiaux d’Almon. La troisième section présente les modèles avec variable
dépendante retardée : modèles de Koyck, AR-X, et ARDL. La dernière section décrit les
procédures d’estimation de ces di¤érents modèles sous les logiciels R et SAS.
Mots clés : Modèles AutoRegressive Distributed-Lagged, ARDL, Modèles à retards dis-

tribués, Spéci…cation, Estimation
JEL classi…cation: C01, C22, C53.
Université d’Orléans (LEO, FRE CNRS 2014). Cette note a été rédigée dans le cadre de la préparation des
étudiants du master ESA de l’Université d’Orléans au challenge DRIM game (Deloitte - RCI Bank) 2018.
1
1 Introduction
Les modèles à retards distribués (ou à retards échelonnés) sont des modèles dynamiques de
séries temporelles. Ils ont pour particularité que la dynamique de la variable dépendante y soit
expliquée par des valeurs contemporaines et retardées d’une ou plusieurs variables explicatives
x. Le principal avantage de ces modèles est qu’ils autorisent une dynamique plus riche (com-
parativement à un modèle linéaire simple sans retard sur les variables explicatives) des e¤ets
marginaux des variables x sur la variable dépendante. On peut ainsi distinguer la notion d’e¤ets
marginaux dynamiques de court terme, qui représentent l’impact instantané de la variable con-
temporaine xt (ou retardée xt s ) sur yt , de l’e¤et cumulatif de long terme de x sur la variable
dépendante y.
De façon générale on oppose les modèles à retards distribués …nis et in…nis, suivant que
l’on considère un nombre …ni ou in…ni de valeurs retardées pour la variable explicative. Bien
évidemment, seuls les modèles à retards …nis (…nite distributed lag models) peuvent être estimés
en pratique. Toutefois, même lorsque l’on considère un nombre …ni et relativement peu impor-
tant de retards, l’estimation de ce type de modèle par MCO ou MCG peut poser problème.
En e¤et, il est fréquent que les valeurs retardées xt ; xt 1 ; : : : ; xt q soient fortement corrélées,
induisant un problème de multi-colinéarité dans le modèle de régression. Les estimations des
coe¢ cients par MCO sont alors peu …ables et peuvent notamment prendre des valeurs aber-
rantes. De plus, l’estimation de ces modèles requiert des échantillons de taille importante étant
donné le potentiellement grand nombre de paramètres à estimer suivant le nombre de retards q
considérés pour la variable exogène.
A…n de palier à ces problèmes, deux types de solutions ont été considérés dans la littérature.
La première solution a consisté à imposer des restrictions sur les coe¢ cients associées aux
valeurs retardées xt ; xt 1 ; : : : ; xt q de la variable explicative (Almon, 1965; Smith and Giles,
1976; Madinier et Mouillart, 1983). On obtient alors des modèles à retards distribués contraints
(restricted distributed lag models). Ces restrictions peuvent être de formes très di¤érentes, mais
elles ont toutes pour objectif (i) de limiter le nombre de paramètres à estimer, (ii) de limiter
les potentiels problèmes de quasi-colinéarité, et (iii) de conduire à des pro…ls temporels d’e¤ets
marginaux pouvant être justi…és sur le économique. Concernant ce dernier point, le principal a
priori que l’on peut avoir vis-à-vis des e¤ets marginaux est que l’e¤et instantané de la variable
xt s sur le niveau de yt diminue avec le temps, mais pas nécessairement de façon uniforme.
Plusieurs modèles restreints ont été proposés a…n de satisfaire ces trois objectifs. On peut
citer le modèle avec décroissance linéaire des paramètres retard et le modèle avec distribution
géométrique des retards (geometric distributed lag model ). Mais le modèle le plus utilisé est
sans aucun doute le modèle à retards polynomiaux (polynomial distributed lag model ) ou modèle
d’Almon (1965). L’idée consiste à postuler que le paramètre associé à la variable retardée xt s
est une fonction (inconnue) du décalage s, et que cette fonction peut être approximée par un
polynôme d’ordre p; avec généralement p << q. Il su¢ t alors d’estimer les paramètres de ce
polynôme pour retrouver les coe¢ cients associés aux variables retardées xt s . On peut ainsi
réduire la dimension du problème et limiter les risques de quasi-colinéarité.
La seconde solution consiste à introduire des valeurs retardées de la variable dépendante. On
aboutit ainsi à une représentation de type AR(p) sur yt , augmentée des valeurs contemporaines
et passées d’une variable exogènes xt . L’exemple le plus simple est le modèle de Koyck (1954).
Ce modèle linéaire très simple explique le niveau de yt par une constante, la valeur retardée
yt 1 et le niveau contemporain d’une variable explicative xt . Notons que dans le modèle de
2
Koyck, aucun retard n’est introduit sur la variable explicative xt , ce qui exclut tout problème
de colinéarité. Quel est l’avantage de ce modèle ? En inversant le polynôme autorégressif associé
à yt 1 , on peut montrer que cette représentation est équivalente à un modèle à retards distribués
de dimension in…nie, avec une décroissance géométrique des poids. Ainsi, le modèle de Koyck
est équivalent à une représentation dans laquelle la variable yt est expliquée par les variables
xt ; xt 1 ; xt 2 ; xt 3 ; : : : ; x 1 , et pour autant l’estimation de ce modèle (qui suppose simplement
de régresser yt sur yt 1 et xt ) ne pose pas de problème lié à la corrélation entre les valeurs
retardées.
Dans la terminologie de Box et Jenkins (1976), le modèle de Koyck s’apparente à un modèle
de type AR(1)-X, où la lettre X indique la présence de la variable exogène xt dans l’équation
d’espérance conditionnelle de yt . Bien évidemment, ce modèle peut être étendu à une représen-
tation de type AR(p)-X, incluant non plus une seule valeur retardée yt 1 , mais p valeurs yt 1 ;
yt 2 ; : : : ; yt p . Toutefois, le modèle de Koyck et son extension présentent un important défaut
lorsque l’on considère plus d’une variable exogène. Dans ce cas, la décroissance des coe¢ cients
retards (e¤ets marginaux de court terme) avec le décalage temporel est identique pour toutes
les variables explicatives. Par exemple, les impacts dynamiques sur yt de deux variables ex-
plicatives x1;t s et x2;t s sont supposés évoluer de la même façon avec le décalage s. Une telle
hypothèse est problématique car elle ne correspond généralement à aucune théorie, ni à au-
cune observation empirique. Le modèle ARDL (autoregressive distributed lag model ) permet de
répondre à cette critique. Formellement, ce modèle permet d’introduire à la fois des retards sur
la variable dépendante et sur la variable exogène. Ce faisant l’e¤et marginal de la variable xt
sur yt est déterminé par le ratio de deux polynômes retard (d’où l’appellation alternative de
rational lag model ), le premier étant spéci…que à la variable xt ; le second à celui de la variable
dépendante. Dès lors, deux variables exogènes, associées à deux polynômes retards, n’ont pas
nécessairement le même impact dynamique sur la variable endogène.
Tous ces modèles peuvent être estimés assez facilement grâce à di¤érentes procédures, que ce
soit sous les logiciels SAS, Eviews, Matlab, et R. Dans cette note nous donnerons les principales
fonctions pour SAS et R.
Le plan de cette note est structuré de la façon suivante. Dans une première section, nous
présenterons les modèles à retards distribués non contraints. Dans une seconde section, nous
présenterons les modèles restreints (linéaire, géométrique, etc.) et notamment les modèles poly-
nomiaux d’Almon. La troisième section sera consacrée aux modèles avec variable dépendante
retardée : modèles de Koyck, AR-X, et ARDL. La dernière section présentera les procédures
d’estimation de ces di¤érents modèles sous R et SAS.
2 Modèles à retards distribués

Comme nous l’avons dit précédemment, les modèles à retards distribués (ou à retards échelon-
nés) sont des modèles dynamiques dans lesquels la variable endogène yt est expliquée par des
valeurs contemporaines et retardées d’une ou plusieurs variables explicatives xt . Par souci de
simpli…cation, dans cette note nous ne traiterons que le cas où xt est une variable scalaire, i.e.
le cas avec une seule variable explicative.
1
Cette inversion est parfois appelée "transformation de Koyck ", comme par exemple dans la documentation
du package dLagM de R (Demirhan, 2018)
3
De…nition 1 Un modèle à retards échelonnés linéaire s’écrit sous la forme
q
X
yt = + (L) xt + "t = + s xt s + "t (1)
s=0
où f"t ; t 2 Zg est un bruit blanc faible, L désigne l’opérateur retard, (L) un polynôme retard
P
d’ordre q avec (L) = qs=0 s Ls et q 6= 0.
Un modèle à retards distribués ressemble ainsi à un modèle ARMA à la di¤érence près

que les retards portent sur la variable explicative x et non sur la variable expliquée y ou sur
l’innovation "t . Tout comme pour un modèle de type ARMA, les paramètres s sont appelés
paramètres ou coe¢ cients retard et (L) est quali…é de polynôme retard. Les paramètres retard
décrivent la manière dont la variable x a¤ecte le niveau contemporain de y.
De façon théorique, tout comme on dé…nit des modèles AR(1), il est possible de considérer
des modèles à retards distribués avec un nombre in…ni de retards.
1
X
yt = + B (L) xt + "t = + s xt s + "t (2)
s=0
La principale di¤érence2 entre un modèle à retards distribués et un modèle de régression

statique classique réside dans l’interprétation des e¤ets marginaux.
2.1 E¤ets marginaux

Les e¤ets marginaux dans un modèle de régression classique statique yt = + xt + "t sont des
événements uniques. La réponse de yt à un changement de xt est supposée être immédiate et
complète à la …n de la période de mesure. Formellement, l’e¤et marginal de x sur y est dé…ni
par @yt =@xt = . Au contraire, dans un modèle dynamique l’e¤et marginal est dé…ni comme
l’e¤et d’un changement ponctuel de xt sur la valeur d’équilibre de yt .
Considérons le cas d’un modèle à retards in…nis et supposons que xt prenne la valeur x
pendant une in…nité de périodes, la valeur d’équilibre de yt , notée y, sera telle que
1
X
y = E ( yt j xt ; xt 1 ; : : :) = + sx (3)
s=0
Notons que cette valeur est …nie à la condition que les paramètres s véri…ent
1
X
j sj <1 (4)
s=0
Supposons à présent que la valeur de la variable x change à la période t. On peut alors distinguer
son e¤et immédiat sur yt (multiplicateur d’impact ou multiplicateur de court-terme) de son e¤et
cumulé sur la valeur d’équilibre de y. Le multiplicateur d’impact mesure l’e¤et immédiat d’une
variation marginale de xt sur yt . Formellement, ce multiplicateur est dé…ni par :
@yt @yt+s
Multiplicateur dynamique de court terme = = = s (5)
@xt s @xt
2
Pour une discussion détaillée des modèles à retard échelonnés, de leur spéci…cation et de leur estimation, voir
l’ouvrage de synthèse Dhrymes (1971).
4
Le multiplicateur de long-terme est quant à lui dé…ni par
X 1
@y
Multiplicateur de long-terme = = s (6)
@x
s=0
Par exemple, considérons un modèle à retards échelonnés et …nis d’ordre 2 tel que
yt = 0; 5 + 6xt 2xt 1 + 3xt 2 + "t (7)
Supposons que la variable x augmente de façon transitoire d’une unité à la date t, puis revient
à son niveau initial à la date t + 1. Dans ce cas, yt augmente à la date t de 6 unités, puisque les
valeurs xt 1 , xt 2 et "t sont inchangées et que @yt =@xt = 6. A la date t + 1, la valeur de yt+1
diminuera de 2 unités puisque @yt =@xt 1 = 2. Ainsi la quantité @yt =@xt s mesure l’impact
dynamique d’un changement marginal de xt sur les valeurs successives de yt ; yt+1 ; yt+2 , etc.
Supposons à présent que la variable xt augmente de façon permanente d’une unité à partir
de la date t.
0 si s < t
xs = (8)
1 si s t
A la date t, yt augmente de 6 unités tout comme dans le cas précédent. Mais à la date t + 1,
yt+1 augmente de @yt =@xt + @yt =@xt 1 = 6 2 = 4 unités. La limite de cet e¤et cumulatif est
déterminée par la somme des coe¢ cients retards, c’est à dire
@yt @yt @yt
+ + =6 2+3=7 (9)
@xt @xt 1 @xt 2
L’e¤et marginal de long terme de la variable x sur la valeur d’équilibre de y est donc égal à 7
unités.
2.2 Estimation d’un modèle à retards distribués à ordre …ni

On considère un modèle avec un nombre de retards q …ni (…nite distributed lag model ), tel que
q
X
yt = + B (L) xt + "t = + s xt s + "t (10)
s=0
Comme dans le cas d’un modèle linéaire simple, les paramètres s peuvent être estimés par la
méthode des moindres carrés ordinaires (MCO) ou la méthode des moindres carrés généralisés
(MCG), en supposant que la variable x est strictement exogène. L’interprétation des coe¢ -
cients s renvoie à l’analyse des e¤ets marginaux présentés précédemment. L’avantage de cette
spéci…cation réside dans le fait qu’aucune restriction n’est imposée a priori sur les paramètres
s , et donc sur les e¤ets dynamiques de x sur y.
Toutefois, l’estimation des paramètres d’un modèle à retards distribués pose deux principaux
problèmes. Le premier problème est celui de la multicolinéarité. Même dans le cas d’une variable
explicative x stationnaire, il est fréquent d’observer de fortes autocorrélations entre les valeurs
xt et xt s ; notamment aux premiers ordres. Or de fortes corrélations entre les variables xt , xt 1 ,
xt 2 ,: : : ; xt q se traduit dans le modèle de régression de l’équation (10) par un problème de
quasi-multicolinéarité3 . Le niveau élevé de corrélation entre les régresseurs peut conduire à des
3
La multi-colinéarité au sens strict impliquant que la matrice des régresseurs X = (xt : xt 1 : : : : : xt q ) n’est
pas de plein rang q + 1; i.e. que certaines colonnes peuvent s’écrire comme une combinaison linéaire exacte des
autres colonnes de la matrice. Par conséquent, la matrice X 0 X n’est pas inversible. Dans le cas d’une quasi
multi-colinéarité, la matrice X 0 X est inversible mais son déterminant est très proche de 0.
5
estimations des coe¢ cients peu …ables4 avec des variances et des écart types très importants.
L’estimation du modèle à retards échelonnés pose un second problème lorsque l’ordre des
retards q est relativement grand comparé à la taille d’échantillon disponible pour estimer les
paramètres du modèle. En e¤et, si la taille d’échantillon est égale à T , compte tenu des retards
on ne dispose au …nal que de T q observations pour estimer les q + 2 paramètres du modèle
(y compris la constante), soit T 2q 2 degrés de liberté. Chaque fois que l’on augmente
le retard q d’une unité, on perd ainsi deux degrés de liberté : un parce qu’il faut estimer un
paramètre de plus et un autre parce que la taille d’échantillon e¤ectivement disponible se réduit
d’une observation. L’estimation peut donc s’avérer peu précise si la taille T est relativement
faible comparée au nombre maximum de retard q. Il n’y pas de règle absolue concernant le
nombre de degrés de liberté requis pour garantir à la fois la convergence des estimateurs et la
pertinence du résultat théorique de normalité asymptotique utilisé pour l’inférence. Toutefois,
on peut convenir qu’en dessous de 50 degrés de liberté, il convient d’être prudent quant à
l”interprétation des résultats d’estimation. Bien évidemment, ce problème n’est pas spéci…que
au modèle à retards échelonnés et concerne l’ensemble des modèles dynamiques (AR, MA,
ARIMA, etc.).
En résumé, le modèle à retards distribués à ordre …ni (…nite distributed lag model ) est
approprié pour estimer les relations dynamiques entre x et y lorsque (i) les paramètres s
diminuent assez rapidement avec l’ordre s jusqu’à zéro, (ii) la variable explicative xt est peu
auto-corrélée, et (iii) la taille de l’échantillon T est su¢ samment importante par rapport à
l’ordre des retards q.
3 Modèles à retards distribués avec contraintes

Le fait d’imposer des contraintes sur les paramètres du modèle à retards distribués permet à
la fois d’imposer un certain pro…l de régularité sur la décroissance des paramètres retard s
avec l’ordre s (et donc sur les e¤ets marginaux), mais aussi et surtout de limiter le nombre
des paramètres à estimer. Par ailleurs, la réduction de la dimension du problème (nombre de
paramètres à estimer) permet de limiter les problèmes de multi-colinéarité qui a¤ectent souvent
l’estimation par MCO du modèle à retards distribués à ordre …ni.
De nombreuses formes de contraintes peuvent être adoptées dé…nissant di¤érents types de
modèles à retards distribués avec contraintes (restricted …nite distributed lag models). Nous
évoquerons les 3 principaux modèles de type restricted …nite distributed lag models considérés
sous les logiciels R et SAS.
1. Le modèle avec décroissance linéaire des paramètres (linear-declining lag model ).
2. Le modèle avec distribution géométrique des retards (geometric lag model ).
3. Le modèle avec décalage polynomial distribué (polynomial distributed lag model ), connu
aussi sous le nom de modèle d’Almon (Almon distributed lag model ).
4
Une des manifestations possibles de ce problème de quasi-multicolinéarité est que l’on peut parfois obtenir
des coe¢ cients estimés b s qui prennent alternativement des valeurs positives et négatives très élevées en valeur
absolue, sans aucune explication économique valable. Ce type de comportement peut traduire la présence d’un
problème de quasi-multicolinéarité, mais ce n’est pas une règle absolue. Cela peut simplement traduire le fait
que les racines du polynôme retard B (L) sont des racines complexes.
6
3.1 Modèle avec décroissance linéaire des paramètres.
L’idée est que les paramètres 1 ; 2 ; 3 ; : : : ; s sont des fractions linéairement décroissantes du
multiplicateur de court terme 0 . Dans ce cas, on pose
q+1 s
s = 0 s = 1; : : : ; q (11)
q+1
Par exemple, si l’on pose q = 4 les paramètres s sont respectivement dé…nis par 1 = 4 0 =5;
2 = 3 0 =5; 3 = 2 0 =5; et 4 = 0 =5. Le modèle à retards échelonnés d’ordre …ni q s’écrit
alors sous la forme
Xq
q+1 s
yt = + 0 xt s + "t (12)
q+1
s=0
Dans cette spéci…cation, seuls les paramètres et 0 doivent être estimés. La procédure
d’estimation est alors extrêmement simple. Pour un ordre q donné, on construit la variable
explicative transformée zt dé…nie par
q
X q+1 s q q 1 1
zt = xt s = xt + xt 1 + xt 2 + ::: + xt q (13)
q+1 q+1 q+1 q+1
s=0
Puis, on régresse yt sur une constante et la variable zt par la méthode des MCO ou des MCG.
yt = + 0 zt + "t (14)
Dans ce modèle, l’e¤et cumulatif de long terme est alors égal à
q
X q+1 s q
0 = 0 1+ (15)
q+1 2
s=0
Par exemple, pour q = 4 on obtient un e¤et cumulatif de long terme égal à 0 + 1 + 2 + 3 + 4 =
3 0 . Notons que le modèle à décroissance linéaire peut être conçu comme un cas particulier
du modèle à distribution polynomiale de retards ou modèle d’Almon (1965) obtenu pour le cas
particulier s = 0 (cf. infra)
En…n, il est possible de considérer di¤érentes variantes de ce modèle. On peut par exemple
supposer que les poids s augmentent linéairement jusqu’à un pic à l’ordre m, puis décroissent
linéairement jusqu’à 0. Pour cela, il su¢ t de poser
jm sj
s = 0 1 s = 1; : : : ; 2m (16)
m+1
Par exemple pour m = 3, on obtient 0 = 3 =4, 1 = 2 3 =4; 2 = 3 3 =4; 3; 4 = 3 3 =4,
5 = 2 3 =4 et 6 = 3 =4.
3.2 Modèle avec distribution géométrique des retards

Une écriture alternative au modèle avec décroissance linéaire des paramètres est le modèle avec
distribution géométrique des retards (geometric distributed lag model ) dans lequel les paramètres
retard véri…ent les restrictions
s
s = 0 (1 ) s = 1; : : : ; q (17)
où est un paramètre véri…ant 0 < < 1. L’idée est que les poids relatif ! s = s = 0 décroissent
de façon géométrique : plus on considère une date éloignée t s de la date courante t, plus le
poids relatif ! s associé à la variable explicative xt s est faible.
s
! s = (1 ) 0 < !s < 1 (18)
7
De…nition 2 Le modèle à retards échelonnés d’ordre …ni q, avec distribution géométrique des
retards (geometric lag model), s’écrit sous la forme
q
X
s
yt = + 0 (1 ) xt s + "t (19)
s=0
Cette représentation peut être justi…ée comme une forme réduite d’un modèle d’anticipation
dans lequel la valeur de yt dépend de l’anticipation de la valeur future xt+1 obtenue avec
l’information disponible à la date t. Sous l’hypothèse d’anticipation adaptative, la forme réduite
de ce modèle correspond à l’équation (19). Voir Greene (2007) pour plus de détails.
3.3 Modèle avec décalage polynomial distribué ou méthode polynomiale

d’Almon
Le modèle à décalage retard polynomial distribué (polynomial distributed lag model ) consiste
à dé…nir la distribution des retards par une fonction polynomiale. On parle alors de lags
d’Almon (1965), de méthode polynomiale d’Almon (Madinier et Mouillart, 1983), ou de modèle
polynomial d’Almon. Quel que soit sa dénomination, cette approche présente deux avantages
: d’une part elle permet de réduire la colinéarité des variables explicatives entre elles, d’autre
part elle est su¢ samment souple pour restituer di¤érents pro…ls d’évolution temporelle des
coe¢ cients retard, et donc des e¤ets marginaux dynamiques.
Spéci…cation. L’idée de base de Shirley Almon (1965) consiste à mobiliser le théorème de

l’approximation de Weierstrass (1885). Ce théorème stipule que toute fonction continue dé…nie
sur un intervalle fermé peut être uniformément approximée, de façon arbitrairement précise,
par un polynôme de degré p. Notons que le théorème ne dit pas quelle doit être la valeur de
p, et cela sera une des limites de l’approche d’Almon. Appliqué dans le contexte des modèles à
retards échelonnés, il cela revient à postuler que les paramètres retard s s’écrivent comme des
fonctions inconnues g (s) du décalage temporel s comme suit :
s = g (s) s = 1; : : : ; q (20)
Dès lors, il est toujours possible d’approximer cette fonction par un pôlynome d’ordre p
2 p
s = g (s) ' 0 + 1s + 2s + ::: + ps (21)
On obtient ainsi un modèle de type polynomial distributed lag model.
De…nition 3 La modèle polynomial d’Almon postule une restriction sur les paramètres retard
s de la forme
Xq
yt = + s xt s + "t (22)
s=0
2 p
s = 0 + 1s + 2s + ::: + ps s = 0; 1; : : : ; q (23)
où les paramètres j ; j = 0; : : : ; p sont des constantes réelles véri…ant p 6= 0. Le modèle à
décalage retard polynomial distribué devient alors
q
X q
X q
X
yt = + 0 xt s + 1 sxt s + + p sp xt s + "t (24)
s=0 s=0 s=0
8
Une spéci…cation usuelle des lags d’Almon est la fonction quadratique, obtenue pour p = 2
et s = 0 + 1 s + 2 s2 . Comme le montre la …gure ci-dessous, la fonction quadratique permet
d’obtenir des pro…ls de coe¢ cients retards s su¢ samment variés pour capter un grand nombre
de con…gurations sur les e¤ets marginaux.
Figure 1: Exemples de coe¢ cients retards pour s = 2 et q = 5

=1 =-0,5 =0,8
0 1 2
25
s
20
coefficient
15
10
5
0
0 1 2 3 4 5
s
=0.2 =1,2 =-0,2
0 1 2
3
s
2
coefficient
0
0 1 2 3 4 5
s
Estimation. La méthode d’estimation d’un modèle d’Almon est très simple. Pour un ordre
de retard q et un degré s du polynôme d’Almon donnés, on construit les variables explicatives
transformées suivantes :
q
X q
X q
X q
X
z0;t = xt s z1;t = sxt s z2;t = s2 xt s ::: zp;t = sp xt s (25)
s=0 s=0 s=0 s=0
On considère ensuite le modèle de régression linéaire
yt = + 0 z0;t + 1 z1;t + ::: + p zp;t + "t (26)
Les paramètres ; 0 ; 1 ; : : : ; p peuvent alors être estimés par MCO ou MCG. A partir des
paramètres estimés b0 ; b1 ; : : : ; bp on peut alors reconstruire les estimateurs b 0 ; b 1 ; : : : ; b q des
coe¢ cients retard en utilisant la fonction polynomiale
b = b0 + b1 s + b2 s2 + : : : + bp sp s = 0; 1; : : : ; q (27)
s
Considérons l’exemple d’un modèle avec q = 4 retards et un polynôme d’Almon d’ordre

p = 2, tel que
yt = + 0 xt + 1 xt 1 + 2 xt 2 + 3 xt 3 + 4 xt 4 + "t (28)
2
s = 0 + 1s + 2s s = 0; 1; 2; 3; 4 (29)
9
On construit les 3 variables z0;t ; z1;t et z2;t telles que
z0;t = xt + xt 1 + xt 2 + xt 3 + xt 4 (30)
z1;t = xt 1 + 2xt 2 + 3xt 3 + 4xt 4 (31)

z2;t = xt 1 + 4xt 2 + 9xt 3 + 16xt 4 (32)
On estime alors les paramètres du modèle linéaire
yt = + 0 z0;t + 1 z1;t + 2 z2;t + "t (33)
par MCO ou MCG et il vient
b = b0 b = b0 + b1 + b2 b = b0 + 2b1 + 4b2 (34)

0 1 2
La valeur ajustée de yt peut alors s’écrire soit en fonction des variables transformées zs;t , soit
en fonction des variables explicatives retardées xt s de la façon suivante
ybt = b + b0 z0;t + b1 z1;t + b2 z2;t

= b + b0 xt + b0 + b1 + b2 xt 1 + b0 + 2b1 + 4b2 xt 2 (35)
La distribution des paramètres retards estimés peut parfois sembler contre-intuitive. On peut
par exemple obtenir des coe¢ cients retard qui s’écartent de zéro à l’extrémité ou qui prennent
des valeurs négatives au milieu. Une distribution de retards estimée non plausible peut être
la preuve d’une mauvaise spéci…cation du modèle et ne doit pas être ignorée. Si l’on souhaite
toutefois conserver la spéci…cation du modèle, il est possible de contraindre les coe¢ cients
s à avoir certaines propriétés en posant des contraintes sur les paramètres de la fonction
polynomiale. Par exemple, considérons le cas d’une fonction quadratique (p = 2) et supposons
que l’on souhaite que les poids s convergent régulièrement vers zéro et qu’ils s’annulent pour
un décalage q + 1, comme c’était le cas pour les décalages linéaires précédemment mentionnés.
On souhaite donc imposer la contrainte
q+1 = 0 + 1 (q + 1) + 2 (q + 1)2 = 0 () 0 = 1 (q + 1) 2 (q + 1)2
Imposer cette contrainte sur les paramètres de la fonction polynomiale lors de l’estimation
permet ainsi d’obtenir des coe¢ cients retards estimés b s qui décroissent progressivement vers
0 lorsque les retards s approchent l’ordre maximum q. Pour une discussion plus approfondie
sur le choix de l’ordre du polynôme s, ses implications sur les pro…ls des coe¢ cients retard s ,
et sur les di¤érentes restrictions que l’on peut imposer sur ces paramètres, voir Smith et Giles
(1976).5
La méthode polynomiale d’Almon est donc très simple d’utilisation. Toutefois, elle présente
un inconvénient puisqu’elle nécessite non seulement la spéci…cation a priori du nombre de retards
q, mais aussi la spéci…cation du degré p du polynôme. Le choix de ce dernier paramètre est
particulièrement délicat et une mauvaise spéci…cation peut introduire un biais important lors
de l’estimation de certains coe¢ cients.
5
Pour une application des lags d’Almon dans un autre contexte que celui des modèles à retards échelonnés,
voir par exemple Banulescu, Candelon, Hurlin et Laurent (2016).
10
4 Modèles avec variable dépendante retardée
L’idée des modèles avec variable dépendante retardée est similaire à celle des modèles AR et
ARIMA : il s’agit d’utiliser une ou plusieurs valeurs retardées de y comme déterminant de la
valeur actuelle de yt . Le modèle le plus simple est le modèle de Koyck qui est fondé uniquement
sur la valeur retardée yt 1 et la valeur courante de l’explicative xt . Par inversion du polynôme
autorégressif, il est alors possible de montrer que ce modèle admet une représentation équivalente
sous forme de modèle à retards échelonnés in…nis à décroissance géométrique.
4.1 Modèle de Koyck et extensions

Le modèle avec un décalage autorégressif de premier ordre est souvent appelé modèle de Koyck
en reconnaissance de l’application déterminante de ce modèle à la fonction d’investissement
macroéconomique par Koyck (1954). Ce modèle correspond à un modèle AR(1)-X dans la
terminologie de Box et Jenkins (1976)6 .
De…nition 4 Le modèle de Koyck (1954) est dé…ni par la relation
y t = + yt 1 + 0 xt + vt (36)
où "t est un bruit blanc faible et où le paramètre véri…e j j < 1.
Ce modèle peut s’écrire sous la forme
(1 L) yt = + 0 xt + vt (37)
où L désigne l’opérateur retard. Par inversion du polynôme (1 L) ; il vient
0 1
yt = + xt + vt (38)
1 (1 L) (1 L)
1 2 3 P1 s
On rappelle que si j j < 1, on a (1 L) =1+ + + + ::: = s=0 . Dès lors, cette
équation peut se réécrire sous la forme
1
X 1
X
s s
yt = + 0 xt s + vt s (39)
1
s=0 s=0
On constate immédiatement que l’équation (39) correspond à un modèle à retards échelonnés

contraint (restricted distributed lag model ) similaire à ceux évoqués dans la section précédente.
Proposition 1 Le modèle de Koyck peut se réécrire sous la forme d’un modèle à retards in…nis
contraints, à décroissance géométrique, sous la forme
1
X
s
yt = + B (L) xt + "t = + 0 xt s + "t (40)
s=0
P1 s 1 P1 s
avec = = (1 ), "t = s=0 vt s, B (L) = 0 (1 L) = s=0 s, et s = 0 .
6
Le X mis à la …n de l’acronyme AR, MA, ARMA ou ARIMA signi…e que l’on ajoute à l’équation du modèle
une ou plusieurs variables explicatives supposées exogènes. Dans un modèle ARIMA-X il n’y a pas d’équation
auxiliaire pour décrire la dynamique de ces variables X exogènes, contrairement aux modèles VAR qui postulent
une dynamique jointe (endogène).
11
Un modèle de Koyck correspond donc à un modèle à retards in…nis, avec une distribution
géométrique des retards qui est dé…nie implicitement par inversion du polynôme retard autoré-
gressif 1 L. Pour rappel, un modèle avec distribution (in…nie) géométrique des retards s’écrit
sous la forme
1
X
yt = + e (1 0) s
xt s + "t (41)
s=0
En posant 0 = e 0 (1 ) et = , on retrouve le modèle de Koyck.
Spéci…cation et e¤ets marginaux dans le modèle de Koyck. Quelques remarques sur la

spéci…cation du modèle de Koyck et les e¤ets marginaux associés. L’e¤et marginal dynamique
est égal à
@yt
= s= 0 s (42)
@xt
Si l’on suppose que 0 < < 1, cet e¤et décline exponentiellement vers 0 à partir d’une valeur
égale à 0 . L’e¤et cumulé de long terme de x sur y est égal à
1
X
s 0
0 = (43)
1
s=0
Une des principales limites de la spéci…cation du modèle de Koyck réside dans son manque
de souplesse et de ‡exibilité lorsque l’on considère plus d’une variable explicative. Considérons
un modèle de Koyck avec deux variables explicatives x1t et x2t tel que
y t = + yt 1 + 0 x1t + 0 x2t + vt (44)
Les e¤ets marginaux dynamiques des variables x1t et x2t sur yt sont alors égaux à
@yt s @yt s
= 0 = 0 (45)
@x1t @x2t
On observe immédiatement que le modèle de Koyck impose que la vitesse de décroissance (avec
les décalages temporels) des e¤ets marginaux des variables x1t et x2t soit exactement identique.
Une telle hypothèse de symétrie du pro…l temporel des réponses dynamiques de la variable y
aux di¤érentes variables explicatives peut être problématique. C’est la principale justi…cation
aux modèles ARDL (cf. infra) : le fait d’introduire un polynôme retard spéci…que à chaque
variable explicative permet de di¤érencier la dynamique temporelles des e¤ets marginaux des
variables x1t et x2t .
Estimation du modèle de Koyck. L’estimation des paramètres du modèle de Koyck peut

poser plusieurs problèmes, que l’on peut considérer comme classiques dans le cas des modèles
autorégressifs. Tout d’abord, par dé…nition la variable dépendante décalée yt 1 ne pas être
strictement exogène, de sorte que les hypothèses nécessaires pour le théorème de Gauss-Markov
ne peuvent pas être satisfaites.
Mais même l’hypothèse d’exogénéité faible de la variable yt 1 peut poser problème. La
variable yt 1 dépend de vt 1 ; puisque yt 1 = + yt 2 + 0 xt 1 + vt 1 . Dès lors, si le terme
d’erreur vt est autocorrélé on a cov (vt ; yt 1 ) 6= 0 et l’estimateur des MCO n’est plus convergent.
C’est pourquoi, nous avons dé…ni le modèle de Koyck en supposant que le terme d’erreur était
un bruit blanc faible, impliquant cov (vt ; vt 1 ) = 0. Dès lors, il convient de véri…er l’absence
d’autocorrélation dans les résidus du modèle de Koyck, car la présence d’une telle autocorrélation
12
remet en cause l’exogénéité faible du régresseur yt 1 . Mais on se heurte ici à un problème
de circularité : a…n de tester l’absence d’autocorrélation dans le terme d’erreur vt (et donc
l’exogénéité faible de yt 1 et, in …ne, la convergence de l’estimateur des MCO), on a besoin
des résidus vbt qui ont été construits à partir des estimateurs des MCO, potentiellement non
convergents.
Pour cette raison, l’estimation du modèle de Koyck et de ses extensions (ARDL, AR-X) se
fait parfois par la méthode des variables instrumentales pour tenir compte de l’endogénéité de
la variable yt 1 . C’est typiquement le cas sous R, avec la fonction koyckDlm du package dLagM
(Demirhan, 2018).
Extension du modèle de Koyck. Une extension naturelle du modèle de Koyck est le modèle
AR(p)-X qui les valeurs retardées de la variable dépendante pour des retards allant de 1 à p.
Ce modèle s’écrit simplement comme suit
yt = + 1 yt 1 + ::: + p yt p + 0 xt + vt (46)
Pour p = 1,on retrouve bien évidemment le modèle de Koyck. Ce modèle AR(p)-X peut s’écrire
de façon plus concise en utilisant un polynôme retard.
De…nition 5 Le modèle AR(p)-X est dé…ni par la relation
(L) yt = + 0 xt + vt (47)
Pp s
où vt est un bruit blanc faible et où le polynôme (L) véri…e (L) = 1 s=1 s L , avec
p 2 R . On suppose que les racines du polynôme (L) sont toutes situées en dehors du cercle
unité.
La condition sur les racines du polynôme (L) est une généralisation de la condition j j < 1
du modèle de Koyck. Par exemple, considérons un modèle AR(2)-X tel que :
5 1
y t = yt 1 yt 2 + xt + vt (48)
8 16
Le polynôme autorégressif (L) s’écrit (L) = 1 5=8 L + 1=16 L2 . Les racines de ce
polynôme, telles que ( 1 ) = ( 2 ) = 0; sont égales à 1 = 2 et 2 = 8. Leur module (leur
valeur absolue pour des valeurs réelles) est supérieur à l’unité. Les deux racines sont donc à
l’extérieur du cercle unité, ce qui garantit la stabilité du modèle.
Lorsque cette condition de stabilité n’est pas satisfaite, une variation marginale de x peut
conduire à une variation explosive de y. Dit autrement, la réponse dynamique de y à un choc
x est explosive. Une solution consiste alors à di¤érencier la variable y et à postuler un nouveau
modèle AR(p 1)-X sur la variation y = (1 L) y et non plus sur le niveau de y.
4.2 Modèles ARDL

Les modèles ARDL (Autoregressive-Distributed Lag model ) consituent une généralisation des
modèles de Koyck et AR-X.
13
Spéci…cation et estimation des modèles ARDL. Un modèle ARDL(p; q) s’écrit sous la
forme
Xp q
X
yt = + s yt s + s xt s + vt (49)
s=1 s=0
Par exemple, un modèle ARDL(2,1) correspond à
yt = + 1 yt 1 + 2 yt 2 + 0 xt + 1 xt 1 + vt (50)
Le modèle ARDL(p; q) peut s’écrire de façon plus concise en utilisant deux polynômes retard
: un pour les retards sur la variable dépendante y (polynôme autorégressif) et l’autre pour les
retards sur la variable explicative x.
De…nition 6 Le modèle ARDL(p; q) est dé…ni par la relation
(L) yt = + (L) xt + vt (51)

Pp s
où "t est un bruit blanc faible. Le polynôme (L) véri…e (L) = 1 s=1 s L , avec p 2 R .
On suppose que les racines du polynôme (L) sont toutes situées en dehors du cercle unité. Le
P
polynôme (L) est dé…ni par (L) = qs=0 0 Ls , avec q 2 R .
Tout comme nous l’avions fait pour le modèle de Koyck, nous pouvons réécrire ce modèle
sous la forme d’un modèle à retards échelonnés contraint par inversion du polynôme (L).
(L)
yt = + xt + vt = + B (L) xt + vt (52)
(1) (L)
Cette formulation explique pourquoi le modèle ARDL est parfois appelé modèle à retard ra-
tionnel7 ou rational lag model ( Jorgenson, 1966). La détermination des termes du polynôme
B (L) suppose d’inverser le polynôme (L). Il existe pour cela di¤érentes méthodes (cf. annexe
A).
Dans ce modèle, l’e¤et cumulatif de long terme de x sur y est égal à
1
X (1)
B (1) = s = (53)
(1)
s=1
Tout comme pour le modèles de Koyck et AR-X, la condition selon laquelle les racines du
polynôme (L) sont toutes situées en dehors du cercle unité garantit que l’e¤et dynamique
(e¤et cumulé de long terme) de x sur y est non explosif8 . Il convient de bien véri…er cette
condition sur les modèles estimés sous peine d’obtenir des e¤ets dynamiques non conformes à la
réalité économique. Dans le cas, où cette condition n’est pas véri…ée, il convient de di¤érencier
la variable yt et d’appliquer un nouveau modèle ARDL sur yt = (1 L) yt .
Tout comme pour les modèles ARMA, il existe plusieurs façons (non exclusives) de déter-
miner les retards maximum p et q des modèles ARDL :
7
Pour rappel, un nombre rationnel est un nombre qui peut s’exprimer comme le quotient de deux entiers
relatifs. Par analogie ici le polynôme retard B (L) s’écrit comme le ratio de deux polynômes
(L)
B (L) =
(L)
8
Pour plus de détails, voir Greene (2007), chapitre 19, section 19.4.3, consacrée à l’étude de la stabilité d’une
équation dynamique.
14
En testant la signi…cativité des paramètres p et q . Si l’hypothèse nulle de nullité de p
(respectivement q ) n’est pas rejetée, il convient de réduire l’ordre p (respectivement q).
En utilisant des critères d’information de type AIC et BIC. La meilleure spéci…cation des
retards maximum (p; q) est celle qui permet de minimiser les critères d’information, i.e.
de minimiser la MSE du modèles pour un nombre de paramètre à estimer le plus faible
possible.
En utilisant des tests d’autocorrélation des résidus de type Breusch-Godfrey LM ou Box-

Ljung Q-test. Pour un ordre (p; q) donné, on estime le modèle et les résidus vbt . Si
l’hypothèse nulle d’absence d’autocorrélation est rejetée, il convient d’augmenter l’ordre
de retard du polynôme autorégressif p d’une unité, et cela jusqu’à obtenir des résidus
compatibles avec l’hypothèse de bruit blanc.
Prévisions avec un modèle ARDL. Considérons un modèle ARDL(p; q) tel que

p
X q
X
yt = + s yt s + s xt s + vt (54)
s=1 s=0
Pour simpli…er, on pose

q
X
t = + s xt s (55)
s=0
Le modèle ARDL se réécrit sous la forme d’un modèle équivalent à un AR(p)
yt = t + 1 yt 1 + ::: + p yt p + vt (56)
Conditionnellement à l’ensemble d’information disponible jusqu’à la date T , noté T , et à

b T +1jT , la prévision à l’horizon h = 1 de la variable
une prévision sur la variable explicative x
dépendante est dé…nie par
ybT +1jT = b T +1jT + 1 yT + ::: + p yT p+1 (57)
b T +1jT = + b T +1jT
0x + 1 xT + ::: + q xT q+1 (58)
| {z }
Prev
Dans la pratique, l’erreur de prévision yT +1 ybT +1jT provient de 3 sources :
1. Les paramètres ; i ; j ne sont pas connus et doivent être estimées, ce qui engendre une
erreur d’estimation.
2. La valeur future de la variable explicative xT +1 n’est pas connue. Elle doit être prévue, ce
qui induit une erreur de prévision xT +1 x b T +1jT qui se répercute sur l’erreur de prévision
sur yT +1 .
3. Par dé…nition, la composante d’erreur de type bruit blanc vT +1 , ne peut pas être prévue
puisque E ( vT +1 j T ) = 0.
En général, la seconde source d’incertitude est négligée car on ne connait pas la forme ou les
propriétés de l’erreur de prévision sur xT +1 . La variance asymptotique de la prévision ybT +1jT ,
et donc de l’erreur de prévision sur yT +1 , dépend de façon classique de la matrice de variance
15
covariance des paramètres estimés et de la variance du terme d’erreur vt . Pour plus de détails,
voir Greene (2007).
Le même raisonnement peut être mené pour n’importe quel horizon h 1. Par exemple,
pour un horizon h = 2, la prévision dynamique de yT +2 conditionnelle à l’information T
disponible à la date T devient
ybT +2jT = b T +2jT + bT +1jT

1y + 2 yT ::: + p yT p+2 (59)
b T +2jT = + b T +2jT
0x + b T +1jT
1x + 2 xT + ::: + q xT q+2 (60)
| {z } | {z }
Prev Prev
Dans ce cas, la prévision ybT +2jT nécessite de connaitre les prévisions de la variable x aux
b T +1jT et x
horizons h = 1 et h = 2, notées x b T +2jT . Les procédures de prévisions de R ou de SAS
pour les modèles ARDL nécessitent donc de donner les prévisions x b T +1jT ; x
b T +2jT ; : : : x
b T +hjT
pour toutes les variables explicatives (exogènes) du modèle. L’utilisateur doit donc construire
des modèles auxiliaires pour mener à bien ces prévisions ou faire des scenarios sur ces valeurs
futures.
5 Applications
Nous allons discuter ici brièvement les possibilités d’application de ces modèles sous le logiciel
R et sous le logiciel SAS.
5.1 Mise en oeuvre sous le logiciel R

L’estimation des di¤érents modèles à retards échelonnés est rendue possible sous le logiciel R
grâce au package dLagM (Demirhan, 2018). Ce package comprend di¤érentes fonctions qui
permettent d’estimer les paramètres des modèles suivants :
Modèles à retard distribués d’ordre …ni (…nite distributed lag models) : fonction dlm
Modèles avec décalage polynomial distribués (polynomial (Almon) distributed lag models)
: fonction polyDlm
Modèles avec distribution géométrique des retards (geometric distributed lag models) avec
ou sans transformation de Koyck : fonction koyckDlm. Rappelons qu’un modèle avec
transformation de Koyck est équivalent à un modèle à retards distribués d’ordre in…ni
(in…nite distributed lag models).
Modèles ARDL (autoregressive distributed lag models) : fonction ardlDlm
La fonction forecast de ce package permet de produire les prévisions à un horizon h issues de

n’importe lequel de ces modèles. Ces prévisions sont obtenues en indiquant en option les valeurs
prédites de la ou des variables explicatives exogènes xb t+1jt ; x
b t+2jt ; : : : ; x
b t+hjt . Les intervalles
de con…ances sont obtenus par simulation de Monte Carlo sous l’hypothèse de normalité du
bruit blanc "t . En…n, la fonction …niteDLMauto permet de déterminer le retard optimal q sur
le polynôme retard de la variable explicative, suivant di¤érents critères (MASE, BIC ou AIC).
Cette fonction ne s’applique qu’au modèle sans composante autorégressive.
16
5.2 Mise en oeuvre sous le logiciel SAS
Sous SAS, les modèles à retards échelonnés et leurs extensions peuvent être estimés à partir
de la procédure PROC PDLREG. Pour plus de détails, voir SAS (2014). Cette procédure est
essentiellement consacrée à l’estimation de modèles à retards polynomiaux d’Almon. Elle peut
être étendue pour introduire la variable dépendante retardée yt 1 grâce à l’instruction LAGDEP.
Attention dans ce cas, on obtient un modèle de Koyck avec un schéma de contraintes sur le
polynôme retard (L) déterminé par les polynômes d’Almon. C’est donc une procédure qui est
beaucoup plus spécialisée que le package dLagM de R. Mais elle permet facilement par exemple
de poser des restrictions lors de l’estimation des paramètres du polynôme d’Almon.
De façon automatique, les coe¢ cients estimés bi et b s sont a¢ chés comme le montre les
…gures ci-dessous9 . Dans cet exemple, la variable dépendante m est régressée sur 3 variables
explicatives (y; r et p) et une valeur retardée mt 1 . Pour la variable yt on considère q = 3 lags,
c’est-à-dire que l’on va introduire les régresseurs yt ; yt 1 ; yt 2 et yt 3 . Les 4 paramètres associés
0 ; 1 ; 2 ; 3 sont déterminés par un polynôme de degré 3, du type
2 3
s = 0 + 1s + 2s + 3s s = 0; : : : ; 3
Les paramètres estimés b0 ; b1 ; b2 et b3 sont reportés sur la …gure 2. Les paramètres estimés
b ; b ; b ; b sont reportés sur la …gure 3.
0 1 2 3
Figure 2: Estimation d’un modèle à retards polynomiaux d’Almon
9
Les instructions de ce modèle sont les suivantes :
proc pdlreg data=a;
model m = lagm y(5,3) r(2, , ,…rst) p(3,2) / lagdep=lagm;
run;
17
Figure 3: Paramètres b s estimés
References
[1] Almon, S. (1965). The Distributed Lag Between Capital Appropriations and Expenditures.
Econometrica, 33 (1), pp. 178-196.
[2] Banulescu D.,Candelon B., Hurlin C. et Laurent S. (2016), Do We Need Ultra-High Fre-
quency Data to Forecast Variances?, Annales d’Economie et Statistiques, 123-124, pp.
135-174.
[3] Box, G.E et G.M. Jenkins (1976). Time Series Analysis, Forecasting and Control, Wiley.
[4] Demirhan H. (2018), Package ‘dLagM’, October 2018.
[5] Dhrymes, P. J., (1971). Distributed Lags: Problems of Estimation and Formulation. Holden-
Day, San Francisco.
[6] Greene W. (2007), Econometric Analysis, sixth edition, Pearson - Prentice Hill.
[7] Koyck, L. M. (1954). Distributed Lags and Investment Analysis. Amsterdam: North-
Holland.
[8] Madinier H. et M. Mouillart (1983), Les méthodes d’estimation des modèles à retards
échelonnés en économie, Revue de statistique appliquée, 31 (4), pp. 53-73.
[9] SAS (2014), SAS/ETS R 13.2 User’s GuideThe PDLREG Procedure.
[10] Smith, R.G. et D.E.A. Giles (1976). The Almon estimator: Methodology and users’guide.
Discussion Paper E76/3, Reserve Bank of New Zealand.
18
A Annexe : Inversion d’un polynôme d’ordre p
En analyse des séries temporelles, il est souvent utile d’inverser des processus. Par exemple,
partant d’un processus AR stationnaire, on peut par inversion du polynôme autorégressif, déter-
miner la forme M A (1) associée à la décomposition de Wold. On obtient ainsi des représen-
tation équivalentes d’un même processus. Pour cela, il est nécessaire d’inverser des polynômes
dé…nis en l’opérateur retard. Nous avons déjà vu comment réaliser cette opération pour des
polynômes de degré un. Nous allons à présent généraliser cette méthode au cas de polynôme
de degré supérieur ou égal à un.
Le problème est donc le suivant. Soit (z) un polynôme inversible d’ordre p à coe¢ cients
réels avec 0 = 1. Il s’agit de déterminer e (z) ; le polynôme inverse de (z) . Par dé…nition,
8z 2 C
Xp 1
X
e
(z) (z) = (z) (z) = 1 j e zj = 1
jz j
j=0 j=0
Plusieurs solutions existent pour déterminer e (z) : Parmi celles-ci, nous n’en retiendrions
que deux.
A.1 Méthode d’identi…cation

On part de la relation (z) e (z) = 1 et l’on cherche à déterminer les paramètres ej ; tels que
Pn e j
j=0 j z , par identi…cation les coe¢ cients de même degré. De façon générale, les n équations
du système d’identi…cation sont dé…nies par :
P
i
e
i k k = 0 8i 2 [1; p]
k=0 (61)
Pp
e
i k k = 0 8i > p
k=0
Considérons l’exemple suivant. On cherche à inverser le polynôme AR (2) dé…ni par

2
(z) xt = 1 + 1z + 2z xt = "t
avec 1 = 0:6 et 2 = 0:3: Le polynôme est inversible puisque les racines sont de module
strictement supérieur à 1 : 1 = 1:23 et 2 = 3:23: Soit e (z) le polynôme inverse de (z) ;
que l’on suppose de degré in…ni. On part de la relation d’identi…cation :
(z) e (z) = 1
En développant on obtient :
1+ 2 e + e z + e z 2 + e z 3 + ::: + e z p + ::: = 1
1z + 2z 0 1 2 3 p
() e + e z + e z 2 + e z 3 + ::: + e z p + :::
0 1 2 3 p
e e z + e2 z 2 + e e z 3 + e e z 4 + ::: + e e z p+1 + :::
0 1 1 2 1 3 1 p 1
e e z 2 + e e z 3 + e2 z 4 + e e z 5 + ::: + e e z p+2 + ::: = 1
0 2 1 2 2 3 2 p 2
Par identi…cation des termes de même degré à droite et à gauche du signe égal, on obtient
alors le système suivant :
19
8
> e =1
>
> 0
>
> e +
>
> 1 1 =0
< e e
2+ 1 1+ 2 =0
> e e e
>
> 3+ 2 1+ 1 2 =0
>
> :::
>
>
: e e e
n + n 1 1 + n 2 2 = 0 8n > 2
La résolution de ce système fournit alors une suite de récurrence qui dé…nit les coe¢ cients
de la représentation M A (1) du processus xt :
1
X
xt = e (z) "t = e "t j (62)
j
j=0
où les paramètres ej sont dé…nis par la relation suivante :
e =1 (63)
0
e = 0:6 (64)
1
e = 0:6en + 0:3en 8n 2 (65)
n 1 2
A.2 Méthode dite du guess and verify

L’autre méthode d’inversion des polynômes est la suivante. On considère pe racines i réelles
distinctes du polynôme (z) d’ordre p, ( i )i 2 Rpe, avec pe p: On supposera ici pour simpli…er
que pe = p; mais cette méthode peut être étendue au cas général pe p. Une autre façon d’obtenir
les paramètres ei du polynôme inverse e (z) = (z) 1 consiste à déterminer les paramètres aj
tels que : !
Xp
e (z) = 1 a j
=
Qp
e 1 ej z
1 jz j=1
j=1
où les ej sont dé…nis comme l’inverse des racines de (z) :
ej = 1 8j 2 [1; pe] (66)

j
De façon générale, les paramètres aj sont dé…nis par la relation suivante.
ep 1
j
aj = 8j p (67)
Q
p
ej ek
k=1
k6=j
Or on montre que :
! 0 1
p
X p
X 1
X 1
X Xp
aj ek z k = k
= aj j
@ aj ej A z k (68)
j=1 1 ej z j=1 k=0 k=0 j=1
Dès lors, par identi…cation, on obtient :

0 1
X p
i
e =@ aj ej A 8i 0 (69)
i
j=1
20
Considèrons à nouveau l’exemple du polynôme AR (2) dé…ni par (z) = 1 + 1 z + 2 z 2 ;
avec 1 = 0:6 et 2 = 0:3: Les deux racines réelles sont 1 = 1:23 et 2 = 3:23: On cherche
tout d’abord à déterminer les paramètres ai tels que 8z 2 C
1 a1 a2
= +
1 e1 z 1 e2 z 1 e1 z 1 e2 z
avec
e1 = 1 1
=
1 1:23
e2 = 1 1
=
2 3:23
En développant, on obtient l’égalité suivante, 8z 6= i; i = 1; 2 :
a1 1 e2 z + a2 1 e1 z = 1
() (a1 + a2 ) a1 e2 + a2 e1 z = 1
Par identi…cation des termes de même degré, on obtient le système :
a1 + a2 = 1
a1 e2 + a2 e1 = 0
D’où l’on tire …nalement que :
e1 e2
a1 = a2 =
e1 e2 e2 e1
On peut alors construire le polynôme inverse.

0 1
X1 p
X 1
X
k k k
e (z) = @ aj ej A z k = a1 e1 + a2 e2 z k
k=0 j=1 k=0
!
1
X ek+1 ek+1
1 2
= + zk
e1 e2 e2 e1
k=0
k+1 !
1
X ek+1 e
2 1
= zk
e2 e1
k=0
On obtient donc …nalement :

!
1
X 1
X ej+1 ej+1
e (z) = e z = j 2 1
zj (70)
j
e2 e1
j=0 j=0
On peut démontrer que les paramètres ej ainsi dé…nis satisfont l’équation de récurrence
dé…nie en (65).
21
This article was downloaded by: [Michigan State University]
On: 23 September 2013, At: 06:48
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number:
1072954 Registered office: Mortimer House, 37-41 Mortimer Street,
London W1T 3JH, UK
Journal of Applied Statistics

Publication details, including instructions for
authors and subscription information:
http://www.tandfonline.com/loi/cjas20
The use of the ARDL

approach in estimating
virtual exchange rates in
India
Subrata Ghatak & Jalal U. Siddiki
Published online: 02 Aug 2010.
To cite this article: Subrata Ghatak & Jalal U. Siddiki (2001) The use of the
ARDL approach in estimating virtual exchange rates in India, Journal of Applied
Statistics, 28:5, 573-583, DOI: 10.1080/02664760120047906
To link to this article: http://dx.doi.org/10.1080/02664760120047906
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all
the information (the “Content”) contained in the publications on our
platform. However, Taylor & Francis, our agents, and our licensors
make no representations or warranties whatsoever as to the accuracy,
completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of
the authors, and are not the views of or endorsed by Taylor & Francis.
The accuracy of the Content should not be relied upon and should be
independently verified with primary sources of information. Taylor and
Francis shall not be liable for any losses, actions, claims, proceedings,
demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in
relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study
purposes. Any substantial or systematic reproduction, redistribution,
reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access
and use can be found at http://www.tandfonline.com/page/terms-and-
conditions
Downloaded by [Michigan State University] at 06:48 23 September 2013
Journal of Applied Statistics, Vol. 28, No. 5, 2001, 573- 583
The use of the ARDL approach in estimating

virtual exchange rates in India
SUBRATA GHATAK & JALAL U. SIDDIKI, School of Economics,

Kingston University, UK
abstract This paper applies the autoregressive distributed lag approach to cointegration
analysis in estimating the `virtual exchange rate’ (VER) in India. The VER would have
prevailed if the unconstrained import demand were equal to the constraint imposed due to
foreign exchange rationing and the VER is used to approximate the `price’ of rationed
foreign exchange reserves. We highlight the shortcomings of the existing literature in
approximating equilibrium exchange rates in a less developed country such as India and
propose the VER approach for equilibrium rates, which uses information from an estimated
structural model. In this relationship, black market real exchange rate (EU ) is a dependent
variable and real oý cial exchange rates (EO ), the ratio of the foreign (r*) to the domestic
(r) interest rate (I), and oý cial forex reserves (Q) are explanatory variables. In our
estimation, the VERs are higher than EO by about 10% in the short-run and 16% in the
long-run.
1 Introduction
The existence of `dual’ rates in the foreign exchange (forex) markets- one oý cial
and the other ùnoý cial’ or black market (BM)- is a common phenomenon in less
developed countries (LDCs) (Dornbusch, 1983; Phylaktis, 1992; Siddiki, 2000).
Dual exchange rates emerge as a result of controls on access to the oý cial market.
The chronic and persistent balance of payments problems, the trade controls, and
® nancial repression lead to the emergence of BM in exchange rates.
These BM rates could render the use of the oý cial exchange rate impotent
to control the trade balance and forex reserves. The negative consequences of
protectionist trade and ® nancial policies on the Indian economy are enormous
(Bhagwati & Desai, 1970; Bhagwati, 1979; Siddiki & Daly, 1999). The major costs
Correspondence: S. Ghatak, School of Economics, Kingston University, Kingston KT1 2EE, UK.
ISSN 0266-4763 print; 1360-0532 online/01/050573-11 2001 Taylor & Francis Ltd
DOI: 10.1080/02664760120047906
574 S. Ghatak & J. U. Siddiki
of BM rates can be summarized as follows. (a) The mis-allocation of resources, as

dual exchange rates drives a wedge between marginal costs and prices and imposes
a high tax on the export sector; (b) rent-seeking activities, which often imply large
resource costs (both direct and indirect due to the lobbying by rent-seeking groups)
due to corruption; and (c) government budget problems. In addition, the presence
of BMs and of a signi® cant margin between oý cial and BM rates may also fuel
speculative attacks on oý cial forex markets and make the forecasting of oý cial
rates problematic. These speculative attacks negate the eþ ectiveness of oý cial
devaluations in increasing the competitiveness of the economy (Kamin, 1993).
Budget de® cits arise when the authorities in a country, due to dual exchange
rates, buy forex at a high price from exporters and sell it at a low price to importers.
Such de® cits are to be ® nanced either by higher taxes (rather rare phenomena in
LDCs) or by money creation (a popular measure). Monetization of de® cits is
frequently in¯ ationary (Ghatak & Ghatak, 1996). Thus, exchange rate regimes
dominated by BMs could be linked to high in¯ ation. In the asset market also, an
exchange rate linkage exists when domestic agents can hold real assets (land) or
claims on real assets (stocks). Indeed, to diþ erentiate between the domestic and
foreign asset markets, capital controls and dual exchange rates have often been
used in many East European and LDCs, particularly in Latin America (Charemza
& Ghatak, 1990).
The reduction of such costs emanating from the presence of a BM in many LDCs
provides a strong motivation for measuring a `virtual’ exchange rate (VER)- a rate
that would have prevailed if the unconstrained import demand were equal to the
constraint imposed due to forex rationing. Such VERs can be regarded as `just
bites’ (i.e. prices) of rationed forex in the sense that the rationed levels coincide
with the quantities that would have been chosen by the unrationed agents facing
the same prices and income in the Tobin & Houthakker (1950: see also Neary &
Roberts (1980)) or Rothbarth (1940) sense. In this sense, a VER approximates the
equilibrium or `just’ price of rationed forex of a developing economy.
It is often argued by the International Monetary Fund/World Bank that getting
the real exchange rate `right’ should be one of the important goals for policy makers
in LDCs, particularly where oý cial exchange rates are administratively determined.
The modelling and estimation of such `right’ rates in LDCs is the prime objective
of this paper since the concept of the èquilibrium’ exchange rate has long been
regarded as a chimera and its estimation is hazardous (see Section 3). In addition,
none of the available methods considers the relationship between oý cial and BM
rates and the impact of ® nancial policies on BM rates. Our paper seeks to ® ll this
gap by exploring the determinants of BM rates, i.e. the causes of distortions in the
forex market, in India and by deriving the VER from the information available in
both the oý cial and the unoý cial exchange rate markets. Our estimation of the
VER is based on the important structural factors in the economy that aþ ect the
èquilibrium’ exchange rate.
The autoregressive distributed lag (ARDL) approach to cointegrated analysis
(Pesaran & Shin, 1998) and time series data from 1965- 96 are used in estimating
our empirical model and in estimation VERs. The major advantage of the ARDL
method is that it avoids problems of serial correlation and of endogeneity, by an
appropriate augmentation, that may be experienced by other cointegration
methods. In addition, this method avoids pretesting of the order of integration,
which is associated with other cointegration analyses. Thus, the aims of the paper
are as follows:
Estimating virtual exchange rates in India 575
(i) To develop a methodology to measure VERs to indicate the extent of

changes to be made in oý cial exchange rates and thereby reduce, if not
eliminate, the size of a BM and its costs.
(ii) To analyse rigorously the relationships between oý cial and unoý cial
exchange rates.
(iii) To understand some of the determinants of VERs in LDCs.
(iv) To derive some policy implications.
This paper is organized as follows: Section 2 explains the trade and forex policies
in India. In Section 3 we survey the literature on BMs for forex and equilibrium
exchange rates. In this section we justify the empirical speci® cation of our model.
This section also highlights the shortcomings of various types of equilibrium
exchange rates. Section 4 reports our empirical results. In Section 5, we estimate
the VERs. Section 6 draws conclusions.
2 The BM for forex in India

After independence in 1947, India followed a set of import substitution industrial-
ization (ISI) policies until the late 1970s in order to achieve national self-reliance.
Large scale industries were mainly promoted by planning and regulated by the
government. During the 1950s and 1960s, India launched a major planning
programme for rapid industrialization and economic growth via import substitu-
tions, tariþ s and exchange controls.
Imports were controlled while exports were taxed in various ways. These
interventions were reduced between 1985 and 1991. Even so, import tariþ s in
India still remain among one of the highest in the world (World Bank, 1994).
Similarly, forex is strictly regulated by the central bank of India (IMF, 1997).
Consequently, BM premiums are very high, although since the 1990s they have
been declining (Siddiki, 2000). The real GDP growth rate until the mid-1970s was
low (i.e. 3.5% p.a.) despite the fact that savings and investment were about one
® fth of the GDP (World Bank, 1994). Export taxes, together with ISI policies,
failed to promote exports as a percentage of GDP. Similarly, imports growth has
also been very low. In addition, India has been losing a huge amount of forex
through the BM due to controls over its external sector. Moreover, the exchange
rate distortions probably kept foreign investment in India well below that in other
large developing countries (World Bank, 1994).
3 The economics of BM and equilibrium exchange rates

The literature on the determinants of BM rates can be divided into three main
categories (see Kiguel et al., 1997; Siddiki, 1999; for a survey). For the real trade
approach to BM (TABM) states that illegal trade to avoid tariþ s and legal restrictions
on oý cial forex and international trade is the main reason for the existence of the
BM (Sheik, 1976). The oý cial supply of forex is generally lower than the demand
when oý cial exchange rates are overvalued. This excess demand is met in the BMs
at a market determined rate, which is generally more than the oý cial rate. Under
such circumstances, importers over-invoice imports to obtain more forex from
oý cial sources than the amount needed and the excess amount is sold via BMs to
earn BM premiums. Similarly, exporters under-invoice exports and sell the
unreported forex via BMs to reap premiums.
Second, the monetary approach to BM (MABM) postulates that excess money

supply creates BMs for forex when oý cial forex markets are restricted; the margin
between BM and oý cial rates depends on the extent of excess money supply
(Blejer, 1978; Biswas & Nandi, 1986; Gupta, 1980). The presence of excess money
supply creates in¯ ationary pressures and leads to the emergence of BM rates.
Finally, the portfolio balance approach to the BM (PABM) argues that surpluses
or de® cits in the current account of the BOPs create the BMs for forex when
oý cial rates are administratively determined and overvalued (Dornbusch et al.,
1983; Phylaktis, 1992). The PABM postulates that an increase in the current
account surplus causes an appreciation in the domestic currency since such
increases raise the wealth of a nation, which in turn raises the demand for
(domestic) money and, hence, a fall in excess money supply. The opposite is true
when the current account is in de® cit. Thus, current account de® cits or surpluses
directly aþ ect BM rates since economic agents could not change the assets in their
portfolios through oý cial markets.

Theories on how to estimate equilibrium exchange rates are not very well
developed despite some eþ orts to estimate them (Khan et al., 1992). The ® rst
approach is a forecasting technique that tries to estimate the required change in
exchange rates in order to achieve `sustainable’ balance of payments or to move
the economy closer to èquilibrium’ . Khan et al. (1992) suggest that one should
® nd a base period in which the real exchange rate is equal to the equilibrium rate
and then determine the extent of misalignment due to policy shocks or changes in
structural parameters. Thus, this method requires a base year. However, the
selection of a base year in which existing real and equilibrium exchange rates are
equal is rather arbitrary and implausible.
The second approach suggests using BM premiums as an indicator of real
exchange misalignments as BM exchange rates are considered as market deter-
mined. However, none of the above approaches considers the structural relationship
between oý cial and BM rates, oý cial forex reserves and ® nancial policies before
taking the BM premiums as an indicator of the excess demand. Our paper
circumvents the above criticisms by both using and estimating VERs in India in
the following way: ® rst, a structural model for real BM exchange rates is estimated.
Secondly, we provide a methodology for approximating VERs, in the absence of
restriction on forex, using information obtained from the structural model.
Oý cial forex reserves in¯ uence trade policies and thus the BM rate in a
developing country such as India since a low (high) level of oý cial forex reserves
is associated with more restrictive (liberal) trade policies ( Joshi et al., 1994), which
are among the important determinants of BM rates (Siddiki, 2000). Thus, we
include forex reserves as a determinant of the BM rates. Interest rate diþ erentials,
represented by the ratio of foreign to domestic interest rates (I), rather than only
the domestic interest rate are also included as a determinant of BM rates in order
to test both MABM and PABM. Note that the MABM predicts that a relative
increase in domestic interest rates (the opportunity costs of holding money) reduces
the demand for domestic money and thus generates excess money supply, which
in turn increases the demand for forex in the BM. Thus, a relative rise in domestic
interest rates implying a reduction in I causes a depreciation in BM rates. On the
other hand, the PABM predicts that this relative rise in domestic rates (the returns
to money), i.e. a fall in I, increases the portfolio demand for domestic money and
reduces excess money supply and therefore leads to an appreciation in BM rates.
Money supply is excluded as a determinant of BM rates due to double counting
problems since oý cial reserves are a component of the money supply. Additionally,
the inclusion of interest rates as a determinant of BM rates captures the impact of
money supply on BM rates and causes the co-eý cient of money supply to be
statistically insigni® cant (Siddiki, 2000).
4 Model speci® cation and empirical results

On the basis of the previous discussion, the empirical speci® cation of our long-run
model for BM rates for a LDC such as India can be written as:
EU 5 a 0 +b EO + c Q + d I + ut
(1)
b > 0, c < 0, d > 0
EU is the unoý cial real exchange rate de® ned as the ratio of the foreign price level
(P*) to the domestic price level (P) multiplied by the nominal unoý cial exchange
rate, rupees per dollar. EO is the oý cial real exchange rate de® ned as the ratio of
P* to P multiplied by the nominal oý cial exchange rate. Q is oý cial forex reserves
in US million dollars. I is ratio of foreign to domestic interest rates.1 The error
term ut is normally and identically distributed. All data are in natural logarithms
except the ratio of foreign to domestic interest rates. Our sample comprises annual
data from 1967 to 1996. The data source for EU is Pick’s/World Currency Yearbook
(various years). For the remaining variables, data are gathered from the International
Financial Statistics Yearbook (IMF: various years).
According to our model, EU evolves positively with EO , i.e. b > 0. An increase in
EO reduces (raises) the ¯ ow supply of forex in the BMs (oý cial markets). This
decrease in supply requires an increase in EU to keep the premiums unchanged and
push the ¯ ow supply up to retain equilibrium in BMs (AgeÂnor, 1990). This
prediction is also consistent with the ® nding of Baghestani & Noer (1993) and
Siddiki (2000) in the case of India.
We expect a negative sign of the coeý cient of Q, i.e. oý cial forex reserves. A
decline in Q increases the excess demand of forex in the oý cial markets. This
excess demand is met in the BM with a market determined rate. Note that a low
level of oý cial forex reserves in India is associated with more restrictive trade
policies ( Joshi & Little, 1994; Siddiki, 2000). Thus, a low value of Q signals
expected future depreciations in EO , which in turn causes a depreciation in EU .
This argument is consistent with the ® ndings of various devaluation episodes in
LDCs, which con® rm that a low level of oý cial forex reserves is associated with a
high level of BM rates (Kamin, 1993).
The sign of the coeý cient of I depends on whether the BM in India is a monetary
or a portfolio phenomenon. As is explained above, the MABM predicts a negative
sign, i.e. a fall in I causes a depreciation of BM rates while the PABM postulates a
positive sign, i.e. a fall in I causes an appreciation of BM rates (see Section 3
above).
As described in the Appendix, we follow a two-step procedure of the ARDL
method to estimate equation (1): see Pesaran and Pesaran (1997). In the ® rst step,
we carried out `stability tests’ for examining the existence of the long-run relation-
ship among EU , EO , Q and I. The F-test for examining this relationship from the
EC model with EU as a dependent variable is denoted by FEU (EU ½ EO , Q, I) (see
equation (4) and the discussion on it). The calculated FEU (. ½ . . . ) 5 5.03 is higher
than the upper bound critical value 4.378 at a 5% signi® cance level2 (the number
of lags chosen in all EC models is two). Therefore we reject the null of no long-
run relationship with EU as a dependent variable. Similarly, F-tests in EC models
with EO , Q, I as dependent variables are indicated by FEO (EO ½ EU , Q, I),
FQ(Q ½ EU , EO , I) and FI(I ½ EU , EO , Q, I), respectively. Calculated FEO (. ½ . . . ) 5 2.4058,
FQ(. ½ . . . ) 5 2.8841, FI(. ½ . . . ) 5 2.7578. These F statistics are lower than the lower
bound of the critical value 3.219 at a 5% signi® cance level. Our results show
that only FEU (. ½ . . . ) is signi® cant and the remaining F-statistics are insigni® cant.
Therefore, there exists a unique and stable long-run relationship with EU as
dependent variables and EO , Q and I as independent variables.
Having found a unique relationship, in the next step the following ARDL
(2, 0, 2, 1), with lag lengths determined by the Akaike Information Criterion (AIC),
is estimated (t values in parentheses):
EU 5 0.013 + 0.60743 E**
U, (t 2 1) 2 0.23862 EU, (t 2 2) + 0.73443 E**
O,t 2 0.009 Qt
(0.04) (3.58) ( 2 1.50) (5.03) ( 2 0.2)

(2)
+ 0.029 Qt 2 1 2 0.075 Qt 2 2 + 0.042 It 0.143 It 2 1
(0.44) ( 2 1.53) (0.58) (1.93)

RÅ2 5 0.90609; DW 5 2.1327; S.E. of regression 5 0.088067 F(8, 21) 5 35.9747
[0.000]; residual sum of squares 5 0.16287; AR2 2 F(2, 9) 5 0.37[0.693];
AR2 2 k 2(2) 5 1.137[0.566]; RESET 2 F(1, 20) 5 3.574[0.073]; NOR 2 k 2(2) 5
0.70[0.705]; H 2 k 2(1) 5 1.9[0.168]; H 2 k 2(1) 5 1.9[0.168], H 2 F(1, 28) 5 1.893
[0.18]; EU 2 F(2.21) 5 7.11[0.004], EO 2 F(1, 21) 5 25.34[0.000]; Q 2 F(3, 21) 5
1.67[0.19]; I 2 F(2, 21) 5 3.52[0.048].
* and ** represent 5% and 1% levels of signi® cance respectively. None of the
tests reveals any mis-speci® cation.3 For all of the tests in F-form, the degrees of
freedom are given in brackets, (.), while the probability is given in square brackets
[.]. All variables are signi® cant at a 5% signi® cance level except Q, which is
signi® cant at a 19% level. The theoretical arguments, the real situation in India,
and empirical ® ndings on long-run analysis strongly support the inclusion of Q.
Thus, the exclusion of Q based on only the level of signi® cance in the short-run
would not be justi® ed. Moreover, Q is signi® cant at a 6% signi® cance level in the
static long-run equation (see equation (3) below).
We also carried out tests on the stability of the model. Both the CUSUM test
and CUSUM of square test suggest that the model is stable over the sample period
(results are available on request). Therefore, we reject the possibility of parameter
instability. It is apparent that the overall ® t of the model is very good and it passes
all diagnostic tests. The corresponding long-run model estimated from equation
(2) is as follows:
EU 5 0.021 + 1.1633 E **
O 2 0.08 Q + 0.29 I*
(3)
(0.04) (6.40) ( 2 1.97) (2.15)
The error correction (EC) representation of our ARDL model is as follows:
D EU 5 0.013 + 0.24 D EU, (t 2 1) + 0.73 D O,t 2
E ** 0.009 D Qt + 0.08 D Qt 2 1
(0.04) (1.50) (5.03) ( 2 0.02) (1.53)

(4)
+ 0.04 D It 2 0.63 ECM**
t2 1
(0.58) ( 2 4.69)
RÅ2 5 0.51354, S.E. of regression 5 0.088067; F-stat. F(6, 23) 5 6.4357[0.000],

S.D. of dependent variable 0.12627; residual sum of squares 5 0.16287, DW-
statistic 5 2.1327.
D is the diþ erence operator and ECM is the error correction mechanism, which
is statistically signi® cant and negative, implying that there is a mechanism in the
model that prevents the error terms from enlarging (Engle & Granger, 1987).
Our estimated long-run model (equation (3)) reveals that the coeý cient of EO is
positive and statistically signi® cant, implying that oý cial exchange rates have a
strong in¯ uence on BM rates. This result supports the view that an oý cial
depreciation is associated with a similar depreciation in the BM rate. An oý cial
depreciation generally reduces the BM premiums and the ¯ ow supply of forex to
the BM since the oý cial depreciation reduces under-invoicing of exports and over-
invoicing of imports. This fall in (¯ ow) supply of forex requires a depreciation in
the parallel rate to maintain the equilibrium (AgeÂnor, 1990).
The coeý cient of I is positive and statistically signi® cant. This result implies
that an increase in foreign interest rates (returns to foreign money), relative to
domestic interest rates (returns to domestic money), boosts the demand for foreign
money. This increase in demand causes an increase in BM rates. This ® nding
supports the prediction that the higher the interest rates diþ erential, the greater
the expectations that the domestic currency will be depreciated in the future.
Therefore, the demand and price of foreign currencies will be higher in the BMs
(Dornbusch et al., 1983).
Our estimated long-run coeý cient of Q is negative and statistically signi® cant at
a 6% level. This result is in accordance with the fact that one of the main reasons
for the existence of BMs in India is the excess demand in the oý cial markets
caused by the scarcity of oý cial forex reserves. Thus, a low level of oý cial reserves
is associated with a high level of excess demand that increases BM rates.
In terms of short-run dynamics only D EO is statistically signi® cant (equation
(4)). However, the inclusion of the other variables is justi® ed according to the AIC
criterion. The statistically signi® cant coeý cient of D EO implies that, in the short-
run, BM rates respond positively to the oý cial rates.
5 Virtual exchange rates

According to the de® nition, the virtual exchange rate (VER) equates EU to EO
given the existing constraints on Q and I. Using the ARDL version of our empirical
model (equation (1) estimated in equation (2)) and following Charemza (1990),
we can relate EU to the short-run VER (VERS ) as follows:
m n p q
EUt 5 a0 + + a Ã i EUt + + b j EOt + + c i Qt 2 i ++ d i It 2 i + ut
i5 1 j5 0 i5 1 i5 0
2 1
Þ EUt 5 a0 + VERS + + c i Qt 2 i ++ d i It 2 i + ut (5)
i5 0 i5 0
Þ VERS 5 p S 3 EO,t ; p S 5
( +
i5 1
2
aÃ i ++
j5 0
0
b j
)
The VERS can be calculated from equation (2):
VERs 5 p st 3 EO, t 5 5 (0.60743 2 0.23862 2 0.73443) EOt 5 1.10324 3 EOt (6)
The long-run VER (VERL ) can be obtained from the estimated long-run equa-
tion (3):
EU 5 a + VERL EO + c QOL + d I (7)
where VERL 5 p L 3 EO with p L 5 b (see equation (3)), i.e. VERL is equal to the
long-run coeý cient of EO multiplied by EO , which is 1.1633 3 EO in our case.
We can also obtain the short and long-run VER by taking the weighted average
of EO and EU :
VERi 5 p i (s EOt + l EUt ) (8)
where i 5 S and L, s and l are weights, such that s + l 5 1, given to EO and EU .
Therefore, VERs would be about 10% higher in the short-run and 16% higher
in the long-run than the oý cial rates. More interestingly, the VERs are lower than
the BM rates, indicating that risk premiums are associated with the BM rates. The
diþ erence between the VER and BM rates is thought to be positively in¯ uenced
by the risks associated with BM markets. The risks include the probability of
detection plus legal and moral problems (Sheik, 1976).
6 Conclusions
In this paper, we applied the ARDL approach to cointegration analysis developed
by Pesaran & Shin (1998) for estimating the VERs in India using annual data from
1967 to 1996. We ® nd a multivariate cointegrated relationship where real BM
exchange rates (EU ) is a dependent variable and real oý cial exchange rates (EO ),
oý cial forex reserves (Q) and the ratio of foreign to domestic interest rates (I) are
explanatory variables. Results reveal that an increase in EO causes a depreciation
in EU . This result supports the view that an oý cial depreciation generally reduces
the BM premiums and the ¯ ow supply of forex to the BM by reducing under-
invoicing of exports and over-invoicing of imports. This fall in (¯ ow) supply of
forex requires a depreciation in the BM rate to maintain the equilibrium.
We also conclude that an increase in I causes a depreciation in EU . Note that a
rise in I, i.e. an increase in foreign interest rates (returns on foreign money) relative
to domestic interest rates (returns on domestic money), boosts the demand for
foreign money. This rise in demand causes an increase in EU . Finally, we found
that an increase in Q causes an appreciation in EU . A reduction in Q raises the
excess demand in the oý cial markets, which in turn causes a depreciation in EU .
Contrary to the other available methods of modelling equilibrium exchange
rates, a structural relationship is considered in estimating the VERs. Our results
show that the VER would be higher than the oý cial exchange rates by about 10%
in the short-run, and 16% in the long-run. As the VER is lower than the BM rates,
distortions in exchange rates are not severe and the government can gradually
adjust the exchange rates without facing serious diý culties. The reason for BM
rates being higher than the VERs may be due to the risks associated with the BMs.
Acknowledgements
We are grateful to an anonymous referee for constructive comments on an earlier
version of this paper. This paper has also bene® ted from the comments of Professor
Kate Phylaktis and the participants of the ESRC conference in Birmingham
University (UK) and the IIDS conference at Central Michigan University (USA).
We are thankful to Stephen Wheatly Price and Chris Stewart for helpful comments.
The usual disclaimer applies.
Notes
1. The foreign interest rate is proxied by the London-based Euro Dollar Rate and the domestic interest
rate is proxied by the Bank Rate, the discount rate given by the central bank to commercial banks.
2. The lower and upper bounds are appropriate for I (0) and I (1) variables, respectively. If the critical
values fall outside both bounds, as is our case, no knowledge is required regarding whether variables
are I(0) or I(1). However, if estimated critical values fall within the band, knowledge on the order
of integration of the variables is needed.
3. AR2-F and AR2-k (2) are the F and chi-square statistics, respectively, for joint autocorrelation of the
residuals up to order two. RESET-F and NOR-k 2(2) are the F and chi-square statistics, respectively,
for functional mis-speci® cation. NOR-k 2(2) is the chi-square statistic for testing normality. H-F and
H-k 2 are F and the chi-square statistics, respectively, for testing heteroscedasticity. EU -F, EO -F, Q-F,
I-F are F tests for the joint signi® cance of the particular variables (contemporaneous and lagged) in
the model.
REFERENCES
AgeÂ nor, P. R. (1990) Stabilization policies in developing countries with a parallel market for foreign
exchange: a formal framework, IMF Staþ Papers, 37(3), pp. 560- 592.
Baghestani, H. & Noer, J. (1993) Cointegration analysis of the black market and oý cial exchange
rates in India, Journal of Macroeconomics, 15(4), pp. 709- 721.
Biswas, B. & Nandi, S. (1986) The black market exchange rate in a developing economy: the case of
India, The Indian Economic Journal, 33(3), pp. 23- 34.
Blejer, M. L. (1978) Exchange restrictions and the monetary approach to the exchange rate. In: J. A.
Frankel and H. G. Johnson (Eds) The Economics of Exchange Rates: Selected Studies (Reading, MA).
Bhagwati, J. (1979) The New International Economic Order (Boston, MIT Press).
Bhagwati, J. & Desai, M. (1970) India: Planning for Industrialisation (London, Oxford University Press).
Charemza, W. W. (1990) Parallel markets, excess demand and virtual prices: an empirical approach,
European Economic Review, 34, pp. 331- 339.
Charemza, W. W. & Ghatak, S. (1990) Demand for money in dual-currency quantity constrained
economy: Hungary and Poland, 1956- 85, The Economic Journal, 100, pp. 1159- 1172.
Dornbusch, R., Dantas, D. V., Pechman, C., Rocha, R. R. & Simoes, D. (1983), The black market
for dollars in Brazil, Quarterly Journal of Economics, 98, pp. 25- 40.
Engle, R. F. & Granger, C. W. J. (1987) Cointegration and error correction: representation, estimation
and testing, Econometrica, 52, pp. 251- 276.
Ghatak, A. & Ghatak, S. (1996) Budgetary de® cits and Recardian equivalence: the case of India,
Journal of Public Economic, 60, pp. 267- 282.
Gupta, S. (1980) An application of the monetary approach to black market exchange rates, Welwirtsch-
aftliches Archiv, 116, pp. 235- 252.
International Monetary Fund (1997) Exchange Arrangements and Exchange Restrictions: Annual
Report 1997 (Washington, DC).
Joshi, V. & Little, I. M. D. (1994) India: Macroeconomics and Political Economy, 1964- 1991 (Wash-
ington, DC, The World Bank).
Kamin, S. B. (1993) Devaluation, exchange controls, and black markets for foreign exchange for
developing countries, Journal of Development Economies, 40, pp. 151- 169.
Khan, M. S. and Ostoy, J. D. (1992) Response of Equilibrium Real Exchange Rate to Real Disturbances
in Developing Countries, World Development, 20, pp. 1325- 34.
Kiguel, M. A., Lizondo, J. S. & O’Connell, S. A. (1997) Parallel Exchange Rates in Developing
Countries (London, Macmillan Press).
Neary, P. & Roberts (1980) The theory of household behaviour under rationing, European Economic
Review, 13, pp. 25- 42.
Pesaran, H. M. & Pesaran, B. (1997) Micro® t 4.0 (Oxford University Press).
Pesaran, H. M. & Shin, Y. (1998) An autoregressive distributed lag modelling approach to cointegration
analysis, chapter 11 in S. Størm (Ed) The Econometrics and Economic Theory in the 20th Century
(Cambridge, Cambridge University Press).
Pesaran, H. M., Shin, Y. & Smith, R. J. (1996) Testing the existence of a long-run relationship. DAE
Working Paper Series, 9622, Cambridge University, Department of Applied Economics.
Phylaktis, K. (1992) The black market for dollars in Chile, Journal of Developing Countries, 37,
pp. 155- 172.
Rothbarth, E. (1940) The measurement of changes in real income under conditions of rationing,
Review of Economic Studies, 8, pp. 100- 107.
Sheik, M. A. (1976) Black market for foreign exchange, capital ¯ ows and smuggling, Jour nal of
Development Economics, 3, pp. 9- 26.
Siddiki, J. U. (1999) Economic liberalisation and growth in Bangladesh: 1974- 95, PhD Thesis,
Kingston University, UK.
Siddiki, J. U. (2000) Black market exchange rates in India: an empirical analysis, Empirical Economics,
25(2), pp. 297- 313.
Siddiki, J. U. & Daly, V. (1999) Trade and ® nancial liberalisation and economic growth in India.
Discussion Paper No. 99/7, Kingston University, UK.
Tobin, J. & Houthakker, H. S. (1950) The eþ ects of rationing on demand elasticities, Review of
Economic Studies, 18, pp. 140- 153.
World Bank (1994) Trends in Developing Countries (Washington, DC).
Appendix
Methodology: the Autoregressive Distributed Lag (ARDL) method

The Engle- Granger (EG) (1987) representation theorem asserts that whenever the
level of a set of I(1) variables is constrained by one or more cointegrating
relationships then their data generating process may be expressed as an error
correction model (ECM). However, at one level, an ECM is simply one possible
(constrained) parameterization of a vector autoregression (VAR). Since the separate
equations of a VAR are individually autoregressive distributed lag (ARDL) regres-
sions then the representation theorem may be taken as a hint that cointegrating
relationships may be investigated via estimation of ARDL regressions.
We are particularly interested in the case of a single cointegrating relationship,
which we might sketch as
LR: zt º yt 2 P ¢ xt ~ I(0), (M.1)

which implies by the representation theorem an error correction model
ECM: D yt 5 + a i D yt 2 i ++ b i¢ D xt 2 i +c zt 2 1 (M.2)
1 0
Re-parameterization as an ARDL model gives
ARDL: yt 5 + ai yt 2 i ++ bi¢ xt 2 i (M.3)

1 0
The EG method estimates the long-run relationship directly by ordinary least

squares (OLS), which is known to be (super) consistent when LR exists. There are
problems with this approach. First, the asymptotic properties of OLS now involve
the Dicky- Fuller distribution. Secondly, the long-run relationship omits the short-
run dynamics of ECM and additionally ignores any correlation between the
innovations in ECM and the innovations that generate the data for x. These
omissions may induce serial correlation in the innovation of LR and endogeneity
of its regressors, which can be a signi® cant issue in ® nite samples.
Pesaran & Shin (1998) argue that unmodi® ed OLS has desirable asymptotic
properties when applied to ARDL, provided that the lag lengths are suý cient to
proxy for the serial correlation and endogeneity. They further suggest that the
choice of estimator for small-sample investigations should be based on Monte Carlo
assessment and oþ er evidence to support a `two-step’ strategy in which lag lengths
are ® rst determined by the Schwartz Bayesian criterion or by the Akaike information
criterion and OLS is then applied. Recovery of the coeý cients of the long-run
model is a re-parameterization exercise and therefore purely computational.
Pesaran et al. (1996) oþ er a procedure for identifying the dependent variable in
a system containing a single cointegrating relationship. This procedure involves
computation of standard and hypothesis tests, albeit with non-standard critical
values, applied to an unrestricted version of ECM (UECM):
UECM: D yt 5 + a i D yt 2 i ++ b i¢ D xt 2 i +} yt 2 1 +d ¢ xt 2 1 (M.4)
1 0
The joint hypothesis } 5 0, d ¢ 5 0 asserts that no ECM and therefore no long-run

relationship exists. An `F-statistic’ of this hypothesis is carried out using non-
standard critical values developed by Pesaran et al. (1996). The UECM is normal-
ized upon a particular selection of dependent variable by omitting the current
change of this variable from the right-hand side; applying the F-test to all such
normalizations constitutes a search for the direction of causation.

Tomas Cipra
Time Series
in Economics
and Finance
Time Series in Economics and Finance
Tomas Cipra
Time Series in Economics

and Finance
Tomas Cipra
Faculty of Mathematics and Physics
Charles University
Prague, Czech Republic
ISBN 978-3-030-46346-5 ISBN 978-3-030-46347-2 (eBook)

https://doi.org/10.1007/978-3-030-46347-2
Mathematics Subject Classification: 62M10, 91B84, 62M20, 62P20, 91B25, 91B30
© Springer Nature Switzerland AG 2020

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Part I Subject of Time Series

2 Random Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Random Processes as Models for Time Series . . . . . . . . . . . . . . 5
2.2 Specific Problems of Time Series Analysis . . . . . . . . . . . . . . . . 6
2.2.1 Problems of Economic and Financial Data Observed
in Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Methodological Problems . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3 Problems with Construction of Predictions . . . . . . . . . . 15
2.3 Random Processes with Discrete States in Discrete Time . . . . . . 28
2.3.1 Binary Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.2 Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.3 Branching Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.4 Markov Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Random Processes with Discrete States in Continuous Time . . . 33
2.4.1 Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4.2 Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Random Processes with Continuous States in Continuous
Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5.1 Goniometric Function with Random Amplitude
and Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5.2 Wiener Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Part II Decomposition of Economic Time Series

3 Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1 Trend in Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1.1 Subjective Methods of Elimination of Trend . . . . . . . . 42
v
vi Contents
3.1.2 Trend Modeling by Mathematical Curves . . . . . . . . . . 43

3.2 Method of Moving Averages . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2.1 Construction of Moving Averages by Local
Polynomial Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2.2 Other Types of Moving Averages . . . . . . . . . . . . . . . . 73
3.3 Exponential Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.3.1 Simple Exponential Smoothing . . . . . . . . . . . . . . . . . . 76
3.3.2 Double Exponential Smoothing . . . . . . . . . . . . . . . . . . 81
3.3.3 Holt’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4 Seasonality and Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.1 Seasonality in Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.1.1 Simple Approaches to Seasonality . . . . . . . . . . . . . . . . 89
4.1.2 Regression Approaches to Seasonality . . . . . . . . . . . . . 93
4.1.3 Holt–Winters’ Method . . . . . . . . . . . . . . . . . . . . . . . . 97
4.1.4 Schlicht’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2 Tests of Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Transformations of Time Series . . . . . . . . . . . . . . . . . . . . . . . . 107
4.3.1 Box–Cox Transformation . . . . . . . . . . . . . . . . . . . . . . 107
4.3.2 Transformation Based on Differencing . . . . . . . . . . . . . 111
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5 Residual Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.1 Tests of Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.1.1 Test Based on Signs of Differences . . . . . . . . . . . . . . . 114
5.1.2 Test Based on Turning Points . . . . . . . . . . . . . . . . . . . 115
5.1.3 Test Based on Kendall Rank Correlation
Coefficient τ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.1.4 Test Based on Spearman Rank Correlation
Coefficient ρ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.1.5 Test Based on Numbers of Runs Above and Below
Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Part III Autocorrelation Methods for Univariate Time Series

6 Box–Jenkins Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.1 Autocorrelation Properties of Time Series . . . . . . . . . . . . . . . . . 124
6.1.1 Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.1.2 Autocovariance and Autocorrelation Function . . . . . . . 125
6.1.3 Estimated Autocovariance and Autocorrelation
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.1.4 Partial Autocorrelation Function and Its Estimate . . . . . 127
6.2 Basic Processes of Box–Jenkins Methodology . . . . . . . . . . . . . 128
6.2.1 Linear Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2.2 Moving Average Process MA . . . . . . . . . . . . . . . . . . . 130
Contents vii
6.2.3 Autoregressive Process AR . . . . . . . . . . . . . . . . . . . . . 131

6.2.4 Mixed Process ARMA . . . . . . . . . . . . . . . . . . . . . . . . 134
6.3 Construction of Models by Box–Jenkins Methodology . . . . . . . 136
6.3.1 Identification of Model . . . . . . . . . . . . . . . . . . . . . . . . 137
6.3.2 Estimation of Model . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.3.3 Verification of Model . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.4 Stochastic Modeling of Trend . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.4.1 Tests of Unit Root . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.4.2 Process ARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.5 Stochastic Modeling of Seasonality . . . . . . . . . . . . . . . . . . . . . 161
6.6 Predictions in Box–Jenkins Methodology . . . . . . . . . . . . . . . . . 164
6.7 Long Memory Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7 Autocorrelation Methods in Regression Models . . . . . . . . . . . . . . . 175
7.1 Dynamic Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
7.2 Linear Regression Model with Autocorrelated Residuals . . . . . . 176
7.2.1 Durbin–Watson Test . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.2.2 Breusch–Godfrey Test . . . . . . . . . . . . . . . . . . . . . . . . 179
7.2.3 Construction of Linear Regression Model
with ARMA Residuals . . . . . . . . . . . . . . . . . . . . . . . . 180
7.3 Distributed Lag Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
7.3.1 Geometric Distributed Lag Model . . . . . . . . . . . . . . . . 185
7.3.2 Polynomial Distributed Lag Model . . . . . . . . . . . . . . . 186
7.4 Autoregressive Distributed Lag Model . . . . . . . . . . . . . . . . . . . 192
7.4.1 Intervention Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.4.2 Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Part IV Financial Time Series

8 Volatility of Financial Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.1 Characteristic Features of Financial Time Series . . . . . . . . . . . . 199
8.2 Classification of Nonlinear Models of Financial Time Series . . . 204
8.3 Volatility Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.3.1 Historical Volatility and EWMA Models . . . . . . . . . . . 206
8.3.2 Implied Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.3.3 Autoregressive Models of Volatility . . . . . . . . . . . . . . 210
8.3.4 ARCH Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.3.5 GARCH Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.3.6 Various Modifications of GARCH Models . . . . . . . . . . 221
8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
9 Other Methods for Financial Time Series . . . . . . . . . . . . . . . . . . . . 231
9.1 Models Nonlinear in Mean Value . . . . . . . . . . . . . . . . . . . . . . . 231
9.1.1 Bilinear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.1.2 Threshold Models SETAR . . . . . . . . . . . . . . . . . . . . . 235
viii Contents
9.1.3 Asymmetric Moving Average Models . . . . . . . . . . . . . 237

9.1.4 Autoregressive Models with Random Coefficients
RCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
9.1.5 Double Stochastic Models . . . . . . . . . . . . . . . . . . . . . . 238
9.1.6 Switching Regimes Models MSW . . . . . . . . . . . . . . . . 239
9.2 Further Models for Financial Time Series . . . . . . . . . . . . . . . . . 240
9.2.1 Nonparametric Models . . . . . . . . . . . . . . . . . . . . . . . . 240
9.2.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.3 Tests of Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.4 Duration Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
10 Models of Development of Financial Assets . . . . . . . . . . . . . . . . . . . 251
10.1 Financial Modeling in Continuous Time . . . . . . . . . . . . . . . . . . 251
10.1.1 Diffusion Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
10.1.2 Ito’s Lemma and Stochastic Integral . . . . . . . . . . . . . . 254
10.1.3 Exponential Wiener Process . . . . . . . . . . . . . . . . . . . . 255
10.2 Black–Scholes Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
10.3 Modeling of Term Structure of Interest Rates . . . . . . . . . . . . . . 263
10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
11 Value at Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
11.1 Financial Risk Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
11.1.1 VaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
11.1.2 Other Risk Measures . . . . . . . . . . . . . . . . . . . . . . . . . 271
11.2 Calculation of VaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
11.3 Extreme Value Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11.3.1 Block Maxima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11.3.2 Threshold Excesses . . . . . . . . . . . . . . . . . . . . . . . . . . 295
11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Part V Multivariate Time Series

12 Methods for Multivariate Time Series . . . . . . . . . . . . . . . . . . . . . . . 305
12.1 Generalization of Methods for Univariate Time Series . . . . . . . . 305
12.2 Vector Autoregression VAR . . . . . . . . . . . . . . . . . . . . . . . . . . 315
12.3 Tests of Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
12.4 Impulse Response and Variance Decomposition . . . . . . . . . . . . 330
12.5 Cointegration and EC Models . . . . . . . . . . . . . . . . . . . . . . . . . 336
12.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
13 Multivariate Volatility Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
13.1 Multivariate Models EWMA . . . . . . . . . . . . . . . . . . . . . . . . . . 352
13.2 Implied Mutual Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
13.3 Multivariate GARCH Models . . . . . . . . . . . . . . . . . . . . . . . . . 353
13.3.1 Models of Conditional Covariance Matrix . . . . . . . . . . 355
13.3.2 Models of Conditional Variances and Correlations . . . . 357
Contents ix
13.3.3 Factor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

13.3.4 Estimation of Multivariate GARCH Models . . . . . . . . . 361
13.4 Conditional Value at Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
14 State Space Models of Time Series . . . . . . . . . . . . . . . . . . . . . . . . . 373
14.1 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
14.1.1 Recursive Estimation of Multivariate GARCH
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
14.2 State Space Model Approach to Exponential Smoothing . . . . . . 383
14.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Chapter 1
Introduction
Most data in economics and finance are observed in time (sometimes even online in
real time) so that they have the character of time series. This monograph presents
methods currently used for analysis of data in this context. Such methods are
available not only in many monographs, textbooks, or papers but also in various
journals or working papers, case studies, or guides to the corresponding software
systems. This text tries to bring together as many methods as possible to cover the
most recommended instruments for analysis and prediction of dynamic data in
economics and finance.
The objective of this book is the practical applicability. Therefore, it centers on
the description of methods used in practice (both simple and complex ones from the
computational point of view). Their derivation is often concise (if any, particularly in
more complicated cases), but one always refers to easily available sources. In any
case, a lot of numerical examples illustrate the theory by means of real data which are
usually chosen to be characteristic for the presented methodology.
Selected parts of the text are suitable for university programs (undergraduate,
graduate, or doctoral) concerning econometrics or calculation finance as study,
training, or reference materials. Moreover, due to the complete survey of actual
methods and approaches the book can serve as a reference text in research work. On
the other hand, it can also be recommended for people dealing with analysis of data
in economics and finance (banks, exchanges, energetic planning, currency and
commodity markets, insurance, statistical offices, demography, and others).
The presented material requires mostly the application of suitable software.
Fortunately, the corresponding programs are easily available since they can be
found in libraries of common statistical or financial software systems (R Statistical
Software, MATLAB, EViews, and others can be recommended). There are several
reasons supporting ready-made software: (1) calculations (e.g., in Excel) are usually
troublesome (particularly for users with superficial knowledge of programming);
(2) software manuals are usually helpful in various individual situations, and,
moreover, the parameters of programs are preset as default values suitable for the
immediate (routine) application; and (3) when browsing through the offer of
© Springer Nature Switzerland AG 2020 1

T. Cipra, Time Series in Economics and Finance,
https://doi.org/10.1007/978-3-030-46347-2_1
2 1 Introduction
software systems, one discovers other methods or modifications which can be useful
for the solved problem. On the other hand, the potential user should not be only a
software consumer sharing all drawbacks of the given software product. Moreover,
the qualified users should be capable of interpreting the computer outputs in a proper
way since they understand principles of the chosen methods.
The monograph consists of several parts divided into particular chapters:
Part I (Subject of time series, Chap. 2) deals with the subject of time series which
are looked upon as trajectories of random processes.
Part II (Decomposition of economic time series, Chaps. 3–5) is devoted to the
classical approach decomposing economic time series to trend, periodic (seasonal
and cyclical), and residual components. Some of more advanced methods are also
addressed, e.g., tests of periodicity or randomness.
Part III (Autocorrelation methods for univariate time series, Chaps. 6 and 7)
summarizes so-called Box–Jenkins methodology based on (linear) ARMA models
and their modifications (ARIMA, seasonal ARMA, long memory processes) for
univariate time series. Some more actual topics are also mentioned in this context
(e.g., information criteria or tests of unit root). Finally, dynamic regression models
are presented in Part III including distributed lag models.
Part IV (Financial time series, Chaps. 8–11) confines itself to financial time series
which require special (namely nonlinear) models and instruments due to the typical
volatility of financial data. Models nonlinear in mean and in variance are distin-
guished including tests of linearity and duration modeling. Further, Part IV addresses
the modeling of financial assets by means of diffusion processes including Black–
Scholes formula and modeling of the term structure of interest rates. Chapter 11
presents a very actual topic of risk measures (value at risk and others). Extreme value
theory is also mentioned in this context, namely block maxima and threshold
excesses.
Part V (State space models of time series, Chaps. 12–14) concludes the mono-
graph considering the multivariate time series. At first, the popular vector
autoregression (VAR) model is presented including tests of causality, impulse
response, variance decomposition, cointegration, and EC models. The multivariate
volatility modeling is also described including multivariate EWMA and GARCH
models with a practical application for conditional value at risk. Finally, the (mul-
tivariate) state space models as the background of Kalman filtering are discussed
including the state space model approach to exponential smoothing.
Some parts of this monograph serve as lecture notes for courses of time series
analysis and econometrics at the Faculty of Mathematics and Physics of Charles
University in Prague (it is also the reason why some real data used in practical
examples are taken from the Czech economics and finance).
Acknowledgment The author thanks for various forms of help to Dr. Radek Hendrych. The
research work contained in the monograph was supported by the grant 19-28231X provided by
the Grant Agency of the Czech Republic.
Part I
Subject of Time Series
Chapter 2
Random Processes
2.1 Random Processes as Models for Time Series
Data typical for economic and financial practice are time data, i.e., values of an
economic variable (or variables in multivariate case) observed in a time interval with
a given frequency of records (each trading day, in moments of transactions, monthly,
etc.). The frequency of records is understood either as the lengths of intervals
between particular observations (e.g., calendar months) or the regularity of obser-
vations (e.g., each trading day). As to the regularity, financial data are often
irregularly observed (irregularly spaced data), e.g., the stock prices in stock
exchanges are quoted usually in moments of transactions from the opening to closing
time of trading day, the frequency of transactions being usually lower in the morning
after opening, during the lunch time, and later in the afternoon before closing
(a possible approach in such a situation assigns the closing or prevailing price to
this day). The important property of time data is the fact that they are ordered
chronologically in time.
The term time series denotes any sequence of data y1, . . ., yn ordered chronolog-
ically in time. It could justify a simplifying view looking on a time series as a set of
numbers ordered in time (historically it was the case, e.g., for astronomic observa-
tions). However, a very important aspect of time series is not only their dynamics but
also their randomness. In order to be adequate, the analysis of time series should
apply such models that are based on stochastic principles (i.e., on probability theory)
and are capable of generating time sequences similar from the stochastic point of
view to the trajectory that we just observe. Such models are denoted as random
processes and can be looked on as specific algorithms based on random number
generators. The knowledge of the algorithm that generated the observed time series
as an output among many realizations may be highly useful for examining our
specific data.
Random process (or stochastic process) {Yt, t 2 T} is a set (or family) of random
variables in the same probability space (Ω, ℑ, P) indexed by means of values t from

https://doi.org/10.1007/978-3-030-46347-2_2
6 2 Random Processes
T (T ⊂ R), where t is interpreted as time. According to the form of the index set T,
which is a subset of real line R, one distinguishes:
• Random process in continuous time: T is an interval in real line, e.g., T ¼ h0, 1),
i.e., {Yt, t 0}.
• Random process in discrete time: T is formed by discrete real values, e.g., T ¼ N0,
i.e., {Y0, Y1, Y2, . . .}.
According to the states of random variables Yt (i.e., according to the state space S)
one also distinguishes:
• Random process with discrete states: e.g., counting process Yt 2 N0 for all t 2 T
that registers the number of specified events in time.
• Random process with continuous states: e.g., real process Yt 2 R or nonnegative
process Yt 2 h0, 1) for all t 2 T.
• Multivariate random process: Yt is an m-variate random vector Yt for all t 2 T.
In any case, one can observe only trajectories (realizations) of random processes.
Such a trajectory arises by a choice of an elementary event ω 2 Ω and is a
deterministic function of time {Yt (ω), t 2 T} observable due to this specific choice.
One denotes trajectories simply as {yt, t 2 T} in discrete time and {y(t), t 2 T} in
continuous time.
Remark 2.1 Unfortunately in the literature (and also in this text), it is common that
the term time series is interpreted sometimes as the trajectory and sometimes as the
random process. The real meaning follows from the context.
⋄
2.2 Specific Problems of Time Series Analysis
The general objective of time series analysis including applications in economics

and finance is to construct an adequate model of the underlying random process.
Such a model usually enables:
• to understand mechanism (or algorithm) that has generated the observed time
series
• to test hypotheses of a priori expectations and conjectures in a statistically
credible way (e.g., whether the stock markets show a long-term growth)
• to predict (or to forecast or to extrapolate) the future development of the system
(e.g., which interest or currency rates one can expect the next month with a given
confidence)
• to control and optimize the dynamics of a system including adjustment of
parameters, initial conditions, self-regulations, and the like (e.g., how to set up
parameters of pension reform)
2.2 Specific Problems of Time Series Analysis 7
In any case, the data in the form of time series have a lot of specific features. It can
help to analyze such data files, but on the other hand, it can cause complications that
must be overcome by suitable procedures and adjustments. The next section will
present examples of specific problems that are typical for time series analysis.
2.2.1 Problems of Economic and Financial Data Observed

in Time
Economic and financial data observed in time typically feature problems implied just
by their time character:
2.2.1.1 Problems Due to Choice of Observation Time Points
Time series in discrete time that prevail in economics and finance usually arise by the
following ways:
• They are discrete by nature (e.g., daily interbank LIBOR rates).
• One discretizes time series in continuous time (e.g., closing quotations on stock
exchanges assigned to the ends of particular trading days in the context of
continuous trading).
• One accumulates (aggregates) values over given time intervals (e.g., accumu-
lated sums of insurance benefits paid out in particular quarters); often one pro-
duces averages instead of aggregates.
In some cases, one is not allowed to select time points of observations oneself.
However, if such a possibility exists, one should pay careful attention to it. It often
means that one must find a trade-off among contradictory requirements: for instance,
on one side due to numerical complexity not to use too high density of records of a
continuous process (see, e.g., ultra-high-frequency data UHFD in finance) and on
the other side not to apply so scarce data that one is not capable of identifying some
characteristic features of the given process (e.g., if we are interested in seasonal
fluctuations we must dispose of several observations during each year at least). As
the distance between neighboring observations is concerned, it is common to
observe data regularly in equidistant time points. On the contrary, in finance there
are not unusual irregularly spaced data due to irregularities in market trading (see
Sect. 2.1).
2.2.1.2 Problems Due to Calendar
The nature is responsible for a minor part of problems caused by the calendar (e.g.,
the number of days of one solar year is not integer, various geographic zones require
time shifts). However, the major part of calendar problems is due to human conven-
tions due to which we have, e.g.,
• Different lengths of calendar months
• Four or five weekends monthly
• Different numbers of working or trading days monthly
• Moving holidays (e.g., Easter once in the first quarter and next time in the
second one)
• Wintertime or summertime
Such irregularities must be taken into account in an adequate way, e.g., differ-
ences in security trading or in quality of produced cars at the beginning, middle, and
end of particular weeks, different times to maturities quoted on some security
exchanges as the third Friday of particular months, and others. In practice, one
usually applies simple methods eliminating these undesirable phenomena. Several
examples follow:
• Calendar conventions are common in the framework of simple interest and
discount models (e.g., the calendar Euro-30/360 introduces months with
30 days and years with 360 days).
• If comparing monthly productions of some products (cars), the volumes are
adjusted using so-called standard month with 30 days: in such a case, one should
multiply the January production by coefficient 30/31, the February production in
the common year by 30/28, and in the leap year by 30/29, etc. Similarly if
comparing securities traded monthly, one should multiply the January volume
by (21/real number of January trading day), the February volume by (21/real
number of February trading day), etc., as the average annual number of trading
days is 252, i.e., 21 monthly.
• Some short-term calendar irregularities can be eliminated by means of accumu-
lation. For instance, if it suffices to analyze data accumulated annually instead of
original quarterly data, then some calendar problems (e.g., seasonal fluctuations,
moving Easter, and others) can be reduced in this natural way.
In addition to calendar problems, one must frequently face such irregularities in
time series that are consequences of operation risk (blackouts, breakdowns of web,
failures of human factor including frauds, etc.). The irregularities of this type are
classified as outliers, and statistical methods for time series with outliers should be
robustified to become insensitive to such outlying values. Another type or irregu-
larities are jumps in consequence of interventions (it can be, e.g., successful adver-
tising campaign, decision of bank council on decrease of key interest rates, new
legislative, and so on).
2.2.1.3 Problems Due to Length of Time Series
The length of time series is the number n of observations of the given time series (not
the time range between the beginning and the end of time series). Therefore, e.g., the
monthly time series over 10 years has the length of 120. It is logic that the volume of
information available for analysis increases with the increasing length of time series.
However, the length of time series is not a unique measure of information contained
in the time series (e.g., the doubling of time series length by halving the original time
intervals between neighboring points of observations does not mean usually the
doubling of information on this time series): one must consider also the inner
structure of given time series.
As the length of time series is concerned, usually a reasonable trade-off is
necessary in practice. On one side, some time series methods require a sufficient
length of series (e.g., the routine application of Box–Jenkins methodology is not
recommended for time series shorter than 50). On the other hand, characteristic
features of long time series usually change in time so that the construction of
adequate model becomes more complex with increasing length of time series.
Similarly, the typical problem in longer time series originates due to the fact that
the measurements in the beginning of the given time series need not be comparable
with the ones in its end, e.g., due to inflation, price growth, technical development,
and the like. In such a case, one should adjust data by means of a suitable index
(in practice, it can be not only the inflation rate but also the salary growth for time
series used in formulas of pay-as-you-go pension systems and the like).
2.2.2 Methodological Problems
The choice of suitable time series method depends on various factors, e.g.,
• Objective of analysis, mainly the identification of generating model, the hypoth-
esis testing, the prediction, the control and optimization (see the introduction to
Sect. 2.2); in this context, it is also relevant how the analysis results will be
exploited in practice, which will be the costs of analysis, which is the volume of
analyzed data and the like.
• Type of time series, since some methods are not suitable universally for all time
series (e.g., it has no sense to apply a Box–Jenkins model for an economic time
series of ten annual observations that show an apparent linear growth).
• Experience of analyst, who is responsible for the analysis, and software, which
will be exploited for the analysis.
The most popular methods and procedures of time series analysis are the
following ones:
2.2.2.1 Decomposition of Time Series
Reality shows that time series of economic character can be usually decomposed to
several specific components, namely
• Trend component Trt (see Chap. 3)

• Seasonal component It (see Sect. 4.1)
• Cyclical component Ct (see Sect. 4.2)
• Residual (random, irregular) component Et (see Chap. 5)
This decomposition is motivated by the expectancy that particular components
will show some regular features more distinctly than the original (compound) time
series. The classical decomposition regards the trend, seasonal, and cyclical compo-
nents as deterministic functions of time, while the residual component as a stochastic
function of time (i.e., as a random process). These unobservable functions have
distinctive features:
Trend presents long-term changes in the level of time series (e.g., a long-term
increase or decrease). One can imagine that the trend component originates as a
consequence of forces acting in the same direction. For instance, the interrelated
forces causing the growing mortgage volumes are higher demands of some segments
of population, salary movements, higher market rents, changes in real estate market,
and the like. The trend component has a relative character: the climate changes that
economists perceive as long-term movements are from the point of view of clima-
tologists only short-term deviations.
Seasonal component describes periodic changes in time series that pass off during
one calendar year and repeat themselves each year. These changes are caused by the
rotation of seasons, and they affect significantly most economic activities (typical
seasonal phenomena are, e.g., agricultural production, unemployment, accident rate
of cars, sale volumes, deposit withdrawals, and the like). Mainly monthly and
quarterly data are typical for seasonal analysis of economic time series. The semi-
annual observations present the lowest frequency (denoted as Nyquist frequency in
spectral analysis of time series) that enables the statistical identification of season-
ality. The seasonal structure varies in time, e.g., the global warming reduces the
winter drops in building industry. From the practical point of view, the seasonal
elimination is usually necessary for obtaining inferences from economic time series;
the government statistical offices (in the EU, USA, and elsewhere) must publish time
series important for national economics both before the seasonal elimination and
after it. Special software products deal professionally with seasonality (e.g., the
software systems X-12-ARIMA or X-13ARIMA-SEATS used by the U.S. Census
Bureau).
Cyclical component is the most controversial component of time series. Some
authors avoid denoting this component as cyclical (or even periodic), and they speak
rather on fluctuations around the trend, where increase phases (booms) alternate with
decrease phases (recessions). The length of particular cycles, i.e., the distance
between the neighboring upper turning points (i.e., local maxima) or between the
neighboring lower turning points (i.e., local minima), is usually variable, and also the
intensity of particular phases of each cycle can vary in time. The cyclical behavior
can be caused by evident external effects, but sometimes its causes are difficult to
find. The typical representative of this component is so-called business cycle which
is a (regular) alternation of booms and recessions (see above)—the length of
business cycles generally ranges from 5 to 7 years. The elimination of cyclical

component is usually complex both due to factual reasons (it is not easy to find
out causes of its origin) and due to calculation reasons (its character can vary in time
similarly as in the case of seasonal component). Sometimes the seasonal and cyclical
components are denoted collectively periodic components of time series.
Residual component (called also random or irregular component) remains in
time series after eliminating trend and periodic components. It is formed by random
movements (or fluctuations) of time series which have no recognizable systematic
character. Therefore, it is not included among the systematic components described
above. The residual component also covers the measurement and rounding errors
and the errors made when modeling the given time series. In order to justify some
statistical procedures used for the classical decomposition, one usually assumes that
the residual component is so-called white noise (or even the normally distributed
white noise). Here the term white noise denotes a sequence {εt} of uncorrelated
random variables with zero mean value and constant (finite) variance σ 2 > 0:
Eðεt Þ ¼ 0, varðεt Þ ¼ σ 2 > 0, covðεs , εt Þ ¼ 0 for s 6¼ t ð2:1Þ
(some authors demand for the white noise even stronger assumptions written usually
as εt ~ iid(0, σ 2), where εt are independent and identically distributed random vari-
ables with zero mean value and constant variance). The name “white noise” derives
from the spectral analysis and refers to the property of a constant spectrum with
equal magnitude at all frequencies (or wavelengths) similarly as in the white light in
optics. The values εt are also called innovations as they correspond to unpredictable
movements (shocks) in time series.
Obviously, one can look upon the given economic time series as a trend linked
with periodic components (i.e., seasonal and cyclical ones) and white noise. More-
over, the decomposition can be either additive or multiplicative:
Additive decomposition has the form
yt ¼ Tr t þ Ct þ I t þ Et : ð2:2Þ
In the additive decomposition, all components are measured in the units of the
time series yt, i.e., all components are absolute ones (not relative ones measured, e.g.,
as percent of the trend).
Multiplicative decomposition has the form
yt ¼ Tr t Ct I t Et : ð2:3Þ
In the multiplicative decomposition, only the trend component is usually measured

in the units of the time series yt, i.e., it is the absolute one. The others are then
considered relatively to the trend. For example, I1 ¼ 1.15 means that the value of
time series explained by the trend and seasonal components is 1.15Tr1 in time
t ¼ 1. Obviously, the logarithmic transformation converts the multiplicative
Trt Trt
t t
2000 2010 2000 2010
Trt+Ct
t t
0 0
Ct
Trt+Ct+It
It
t t
0 0
Trt+Ct+It+Et
t t
0 0
Et
Fig. 2.1 Additive decomposition of time series
decomposition to the additive one, and vice versa, by means of the exponential
transformation (one must pay attention to the changes of statistical properties of the
transferred residual component in such a case).
If the observations are y1, . . ., yn, then
byt ¼ Tr b t þ bI t
b t þC b t bI t
b t C
or byt ¼ Tr ð2:4Þ
is the smoothed time series (for t n), or the prediction of time series (for t > n),
based on the calculated values of systematic components, or on the extrapolated
ones, respectively. Obviously, some systematic components can be missing in the
decomposition of various economic time series, e.g., the series yt ¼ Trt + Et does not
contain any periodic components at all. Figure 2.1 shows the scheme of additive
decomposition in a graphical way.
2.2.2.2 Box–Jenkins Methodology
The decomposition methods are based mainly on the analysis of systematic compo-
nents of time series (i.e., the trend, seasonal, and cyclical components), and they
regard the particular observations as uncorrelated. The typical statistical instrument
is here the regression analysis. On the contrary, Box–Jenkins methodology takes as

the basis for construction of time series models the residual component (i.e., the
component of random character). The key statistical instruments consist here in the
correlation analysis, and therefore, this methodology can deal successfully with
mutually correlated observations (see Chap. 6). Its formal principles were formulated
by Box and Jenkins (1970).
For instance, one of the simplest model of Box–Jenkins methodology is so-called
moving average process of the first order denoted as MA(1) [see also (6.10)]. It is
suitable for such time series where all observations are mutually uncorrelated except
for direct neighbors. This model can have the following concrete form:
yt ¼ εt þ 0:7εt1 , ð2:5Þ
where yt is the modeled time series and εt is the white noise (2.1). Other types of
models applied in the framework of Box–Jenkins methodology are so-called
autoregressive processes AR [see (6.31)] and processes ARMA [see (6.45)].
At first sight it could seem that the attention devoted by this methodology to the
random component is excessive and that one loses the possibility to model
nonstationary time series with evident trend or seasonal character (the so-called
stationarity of a time series means that the behavior of this series is stable in a
specific way; see Sect. 6.1). However, Box–Jenkins methodology is capable of
managing also these cases by means of so-called integrated processes ARIMA and
seasonal processes SARIMA, where the trend or seasonal components are modeled
in a stochastic way (in contrast to the deterministic modeling when using the
classical decomposition approach). For instance in a very simple model ARIMA
(0, 1, 0)
yt ¼ yt1 þ εt , ð2:6Þ
the stochastic trend can be characterized in such a way that its increments over
particular observation intervals are random in the form of white noise (hence it is
logic why the process (2.6) is called the random walk). Due to this stochastic
approach, Box–Jenkins methodology is very flexible modeling in a satisfactory
way also non-standard time series that are unmanageable by the classical decompo-
sition approach.
2.2.2.3 Analysis of Multivariate Time Series
In analysis of multivariate time series, one models several time series simultaneously
including relations and correlations among them (see Chap. 12). Then the causality
relations among various economic variables modeled dynamically in time can be
addressed in this context. Another important phenomenon is here so-called
cointegration when particular (univariate) time series from multivariate model
have a common stochastic trend which can be eliminated completely combining
particular time series in a suitable way (see Sect. 12.5). The popular instrument for
modeling multivariate time series is the process VAR (vector autoregression; see
Sect. 12.2).
2.2.2.4 Spectral Analysis of Time Series
Three approaches presented above can be summarized as the time series analysis in
time domain. A distinct approach that regards the examined time series as an
(infinite) mixture of sinusoids and cosinusoids with different amplitudes and fre-
quencies (according to the Wiener–Khinchin theorem for stationary time series) is
the time series analysis in spectral domain called briefly the spectral analysis of time
series (sometimes one also speaks more generally on Fourier analysis). Applying
special statistical instruments, e.g., periodogram or spectral density, one can obtain
in this context the image which is the distribution of intensities of particular
frequencies in the examined time series (so-called spectrum of time series), which
of its frequencies are the most intensive ones including the estimation of the
corresponding periodic components, etc.
The spectral analysis is important for applications in engineering (vibrograms,
technical diagnostics, seismograms) and biology (electrocardiograms). On the other
hand, it is not usual for economic time series [except for the tests of periodicity (see
Sect. 4.2) or the investigation of cycles in economics (see Hatanaka 1996)]. In any
case, a deeper study of theoretical backgrounds of this approach to time series
demands special references, e.g., monographs Koopmans (1995) or Priestley (2001).
2.2.2.5 Special Methods of Time Series Analysis
There exist plenty of methods concerning special types or aspects of time series, e.g.:
• Nonlinear models of time series: for instance, threshold models are suitable for
time series that change their character after exceeding particular threshold levels;
asymmetric models are applied for time series whose momentary development is
revised according to their previous development and, moreover, such a revision is
asymmetric in dependence on the previous growth or decline.
• Models of financial time series: these time series have various typical features,
e.g., so-called leptokurtic or heavy-tailed distribution, extreme values appearing
in clusters, high frequency of records, and others; therefore, very specific
nonlinear models are necessary for time series used in finance (e.g., models
ARCH or GARCH with conditional heteroscedasticity whose variance called
usually volatility depends in the given time on the previous behavior of time
series; see Chap. 8).
• Recursive methods in time series: these methods provide results for a new time
step (estimates, smoothed values, predictions, and others) using results from
previous time periods and adjusting them by means of new observations; in
particular, one can make use of Kalman filter here as a formal recursive method-
ology; see Chap. 14).
• Methods for time series with missing or irregular observations: in such series,
some observations are either missing (e.g., they are unobservable or false or
outlied or secret) or are observed in irregular time intervals (e.g., due to time
irregularities in trading on markets, one must also model so-called durations
between neighboring values; see Sect. 9.4).
• Robust analysis of time series: here one identifies and eliminates the influence of
outliers that contaminate analyzed records and distort results of classical methods
(a very simple example how to robustify a classical statistical method in order to
be insensitive to outliers is to replace the arithmetic average by the median when
estimating the average level of a time series).
• Intervention analysis of time series: this analysis examines one-off impacts from
outside that can influence significantly the course of time series (e.g., intervention
of central bank, useful advertising campaign, and others; see Sect. 7.4);
• Plenty of other special methods.
2.2.3 Problems with Construction of Predictions
The construction of predictions is one of the important objectives of time series

analysis. Some analysts compare particular models entirely according to the accu-
racy of generated predictions (other criteria need not be satisfactory for preferred
models in such a case).
Particularly, the predictions in finance are extremely important. The financial
management often deals with long-term liabilities and investments whose results are
known in the far future, and therefore acceptable predictions play the key role in this
context, e.g.,
• Prediction of profitability and risk of given investment portfolio in next year
• Prediction of volatility of bond yields during future 5 years
• Prediction of stock prices next trading day on stock exchanges
• Short-term prediction of correlations among American and European stock
markets
• Long-term prediction of volumes of credit defaults in commercial banks
• Prediction of prices of real estates according to their characteristics
In this section, we mention only some general aspects concerning predictions in
time series. Specific prediction methods will be described later after introducing
particular time series models.
2.2.3.1 Point Prediction and Interval Prediction
This classification of predictions holds not only for time series, but it is also
common, e.g., in the econometric regression analysis:
Point prediction is the quantity that presents a numerical estimate of future value
of time series which is optimal in a certain sense (i.e., the estimate of time series
value in so far unobserved future time point). For instance, the point prediction of
exchange rate EUR/USD in three future months predicted just now is 1.0635.
Obviously, the point prediction is always burdened by error so it must be taken
with discretion.
Interval prediction is the prediction interval which is quite analogous to the
confidence interval used in mathematical statistics; the only difference consists in
the fact that one estimates an unknown (future) value of time series instead of an
unknown parameter in this case. For instance, the 95% prediction interval presents
the lower and upper bounds for the range in which the corresponding future value of
time series will lie with the probability of 0.95. Let us consider again the previous
example with the exchange rate EUR/USD: if the corresponding 95% prediction
interval is (1.0605; 1.0665), then, e.g., a European company can expect with high
confidence that it obtains for each euro at least 1.0605 dollars. From the practical
point of view, the interval predictions seem to be more useful for users than the point
predictions.
2.2.3.2 Quantitative Prediction and Qualitative Prediction
It has nothing in common with the classification of variables (the quantitative

variable can be measured quantitatively on a numeric or quantitative scale, e.g.,
stock quotes, while the qualitative variable has no natural or logical order, e.g., the
variable with values spring, summer, autumn, and winter denoting seasons of year).
Roughly speaking, the classification of quantitative and qualitative predictions
depends on the objectivity of their construction:
Quantitative prediction methods provide predictions based on the statistical
analysis of observed data, i.e., such predictions are usually based on objective
mathematical methods of statistics. Of course it does not guarantee that one obtains
the best prediction results in this way: although the quantitative predictions are
constructed applying objective methods, their worth depends significantly on the
assumption that in the (future) prediction horizon the character of the given time
series remains unchanged so that the model constructed using current and past data
remains valid. One produces here only a technical extrapolation (autoprojection or
mathematical extension) of past and current observations to the future. This fact
must be reminded in the following chapters where we confine ourselves entirely to
the quantitative predictions (so that the term extrapolation could be more appropriate
even if it is not common in economic applications).
Fig. 2.2 S-curve plotting sale

the sale of a new product
Qualitative prediction methods are based usually on opinions of experts (one calls
them sometimes the “expert predictions”), and therefore in practice they have rather
subjective character. Sometimes one is forced to apply these methods when histor-
ical data are missing, e.g., when one introduces a new bank product. Since we avoid
these methods in the following chapters with regard to the character of this text,
some simple examples of qualitative predictions will be given below to have an idea
of this approach to predicting (sometimes the qualitative predictions are even better
than the purely mathematical ones):
• Subjective fitting by curve is a (graphical) method when experts strive to estimate
the future behavior of particular time series using their experience with time series
of similar type. For instance, the graphical plot describing the sale of a new
product (e.g., a new car make) has frequently the form of so-called S-curve shown
in Fig. 2.2: after the starting stage, the sale accrues during the growth stage in
dependence on the intensity of advertising campaign till the stable stage is
achieved (later usually the drop of sale follows only). Therefore, the experts
can suggest the prediction just according to a specific S-curve applying their
subjective opinion on its form (e.g., on the length of particular stages).
• Delphi method is the prediction method based on the enquiry in an expert group
and the gradual mediation of consensus for given prediction problem. This
methodology has been developed by large-scale multinational corporations to
forecast development in science, engineering, production, consumption, and the
like. In particular, its application consists in several stages of anonymous
enquiring where each of addressed experts presents his or her opinion on the
given prediction. In each stage, one adds to the enquiry form the statistical results
of previous stages so that experts can adjust their previous opinions and to
converge gradually to the group opinion which is declared as the final prediction.
It should be stressed that the results of particular stages are communicated only in
Table 2.1 Data for Example 2.1 (Delphi method)

Proportion of renewable energy resources 25 years forwards Numbers of positive answers
5% and less 0
10% 1
15% 7
20% 11
25% 12
30% 10
35% 5
40% 4
45% and more 0
Σ 50
a global statistical form for the whole group (no individual answers are provided).
In the following Example 2.1, one uses only very simple statistical instruments in
this context, namely the mean values, standard deviations, and lower and upper
quartile.
Example 2.1 (Delphi method). In Table 2.1, one summarizes enquiries by

50 experts on the proportion of renewable energy resources predicted for a given
region 25 years forward in the first stage of Delphi method.
From the statistical sample in Table 2.1, one calculates the mean value
1 10 þ . . . þ 4 40
¼ 25:4 %
50
and the standard deviation

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 ð10 25:4Þ2 þ . . . þ 4 ð40 25:4Þ2
ffi 7:5%:
50
Further one finds the lower and upper quartile. The lower (or upper) quartile is the
bound separating one-quarter of the lowest (or highest) observations, respectively. In
our case, one-quarter of number of observations is 50/4 ¼ 12.5, and therefore, the
lower quartile is 20% and the upper quartile is 30%. In the next stage of this
prediction method, one informs particular participants of the expert group on the
statistical results obtained in the first stage (but not on the answers of particular
experts) and so on.
⋄
2.2.3.3 Prediction in Structural Model and Prediction in Time

Series Model
Prediction in structural (econometric) model relates the future value of explained

variable (which is the prediction task in this case) to future values of relevant
explanatory variables. Such models may be successful even for long-term predic-
tions since the long-term relations are frequent in economic and financial practice,
e.g., due to efficiency of financial markets and arbitrage-free principle. However, in
such a case one must construct predictions of future values for all participating
explanatory variables in the model.
Prediction in time series model is constructed as the autoprojection of past and
current values of the time series to future. Such a prediction often makes use
explicitly of calculated values of error terms in the model. Sometimes the difference
between the prediction based on structural model and the prediction based on time
series model is not quite clear (e.g., in the vector autoregression VAR; see Sect.
12.2).
2.2.3.4 In-Sample Prediction and Out-of-Sample Prediction
In-sample prediction is that generated for the same set of data that was used to
estimate the model’s parameters. Obviously, one can expect good prediction results
since one only recalculates selected values of the original sample by means of the
constructed model so that all model assumptions remain valid in the “prediction
horizon.” Nevertheless, this procedure can serve a very simple test of in-sample fit of
the model.
Out-of-sample prediction is on the contrary the prediction of time series values
which have not participated in the construction of prediction model at all: either they
were not available at that time (i.e., they were future values from the point of view of
that time), or they were deleted from the sample on purpose (it is common in the
situation when one tries to evaluate the prediction ability of a time series model: then
the data deleted artificially are denoted as the hold-out sample). Obviously, pre-
dictions of hold-out sample represent a better evaluation of the prediction model than
an examination of its in-sample fit (see above).
As a simple example let us consider the time series of 120 monthly observations
in the period 2008M1–2017M12. The objective is to construct a model for this time
series and to assess its quality (in particular, its prediction abilities). Two solutions
are possible in this case: either (1) to construct the model using the whole time series
2008M1–2017M12 (and possibly to generate in-sample predictions) or (2) to con-
struct the model using only the shorter time series 2008M1–2016M12 and to
generate the out-of-sample predictions for 2017M1–2017M12 (which is the hold-
out sample here; see Fig. 2.3) and to compare them with the real values from the
hold-out sample. The second approach is more correct (of course, a suitable length of
Observed time series
Out-of-sample
Estimation of model predictions
(for hold-out sample)
2008M1 2016M12 2017M1 2017M12
Fig. 2.3 Example of out-of-sample predictions
hold-out sample must be chosen appropriately) since here the data information on
2017M1–2017M12 is not used for the construction of prediction model.
2.2.3.5 Single-Prediction and Multi-prediction
Single-prediction is the prediction constructed for a single time (usually for the next
one), e.g., at time n for time n + 1 denoted as bynþ1 ðnÞ, but also, e.g., at time n for time
n + 5 denoted as bynþ5 ðnÞ.
Multi-prediction is the prediction simultaneously for more (future) times (e.g., at
time n for times n + 1, . . ., n + h). Obviously, one obtains a vector of several single
predictions which are constructed at the same time (on the other hand, the sequence
of one-step-ahead predictions constructed at times n + 1, . . ., n + h always after
receiving particular observations yn, . . ., yn+h1 cannot be called multi-prediction).
The example in Fig. 2.3 may show some problems connected with multi-
predictions, e.g., with assessment of their quality. Let an examined prediction
technique applied at the end of 2006 for the hold-out sample 2007 provide good
result only for the first month 2007M1 (i.e., the short-term prediction result) and bad
results for the remaining months 2007M2–2007M12 (i.e., the long-term prediction
results). The multi-prediction from time 2006M12 for times 2007M1–2007M12 is
not enough to assess the quality of applied prediction methodology: one should
generate a set of multi-predictions, and it is possible to do it in a systematic way
using so-called prediction windows. Moreover, two types of prediction windows can
be used (see Table 2.2 if predicting only three closest future values for simplicity):
• Rolling windows: the samples used for prediction (observable in the rolling
windows) have a fixed length (e.g., 108 in Table 2.2), but their beginning is
shifted.
• Recursive windows: the samples used for prediction (observable in the recursive
windows) have a fixed beginning (e.g., 2008M1 in Table 2.2), but their length
increases.
In both cases, ten multi-predictions are obtained (see Table 2.2) so that conclusions
on the prediction methodology may be reliable.
Table 2.2 Rolling and recursive windows to assess the quality of multi-predictions (see Fig. 2.3)
Multi-prediction Multi-predictions based on samples provided by
Constructed for times Rolling windows Recursive windows
2017M1, M2, M3 2008M1–2016M12 2008M1–2016M12
2017M2, M3, M4 2008M2–2017M1 2008M1–2017M1
2017M3, M4, M5 2008M3–2017M2 2008M1–2017M2
2017M4, M5, M6 2008M4–2017M3 2008M1–2017M3
2017M5, M6, M7 2008M5–2017M4 2008M1–2017M4
2017M6, M7, M8 2008M6–2017M5 2008M1–2017M5
2017M7, M8, M9 2008M7–2017M6 2008M1–2017M6
2017M8, M9, M10 2008M8–2017M7 2008M1–2017M7
2017M9, M10, M11 2008M9–2017M8 2008M1–2017M8
2017M10, M11, M12 2008M10–2017M9 2008M1–2017M9
2.2.3.6 Static Prediction and Dynamic Prediction
This classification of predictions is used in cases when we explain the predicted

variable by means of explanatory variables, and among these explanatory variables
there are lagged (i.e., delayed) values of the predicted variable (this situation is
common, e.g., in autoregressive models):
Static prediction makes use of the observed lagged values of predicted variable if
they are available for prediction. For instance, predicting for time t +2 from time t by
means of the estimated model yt ¼ 0.64yt1 + εt (so-called autoregression of the first
order; see Sect. 6.2) with value yt+1 known at time t of prediction, the corresponding
static prediction will be
bytþ2 ðt Þ ¼ 0:64ytþ1 : ð2:7Þ
Dynamic prediction does not exploit values of predicted variable lying in the
prediction horizon (even if these values are known), but replaces them by
corresponding predictions. Therefore, in the situation described above, the dynamic
prediction will be
bytþ2 ðt Þ ¼ 0, 64bytþ1 ðt Þ, ð2:8Þ
i.e., the value yt+1 in (2.7) is replaced by the one-step-ahead prediction bytþ1 ðt Þ
ignoring the possibility that the value yt+1 may be known at time t. Obviously, the
dynamic predictions are not so accurate as the static predictions (if the static pre-
dictions are feasible).
2.2.3.7 Measures of Prediction Accuracy
The important aspect of prediction consists in the measuring of its accuracy based on
the error of prediction. The error et of prediction byt (when predicting value yt) is
defined as
et ¼ yt byt : ð2:9Þ
The error of prediction cannot be calculated until the time when we know the actual
value yt (this value has been unknown at the time of prediction). However in
practice, when assessing the quality of prediction, one sometimes “predicts”
known values of time series to compare these predictions with the known actual
values (see the hold-out sample described above).
The main source of prediction errors consists in the residual component of time
series since it represents unpredictable (unsystematic) fluctuations in data. If the
participation of this component in time series is significant, then the possibility of
construction of reliable predictions is limited. On the other hand, the size of
prediction error depends also on the quality of predictions for systematic compo-
nents of time series. Therefore, significant prediction errors may indicate either an
extraordinary participation of residual component or inappropriateness of prediction
methodology.
In any case, the examination of error of prediction is useful. If the prediction
technique masters predictions of systematic components, then the prediction errors
reflect the influence of residual component only (see Fig. 2.4a). On the contrary,
Fig. 2.4b–d shows the cases when the prediction technique failed due to inappropri-
ate prediction of trend, seasonal, and cyclical components, respectively.
The measures of prediction accuracy assess the development of predictions in time.
We will give below the usual measures of this type for a simple situation when one
assesses in total the accuracy of predictions bynþ1 , . . . , bynþh of values yn+1, . . ., yn+h
(here it does not matter if we assess a multi-stage prediction or a sequence of one-step-
ahead predictions, static or dynamic predictions, or other types of predictions):
1. Sum of squared errors SSE (sum of squared errors):
Xnþh Xnþh
SSE ¼ ðy byt Þ2 ¼
t¼nþ1 t t¼nþ1
e2t : ð2:10Þ
SSE is the analogy of the criterion of least squares in regression models.

2. Mean squared error MSE:
1 Xnþh 1 Xnþh
MSE ¼ ð y t b
y t Þ 2
¼ e2 :
t¼nþ1 t
ð2:11Þ
h t¼nþ1 h
(a) (b)
et et
1 2 3 1 2 3
t (years) t (years)
et (c) et (d)
1 2 3 5 10 15
t (years) t (years)
Fig. 2.4 Graphical plots of errors of prediction
MSE is a popular quadratic loss function. Some software decompose MSE to three
components:
2
1 Xnþh 2
ð y t b
y t Þ 2
¼ by y þ s^y sy þ 2 1 r^yy s^y sy ð2:12Þ
h t¼nþ1
(by, y, s^y , sy are the corresponding sample means and (biased) sample standard
deviations of values by and y; r ^yy is the sample correlation coefficient between by
and y). Usually, one uses relative values of these components, namely
(a) Proportional bias:
2
by y
Pnþh ð2:13Þ
1
h t¼nþ1 ðyt byt Þ2
(b) Proportional variance:
2
s ^y sy
Pnþh ð2:14Þ
1
h t¼nþ1 ðyt byt Þ2
(c) Proportional covariance:

2 1 r ^yy s ^y sy
Pnþh ð2:15Þ
t¼nþ1 ðyt byt Þ 2
1
h
These proportional components have obviously the unit sum, in which the
proportional bias indicates the distance of average of predictions from average
of future values, the proportional variance indicates the distance of variance of
predictions from variance of future values, and the proportional covariance covers
the remaining unsystematic part of prediction error (any “good” prediction
technique has proportional bias and proportional variance small so that the
unsystematic component prevails in such a case).
3. Root mean squared error:
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 Xnþh 1 Xnþh
RMSE ¼ ðyt byt Þ2 ¼ t¼nþ1 t
e2 : ð2:16Þ
h t¼nþ1 h
RMSE modifies MSE in order to be measured in the same units as the given time
series.
4. Mean absolute error MAE:
1 Xnþh 1 Xnþh
MAE ¼ jyt byt j ¼ j et j: ð2:17Þ
h t¼nþ1 h t¼nþ1
MAE penalizes large prediction errors not so strictly as MSE. Therefore, it is

recommended for assessment of prediction accuracy in time series with outliers
(see Sect. 2.2.1).
The measures of prediction accuracy given above depend on the scale of the
given time series. Therefore, they are applicable only if one mutually compares
various prediction techniques in similar time series. Now we will present further
measures which do not depend on the time series scale:
5. Mean absolute percentage error MAPE:

100 Xnþh yt byt
MAPE ¼ t¼nþ1 yt
: ð2:18Þ
h
MAPE usually ranges from 0 to 100%. This measure is preferred in practice by

some authors [see, e.g., Makridakis (1993)]. The result less than 100% means that
the given prediction model is better than the random walk model (the random
walk model predicts the zero level permanently), i.e., MAPE ¼ 100% constantly.
MAPE is not reliable for time series ranging in vicinity of zero.
6. Adjusted mean absolute percentage error AMAPE:

100 Xnþh yt byt
AMAPE ¼ t¼nþ1 ðy þ b
: ð2:19Þ
h t yt Þ=2
AMAPE rectifies the asymmetry of the criterion MAPE in (2.18), namely that it
provides the same result even if one swaps the real value and its prediction (e.g.,
real value 0.7 and prediction 0.9 give the same value in (2.19) as real value 0.9
and prediction 0.7).
7. Theil’s U-statistic [see Theil (1966)]
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pnþh
t¼nþ1 ðyt b yt Þ2
U ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pnþh ffi q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pnþh ffi: ð2:20Þ
t¼nþ1 t b
y 2
þ t¼nþ1 t y 2
U lies always between 0 and 1 (e.g., U¼0 means the perfect coincidence of
prediction with reality).
Further group of measures of prediction accuracy only indicates whether the
model predicts correct signs of future values (i.e., whether these values will be
positive or negative) or predicts correct direction changes (i.e., whether an
increase changes to a decrease and the like). From the strategic point of view,
such predictions are often more important than numerical predictions:
8. Percentage of correct sign predictions:

100 Xnþh 1 for yt byt > 0,
z , where zt ¼
t¼nþ1 t
ð2:21Þ
h 0 otherwise:
9. Percentage of correct direction change predictions:

100 Xnþh 1 for ðyt yt1 Þ ðbyt yt1 Þ > 0,
z , where zt ¼
t¼nþ1 t
ð2:22Þ
h 0 otherwise:
Remark 2.2 One should stress once more that the given measures concern the
statistical accuracy of predictions only. In any way they do not justify an economic
or financial adequacy of predictions. For instance, small values of MSE do not mean
that we dispose of a successful outline how to predict future market strategies (e.g.,
sometimes it can be desirable from the strategic point of view to underestimate or
overestimate the future development and the like).
⋄
2.2.3.8 Prediction Combinations
As the accuracy of predictions is concerned, a surprisingly positive effect brings

combining several predictions constructed in different ways (i.e., by different
approaches, by employing different information sources, and the like) to a final
summary prediction [see, e.g., Clements and Harvey (2011), Timmermann (2006),
and others]. Moreover, the predictions based on different modeling approaches may
have similar predictive accuracy so that it is difficult to identify a single best forecast.
This philosophy based on mixing various predictions has been successfully applied
in practice, e.g., to forecast interest rates, currency market volatility and exchange
rates, inflation, money supply, stock returns, meteorological data, city populations,
outcomes of football games, and many others.
Let us consider a simple case when we combine two predictions in the form
byt ¼ w by1t þ ð1 wÞ by2t : ð2:23Þ
The weights w and 1 w (w 0) should be chosen to minimize the mean

squared error
varðeÞ ¼ varðyt byt Þ ¼ varðw ðyt by1t Þ þ ð1 wÞ ðyt by2t ÞÞ

¼ varðw e1 þ ð1 wÞ e2 Þ: ð2:24Þ
If both the combined predictions are unbiased (i.e., E(e1) ¼ 0 and E(e2) ¼ 0) with
finite mean squared errors denoted as σ 12 ¼ E(e12) and σ 22 ¼ E(e22) and covariance
denoted as σ 12 ¼ cov(e1, e2) ¼ E(e1e2), then the corresponding optimal weights are
σ 22 σ 12 σ 21 σ 12
w¼ , 1w¼ : ð2:25Þ
σ 21 þ σ 22 2σ 12 σ 21 þ σ 22 2σ 12
According to (2.25), greater weights are assigned to more precise models with
smaller σ 12 or σ 22. Moreover, the weights can be negative if σ 12 > σ 22 or σ 12 >
σ 12 (the negative weight assigned to a prediction component means that this com-
ponent is replaced in the prediction combination by other prediction components
with lower prediction errors). The weakly correlated prediction errors enable us to
rewrite the weights as functions of the relative variance σ 22/σ 12:
σ 22 =σ 21 1
w , 1w :
1 þ σ 22 =σ 21 1 þ σ 22 =σ 21
The result (2.25) can be generalized easily if one combines m predictions. In the
literature and in software systems (R packages, EViews, and others), there are
suggested many strategies how to combine predictions, e.g.:
• Equal-weighted predictions:
1 X
m
byt ¼ by : ð2:26Þ
m i¼1 it
• Median prediction:
byt ¼ medðby1t , . . ., bymt Þ: ð2:27Þ
• Trimmed mean of predictions: One orders (increasingly) the predictions

by1t . . . bymt and then trims λ100% predictions at the top and λ100%
predictions at the bottom:
XÞmc
bð1λ
1
byt ¼ byit ð2:28Þ
mð1 2λÞ i¼bλmþ1c
(the symbol b. . .c denotes the integer part).

• Predictions weighted in inverse proportional way to MSE:
1 Xm
byt ¼ Pm 1
MSE 1
i byit , ð2:29Þ
j¼1 MSE j i¼1
where MSEi ¼ E(ei2) is the mean squared error of prediction byit .

• Predictions weighted in inverse proportional way to ranking:
1 X
m
byt ¼ Pm 1
R1
i byit , ð2:30Þ
j¼1 R j i¼1
where Ri is the rank of prediction byit (e.g., the smallest prediction has the rank 1).
This weighting scheme which weights predictions inversely to their rank seems to
be surprisingly robust [see Timmermann (2006)].
2.3 Random Processes with Discrete States in Discrete Time
The majority of methods of time series analysis in this publication concerns time
series that are modeled as random processes with continuous states in discrete time
(see Sect. 2.1). Therefore, the term “time series” means here usually the trajectory of
values y1, . . ., yn from a continuous interval on the real line which are observed in
(regular) discrete moments. For the sake of completeness, the remaining Sects. 2.3–
2.5 of this chapter are devoted to examples of time series with different character
(e.g., to time series which are modeled as random processes with discrete states in
continuous time; see Sect. 2.4) in order to get an idea on further possibilities in this
modeling framework.
Let us start with several examples of random processes with discrete states in
discrete time:
2.3.1 Binary Process
Binary process is the two-valued random process in discrete time
fY t , t ¼ 1, 2, . . .g, where Y t iid, PðY t ¼ 1Þ ¼ PðY t ¼ 1Þ ¼ 1=2 ð2:31Þ
(the definition can be more general with asymmetric probabilities p and q). Its
trajectory may be interpreted as a record of results when tossing an ideal coin
(e.g., {1, 1, 1, 1, 1, 1, 1, . . .}; see Fig. 2.5).
2.3.2 Random Walk
Random walk (RW ) on line is the integer-valued random process in continuous time
2.3 Random Processes with Discrete States in Discrete Time 29
Fig. 2.5 Trajectory of yt

binary process
1 2 3 4 5 6 7 t
−1
Fig. 2.6 Principle of q p

random walk on real line
−3 −2 −1 0 1 2 3 t
Xt
fY t , t ¼ 0, 1, . . .g, where Y 0 ¼ 0; Yt ¼ i¼1
Xi, t ¼ 1, 2, . . . ;
X t iid; PðX t ¼ 1Þ ¼ p, PðX t ¼ 1Þ ¼ q ðp þ q ¼ 1Þ:
ð2:32Þ
Its trajectory may be interpreted as a record of movement of particle which moves

across integer values on real line: the particle is at time 0 in the origin and during
each time unit moves to the right with probability p and to the left with probability
q (see Fig. 2.6). In the case with p ¼ q ¼ ½, one calls it the symmetric random walk.
2.3.3 Branching Process
Branching process (or Galton–Watson process) is the integer-valued nonnegative

random process in discrete time
XY t
fY t , t ¼ 0, 1, . . .g, where Y 0 ¼ 1; Y tþ1 ¼ Z ,
j¼1 tj
t ¼ 0, 1, . . . ;
Z tj iid random variables with values 0, 1, . . . :
ð2:33Þ
Here Yt describes the random number of members of tth generation: the initial 0th
generation has only one member, and the jth member of tth generation gives rise to a
random number Ztj of members of (t +1)th generation (see Fig. 2.7). For example, if
in a pyramid game each player finds further three participants, then the cor-
responding trajectory is {1, 3, 9, 27, . . .}.
time 0 :
M M
number Yt
time t : 1 2 ........... Yt
... ... ... ... number Yt+1

time t +1 :
Zt1 Zt 2 ZtYt
Fig. 2.7 Principle of branching process
2.3.4 Markov Chain
Markov chain is a general scheme frequently used in practice to model random

processes with discrete states in discrete time (the binary process, the random walk,
and the branching process are its special cases). Let for simplicity the possible states
of this process be integer numbers i ¼ . . ., 1, 0, 1, . . . . The process can move
across these states in particular discrete times t ¼ 0, 1, . . . with given transition
probabilities. Moreover, so-called Markov property is here fundamental
PðY tþ1 ¼ j j Y t ¼ i, Y t1 ¼ it1 , . . ., Y 0 ¼ i0 Þ ¼ PðY tþ1 ¼ j j Y t ¼ iÞ

ð2:34Þ
for all t ¼ 0, 1, . . . and i, j, i0, . . ., it1 ¼ . . ., 1, 0, 1, . . . (in other words, the
probability of moving to the next state depends only on the present state and not on
the previous states). The probability on the right-hand side of (2.34) is so-called
transition probability from state i at time t to state j at time t + 1. The important
special case is the homogenous Markov chain whose transition probabilities do not
depend on time, i.e.,
pij ¼ PðY tþ1 ¼ j j Y t ¼ iÞ ð2:35Þ
for all t. For example, the symmetric random walk (see above) is the homogenous
Markov chain with starting value Y0 ¼ 0 and with transition probabilities

1=2 for j ¼ i 1
pij ¼ for all i, j ¼ . . . , 1, 0, 1, . . . : ð2:36Þ
0 otherwise
2.3 Random Processes with Discrete States in Discrete Time 31
The maximum likelihood estimate (MLE) of transition probabilities (2.35) has a

simple form
nij
pij ¼ P1
b , ð2:37Þ
k¼1 nik
where nij is the number of transitions from state i to state j during time unit using a
sample of observed trajectories.
Further one introduces in this context the n-step transition probabilities defined
(for simplicity, we constrain ourselves to homogenous Markov chains) as
pij ðnÞ ¼ PðY tþn ¼ j j Y t ¼ iÞ, n ¼ 1, 2, . . . ð2:38Þ
and the probabilities representing the probability distribution of Markov chain at

time t
pi ðt Þ ¼ PðY t ¼ iÞ: ð2:39Þ
The following matrix symbols are usually applied in this context:

P ¼ pij ; PðnÞ ¼ pij ðnÞ ; pðnÞ ¼ ð. . . , p1 ðnÞ, p0 ðnÞ, p1 ðnÞ, . . .Þ0
ð2:40Þ
for so-called transition matrices (P(0) ¼ I, P(1) ¼ P) and distribution vectors of

Markov chain. It holds when multiplying (infinite) transition matrices
PðnÞ ¼ Pn ; pðnÞ0 ¼ pð0Þ0 Pn : ð2:41Þ
Some homogenous Markov chains can have so-called stationary distribution π
lim p ðnÞ ¼ π i > 0, ð2:42Þ

n! 1 i
which corresponds to a stable limit behavior of Markov chain. The vector π fulfills
π0 ¼ π0 P: ð2:43Þ
Example 2.2 (Markov chain). A bonus system in motor car (Casco) insurance has
three bonus levels denoted as 0, 1, 2 (presenting, e.g., 100 %, 80 %, and 60 % of
basic insurance premiums): if the clients report no claims in the given year, their
bonus improves next year by one level or they remain at the best level 2; if reporting
one or more claims they grow worse next year by one level or they remain at the
worst level 0 (i.e., no malus level is introduced). The insurance company disposes of
stable insurance portfolio with 10,000 clients: 5000 are “good” drivers with
0 1 2 3 4 5 6 7
100 %
t (years)
80 %
60 %
Bonus premium in % of basic premium
Fig. 2.8 Development of bonus system of a client in Example 2.2
estimated probability of loss-free year about 0.9 and 5000 are “bad” drivers with
estimated probability of loss-free year about 0.8. The objective is to estimate the
stabilized numbers of clients in particular bonus levels.
The behavior of clients can be described by homogenous Markov chain with
annual time units and with three possible states (then the trajectory for an individual
client is an annual time series jumping across particular levels 100 %, 80 %, and
60 %; see Fig. 2.8).
Obviously, the transition matrices of good or bad drivers are
0 1 0 1
0:1 0:9 0 0:2 0:8 0
B C B C
@ 0:1 0 0:9 A or @ 0:2 0 0:8 A: ð2:44Þ
0 0:1 0:9 0 0:2 0:8
The stationary distribution of good drivers must fulfill according to (2.43) the
system of linear equations
0 1
0:1 0:9 0
B C
ðπ 0 π 1 π 2 Þ ¼ ð π 0 π1 π 2 Þ @ 0:1 0 0:9 A: ð2:45Þ
0 0:1 0:9
Its solution is π 0 ¼ 0.010 989, π 1 ¼ 0.098 901, and π 2 ¼ 0.890 109 so that 5000
0.010 989 ¼ 55.0 clients have the bonus level 0; similarly 494.5 clients have the
bonus level 1 and 4450.5 clients have the bonus level 2 among 5000 good drivers in
the portfolio.
Quite analogously we get that 238.1 clients have the bonus level 0, 952.4 clients
have the bonus level 1, and 3809.5 clients have the bonus level 2 among 5000 bad
drivers in the portfolio. Hence in limit, the majority of good and also bad drivers will
achieve the best bonus level 2 (although this number is significantly lower among
bad drivers than among good drivers). In any case, the given bonus system is very
favorable for insured.
⋄
2.4 Random Processes with Discrete States in Continuous Time 33
2.4 Random Processes with Discrete States

in Continuous Time
Let us present the following well-known examples of random processes with

discrete states in continuous time:
2.4.1 Poisson Process
Poisson process (with intensity λ) is the integer-valued nonnegative random process

in continuous time {Nt, t 0}, where
8
< ðiÞ N 0 ¼ 0;
>
ðiiÞ N t2 N t1 , . . . , N tn N tn1 are independent for arbitrary 0 t 1 < . . . < tn ;
>
:
ðiiiÞ N t N s Poisson distribution with intensity λ ðt sÞ for arbitrary 0 s < t:
ð2:46Þ
Here Nt at time t 0 describes the number of occurrences of an observed event

during the time interval h0, ti. In particular, Poisson process is suitable for modeling
the occurrences of rare events in economy and finance, e.g., insurance claims, credit
defaults, turbulences in stock prices, and the like). For instance, Fig. 2.9 plots the
number nt of mortgage defaults in a bank credit portfolio from the beginning of
accounting year to time t modeled as Poisson process.
In particular, the (random) number Nt of occurrences of given event to time t has
Poisson distribution
ðλ t Þi
PðN t ¼ iÞ ¼ eλ t
for i ¼ 0, 1, . . . ð2:47Þ
i!
with mean value λ t. Hence it follows (without any additional assumptions) that the
periods T1, T2, . . . between particular occurrences of events are iid random variables
with exponential distribution and mean value 1/λ (this conclusion has a logic
interpretation: the mean number of occurrences per time unit is λ 1 so that the
mean period between two occurrences must be 1/λ). The efficient estimate of the
intensity λ is bλ ¼ n=T , where n is the observed number of occurrences during
period T.
2.4.2 Markov Process
Markov process (similarly as Markov chain in Sect. 2.3) is a general scheme for
random processes with discrete states in continuous time. Let for simplicity the
number of defaults to time t

nt
5
4 E(Nt) = λ⋅ t
3
2
1
E(T1) = 1/λ E(T2) = 1/λ E(T3) = 1/λ
first default second default third default t
Fig. 2.9 Numbers of mortgage defaults in a bank credit portfolio from the beginning of year to time
t modeled as Poisson process (T1, T2, . . . are random periods between defaults)
possible states of this process be again integer numbers i ¼ . . ., 1, 0, 1, . . . , and the
process can move across these states in any positive times with given transition
probabilities. Again the Markov property must hold
PðY sþt ¼ j j Y s ¼ i, Y tn ¼ in , . . ., Y t1 ¼ i1 Þ ¼ PðY sþt ¼ j j Y s ¼ iÞ ð2:48Þ
for all times 0 t1 < . . . < tn < s s + t and i, j, i1, . . ., in ¼ . . ., 1, 0, 1, . . . . The
probability on the right-hand side of (2.48) is the transition probability from state i at
time s to state j at time s + t. In the case of homogenous Markov process, it depends
only on time t
pij ðt Þ ¼ PðY sþt ¼ j j Y s ¼ iÞ for all s 0: ð2:49Þ
Similarly the probabilities representing the probability distribution of Markov pro-

cess at time t are
pi ðt Þ ¼ PðY t ¼ iÞ: ð2:50Þ
Other important concepts here are so-called transition intensities
pii ðhÞ 1 pij ðhÞ

qii ¼ lim ; qij ¼ lim , where i 6¼ j, ð2:51Þ
h!0þ h h!0þ h
i.e.,
pii ðhÞ ¼ 1 þ qii h þ oðhÞ; pij ðhÞ ¼ qij h þ oðhÞ, where i 6¼ j: ð2:52Þ
2.5 Random Processes with Continuous States in Continuous Time 35
Markov process can be defined directly by transition intensities: in such a case the
transition probabilities and the probability distribution of Markov process are
obtained by solving so-called Kolmogorov differential equations.
Poisson process with intensity λ > 0 (see above) is the special case of homoge-
nous Markov process with
8
> λ h þ oð hÞ for j ¼ i þ 1;
>
>
< 1 λ h þ oð hÞ for j ¼ i;
pij ðhÞ ¼ ð2:53Þ
>
> oð hÞ for j > i þ 1;
>
:
0 for j < i,
i.e., in the interval of small length h the given event occurs just once with probability
λ h + o(h) (which is proportional approximately to the length of this interval) and
more than once with probability o(h).
Analogously one can define continuous Markov process [i.e., the Markov prop-
erty holds for continuous states in continuous time; see, e.g., Malliaris and Brock
(1982)].
2.5 Random Processes with Continuous States

in Continuous Time
Finally, two examples of random processes with continuous states in continuous

time will be presented:
2.5.1 Goniometric Function with Random Amplitude

and Phase
For instance, the sinusoid with random amplitude and phase is the random process
with continuous states in continuous time {Yt, t 0} defined as
Y t ðωÞ ¼ AðωÞ sin ðν t þ ΦðωÞÞ, ð2:54Þ
where A is nonnegative random variable and Φ is random variable with uniform

distribution on h0, 2π), both A and Φ being mutually independent.
The realization ω0 with denotation yt ¼ Yt(ω0), a ¼ A(ω0), φ ¼ Φ(ω0) gives the
trajectory in the form of (deterministic) sinusoid
yt ¼ a sin ðν t þ ϕÞ: ð2:55Þ

2.5.2 Wiener Process
Wiener process (or also Brownian motion) is the random process with continuous
states in continuous time {Wt, t 0}, where
8
> ði Þ W 0 ¼ 0;
>
>
< ðiiÞ particular trajectories are continuous in time;
>
> ðiiiÞ W t2 W t1 , . . . , W tn W tn1 are independent for arbitrary 0 t1 < . . . < t n ;
>
:
ðivÞ W t W s N ð0, t sÞ for arbitrary 0 s < t:
ð2:56Þ
In particular, the increments Wt+h Wt have the normal distribution N(0, h), and the
correlation structure of this process fulfills
covðW s , W t Þ ¼ min ðs, t Þ, varðW t Þ ¼ t: ð2:57Þ
Further, more sophisticated properties of Wiener process (valid with probability one)
are, e.g.,
• The particular trajectories are continuous but not differentiable functions of time
(i.e., the derivations do not have to exist in any time point).
• The particular trajectories attain any real value infinitely times.
• The particular trajectories have the fractal form (i.e., they “look similarly in any
zoom”).
Wiener process is the basic concept of majority of financial models. After
transforming (to achieve a necessary trend, volatility, and the like), one can apply
it to model continuous movements of interest rates or asset prices (when jumps can
occur, one must combine Wiener process with Poisson process from Sect. 2.4).
Important modifications in practice are the following processes (they are described in
more details later in Chap. 10):
1. Wiener process with drift μ and volatility (or diffusion coefficient) σ:
fY t ¼ μ t þ σ W t , t 0g, ð2:58Þ
where E(Yt) ¼ μ t and var(Yt) ¼ σ 2 t.

2. Exponential Wiener process (or also geometric Brownian motion):
Y t ¼ eX t ¼ eμtþσW t , t 0 , ð2:59Þ
where E(Yt) ¼ exp{(μ + σ 2/ 2)t} and var(Yt) ¼ exp[(2μ + σ 2) t] [exp(σ 2 t) 1].
2.6 Exercises 37
Remark 2.3 One can refer to further examples of random processes with discrete
states or in continuous time which are more complex so that a specialized literature
should be consulted, e.g.:
• Binary process originating by clipping a stationary process (with continuous
states) where simply the values of this stationary process higher or equal to
zero are replaced by the value “1” and the values lower than zero are replaced
by “0” (a general threshold can be used instead of zero; see Kedem (1980)).
• Counting process of nonnegative integer random variables usually correlated
over time that modifies the Box–Jenkins methodology for integer-valued
processes:
– DARMA process (i.e., discrete mixed process) models a general stationary
series of counts with a given marginal distribution (binomial, geometric,
Poisson); see, e.g., Jacobs and Lewis (1983), McKenzie (1988). Sometimes
Markov chains present a suitable model scheme for such processes; see
MacDonald and Zucchini (1997).
– INAR process (i.e., integer autoregressive process) generates integer-valued
time series in a manner similar to the autoregressive recursive scheme for
continuous random variables; see, e.g., Al-Osh and Alzaid (1987), Kedem and
Fokianos (2002), Weiss (2018). In this context, one makes use of the so-called
thinning operator
X
X
p∘X ¼ Y i, ð2:60Þ
i¼1
where {Yt, t ¼ 1, 2, . . .} are iid Bernoulli (i.e., zero-one) random variables with
the probability of success equal to p
PðY t ¼ 1Þ ¼ 1 PðY t ¼ 0Þ ¼ p: ð2:61Þ
• CARMA process (i.e., continuous-time ARMA process) extends the classical

Box–Jenkins methodology over continuous time. It has various applications in
financial time series, e.g., for the pricing of options; see Brockwell (2009).
⋄
2.6 Exercises
Exercise 2.1 Realize practically (e.g., in a group of students) the Delphi method to
predict some actual economic or financial themes.
Exercise 2.2 Repeat the calculation from Example 2.2 (the bonus system in motor
car insurance), but for five bonus levels (e.g., 100%, 90%, 80%, 70%, and 60% of
basic insurance premiums). Moreover, apply for this bonus system the modified rule:
if reporting one or more claims the clients grow worse next year by two levels or they
remain at the worst level 0 (with 100% of basic insurance premiums).
Part II
Decomposition of Economic Time Series
Chapter 3
Trend
This chapter and Chaps. 4 and 5 describe various methods of additive and multipli-
cative decomposition which result in the elimination of particular components of
time series. In practice, it can have various motivations:
1. First and foremost, the analysis of separated (eliminated) components of time
series is useful from the practical point of view since one can detect in such a way
various patterns in behavior of time series, identify particular external effects
influencing records, and compare several time series and the like (e.g., using the
trend, securities dealers can compare the growth rate of various stocks, or using
the seasonal component, banks can assess the demand for commercial credits
during particular years).
2. Important objectives of decomposition are also predictions of future development
of particular components (e.g., which will be the growth rate of contracted
mortgages) or predictions of the (non-decomposed) time series constructed by
compounding predictions of particular components (which are relatively simple
and accurate predictions).
3. Sometimes due to the character of solved problems it is convenient to reveal the
behavior of given time series adjusted by removing some components. For
example, the economic and financial time series are frequently seasonally
adjusted (this seasonal adjustment is even demanded for economic time series
reported officially by government statistical offices; see also Sect. 2.2.2).
The methods of elimination of particular components of time series differ by
various levels of objectivity, accuracy, and computational complexity. The choice of
relevant method depends on the motivation for decomposition and on the type of
analyzed time series. The methods based on the regression approach are often very
popular in this context mostly under the assumption that the residual component is
uncorrelated and homoscedastic in time [see the concept of white noise in (2.1)].
Moreover, the normal distribution of the residual component is sometimes assumed
(and justified by the Central Limit Theorem since the residuals are resultants of many
random effects). If the time series is contaminated by outliers, then one should use

https://doi.org/10.1007/978-3-030-46347-2_3
42 3 Trend
robust decomposition methods which are insensitive to outliers (e.g., one should
apply the median instead of the sample average when estimating the constant level of
time series) .
3.1 Trend in Time Series
This chapter is devoted to methods suggested to eliminate the trend component from
time series and to extrapolate this component to the future. In this context, one
speaks of smoothing of time series since the seasonal (sometimes even periodic) and
random fluctuations of time series are damped down simultaneously. While in Sect.
3.1 we will deal with the classical methods of elimination of trend, in Sects. 3.2 and
3.3 we will present the adaptive approaches that take into account local changes in
the character of trend (e.g., the changes in the slope of linear trend).
3.1.1 Subjective Methods of Elimination of Trend
These methods include simple approaches to the elimination of trend based on

(computer) graphics (in general, the graphical methods used in time series analysis
are mostly supported by special software available for this purpose).
A simple method of this type is based on upper and lower turning points (see
Fig. 3.1). Here one connects by polygonal lines firstly the upper turning points (i.e.,
the local maxima of given trajectory) and then the lower turning points (i.e., the local
minima of trajectory), and finally it suffices to plot the middles between the upper
and lower lines as the smoothing result for each time point. The method is subjective
since one can do suitable corrections subjectively (e.g., one can ignore outliers or
yt
ignored turning points t

original time series
upper and lower turning points
smoothed time series
Fig. 3.1 Method based on upper and lower turning points

3.1 Trend in Time Series 43
irrelevant local turning points which are not characteristic for cycle identification;
see Fig. 3.1).
3.1.2 Trend Modeling by Mathematical Curves
Section 3.1.2 describes the methods that express the trend analytically by simple
curves used in mathematics (e.g., by the line or logarithmic curve). Such estimated
curves enable to calculate in a natural way their future values, i.e., as a matter of fact,
to construct predictions of the trend component (under the assumption that its future
character will sustain in time).
Using this philosophy, one usually assumes that the analyzed time series can be
modeled as
yt ¼ Tr t þ E t ð3:1Þ
(or one has transformed the time series to this form by methods described in the
following chapters, e.g., by means of the seasonal adjustment). Moreover, the
residual component in (3.1) has the properties of white noise. These assumptions
permit to apply suitable regression methods when estimating the parameters of trend
curves, and then to take directly the corresponding regression extrapolations for Trt
as the predictions for yt.
The choice of type of the most appropriate mathematical curve for particular time
series is based on a preliminary analysis, usually by means of graphical records of
time series or by using expected properties of the trend component following, e.g.,
from the economic theory (however, it is obvious that one cannot suppress
completely subjective impacts here). Several reference tests for the choice of the
most appropriate mathematical curves for given trajectory y1, . . ., yn are shown in
Table 3.6. There also exist systematic typologies where the controlled movement
along particular knots of the typological tree offers the most appropriate curve
according to answers to selecting questions (e.g., “the analyzed trajectory is/isn’t
symmetric around the point of inflection ?”).
Now we will survey favorite trend curves including the formulas for estimation of
their parameters and for construction of their (point and interval) predictions in the
time series models of the type (3.1):
3.1.2.1 Linear Trend
It is a simple trend in the form of straight line
Tr t ¼ β0 þ β1 t, t ¼ 1, . . . , n: ð3:2Þ
44 3 Trend
The OLS estimates b0 and b1 of parameters β0 and β1 fulfill the system of normal
equations
X
n X
n
b0 n þ b1 t ¼ yt ,
t¼1 t¼1
ð3:3Þ
X
n Xn Xn
b0 t þ b1 t 2
¼ tyt
t¼1 t¼1 t¼1
with solution given by the formulas

Pn Pn Pn Pn
t¼1 tyt t t¼1 yt t¼1 tyt nþ1 t¼1 yt
b1 ¼ Pn 2 ¼ 2
nðn2 1Þ
,
t¼1 t
2
n t 12 ð3:4Þ
nþ1
b0 ¼ y t b1 ¼ y b1 :
2
The prediction byT of future value yT has the form
byT ¼ b0 þ b1 T ð3:5Þ
and (1 p)100% prediction interval for this value (e.g., 95% interval if p ¼
0.05; see Sect. 2.2.3) under the normality assumption (or normality achieved
asymptotically) is

b0 þ b1 T t 1 p=2 ðn 2Þ s f T , b0 þ b1 T þ t 1 p=2 ðn 2Þ s f T , ð3:6Þ
where
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn Pn 2 Pn 2
t¼1 ðyt b yt Þ 2 t¼1 yt t¼1b yt
s¼ ¼ ,
n2 n2
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u 2
u T nþ1
t 1
f T ¼ 1 þ þ nðn2 1Þ : 2
ð3:7Þ
n
12
Example 3.1 (linear trend). Table 3.1 and Fig. 3.2 present the elimination of linear
trend in the time series yt of the Swiss gross national income (at current prices in
billions of CHF) for particular years 1980–2015 (t ¼ 1, ..., 36). The spreadsheet of
EViews 7 in Table 3.2 and the predictions for particular years 2016–2025 in
Table 3.1 coincide with the calculations according to the formulas (3.4) and (3.5)
Table 3.1 Annual data 1980–2015, eliminated linear trend, and predictions for years 2016–2025 in
Example 3.1 (Swiss gross national income in bn CHF)
t Year yt (bn CHF) byt (bn CHF) t Year yt (bn CHF) byt (bn CHF)
1 1980 203.9 211.7 24 2003 505.9 517.5
2 1981 220.7 225.0 25 2004 520.5 530.8
3 1982 231.4 238.3 26 2005 550.8 544.1
4 1983 239.4 251.6 27 2006 579.2 557.4
5 1984 257.7 264.9 28 2007 577.4 570.7
6 1985 273.2 278.2 29 2008 559.0 584.0
7 1986 284.6 291.5 30 2009 599.1 597.3
8 1987 294.9 304.8 31 2010 642.8 610.6
9 1988 315.3 318.1 32 2011 624.3 623.9
10 1989 339.4 331.4 33 2012 637.6 637.2
11 1990 365.8 344.7 34 2013 649.6 650.5
12 1991 382.4 358.0 35 2014 649.8 663.8
13 1992 389.2 371.3 36 2015 660.3 677.1
14 1993 399.6 384.6 37 2016 690.4
15 1994 406.1 397.9 38 2017 703.7
16 1995 414.5 411.2 39 2018 717.0
17 1996 419.0 424.5 40 2019 730.3
18 1997 435.0 437.8 41 2020 743.6
19 1998 448.9 451.0 42 2021 756.9
20 1999 460.4 464.3 43 2022 770.2
21 2000 489.3 477.6 44 2023 783.5
22 2001 488.9 490.9 45 2024 796.7
23 2002 482.4 504.2 46 2025 810.0
Source: AMECO (European Commission Annual Macro-Economic Database). (https://ec.europa.
eu/economy_finance/ameco/user/serie/SelectSerie.cfm)
P36 P36
t¼1 tyt 2
36þ1
t¼1 yt 51 655:23
b1 ¼ ¼ ¼ 13:296 07,
36ð362 1Þ 3 885
12
36 þ 1
b0 ¼ y b1 ¼ 444:400 9 18:5 13:296 07 ¼ 198:423 6,
2
by37 ¼ b0 þ b1 37 ¼ 198:423 6 þ 13:296 07 37 ¼ 690:4,
etc. Further according to (3.6) and (3.7), we calculated the 95% prediction intervals,
e.g.:
46 3 Trend
900
800
700
600
500
400
300
200
100
1980 1985 1990 1995 2000 2005 2010 2015 2020 2025
Swiss gross national income at current prices (bn CHF)

Point prediction
95% prediction interval
Fig. 3.2 Annual data 1980–2015, eliminated linear trend, and predictions for years 2016–2025 in
Example 3.1 (Swiss gross national income in bn CHF)
Table 3.2 Spreadsheet of Dependent variable: M1

EViews 7 for the Swiss gross
Method: Least squares
national income from
Example 3.1 (elimination of Sample: 1980–2015
linear trend) Included observations: 36
Variable Coefficient Std. Error t-Statistic Prob.
C 198.4236 4.469807 44.39198 0.0000
T 13.29607 0.210670 63.11334 0.0000
R-squared 0.991537 S.E. of regression 13.13100
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P36 2 P36 2
t¼1 yt t¼1b yt
s¼ ¼ 13:131 00,
34
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
u 36þ1 2
u 1 37
f 37 ¼ t1 þ þ 2
¼ 1:056,
36 36ð362 1Þ
12
by37 t 0:975 ð36 2Þ s f 37 ¼ 690:4 2:032 2 13:131 00 1:056

¼ 690:4 28:2,
i.e.,
ð662:2; 718:6Þ,
⋄
etc. (see Fig. 3.2).
Remark 3.1 As the polynomial trends of higher orders are concerned, the quadratic
trend can be also found in economic and financial applications
Tr t ¼ β0 þ β1 t þ β2 t 2 , t ¼ 1, . . . , n: ð3:8Þ
The prediction byT of future value yT has the form
byT ¼ b0 þ b1 T þ b2 T 2 , ð3:9Þ
and the corresponding (1-p)100% prediction interval is

byT t 1p=2 ðn 3Þ s f T , byT þ t 1p=2 ðn 3Þ s f T , ð3:10Þ
where
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn Pn 2 Pn 2
t¼1 ðyt b yt Þ 2 t¼1 yt t¼1b yt
s¼ ¼ ,
n3 n3
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
0ffi
1
fT ¼ 1 þ 1, T, T 2 ðX0 XÞ 1, T, T 2 ,
0 1
1 1 1
B1 4 C
B 2 C
X¼B C: ð3:11Þ
@⋮ ⋮ ⋮A
1 n n2
⋄
3.1.2.2 Exponential Trend
It is the two-parametric trend in the form of exponential
Tr t ¼ αβt , t ¼ 1, . . . , n ð β > 0Þ ð3:12Þ

48 3 Trend
(the parameters are denoted as α and β). This trend has two typical characteristics,
namely that both its coefficient of growth (i.e., the ratio of neighboring values
Trt+1/Trt) and the ratio of neighboring differences
Tr tþ2 Tr tþ1
ð3:13Þ
Tr tþ1 Tr t
are constant with value β in time. If α > 0, then the exponential trend is increasing
for β > 1 and decreasing for 0 < β < 1. The both parameters of exponential trend can
be estimated by taking its logarithm which transfers this trend to the linear one
ln Tr t ¼ ln α þ t ln β: ð3:14Þ
Then it is sufficient to find the antilogarithm for the estimated parameters ln α and ln
β (in any case, if one conjectures that an analyzed trend could be exponential, one
should plot the corresponding time series using the logarithmic scale). Moreover, by
taking the antilogarithm of the prediction intervals in the linear model (3.14), one can
also construct the prediction intervals in the original exponential model. On the other
hand, practical experiences with the exponential trend (3.12) (and also with other
models of nonlinear regression which can be transferred to the linear regression
using a suitable transformation) show that more consistent estimation results can be
obtained using the weighted least squares method (WLS) with weights which are
obtained by a suitable transformation of the original weights since it is not possible
to assume the multiplicative form and the normal-logarithmic distribution of residual
components in the original model (3.12) before transformation. In particular for the
exponential trend, WLS method consists in minimizing the expression
X
n
2
vt ðyt αβt Þ , ð3:15Þ
t¼1
where the weights vt are chosen in advance. However instead of the expression
(3.15), one minimizes the sum of weighted least squares of the form
X
n
wt ð ln yt ln α t ln βÞ2 , ð3:16Þ
t¼1
where the weights wt are constructed in dependence on the original weights vt in such
a way that the minimization of (3.15) and (3.16) provides nearly identical estimates α
and β. It can be shown that in our case of logarithmic transformation one can put
wt ¼ y2t vt , t ¼ 1, . . . , n: ð3:17Þ
Since the most usual choice of original weights is vt ¼ 1, t ¼ 1, ..., n (if there is a
priori no reason to prefer some of given observations), the transformed weights are
simply wt ¼ yt2, t ¼ 1, ..., n. Minimizing the expression (3.16) with the weights
(3.17), one obtains the following system of normal equations:
X X X
y2t ln α þ ty2t ln β ¼ y2t ln yt ,
X X X
ty2t ln α þ t 2 y2t ln β ¼ ty2t ln yt
with explicit solution

P P P P
t 2 y2t y2t ln yt ty2t ty2t ln yt
ln a ¼ P 2 P 2 2 P 2 2 ,
yt t yt tyt
P 2 P 2 P P
yt tyt ln yt ty2t y2t ln yt
ln b ¼ P 2 P 2 2 P 2 2 : ð3:18Þ
yt t yt tyt
Example 3.2 (exponential trend). Table 3.3 and Fig. 3.3 present the elimination of
exponential trend in the time series yt of the US gross national income (at current
prices in billions of USD) for particular years 1960–2016 (t ¼ 1, ..., 57). The
auxiliary results for formulas (3.18) given in Table 3.4 enable to calculate the
estimated parameters:
ln a ¼ 7:203 85, i:e: a ¼ 1 344:60,

ln b ¼ 0:047 816, i:e: b ¼ 1:048 98
so that the eliminated exponential trend (regarded as the smoothed time series in
practice) can be calculated in Table 3.3 as
byt ¼ 1 344:60 1:048 98t :
Figure 3.3 also plots the modified exponential trend (see (3.19) below) that fits
⋄
obviously better the given time series than the exponential trend.
3.1.2.3 Modified Exponential Trend
This trend of the type
Tr t ¼ γ þ αβt , t ¼ 1, . . . , n ð β > 0Þ ð3:19Þ
is the three-parametric generalization of the exponential trend (the parameters are

denoted as α, β ,and γ ). It is suitable to model trends with constant ratio of
50
Table 3.3 Annual data 1960–2016 and eliminated exponential trend in Example 3.2 (US gross national income in bn USD)
Year t yt (bn USD) byt (bn USD) Year t yt (bn USD) byt (bn USD) Year t yt (bn USD) byt (bn USD)
1960 1 542.7 1410.5 1979 20 2619.3 3498.8 1998 39 9167.6 8679.1
1961 2 562.0 1479.5 1980 21 2852.8 3670.1 1999 40 9725.3 9104.2
1962 3 604.7 1552.0 1981 22 3207.1 3849.9 2000 41 10,421.2 9550.1
1963 4 638.1 1628.0 1982 23 3374.7 4038.5 2001 42 10,788.6 10,017.8
1964 5 686.4 1707.7 1983 24 3621.0 4236.3 2002 43 11,098.9 10,508.5
1965 6 744.6 1791.4 1984 25 4038.3 4443.7 2003 44 11,591.4 11,023.2
1966 7 815.9 1879.1 1985 26 4320.9 4661.4 2004 45 12,372.6 11,563.0
1967 8 862.4 1971.2 1986 27 4530.4 4889.7 2005 46 13,221.8 12,129.4
1968 9 943.3 2067.7 1987 28 4847.2 5129.2 2006 47 14,140.8 12,723.4
1969 10 1020.7 2169.0 1988 29 5275.8 5380.4 2007 48 14,585.8 13,346.6
1970 11 1076.9 2275.2 1989 30 5618.3 5643.9 2008 49 14,791.2 14,000.3
1971 12 1165.9 2386.6 1990 31 5922.9 5920.3 2009 50 14,494.5 14,686.0
1972 13 1283.9 2503.5 1991 32 6117.2 6210.3 2010 51 15,121.1 15,405.3
1973 14 1435.1 2626.1 1992 33 6459.4 6514.4 2011 52 15,802.9 16,159.8
1974 15 1557.0 2754.8 1993 34 6758.4 6833.5 2012 53 16,596.1 16,951.3
1975 16 1688.6 2889.7 1994 35 7195.8 7168.2 2013 54 17,073.7 17,781.5
1976 17 1874.0 3031.2 1995 36 7602.3 7519.3 2014 55 17,899.1 18,652.4
1977 18 2087.0 3179.7 1996 37 8075.4 7887.6 2015 56 18,496.1 19,565.9
1978 19 2355.0 3335.4 1997 38 8620.4 8273.9 2016 57 19,041.6 20,524.2
Σ 21,944.2 Σ 101,057.6 Σ 266,430.3
Source: AMECO (European Commission Annual Macro-Economic Database) (https://ec.europa.eu/economy_finance/ameco/user/serie/SelectSerie.cfm)
3 Trend
25000
20000
15000
10000
5000
0
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
US gross national income at current prices (bn USD)

eliminated exponential trend
eliminated modified exponential trend
Fig. 3.3 Annual data 1960–2016 and eliminated exponential trend in Example 3.2 (US gross
national income in bn USD)
Table 3.4 Auxiliary results in Example 3.2 (exponential trend)

P P P P P
y2t ty2t t 2 y2t ∑ ln yt y2t ln yt ty2t ln yt
4 528 876 399 215 566 247 678 10 576 070 915 906 475 42 932 861 468 2 058 612 296 639
neighboring differences, but in addition to this property of exponential trend, the

modified trend can be bounded asymptotically (sometimes one formulates it practi-
cally as a movement to saturation level; see Fig. 3.4).
The approximative parameter estimation for the modified exponential trend
(optionally before using more accurate iterative procedures of nonlinear optimiza-
tion) consists in the following approach. We split the observed time series into three
parts of the same length m (if n 6¼ 3m, then we omit one or two observations
preferably in the beginning of time series) and add up the observations in particular
thirds so that one can write
X X αβðβm 1Þ
y Tr ¼ mγ þ ,
1 t 1 t β1
52 3 Trend
Fig. 3.4 Modified yt

exponential trend
α < 0, 0 < β < 1, γ > 0
X X αβmþ1 ðβm 1Þ
y Tr ¼ mγ þ ,
2 t 2 t β1
X X αβ2mþ1 ðβm 1Þ
y Tr ¼ mγ þ ,
3 t 3 t β1
where, e.g., ∑1yt and ∑1Trt denote the sum of observed and trend values from the
first third of time series, respectively. Solving this system of equations, one can
stepwise obtain the estimates b, a, c of parameters β , α, γ as
P P 1=m
y y
b¼ P3 t P2 t , ð3:20Þ
y
2 t 1 yt
X X
b1
a¼ y t y ,
1 t
ð3:21Þ
bð bm 1Þ 2 2

1 X abðbm 1Þ
c¼ y : ð3:22Þ
m 1 t b1
Another approach is also possible: If fixing the value of parameter β, the model
(3.19) will become obviously the linear model in which one estimates simply the
parameters α and γ for various fixed values β and chooses finally the variant
minimizing SSE.
Example 3.3 (modified exponential trend). Table 3.5 and Fig. 3.5 present the elim-
ination of the modified exponential trend in the time series yt of the Japan gross
national income (at current prices in billions of JPY) for particular years 1960–2016
(t ¼ 1, ..., 57).
The data are divided into three groups (m ¼ 19), for which particular sums are
calculated (see Table 3.5). Then the formulas (3.20)–(9.22) provide stepwise the
following results:
Table 3.5 Annual data 1960–2016 and eliminated modified exponential trend in Example 3.3
(Japan gross national income in bn JPY)
Year t yt Year t yt Year t yt
1960 1 16,421 1979 20 227,692 1998 39 519,390
1961 2 19,817 1980 21 246,449 1999 40 511,280
1962 3 22,480 1981 22 264,544 2000 41 516,340
1963 4 25,717 1982 23 278,328 2001 42 513,933
1964 5 30,225 1983 24 289,723 2002 43 507,189
1965 6 33,640 1984 25 308,153 2003 44 507,117
1966 7 39,080 1985 26 331,538 2004 45 513,112
1967 8 45,807 1986 27 346,885 2005 46 515,652
1968 9 54,222 1987 28 361,520 2006 47 521,152
1969 10 63,708 1988 29 388,725 2007 48 530,313
1970 11 75,124 1989 30 419,067 2008 49 518,002
1971 12 82,724 1990 31 452,267 2009 50 484,216
1972 13 94,845 1991 32 479,613 2010 51 495,651
1973 14 115,496 1992 33 492,078 2011 52 486,254
1974 15 137,541 1993 34 495,227 2012 53 490,386
1975 16 152,089 1994 35 499,681 2013 54 496,725
1976 17 170,819 1995 36 505,821 2014 55 506,607
1977 18 190,438 1996 37 517,710 2015 56 522,127
1978 19 209,883 1997 38 530,218 2016 57 525,062
Σ 1,580,076 Σ 7,435,239 Σ 9,680,508
Source: AMECO (European Commission Annual Macro-Economic Database) (https://ec.europa.
P P 1=m 1=19
y y 9 680 508 7 435 239
b¼ P3 t P2 t ¼ ¼ 0:950 804,
2 yt 1 yt 7 435 239 1 580 076
X X
b1
a¼ y t y
1 t
bð bm 1Þ 2 2
0:950 804 1
¼ 2 ð7 435 239 1 580 076Þ ¼ 797 015,
0:950 804 0:950 80419 1

1 X abðbm 1Þ
c¼ y
m 1 t b1

1 ð797 015Þ 0:950 804 0:950 80419 1
¼ 1 580 076 ¼ 583 001:
19 0:950 804 1
The eliminated modified exponential trend (regarded again as the smoothed time
series) can be calculated as
54 3 Trend
600000
500000
400000
300000
200000
100000
-100000
-200000
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Japanese gross national income at current prices (bn JPN)

eliminated modified exponential trend
Fig. 3.5 Annual data 1960–2016 and eliminated modified exponential trend in Example 3.3 (Japan
gross national income in bn JPY)
byt ¼ 583 001 797 015 0:950 804t
and it is plotted in Fig. 3.5 (moreover, the saturation “insurmountable” level for the
Japan gross national income according to this model should be approximately
⋄
583 000 bn JPY).
3.1.2.4 Logistic Trend
This three-parametric trend has the form
γ
Tr t ¼ , t ¼ 1, . . . , n ðβ > 0, γ > 0Þ: ð3:23Þ
1 þ αβt
It is plotted schematically in Fig. 3.6.

The logistic trend has the inflection (i.e., the change of convex course to concave
one and vice versa) in the point t ¼ lnα/lnβ . It can be also bounded asymptotically
(see again the saturation level γ in Fig. 3.6). If deriving this trend with respect to time
t (time is regarded in this context as a continuous variable), one obtains
dTr t ln β
¼ Tr t ðγ Tr t Þ: ð3:24Þ
dt γ
It is another important indicator of the growth of trend curves (in general, the first
derivative of a trend curve is usually called the growth function). According to
(3.24), the velocity of growth of logistic trend is directly proportional to the achieved
level Trt and to the distance of the achieved level from the saturation level, i.e., γ
Trt ; see Fig. 3.6. Moreover, the first derivative (3.24) is symmetric around inflection
point lnα/lnβ . Hence the logistic trend can be classified as so-called S-curve
symmetric around inflection point (S-curves have been discussed in Sect. 2.2.3,
e.g., as a suitable instrument for modeling sales of new products; see Fig. 2.2).
As the estimation of logistic trend is concerned, its parameters can be estimated
by means of various methods. For example, the logistic trend can be regarded as the
reciprocal value of modified exponential trend so that one can apply the formulas
(3.20)–(3.22) for the time series with values 1/yt. Another approach consists in
so-called difference parametric estimation which is based on the time series of the
first differences yt+1 yt. Here we approximate the trend component Trt in (3.24) by
the real observations yt so that one can write
dyt ln β
y ðγ yt Þ: ð3:25Þ
dt γ t
If we approximate further
dyt y yt
tþ1 ¼ ytþ1 yt ¼ dt , ð3:26Þ
dt ðt þ 1Þ t
where dt denotes the time series of the first differences, then it follows from (3.25)
dt ln β
ln β þ y: ð3:27Þ
yt γ t
Using the classical least squares method in the linear regression model
Fig. 3.6 Logistic trend yt
α > 1, 0 < β < 1, γ > 0
−lnα / lnβ t
56 3 Trend
dt ln β
¼ ln β þ y þ εt ð3:28Þ
yt γ t
one obtains the OLS estimates of lnβ and lnβ/γ and hence the estimates of
parameters β and γ. In order to obtain the estimate of α, we finally approximate Trt
by yt in (3.23)
γ
αβt 1: ð3:29Þ
yt
After taking the logarithm and making the sum over t ¼ 1, ..., n one gets so-called
Rhodes formula

ðn þ 1Þ ln β 1 X
n
γ
ln α ¼ þ ln 1 , ð3:30Þ
2 n t¼1 yt
which enables to estimate the parameter α.

Example 3.4 (logistic trend). Fig. 3.7 presents the elimination of logistic trend for
the data from Example 3.3 (Japan gross national income at current prices in billions
of JPY) estimated as
518 158
byt ¼ :
1 þ 37:143 7 0:845 748t
Obviously the logistic trend fits the given time series better than the modified
exponential trend in Fig. 3.5 (e.g., the estimated values of modified exponential
trend have turned out negative at the beginning of the given time series). The
saturation level for the Japan gross national income is approximately 518 160 bn JPY
⋄
in this case.
3.1.2.5 Gompertz Trend
The trend in the form of this curve arises similarly to the logistic trend by
transforming the modified exponential trend. In this case, one puts
ln Tr t ¼ γ þ αβt , t ¼ 1, . . . , n ð β > 0Þ ð3:31Þ
or equivalently
600000
500000
400000
300000
200000
100000
0
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Japanese gross national income at current prices (bn JPY)

eliminated logistic trend
Fig. 3.7 Annual data 1960–2016 and eliminated logistic trend in Example 3.4 (Japan gross
national income in bn JPY)
Tr t ¼ exp ðγ þ αβt Þ, t ¼ 1, . . . , n ðβ > 0Þ: ð3:32Þ
If applying the parameter values from Fig. 3.8, then Gompertz trend has the
inflection in the point t ¼ ln(α)/lnβ and is bounded asymptotically. However,
the first derivative of this curve (i.e., the growth function) is not symmetric around
inflection point, but it is skewed to the right. Hence Gompertz trend is classified as
the S-curve asymmetric around inflection point. The estimation procedure is similar
to that for the modified exponential trend using the time series with values ln yt.
Example 3.5 (Gompertz trend). Fig. 3.9 presents the elimination of Gompertz trend
for the data from Example 3.3 (Japan gross national income at current prices in
billions of JPY) estimated as
byt ¼ exp ð13:2004 489:071 0:909 765t Þ:
The model implies the saturation level for the Japan gross national income approx-
imately 540 580 bn JPY.
Examples 3.1–3.5 demonstrate that time series of the same type (in our case the
gross national incomes) can be modeled using different trend curves. The choice of
the appropriate curve may depend on the economic or financial hypotheses: the
national income will not be saturated in the future (then e.g. the linear or even
58 3 Trend
α < −1, 0 < β < 1
−ln(−α ) / lnβ t
Fig. 3.8 Gompertz trend
600000
500000
400000
300000
200000
100000
0
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
Japanese gross national income at current prices (bn JPY)

eliminated Gompertz trend
Fig. 3.9 Annual data 1960–2016 and eliminated Gompertz trend in Example 3.5 (Japan gross
national income in bn JPY)
exponential trend) or there will be different levels of saturation (in the Japanese case
the lowest level 518 160 bn JPY applying the logistic trend or the highest one
⋄
583 000 bn JPY applying the modified exponential trend).
Remark 3.2 The examples of trend curves given in this section can be parametrized
in other ways, e.g., logistic trend (3.23) as
β0
Tr t ¼ , t ¼ 1, . . . , n, ð3:33Þ
1 þ exp ðβ1 þ β2 t Þ
or Gompertz trend (3.32) as
β
Tr t ¼ 0 , t ¼ 1, . . . , n: ð3:34Þ
exp β1 βt2
Moreover, there is plenty of other trend curves, e.g., logarithmic trend
Tr t ¼ β0 þ β1 ln t, t ¼ 1, . . . , n, ð3:35Þ
which seems to resemble the modified exponential trend in Fig. 3.4 except for the
fact that it grows indefinitely (not to a saturation level), or Johnson trend
β
Tr t ¼ 0 , t ¼ 1, . . . , n: ð3:36Þ
β1
exp
β2 þ t
⋄
3.1.2.6 Splines
Sometimes the trend changes its character in time and cannot be modeled by means
of a single mathematical curve over the whole range of observations (or only in a
complicated way). In such a case, one can use the technique of so-called spline
functions. Here instead of applying sophisticated mathematical functions, one splits
the given time series to several segments and estimates the trends in particular
segments by simpler functions linked to each other. Moreover, the joint curve
must be sufficiently smooth which can be guaranteed, e.g., by means of conditions
for the existence of two-sided derivatives of appropriate orders in joint points.
Splines consist frequently of piecewise polynomials with pieces defined by a
sequence of knots where the pieces join smoothly.
As an example, Fuller (1976) used for the time series of average wheat yields
(in the USA in years 1908–1971) the trend which is compounded from the following
curves (t ¼ 1 corresponds to the year 1908):
Tr t ¼ 13:97, t ¼ 1, . . . , 25,
Tr t ¼ 13:97 þ 0:0123ðt 25Þ2 , t ¼ 25, . . . , 54,
Tr t ¼ 24:314 þ 0:664ðt 54Þ, t ¼ 54, . . . , 64:
60 3 Trend
In the first joint point t ¼ 25, there exists the two-sided derivative of the first order,
while in the second joint point t ¼ 54, the corresponding one-sided first derivatives
obviously differ from each other.
The simplest case of the spline function is a piecewise linear function that is
linear in all segments but with different slopes. However, such a function is
not flexible enough and, moreover, is not smooth in knots. In practice it is
common to use cubic splines with such cubic polynomials in particular segments
that the two-sided derivatives of the second order exist in the corresponding
knots (higher order polynomials can have erratic behavior at the boundaries of
the domain).
Penalized splines present a different approach to this issue [see, e.g., Eilers and
Marx (2010), Durbin and Koopman (2012)]. Suppose that we wish to approximate a
time series y1, . . ., yT by a relatively smooth function Trt. The penalized spline
method chooses Trt by minimizing
X
T X
T 2
ðyt Tr t Þ2 þ λ Δ2 Tr t ð3:37Þ
t¼1 t¼3
with respect to Trt for given λ > 0. The penalty is based on the level of variation in
Trt measured by the second difference Δ2Trt ¼ Trt 2Trt-1 Trt-2 [see (3.61)]. If λ
is small, the values of Trt will be close to the values of yt but Trt may not be smooth
enough. If λ is large, the Trt series will be smooth but the values of Trt may not be
close enough to the values of yt.
Remark 3.3 In order to choose the appropriate trend curve for the given time series,
one can make use of simple reference tests based on characteristic features of
⋄
particular curves. A survey of such tests is given in Table 3.6.
Table 3.6 Reference tests for choice of appropriate trend curves

Trend Reference test
Linear First differences yt+1 yt are approximately constant
Quadratic Second differences yt+2 2yt+1 + yt are approximately constant
Exponential Ratios of neighboring values yt+1/yt (or first differences of logarithmic values
ln yt+1 ln yt) are approximately constant
Modified Ratios of neighboring first differences (yt+2 yt+1)/(yt+1 yt) are approxi-
exponential mately constant
Logistic (1) Histogram of first differences yt+1 yt looks like density N (0, 1)
(2) Ratios of neighboring first differences of reciprocal values
(1/yt+2 1/yt+1)/(1/ yt+1 1/yt) are approximately constant
Gompertz Ratios of neighboring first differences of logarithmic values
(ln yt+2 ln yt+1)/( ln yt+1 ln yt) are approximately constant
3.2 Method of Moving Averages 61
3.2 Method of Moving Averages
The method of moving averages is a representative of so-called adaptive approaches

to the trend component (as well as the exponential smoothing in Sect. 3.3). The
convenient property of such approaches consists in the possibility to manage sys-
tematic components (e.g., trend) which vary their character in time (in particular, it is
not possible to apply mathematical curves with steady parameters in such a case). On
the other hand, one assumes that in local segments of time series such trend
elimination by means of mathematical curves is possible when these curves are
constructed using different parameters in different segments. In other words, only a
local elimination of trend is acceptable. For instance, a time series cannot be
smoothed by means of the linear trend
β0 þ β1 τ, τ ¼ 1, . . . , n, ð3:38Þ
but for short segments with middles in particular times t one can apply local trends
β0 ðt Þ þ β1 ðt Þτ, τ ¼ . . . , t 1, t, t þ 1, : . . . ð3:39Þ
Obviously, the process of trend elimination according to (3.39) adapts itself to the
actual local run of time series, and, moreover, the intensity of this adaptation can be
controlled. Another advantage of adaptive methods is the numerical simplicity and
the construction of predictions which respond flexibly to eventual changes in the
character of time series.
As the moving averages are concerned, this term denotes linear combinations of
time series values with the unit sum of weights, e.g.,
1
yt2 þ 2yt1 þ 2yt þ 2ytþ1 þ ytþ2 : ð3:40Þ
8
The construction of such a linear combination is equivalent amazingly to the local

elimination of trends by means of specific mathematical curves which is shown in
Sect. 3.2.1 (we still assume the basic form (3.1) of time series models).
3.2.1 Construction of Moving Averages by Local Polynomial

Fitting
This approach is based on the axiom that each “reasonable” function can be
approximated in an acceptable way by a polynomial. Respecting the previous
discussion, at first let us fit by a suitable polynomial the initial time series segment
of length 2m + 1 and take the value of this polynomial at time t ¼ m + 1 (i.e., in the
middle of this segment) as the smoothed value bymþ1 of given time series at this time.
62 3 Trend
In order to obtain the smoothed value at time t ¼ m + 2, we use similarly the

observations y2, ..., y2m+2, etc. It will be shown in this section that such an algorithm
is formally equivalent to the construction of smoothed values as linear combinations
of original observations with fixed weights, i.e., to the construction of moving
averages.
For instance, let us fit by the polynomial of the third order (i.e., by the cubic
parabola) gradually the time series segments of length 2m +1 ¼ 5 denoted formally
as
ytþτ , τ ¼ 2, 1, 0, 1, 2: ð3:41Þ
The parameters of this polynomial can be estimated by means of the least squares
method (i.e., as OLS estimates) minimizing the expression
X
2 2
ytþτ β0 β1 τ β2 τ2 β3 τ3 : ð3:42Þ
τ¼2
If deriving with respect to particular parameters, we obtain the system of four normal
equations for the estimates b0, b1, b2, b3 of parameters β0, β1, β2, β3 written as
X
2 X
2 X
2 X
2 X
2
ytþτ τ j b0 τ j b1 τ jþ1
b2 τ jþ2
b3 τ jþ3
¼ 0, j ¼ 0, 1, 2, 3:
τ¼2 τ¼2 τ¼2 τ¼2 τ¼2
ð3:43Þ
Since it holds for each odd i that
X
2
τi ¼ 0 ð3:44Þ
τ¼2
(it is one of the reasons for the choice of time series segments with the odd number
2m + 1 of observations), this system of equations simplifies to the form
X
5b0 þ 10b2 ¼ ytþτ ,
X
10b1 þ 34b3 ¼ τ ytþτ ,
X ð3:45Þ
10b0 þ 34b2 ¼ τ2 ytþτ ,
X
34b1 þ130b3 ¼ τ3 ytþτ :
However, we are interested only in the estimate b0 since it is the value of the fitting
polynomial b0 + b1τ + b2τ 2 + b3τ 3 at the point τ ¼ 0. Therefore, b0 is taken in our
method as the smoothed value of time series in the middle of the investigated
segment yt-2, ..., yt+2. Obviously, it is sufficient to use only the first and third equation
of system (3.45) with solution
X X
1
b0 ¼ 17 ytþτ 5 τ2 ytþτ
35
1
¼ 3yt2 þ 12yt1 þ 17yt þ 12ytþ1 3ytþ2 , ð3:46Þ
35
so that the fitted trend component, which presents simultaneously the smoothed
value of time series at time t, is also equal to
1
byt ¼ 3yt2 þ 12yt1 þ 17yt þ 12ytþ1 3ytþ2 : ð3:47Þ
35
It can be written symbolically as
1 1
byt ¼ ð3, 12, 17, 12, 3Þyt ¼ ð3, 12, 17, . . .Þyt : ð3:48Þ
35 35
Example 3.6 In this example one applies the formula (3.47) to smooth the time
series given in Table 3.7. The smoothed value at time t ¼ 3 is
1
by3 ¼ ð3 1 þ 12 8 þ 17 27 þ 12 64 3 125Þ ¼ 27 ¼ y3 :
35
Analogously
by4 ¼ 64 ¼ y4 ,
etc. This result corresponds to the fact that one smoothes the cubic time series by the
cubic polynomial in this example (one would obtain the same results for any poly-
⋄
nomials with the order higher than three).
Table 3.7 Cubic time series t 1 2 3 4 5 6 7 8 9 10

from Example 3.6
yt 1 8 27 64 125 216 343 512 729 1000
64 3 Trend
In general, we can fit segments of length 2m + 1 by polynomials of order r to

obtain the moving averages of length 2m + 1 and order r. The smoothed value byt at
time t is the linear combination of expressions
X
m
τ j ytþτ ð3:49Þ
τ¼m
with even j ( j r), which can be derived if generalizing the system of equations
(3.45). After rearrangement it gives a linear combination of values ytm, ..., yt+m with
fixed coefficients called weights of moving average. One can verify easily the
following properties of moving averages:
1. The sum of weights of moving average is equal to one (if one applies the moving
average to any series of constant values, then obviously the smoothed values must
be again the original constant values).
2. The weights are symmetric around the middle value (since for even j the values
ytτ and yt+τ in the expressions of type (3.49) have symmetric coefficients).
3. If r is even, then the moving averages of orders r and r +1 with the same length
2m + 1 are identical (looking, e.g., at the system of equations (3.45), then
obviously its solution for the unknown b0 does not depend on including or not
including the unknown b3 to this system).
Let us note that the described moving averages produce only the smoothed values
bymþ1 , . . . , bynm (i.e., m values at the beginning and m values at the end remain
unsmoothed).
Another note concerns the case when it is desirable to smooth time series using
segments with an even length 2m; then the positions of smoothed values should be
just in the middle of original unit time intervals which has no reasonable practical
interpretation. We will solve both mentioned problems later (see, e.g., the centered
moving averages in Sect. 3.2.2).
Table 3.8 Weights of moving averages

Order
length 2 and 3 4 and 5
3 (0, 1, 0) (0, 1, 0)
5 35(3, 12, 17, . . .)
1 (0, 0, 1, . . .)
21(2, 3, 6, 7, . . .) 231(5, 30, 75, 131, . . .)
7 1 1
231(21, 14, 39, 54, 59, . . .) 429(15, 55, 30, 135, 179, . . .)
9 1 1
429(36, 9, 44, 69, 84, 89, . . .) 429(18, 45, 10, 60, 120, 143, . . .)
11 1 1
143(11, 0, 9, 16, 21, 24, 25, . . .) 2431(110, 198, 135, 110, 390, 600, 677, . . .)
13 1 1
Table 3.8 summarizes the weights of moving averages of various lengths and
orders (r ¼ 2, ..., 5). Since the moving averages are symmetric, one gives only the
first half of weights (the middle one is bold-faced). The weights for the second and
third order or for the fourth and fifth order are equal (see above). The moving
averages of order zero and one are omitted since they have the form of arithmetic
averages
ytm þ . . . þ ytþm
:
2m þ 1
However for the sake of completeness, this table includes, e.g., the moving averages
of length 3 and order 3 in spite of the fact that it holds byt ¼ yt in such a case.
We have stressed above that the application of moving averages of length
2m + 1 does not deliver the smoothed values for the first m and the last
m observations and any predictions at all. Let us go back to fitting always five
neighboring observations by the cubic parabola (see above), and let the fitted
segment be the last one with values yn4, ..., yn. In contrast to the previous
construction, now we are interested in the values of the cubic parabola that fit the
last segment for τ ¼ 1 and 2 (these values have been ignored before). Therefore in
addition we also need the estimates of parameters β1, β2, and β3 in the parabola
model (before it has been sufficient to estimate only β0). Solving the system of
equations (3.45), one can easily find these estimates in the form
!
1 X 2 X2
b1 ¼ 65 τytþτ 17 τ ytþτ ,
3
72 τ¼2 τ¼2
!
1 X 2 X
2 2
b2 ¼ τ y 2 ytþτ ,
14 τ¼2 tþτ τ¼2
!
1 X 2 X 2
b3 ¼ 5 τ ytþτ 17
3
τytþτ : ð3:50Þ
72 τ¼2 τ¼2
Together with the value (3.46) of b0 one obtains for the last two observations yn1
and yn the following smoothed values:
byn2þk ¼ b0 þ b1 k þ b2 k2 þ b3 k 3
1 k
¼ ð3, 12, 17, 12, 3Þyn2 þ ð1, 8, 0, 8, 1Þyn2 þ
35 12
k2 k3
ð2, 1, 2, 1, 2Þyn2 þ ð1, 2, 0, 2, 1Þyn2 , k ¼ 1, 2: ð3:51Þ
14 12
When substituting k ¼ 1 and k ¼ 2, one gets explicitly

66 3 Trend
1 1
byn1 ¼ ð2, 8, 12, 27, 2Þyn2 , byn ¼ ð1, 4, 6, 4, 69Þyn2 : ð3:52Þ
35 70
Due to the apparent symmetry we can also immediately rewrite it for the first and
second value at the beginning of time series
1 1
by1 ¼ ð69, 4, 6, 4, 1Þy3 , by2 ¼ ð2, 27, 12, 8, 2Þy3 : ð3:53Þ
70 35
Moreover, this approach enables to construct predictions in the given time series:
e.g., the prediction of value yn+1 can be constructed when substituting k ¼ 3 to
(3.51), i.e.,
1
bynþ1 ðnÞ ¼ ð4, 11, 4, 14, 16Þyn2 : ð3:54Þ
5
Apparently it can be recommended only for short-term predictions: longer prediction

horizons worsen significantly the confidence of predictions.
The previous moving averages are called logically beginning moving averages
[see, e.g., (3.53)], end moving averages (see, e.g., (3.52)), and prediction moving
averages [see, e.g., (3.54)]. One should stress that these moving averages lose the
advantageous properties in comparison with the ones in Table 3.8: their weights are
not symmetric around the middle value and the weights, e.g., for the second and third
order are not identical. On the other side, the weights for these moving averages also
used to be tabulated in the literature [see, e.g., Kendall (1976)]. Table 3.9 shows the
beginning moving averages of the second and third order with various lengths.
Tables 3.10, 3.11, and 3.12 summarize the one-step-ahead moving averages of the
first, second, and third order with various lengths (i.e., they predict value yn+1 at time
n). For instance, according to the first row of Table 3.10, the one-step-ahead
prediction using the moving averages of the first order with length 3 has the form
1
bynþ1 ðnÞ ¼ ð2yn2 þ yn1 þ 4yn Þ:
3
There is an important question concerning the moving average methods, namely

which length and order of moving averages to choose for given observations. In
practice, we usually make subjective decisions preferring simple averages of low
orders. For example, the length of moving averages depends on the desired level of
time series smoothing (obviously the longer the moving average is chosen, the
higher level of smoothing will follow). These are practical problems which deserve
some further comments.
As the choice of length of moving averages is concerned, one of the important
principles declares that this length should correspond to the period of seasonal or
cyclical fluctuations that are to be eliminated from the time series. Fig. 3.10 shows
schematically applications of moving averages of non-adequate lengths. Here one
has used the moving averages of lengths 3 and 5 to eliminate a biennial cycle. In the
first case (see Fig. 3.10a), the smoothing causes the “inverse cycle” since in each
Table 3.9 Beginning moving averages of the second and third order
Order r ¼ 2
Length 5 Length 7 Length 9
by1 by2 by3 by1 by2 by3 by4 by1 by2 by3 by4 by5
31 9 3 32 5 1 2 109 126 378 14 21
9 13 12 15 4 3 3 63 92 441 273 14
3 12 17 3 3 4 6 27 63 464 447 39
5 6 12 4 2 4 7 1 39 447 536 54
3 5 3 6 1 3 6 15 20 390 540 59
35 35 35 3 0 1 3 21 6 293 459 54
5 1 2 2 17 3 156 293 39
42 14 14 21 3 7 21 42 14
21 6 238 294 21
165 330 2310 2310 231
Order r ¼ 3
Length 5 Length 7 Length 9
by1 by2 by3 by1 by2 by3 by4 by1 by2 by3 by4 by5
69 2 3 39 8 4 2 85 56 28 56 21
4 27 12 8 19 16 3 28 65 392 84 14
6 12 17 4 16 19 6 2 56 515 144 39
4 8 12 4 6 12 7 12 36 432 145 54
1 2 3 1 4 2 6 9 12 234 108 59
70 35 35 4 7 4 3 0 9 12 54 54
2 4 1 2 8 20 143 4 39
42 42 42 21 8 14 140 21 14
7 16 112 0 21
99 198 1 386 462 231
group of three neighboring observations there are either two upper turning points and
one lower turning point, or vice versa. In the second case (see Fig. 3.10b) just the
opposite situation occurs: the smoothed time series follows the original time series
upward to the upper turning points and downward to the lower turning points.
On the contrary, as the choice of order of moving averages is concerned, one can
decide on it by means of the following objective criterion based on differencing time
series (see also Remark 3.4). Let the given time series yt fulfill the model (3.1), where
Trt is a polynomial of the rth order (we denote it as β0 +β1 t + ... + βr tr) and Et is the
Table 3.10 One-step-ahead Length bynþ1 ðnÞ

moving averages of the first 1
3 3 (2, 1, 4)
order
10(4, 1, 2, 5, 8)
5 1
7(2, 1, 0, 1, 2, 3, 4)
7 1
36(8, 5, 2, 1, 4, 7, 10, 13, 16)

9 1
55(10, 7, 4, 1, 2, 5, 8, 11, 14, 17,

11 1
20)
26(4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, 7,
13 1
8)
68 3 Trend
Table 3.11 One-step-ahead moving averages of the second order

Length bynþ1 ðnÞ
5(3, 3, 4, 0, 9)
5 1
7(3, 1, 3, 3, 1, 3, 9)

7 1
42(14, 0, 9, 13, 12, 6, 5, 21, 42)

9 1
165(45, 9, 17, 33, 39, 35, 21, 3, 37, 81,

11 1
135)
143(33, 11, 6, 18, 25, 27, 24, 16, 3,
13 1
15, 38, 66, 99)
Table 3.12 One-step-ahead moving averages of the third order

Length bynþ1 ðnÞ
5(4, 11, 4, 14, 16)
5 1
7(4, 6, 4, 3, 8, 4, 16)

7 1
126(56, 49, 64, 24, 36, 81, 76, 14, 224)

9 1
66(24, 12, 24, 19, 4, 14, 28, 31 16, 4, 96)

11 1
143(44, 11, 36, 38, 24, 1, 24, 44, 52, 41, 4,
13 1
66, 176)
yt (a) yt (b)
1995 2005 t 1995 2005 t

original time series smoothed time series
Fig. 3.10 Applications of moving averages of non-adequate lengths
white noise (we denote it for simplicity as εt and its variance as σ 2). The cor-
responding criterion, which should find r as the order of moving averages in
question, consists in differencing gradually the analyzed time series. When
differencing yt, we decrease the order of its polynomial trend by one in each step:
e.g., the order of the polynomial
ð β 0 þ β 1 t þ . . . þ β r t r Þ ð β 0 þ β 1 ð t 1Þ þ . . . þ β r ð t 1Þ r Þ
is r 1, etc. It is important in our context that Δr+1Trt ¼ 0, i.e., the polynomial trend
Trt can be eliminated completely after applying r + 1 gradual differences (only
r differences are not sufficient since they produce a constant which may be nonzero).
As differencing the white noise εt is concerned, then its kth difference

k k
Δ εt ¼ εt
k
εt1 þ εt2 . . . þ ð1Þk εtk ð3:55Þ
1 2
has zero mean value and variance fulfilling

2 2 !
k k 2k 2
var Δk εt ¼ 1þ þ þ ... þ 1 σ ¼
2
σ : ð3:56Þ
1 2 k
If we denote
P
n k 2
Δ yt
t¼kþ1
Vk ¼ , ð3:57Þ
2k
ðn k Þ
k
then obviously Vk for k r +1 is the estimate of variance σ 2 of εt which is consistent

in routine situations. These conclusions hold even if we approximate the trend
component by polynomials of order r only locally as it is the case in our approach
to constructing moving averages. Let us calculate the values V1, V2, ... gradually till
we note that these values start to converge to a constant. In such a case, if the values
Vr+1, Vr+2, ... are close to this constant value, then our criterion speaks in favor of the
order r of corresponding moving averages (moreover, the mentioned constant
represents the estimate of variance of the residual component of given time series).
However, applying this criterion one must be very cautious. Usually the values Vk
are not mutually independent and they incline to grow or descend without converg-
ing manifestly to a constant. Sometimes the sequence of Vk is decreasing, or it
decreases to a minimal value Vr+1 and then starts to grow slowly to an asymptotic
level (in the latter case, one can take r as the upper limit for the order of
corresponding moving averages). Lately, numerically simple methods have been
suggested that look for the order of moving averages by minimizing a suitable
criterion function.
Remark 3.4 In general, a very important operation in time series analysis is
construction of so-called dth differences. Theoretically, they can be based on
the concept of operators which simplify various time series models (see, e.g.,
Sect. 6.4.2):
• lag operator B delays a variable defined in time by one time unit:
Byt ¼ yt1 ; ð3:58Þ
• jth power of lag operator B delays a variable defined in time by j time units:
70 3 Trend
B j yt ¼ B j1
ðByt Þ ¼ B j1
yt1 ¼ . . . ¼ ytj ; ð3:59Þ
• difference operator Δ produces the first difference:
Δyt ¼ yt yt1 ¼ ð1 BÞyt ; ð3:60Þ
• second power Δ2 of difference operator produces the second difference:
Δ2 yt ¼ ΔðΔyt Þ ¼ ðyt yt1 Þ ðyt1 yt2 Þ ¼ yt 2yt1 þ yt2

¼ ð1 BÞ2 yt ; ð3:61Þ
• dth power Δd of difference operator produces the dth difference:

d d
Δ yt ¼ Δ
d d1
ðΔyt Þ ¼ yt yt1 þ yt2 . . . þ ð1Þd ytd
1 2
¼ ð1 BÞd yt ; ð3:62Þ
• seasonal difference operator Δs for the length of season s (e.g., s ¼ 12 for

monthly observations) produces the seasonal difference:
Δs yt ¼ yt yts ¼ ð1 Bs Þyt : ð3:63Þ
⋄
Example 3.7 Table 3.13 and Fig. 3.11 present smoothing and predicting by moving
averages in the time series yt of the US nominal short-term interest rates (in % p.a.)
for particular years 1961–2015 (t ¼ 1, ..., 55).
The values Vk calculated in Table 3.14 according to (3.57) indicate that the upper
limit for the order of corresponding moving averages is probably equal to 3. There-
fore, we will apply for this time series the moving averages with the order r ¼ 3.
Table 3.13 and Fig. 3.11 present the calculated moving averages of this order which
have lengths 5 and 9 (i.e., m ¼ 2 and m ¼ 4). Figure 3.11 shows that the moving
averages of length 5 follow very closely the original observations so that the
eliminated trend includes some periodic and random fluctuations that should be
left aside from the trend component. On the contrary, the moving averages of length
9 smooth such short-term fluctuations in a sufficient way and, therefore, they should
be preferred for the trend elimination in our case. For smoothing one has applied the
weights from Table 3.8 so that, e.g., the moving averages of length 5 give
Table 3.13 Annual data 1961–2015 and smoothing and predicting by moving averages in Exam-
ple 3.7 (US nominal short-term interest rates in % p.a.)
byt byt byt byt
Year yt r ¼3, m ¼ 2 r ¼3, m ¼ 4 Year yt r ¼ 3, m ¼ 2 r ¼ 3, m ¼ 4
1961 2.37 2.37 2.26 1989 9.28 8.95 7.98
1962 2.77 2.77 2.88 1990 8.28 8.24 7.26
1963 3.17 3.17 3.30 1991 5.98 5.98 6.05
1964 3.57 3.53 3.62 1992 3.83 3.93 5.05
1965 3.97 4.18 3.91 1993 3.30 3.51 4.35
1966 4.86 4.43 4.46 1994 4.75 4.71 4.43
1967 4.30 4.67 5.21 1995 6.04 5.68 5.00
1968 5.35 5.43 5.63 1996 5.51 5.83 5.49
1969 6.74 6.52 5.43 1997 5.74 5.60 6.03
1970 6.28 6.03 5.34 1998 5.56 5.49 6.06
1971 4.32 4.49 5.79 1999 5.41 5.96 5.60
1972 4.18 4.76 5.97 2000 6.53 5.69 4.83
1973 7.19 6.77 5.95 2001 3.77 4.12 3.62
1974 7.89 7.49 5.94 2002 1.79 1.88 2.57
1975 5.77 6.15 5.99 2003 1.22 1.13 2.08
1976 5.00 4.93 6.07 2004 1.62 1.83 2.40
1977 5.33 5.47 6.19 2005 3.56 3.51 3.48
1978 7.37 7.45 7.80 2006 5.20 5.18 4.09
1979 10.11 9.75 9.90 2007 5.30 4.99 3.94
1980 11.56 12.33 11.14 2008 2.91 2.99 3.13
1981 13.97 12.77 11.70 2009 0.69 0.97 1.88
1982 10.60 11.10 11.39 2010 0.34 0.23 0.74
1983 8.67 9.20 10.24 2011 0.34 0.35 0.11
1984 9.54 8.99 8.87 2012 0.43 0.37 0.22
1985 8.38 8.32 7.66 2013 0.27 0.30 0.42
1986 6.83 7.15 7.71 2014 0.23 0.21 0.48
1987 7.19 7.06 8.03 2015 0.32 0.32 0.16
1988 7.98 8.23 8.09 2016 0.83 0.76
Source: AMECO (European Commission Annual Macro-Economic Database). (https://ec.europa.eu/
economy_finance/ameco/user/serie/SelectSerie.cfm)
1
by3 ¼ ð3 2:37 þ 12 2:77 þ 17 3:17 þ 12 3:57 3 3:97Þ ¼ 3:17%:
35
In the beginning and end of time series we have used the beginning and end moving
averages, respectively: e.g., applying again the moving averages of length 5 one gets
according to (3.53)
72 3 Trend
16
14
12
10
-2
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
US nominal short-term interest rates

moving averages: m=2, r=3
moving averages: m=4, r=3
Fig. 3.11 Annual data 1961–2015 and smoothing and predicting by moving averages in Example
3.7 (US nominal short-term interest rates in % p.a.)
Table 3.14 Choice of order k Vk

r of moving averages in
1 1.029
Example 3.7
2 0.474
3 0.309
4 0.243
5 0.209
6 0.190
7 0.176
8 0.166
9 0.158
10 0.150
1
by1 ¼ ð69 2:37 þ 4 2:77 6 3:17 þ 4 3:57 1 3:97Þ ¼ 2:37%:
70
Table 3.13 contains also the one-step-ahead predictions of US nominal short-term

interest rates for year 2016. For example, using the moving averages of order 3 and
length 9 (see Table 3.12), one obtains the following negative interest rate:
1
by31 ð30Þ ¼ ð56 5:30 þ 49 2:91 þ . . . þ 224 0:32Þ ¼ 0:76%:
126
Obviously for short-term predictions, the “short” moving averages should be pre-
ferred: e.g., the moving averages of order 3 and length 5 give a quite different
prediction 0.83% p.a. in this case. In general, the predictions based on moving
⋄
averages cannot be regarded as highly credible.
3.2.2 Other Types of Moving Averages
3.2.2.1 Arithmetic Moving Averages
In practice, simpler moving averages are popular. The simplest ones are the arith-
metic moving averages. For instance, the arithmetic moving averages of length 5 are
ð5Þ 1 1
yt ¼ ð1, 1, 1, 1, 1Þyt ¼ yt2 þ yt1 þ yt þ ytþ1 þ ytþ2 : ð3:64Þ
5 5
In general, the arithmetic moving averages of length 2m + 1 have the form
ð2mþ1Þ 1
yt ¼ ytm þ ytmþ1 þ . . . þ ytþm : ð3:65Þ
2m þ 1
They correspond to the moving averages from Sect. 3.2.1 with order 0 or 1 and the
same length 2m + 1 (i.e., the time series segments of length 2m + 1 are fitted using
constant or linear trend). Therefore, it holds, e.g., for the length 5 and order 0
ð5Þ 1
yn1 ¼ yðn5Þ ¼ bynþτ ðnÞ ¼ ðyn4 þ yn3 þ . . . þ yn Þ ð3:66Þ
5
(for any τ > 0) and for the length 5 and order 1
ð5Þ 1
yn1 ¼ ðy þ 2yn2 þ 3yn1 þ 4yn Þ,
10 n3
1
yðn5Þ ¼ ðyn4 þ yn2 þ 2yn1 þ 3yn Þ,
5
1
bynþ1 ðnÞ ¼ ð4yn4 yn3 þ 2yn2 þ 5yn1 þ 8yn Þ, . . . : ð3:67Þ
10
74 3 Trend
3.2.2.2 Centered Moving Averages
The centered moving averages modify the arithmetic moving averages in order to be
applicable when one smoothes economic time series over particular seasons with an
even number of observations (usually 4 for the quarterly data or 12 for the monthly
data). In such a situation, the methodological problem appears, namely whereabouts
to allocate the particular averages: e.g., the arithmetic average of the values over
January till December belongs to the midpoint between time points for June and July
values. However, when averaging such two neighboring moving averages (the first
one corresponds to the center of interval “June–July” and the second one to the
center of interval “July–August”), then the result can be undoubtedly allocated to the
time point “July”. In other words, we construct moving averages of the type

ð12Þ 1 1
yt ¼ y þ yt5 þ . . . þ yt þ . . . þ ytþ5
2 12 t6
1
þ yt5 þ yt4 þ . . . þ yt þ . . . þ ytþ6 Þ
12
1
¼ y þ 2yt5 þ 2yt4 þ . . . þ 2ytþ5 þ ytþ6 ð3:68Þ
24 t6
(obviously with length 13). That is, when calculating, e.g., the July value, one
exploits the February till December values of the given year (all with the weights
1/12) and the January values of the present and future year (both with weights 1/24).
In general, one can write
ð2mÞ 1
yt ¼ y þ 2ytmþ1 þ . . . þ 2ytþm1 þ ytþm : ð3:69Þ
4m tm
ð4Þ
These values are denoted as the centered moving averages (quarterly ones yt for
ð12Þ
m ¼ 2 or monthly ones yt for m ¼ 6).
3.2.2.3 Robust Moving Averages
This method denotes the moving averages that are capable of restraining or
completely filtering off the outliers (i.e., the outlying observations) in time series.
A simple example is the moving medians of odd length 2m + 1
ð2mþ1Þ
medyt ¼ med ytm , ytmþ1 , . . ., ytþm : ð3:70Þ
Figure 3.12 compares the moving medians with the arithmetic moving averages in
the case of one or two outliers (obviously the length 3 of moving averages is
3.3 Exponential Smoothing 75
yt yt
1200 1200
1000 1000
800 800
600 600
400 400
200 200
0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
moving medians of length 3 moving medians of length 5
arithmetic moving averages of length 3 arithmetic moving averages of length 5
Fig. 3.12 Application of robust moving averages
insufficient in the second case with two outliers so that it has been necessary to
prolong it to 5).
Remark 3.5 In practice, various software systems offer a plenty of other moving
averages presented frequently as filters. For instance in macroeconomics, especially
in real business cycle theory, so-called Hodrick-Prescott filter is popular to remove
the cyclical component (see, e.g., EViews). This filter constructs the smoothed
values by1 , . . . , byn for given time series y1, ..., yn by minimizing the expression
X
n X
n1 2
ðyt byt Þ2 þ λ bytþ1 byt ðbyt byt1 Þ : ð3:71Þ
t¼1 t¼2
The positive constant λ controls the intensity of smoothing of given time series
(obviously if λ ! 1, then the method eliminates the linear trend). Another example
is the moving averages based on OWA operators (ordered weighted averaging)
which calculate weighted averages of ordered values in particular segments of
time series. They can be of interest when we want to over- or underestimate the
⋄
results; see Merigó and Yager (2013).
3.3 Exponential Smoothing
The exponential smoothing is another adaptive approach to the trend component that
is frequently used in practice (see also the introduction to Sect. 3.2). It is a special
case of the moving averages, in which the values observed up to the present period
76 3 Trend
get weights that decrease exponentially with the age of particular observations. Such
moving averages byt are constructed minimizing the expressions of the form
ðyt byt Þ2 þ ðyt1 byt1 Þ2 β þ ðyt2 byt2 Þ2 β2 þ . . . , ð3:72Þ
where (0 < β < 1) is a fixed discount constant. The discounting of weights in (3.72)
can be interpreted in a reasonable way: the observations more distant in the past have
lower weights. At first glance this approach may seem complicated but from the
numerical point of view it is easily realized, in particular if one uses recursive
formulas. In this section, we again assume that time series have the form (3.1)
(i.e., the time series model consists only of trend and additive residual component).
More details can be found, e.g., in Abraham and Ledolter (1983), Bowerman and
O’Connell (1987), Montgomery and Johnson (1976), and others.
3.3.1 Simple Exponential Smoothing
The simple exponential smoothing is recommended for time series in which the
trend can be viewed as locally constant (i.e., constant in short segments of time
series)
Tr t ¼ β0 : ð3:73Þ
A natural objective is to find an estimate of parameter β0. However, since the

exponential smoothing approach is declared as adaptive, this estimate will depend on
the time moment of its construction. Let us denote b0(t) the estimate of parameter
β0 constructed at time t, i.e., using observations yt, yt1, yt2, ... known at time t.
Then the estimate b0(t) will represent both the fitted trend at time t and the smoothed
value byt of given time series. Because of (3.72), we will find it minimizing the
expression
X
1 2
ytj β0 β j , ð3:74Þ
j¼0
where β (0 < β < 1) is a fixed discount constant. It should be pointed out that the sum
in the minimized expression (3.74) is infinite, although in real situations we always
know only a finite number of values y1, ..., yt. However, the hypothetical extension of
time series to the past simplifies significantly the corresponding formulas due to
simpler limit results. In any case, the numerical calculations based on this abstraction
exploit only the observed values y1, ..., yt of given time series (see below).
If we derive (3.74) with respect to β0 and put this derivative equal to zero, then
due to the convexity of minimized function we get the estimate b0(t) of parameter
β0 at time t as
X
1
byt ¼ ð1 βÞ β j ytj : ð3:75Þ
j¼0
Hence one can see that the smoothed value of time series at time t is the weighted
average of values of this time series till time t with weights decreasing exponentially
to the past
1 β, ð1 βÞβ, ð1 βÞβ2 , : . . . ð3:76Þ
Since the formula (3.75) is not comfortable for practical calculations, it is transferred
to the recursive form
byt ¼ αyt þ ð1 αÞbyt1 , ð3:77Þ
where α ¼ 1 β (0 < α < 1) is so-called smoothing constant. The recursive formula

(3.77) also clearly demonstrates merits of exponential smoothing: (1) the calcula-
tions are simple; (2) the method is parsimonious since a low saving capacity is
sufficient: it suffices to save only the previous smoothed value instead of the whole
history of time series; (3) it is possible to control the intensity of smoothing: the
smoothing with a higher constant α (e.g., α ¼ 0.3) responds quickly to changes in the
character of data so that its smoothing ability is lower since the role of the first
summand is dominant in (3.77), while the smoothing ability of (3.77) is enhanced for
a lower constant α (e.g., α ¼ 0.1).
When we apply the simple exponential smoothing to construct predictions in time
series, then due to (3.73) one can put
bytþτ ðt Þ ¼ byt ð3:78Þ
for any τ > 0. In particular, the most usual prediction in practice is
bynþτ ðnÞ ¼ byn : ð3:79Þ
In addition to the formulas (3.75) and (3.77), there exists the third form of smoothing
formula making use of (3.78), namely
byt ¼ byt1 þ αðyt byt1 Þ ¼ byt1 þ αðyt byt ðt 1ÞÞ ¼ byt1 þ α et , ð3:80Þ
The form (3.80) is sometimes denoted as the “error” formula: in order to correct the
previous smoothed value byt1 one exploits the (reduced) one-step-ahead error et of
prediction byt ðt 1Þ constructed at time t 1 as soon as the value yt is observed.
78 3 Trend
The practical realization of the recursive formula (3.77) of simple exponential

smoothing assumes the choice of initial value by0 and smoothing constant α :
1. The choice of by0 : One usually uses the arithmetic average of a short initial
segment of time series (e.g., y1, ..., y6) or directly y1 (then it is obviously by1 ¼ y1).
2. The choice of α : First of all, the interval 0 < α 0.3 is recommended in practice,
and the following choices of α from this interval are possible in practice:
(a) The fixed choice α ¼ 0.1 or α ¼ 0.2.
(b) The choice
1
α¼ , ð3:81Þ
mþ1
where 2m + 1 is the length of arithmetic moving averages which would be

adequate for the given time series (this choice is derived comparing so-called
mean age of arithmetic moving averages of this length, i.e.,
X
2m
k
, ð3:82Þ
k¼0
2m þ 1
and the mean age of weights of simple exponential smoothing, i.e.,
X
1
kαð1 αÞk , ð3:83Þ
k¼0
where the mean ages (3.82) and (3.83) must coincide, and hence α expressed
as (3.81) follows); however, this approach is handicapped by the fact that at
first one must decide on an adequate length of moving averages.
(c) The estimate of α : one admits the grid points 0.01; 0.02; ..., 0.30 as possible
values of the smoothing constant and chooses such a value of α from this grid
that gives the best predictions with minimum SSE [see (2.10)] in the given
time series. This approach is included in many software systems (see, e.g.,
EViews).
Example 3.8 Figure 3.13 presents the simple exponential smoothing in the time
series yt of annual averages of exchange rates USD/EUR(ECU) for particular years
1960–2016 with t ¼ 1, ..., 57 (the former basket currency ECU of the European
Community was introduced as late as in 1979, but it was formally amended for the
period since 1960). The smoothed values for α ¼ 0.01, 0.02, and 0.3 including the
one-step-ahead prediction for year 2017 are plotted in the figure. As the possibility of
estimation of α is concerned (see above), EViews has found the value close to one
1.5
1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
USD/EUR(ECU) exchange rates (annual averages)

simple exponential smoothing: alpha=0.1
Fig. 3.13 Annual data 1960–2016 and single exponential smoothing in Example 3.8 (annual
averages of exchange rates USD/EUR(ECU)). Source: AMECO (European Commission Annual
Macro-Economic Database) (https://ec.europa.eu/economy_finance/ameco/user/serie/SelectSerie.
cfm)
(α ¼ 0.999), which means that the smoothed time series nearly coincides with the
⋄
original time series.
Remark 3.6 Assuming the normal distribution of residual components one can
construct by means of exponential smoothing not only point predictions but also
interval predictions [(3.84) is only approximative without this assumption]. For
example, the (1p)100% prediction interval (i.e., 95% interval if p ¼ 0.05) is
recommended in the form

bynþτ ðnÞ u1p=2 d τ MAE, bynþτ ðnÞ þ u1p=2 d τ MAE , ð3:84Þ
where for any τ > 0 one substitutes

u1p/2 the (1p/2)-quantile of standard normal distribution N(0, 1);
dτ the constant equal to √(π/2) 1.25 in order to transform MSE to MAE;
80 3 Trend
Pn
MAE the mean absolute deviation [see (2.17)]: MAE ¼ ð1=nÞ t¼1 jyt byt ðt 1Þj.
⋄
Remark 3.7 The exponential smoothing algorithm can be controlled by means of
so-called adaptive control process, which
• indicates that the applied type of exponential smoothing stops being adequate for
given time series;
• adjusts automatically the values of smoothing constants when it is necessary (then
this methodology can be looked upon as a special case of so-called stochastic
control).
Here the indicators of “default” are, e.g., significantly high values It(α) constructed
online as
jY t ðαÞj
I t ðα Þ ¼ , ð3:85Þ
D t ðα Þ
where
X
t
1 X
t
Y t ðαÞ ¼ e j ðαÞ, Dt ðαÞ ¼ e ðαÞ : ð3:86Þ
j¼1
t j¼1 j
The symbol ej(α) denotes the error of prediction of value yj (the prediction is
constructed at time j 1, i.e., one-step-ahead, applying the smoothing constant
equal to α). The indicator It(α) is usually calculated online, and its increased values
exceeding a given boundary K indicate that a change of α or even a change of the
type of exponential smoothing is necessary (e.g., one should apply the double
exponential smoothing from Sect. 3.3.2 instead of the simple exponential smooth-
ing). The boundary K is usually fixed in the range from 4 to 6 (it can be constructed
in a similar way to the critical value of statistical tests with fixed significance levels).
One of the methods to change automatically the smoothing constant α makes use
of the approach that three procedures of exponential smoothing are realized in
parallel with three different values of this constant: one can use, e.g., the values α
0.05, α, α + 0.05. Here only the procedure using at time t the “middle” smoothing
constant α delivers the output results for users, and (in each time point) the meth-
odology compares the values Dt(α 0.05), Dt(α), Dt(α + 0.05) calculated according
to (3.86). If it holds
Dt ðαÞ min ðDt ðα 0:05Þ, Dt ðα þ 0:05ÞÞ, ð3:87Þ
then the algorithms goes on without changes. However, if it is, e.g.,

Dt ðα þ 0:05Þ < Dt ðαÞ, ð3:88Þ
then the “middle” process transfers α to α + 0.05, and the algorithm goes on using the
triplet of smoothing constants α, α + 0.05, α + 0.10, respectively.
⋄
3.3.2 Double Exponential Smoothing
The double exponential smoothing (sometimes denoted as Brown’s method) is

suitable for time series in which the trend can be viewed as locally linear (i.e., linear
in short segments)
Tr tj ¼ β0 þ β1 ðjÞ: ð3:89Þ
The estimates b0(t) and b1(t) constructed at time t for the parameters β0 and β1 are
obtained by minimizing the expression
X
1
2
ytj ðβ0 þ β1 ðjÞÞ β j , ð3:90Þ
j¼0
where again β (0 < β < 1) is a fixed discount constant. If we put the partial
derivatives of (3.90) with respect to β0 and β1 both equal to zero, we get the system
of normal equations
X
1 X
1 X
1 X
1 X
1 X
1
β j ytj β0 β j þ β1 jβ j ¼ 0, jβ j ytj β0 jβ j þ β1 j2 β j ¼ 0,
j¼0 j¼0 j¼0 j¼0 j¼0 j¼0
ð3:91Þ
which can be simplified by means of the formulas
X
1
1 X
1
β X
1
β ð1 þ β Þ
βj ¼ , jβ j ¼ , j2 β j ¼ ð3:92Þ
j¼0
1β j¼0 ð1 β Þ2 j¼0 ð1 β Þ3
to the form
β X 1
β ð1 þ β Þ X 1
β0 β ¼ ð1 βÞ β j ytj , β β0 β1 ¼ ð1 βÞ2 jβ j ytj :
1β 1 j¼0
1β j¼0
ð3:93Þ
82 3 Trend
In order to simplify the denotation, one introduces so-called simple smoothing

statistics
X
1
St ¼ ð 1 β Þ β j ytj ð3:94Þ
j¼0
(due to (3.75), this St corresponds to the value of time series smoothed at time t by
the simple exponential smoothing). Therefore, it holds according to (3.77)
St ¼ αyt þ ð1 αÞSt1 ð3:95Þ
(again we put α ¼ 1 β). Similarly one introduces so-called double smoothing

statistics
½2
X
1
St ¼ ð 1 β Þ β j Stj ð3:96Þ
j¼0
(the relation (3.96) is analogous to (3.94), but the values yt are replaced by St). Hence
the analogy to the recursive relation (3.95) gives
½2 ½2
St ¼ αSt þ ð1 αÞSt1 : ð3:97Þ
The introduced smoothing statistics enable to rewrite the system of normal equation
(3.93) to the form
β β ð1 þ β Þ ½2
β0 β ¼ St , β β0 β1 ¼ St ð1 βÞSt : ð3:98Þ
1β 1 1β
Solving these equations, one gets the corresponding estimates as

½2 1β ½2
b0 ðt Þ ¼ 2St St , b1 ð t Þ ¼ St St : ð3:99Þ
β
Then the prediction of value yt+τ constructed at time t has the natural form

ατ α τ ½2
bytþτ ðt Þ ¼ b0 ðt Þ þ bt ðt Þ τ ¼ 2 þ St 1 þ S : ð3:100Þ
1α 1α t
The special case τ ¼ 0 delivers the smoothed value of time series, i.e.,
½2
byt ¼ 2St St : ð3:101Þ
The statistics St and St[2] are calculated recursively according to (3.95) and (3.97).
Again the practical realization of the recursive formulas of double exponential

smoothing assumes the choice of initial values S0 and S0[2] and smoothing constant α :
1. The initial values S0 and S0[2] can be set up using the relations (3.99) for
t ¼0, where b0(0) and b1(0) are chosen as the classical regression estimates of
parameters β0 and β1 fitting the line through a short initial segment of time series
(e.g., y1, ..., y6). Then the values S0 and S0[2] can be expressed from (3.99)
explicitly as
1α ½2 2ð1 αÞ
S0 ¼ b0 ð 0Þ b1 ð0Þ, S0 ¼ b0 ð 0Þ b1 ð0Þ: ð3:102Þ
α α
2. For the choice of smoothing constant α in practice, one again recommends the
interval 0 < α 0.3, in which (similarly to the simple exponential smoothing) we
can use:
(a) The fixed choice α ¼ 0.1 or α ¼ 0.2.
(b) The choice
rffiffiffiffiffiffiffiffiffiffiffiffi
1
α¼ , ð3:103Þ
mþ1
where 2m + 1 is the length of arithmetic moving averages which would be

adequate in this case (it follows again by comparing the mean age of weights
of arithmetic moving averages and the mean age of weights of double
exponential smoothing)
(c) The estimate of α : one admits the grid points 0.01; 0.02; . . ., 0.30 as possible
values of the smoothing constant and chooses such a value of α from this grid
that gives the best predictions with minimum SSE.
Remark 3.8 Using the same denotation as in Remark 3.6, the (1p)100% predic-
tion interval can be constructed in the form

bynþτ ðnÞ u1p=2 dτ MAE, bynþτ ðnÞ þ u1p=2 dτ MAE , ð3:104Þ
where for any τ > 0 one substitutes
0 11=2
1 þ ð1þβ
1β
Þ3
1 þ 4β þ 5β2 þ 2ð1 βÞð1 þ 3βÞτ þ 2ð1 βÞ2 τ2
d τ 1:25 @ A :
2
1 þ ð1þβ
1β
Þ 3 1 þ 4β þ 5β 2
þ 2 ð 1 β Þ ð 1 þ 3β Þ þ 2 ð 1 β Þ
ð3:105Þ
⋄
84 3 Trend
Remark 3.9 A natural extension of simple and double exponential smoothing is the
triple exponential smoothing (the local quadratic trend necessitates to introduce the
triple smoothing statistics St[3] in addition to St and St[2]). Even though the exponen-
tial smoothing of a general order r is possible, the order r ¼ 3 is the highest, which is
⋄
used in practice.
3.3.3 Holt’s Method
Holt’s method generalizes the double exponential smoothing by introducing two

smoothing constants: α (it smoothes the level Lt of time series) and γ (it smoothes the
slope Tt of the same time series). Both constants lie between zero and one (0 < α,
γ < 1):
Lt ¼ αyt þ ð1 αÞðLt1 þ T t1 Þ, ð3:106Þ

T t ¼ γ ðLt Lt1 Þ þ ð1 γ ÞT t1 , ð3:107Þ
byt ¼ Lt , ð3:108Þ
bytþτ ðt Þ ¼ Lt þ T t τ ðτ > 0Þ: ð3:109Þ
The initial values are recommended as L0 ¼ y1 and T0 ¼ y2 y1. It is interesting that

Holt’s method has been suggested ad hoc by a logic consideration. For example, the
elimination of level Lt of given time series according to (3.106) is constructed as a
convex combination of the present value yt at time t and the estimate of this value
which was calculated in previous time t 1 (simply as Lt1 + Tt1
1). Later one
has proved that the double exponential smoothing with smoothing constant α is a
special case of Holt’s method with smoothing constants αHolt and γ Holt of the form
α
αHolt ¼ αð2 αÞ, γ Holt ¼ : ð3:110Þ
2α
Example 3.9 Table 3.15 and Fig. 3.14 present double exponential smoothing (with
the fixed choice of smoothing constant α ¼ 0.15) and Holt’s method (with the fixed
choice of smoothing constants α ¼ 0.1 and γ ¼ 0.2) in the time series yt of the US
nominal short-term interest rates (in % p.a.) for particular years 1961–2015 (t ¼ 1, ...,
55). The smoothed values and the one-step-ahead prediction for year 2016 have been
obtained by EViews and can be compared with the corresponding results by moving
⋄
averages in Example 3.7 for the same data (see Fig. 3.11).
3.4 Exercises 85
Table 3.15 Annual data 1961–2015 and smoothing and predicting by double exponential smooth-
ing and Holt’s method in Example 3.9 (US nominal short-term interest rates in % p.a.)
Doub. exp. Holt α ¼ Doub. exp. Holt α ¼
Year t yt α ¼ 0.15 0.1, γ ¼ 0.2 Year t yt α ¼ 0.15 0.1, γ ¼ 0.2
1961 1 2.37 2.37 2.26 1989 29 9.28 8.95 7.98
1962 2 2.77 2.77 2.88 1990 30 8.28 8.24 7.26
1963 3 3.17 3.17 3.30 1991 31 5.98 5.98 6.05
1964 4 3.57 3.53 3.62 1992 32 3.83 3.93 5.05
1965 5 3.97 4.18 3.91 1993 33 3.30 3.51 4.35
1966 6 4.86 4.43 4.46 1994 34 4.75 4.71 4.43
1967 7 4.30 4.67 5.21 1995 35 6.04 5.68 5.00
1968 8 5.35 5.43 5.63 1996 36 5.51 5.83 5.49
1969 9 6.74 6.52 5.43 1997 37 5.74 5.60 6.03
1970 10 6.28 6.03 5.34 1998 38 5.56 5.49 6.06
1971 11 4.32 4.49 5.79 1999 39 5.41 5.96 5.60
1972 12 4.18 4.76 5.97 2000 40 6.53 5.69 4.83
1973 13 7.19 6.77 5.95 2001 41 3.77 4.12 3.62
1974 14 7.89 7.49 5.94 2002 42 1.79 1.88 2.57
1975 15 5.77 6.15 5.99 2003 43 1.22 1.13 2.08
1976 16 5.00 4.93 6.07 2004 44 1.62 1.83 2.40
1977 17 5.33 5.47 6.19 2005 45 3.56 3.51 3.48
1978 18 7.37 7.45 7.80 2006 46 5.20 5.18 4.09
1979 19 10.11 9.75 9.90 2007 47 5.30 4.99 3.94
1980 20 11.56 12.33 11.14 2008 48 2.91 2.99 3.13
1981 21 13.97 12.77 11.70 2009 49 0.69 0.97 1.88
1982 22 10.60 11.10 11.39 2010 50 0.34 0.23 0.74
1983 23 8.67 9.20 10.24 2011 51 0.34 0.35 0.11
1984 24 9.54 8.99 8.87 2012 52 0.43 0.37 0.22
1985 25 8.38 8.32 7.66 2013 53 0.27 0.30 0.42
1986 26 6.83 7.15 7.71 2014 54 0.23 0.21 0.48
1987 27 7.19 7.06 8.03 2015 55 0.32 0.32 0.16
1988 28 7.98 8.23 8.09 2016 56 0.83 0.76
Source: AMECO (European Commission Annual Macro-Economic Database) (https://ec.europa.
3.4 Exercises
Exercise 3.1 Repeat the analysis from Example 3.1 (the linear trend in the Swiss
gross national income) only for data since 1990 (hint: 343.348 1 + 12.566 40 t, by27 ¼
682.6, (652.5; 712.8)).
Exercise 3.2 Repeat the analysis from Example 3.2 (the exponential trend in US
gross national income) only for data since 1970 (hint: 2 466.881.048 05t).
86 3 Trend
16
12
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
US nominal short-term interest rates

double exponential smoothing: alpha=0.15
Holt's method: alpha=0.1, gamma=0.2
Fig. 3.14 Annual data 1961–2015 and smoothing and predicting by double exponential smoothing
and Holt’s method in Example 3.9 (US nominal short-term interest rates in % p.a.)
Exercise 3.3 Repeat the analysis from Example 3.3 (the modified exponential trend
in Japan gross national income) only for data since 1970 (hint: 511 354-770 662
0,862 404t, saturation in 511 360).
Exercise 3.4 Eliminate numerically the logistic trend in Example 3.4 (the Japan
gross national income).
Exercise 3.5 Eliminate numerically Gompertz trend in Example 3.5 (the Japan
gross national income).
Exercise 3.6 Repeat the analysis from Example 3.7 (the moving averages for the
US nominal short-term interest rates) only for data since 1981.
Exercise 3.7 Derive the formulas (3.66) and (3.67) in the case arithmetic moving
averages of length 5.
Exercise 3.8 Repeat the analysis from Example 3.8 (the simple exponential
smoothing for the annual averages of exchange rates USD/EUR(ECU)) only for
data since 1980 (these data are not presented numerically in the monograph; hint:
α ¼ 0.999).
Exercise 3.9 Repeat the analysis from Example 3.9 (the double exponential
smoothing and Holt’s method for the US nominal short-term interest rates) only
for data since 1981.
Chapter 4
Seasonality and Periodicity
4.1 Seasonality in Time Series
This deals with the elimination of seasonal component describing periodic changes
in time series which pass off during one calendar year and repeat themselves each
year. Even if the moving averages from Sect. 3.2 are capable of eliminating the
seasonality significantly (e.g., the monthly centered moving averages (3.68) have
such an effect in the case of monthly seasonal observations), an effective seasonal
analysis should moreover deliver so-called seasonal indices I1, I2, . . . , Is (s denotes
the length of season, i.e., s ¼ 12 in the case of monthly observations). These indices
model the seasonality in particular seasons, and, moreover, they can be used not only
to eliminate the seasonal phenomenon but also to construct predictions. However,
they have sense only under the assumption that the seasonality is really regular so
that its modeling by repeating seasonal indices is justified for the given time series.
The seasonal indices have the following properties:
• The units, in which the seasonal indices are measured, depend on the type of
decomposition. When the decomposition is
– additive (i.e., yt ¼ Trt + Szt + Et): It is measured in the same units as the
corresponding time series yt (e.g., the December seasonal index of a retail sale
amounting to EUR 45m means that the seasonality manifests itself by the
December increase of time series by EUR 45m above the average trend
behavior);
– multiplicative (i.e., yt ¼ Trt Szt Et): It is a relative variable (e.g., the December
seasonal index of a retail sale amounting to 1.38 means that the seasonality
manifests itself by the December increase of time series by 38% above the
trend).
• It is typical for the multiplicative decomposition that the seasonal fluctuations
increase (decrease) with increasing (decreasing) trend, respectively, even if the
seasonal indices repeat themselves regularly in particular seasons (in the case of

https://doi.org/10.1007/978-3-030-46347-2_4
88 4 Seasonality and Periodicity
Fig. 4.1 Characteristic

shape of seasonal
fluctuations for additive and
multiplicative
decomposition
additive decomposition
multiplicative decomposition
additive decomposition, the seasonal fluctuation does not depend on the trend
monotonicity; see Fig. 4.1).
• The relation of trend and seasonal component is not determined unambiguously:
one of them can be shifted upward in an arbitrary way, if it is offset by shifting the
second one downward, and vice versa. This ambiguity is removed when the
seasonal indices are normalized. Such a normalization of seasonal indices differs
again according to the type of decomposition. When the decomposition is:
– additive: then the usual normalization rule demands that the sum of seasonal
indices over each season must be equal to zero; e.g., monthly observations
must fulfill for each i 0
I 1þ12i þ I 2þ12i þ . . . þ I 12þ12i ¼ 0; ð4:1Þ
– multiplicative: then the usual normalization rule demands that the product of
seasonal indices over each season must be equal to one; e.g., monthly obser-
vations must fulfill for each i 0
I 1þ12i I 2þ12i . . . I 12þ12i ¼ 1 ð4:2Þ
(obviously after taking logarithm, this rule transfers to the form (4.1)), or
occasionally
I 1þ12i þ I 2þ12i þ. . . þ I 12þ12i ¼ 12: ð4:3Þ

4.1 Seasonality in Time Series 89
4.1.1 Simple Approaches to Seasonality
In practice (especially in various software systems) one prefers such approaches to

seasonality that are as simple as possible from the calculation point of view. If the
seasonal indices are (nearly) fixed for different seasons, then it can be recommended
(e.g., in EViews) to apply the following algorithms (they will be described here only
for the case of monthly observations, but they are quite analogous for quarterly
observations):
4.1.1.1 Additive Decomposition
ð12Þ
1. One constructs the centered moving averages yt (in the case of quarterly
ð4Þ
observations, one should construct the centered moving averages yt ). At the
beginning and at the end of time series, one can repeat the first and the last
calculable centered moving average, respectively (if it is necessary).
2. The centered moving averages can be looked upon as a raw estimate of trend
component that enables to eliminate the trend from data
ð12Þ
yt ¼ yt yt : ð4:4Þ
3. One constructs the (non-normalized) seasonal indices I1, I2, . . ., I12, where the
seasonal index Ij for the jth month is estimated as the arithmetic average of all
values yt, which correspond to the jth month over all years included in time series
( j ¼ 1, . . ., 12).
4. One normalizes the values I1, I2, . . ., I12 by subtracting their arithmetic mean
I 1 þ . . . þ I 12
I j ¼ I j I ¼ I j , j ¼ 1, . . . , 12, ð4:5Þ
12
so that the normalization rule (4.1) is fulfilled.

5. One accomplishes the final seasonal elimination obtaining
byðt 12Þ ¼ yt I j , ð4:6Þ
where the index t corresponds to the jth month of year.

4.1.1.2 Multiplicative Decomposition
ð12Þ
1. One constructs the centered moving averages yt similarly as in the case of
additive decomposition (see above).
2. One eliminates the trend from data
yt
yt ¼ ð12Þ
: ð4:7Þ
yt
3. One constructs the (non-normalized) seasonal indices I1, I2, . . ., I12, where the
seasonal index Ij for the jth month is estimated as the arithmetic average of all
values yt, which correspond to the jth month over all years included in time series
( j ¼ 1, . . ., 12).
4. One normalizes the values I1, I2, . . ., I12 by dividing by their geometric mean
I j I j
Ij ¼ ¼ p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , j ¼ 1, . . . , 12, ð4:8Þ
bI 12
I 1 . . . I 12
so that the normalization rule (4.2) is fulfilled.

5. One accomplishes the final seasonal elimination obtaining
yt
byðt 12Þ ¼ , ð4:9Þ
Ij
where the index t corresponds to the jth month of year.

Remark 4.1 A well-known software for seasonal time series denoted as X-ARIMA
(e.g., X-12-ARIMA or X-13ARIMA-SEATS) has been developed by the U.S.
Census Bureau (see, e.g., Dagum and Bianconcini (2016) or EViews). It consists
in gradual application of several special moving averages in the form of compound
filters. Moreover, it includes also procedures for calendar irregularities, seasonal
⋄
Box–Jenkins methodology, and the like.
Example 4.1 Table 4.1 and Fig. 4.2 present the additive elimination of seasonality
in the time series yt of the Czech construction production index for particular quarters
2009Q1-2016Q4 (t ¼ 1, . . ., 32). Table 4.1 shows also numerically the estimated
seasonal indices I1, . . ., I4 according to formulas (4.4)–(4.6).
⋄
Example 4.2 Figure 4.3 presents the multiplicative elimination of seasonality in the
time series yt of the job applicants kept in the Czech labor office register for
particular months 2005M1–2016M12 (t ¼ 1, . . ., 144; see also Table 4.4). The
Table 4.1 Quarterly data 2009Q1–2016Q4 and the simple approach to additive seasonal elimina-
tion in Example 4.1 (Czech construction production index)
Quarter yt Ij byðt 4Þ Quarter yt Ij byðt 4Þ

2009Q1 72.06 37.50 109.55 2013Q1 47.02 37.50 84.52
2009Q2 110.55 0.89 111.44 2013Q2 79.61 0.89 80.50
2009Q3 124.02 15.95 108.07 2013Q3 98.65 15.95 82.70
2009Q4 125.56 22.44 103.13 2013Q4 107.21 22.44 84.77
2010Q1 55.70 37.50 93.19 2014Q1 53.28 37.50 90.77
2010Q2 101.24 0.89 102.14 2014Q2 84.08 0.89 84.97
2010Q3 120.37 15.95 104.41 2014Q3 101.47 15.95 85.51
2010Q4 122.69 22.44 100.25 2014Q4 107.91 22.44 85.48
2011Q1 58.85 37.50 96.35 2015Q1 58.35 37.50 95.85
2011Q2 95.91 0.89 96.81 2015Q2 94.39 0.89 95.29
2011Q3 109.37 15.95 93.42 2015Q3 108.92 15.95 92.97
2011Q4 121.59 22.44 99.15 2015Q4 109.53 22.44 87.10
2012Q1 52.94 37.50 90.44 2016Q1 53.17 37.50 90.67
2012Q2 90.20 0.89 91.10 2016Q2 84.53 0.89 85.43
2012Q3 102.61 15.95 86.65 2016Q3 99.26 15.95 83.31
2012Q4 110.69 22.44 88.25 2016Q4 105.93 22.44 83.50
Source: Czech Statistical Office (https://www.czso.cz/csu/czso/stavebnictvi-casove-rady-archiv-
baze-2010)
130
120
110
100
90
80
70
60
50
40
2009 2010 2011 2012 2013 2014 2015 2016
Czech construction production index

additive seasonale limination by simple approach
Fig. 4.2 Quarterly data 2009Q1–2016Q4 and the simple approach to the additive seasonal
elimination in Example 4.1 (Czech construction production index)
output of EViews in Table 4.2 shows the estimated seasonal indices I1, . . ., I12
⋄
according to formulas (4.7)–(4.9).
640000
600000
560000
520000
480000
440000
400000
360000
320000
280000
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
job applicants kept in the Czech labour office register

multiplicative seasonal elimination
Fig. 4.3 Monthly data 2005M1–2016M12 and the simple approach to multiplicative seasonal
elimination in Example 4.2 ( job applicants kept in the Czech labor office register); see Table 4.4.
Source: Czech Statistical Office
Table 4.2 Monthly data Sample: 2005M01–2016M12

2005M1–2016M12 and the
simple approach to multipli-
cative seasonal elimination in Ratio to Moving Average
Example 4.2 ( job applicants Original Series: CZ_UNEMPL
kept in the Czech labor office Adjusted Series: SEASON_MULTIPL
register) Scaling Factors:
1 1.079644
2 1.083429
3 1.057297
4 1.006022
5 0.968020
6 0.951058
7 0.972616
8 0.973159
9 0.971287
10 0.957589
11 0.965881
12 1.026710
Source: calculated by EViews
4.1.2 Regression Approaches to Seasonality
The regression approaches differ from the simple approaches of Sect. 4.1.1 only by
estimating the seasonal indices using more sophisticated regression models.
4.1.2.1 Seasonality Modeled by Dummies
The additive seasonality can be conveniently modeled by means of regression with

s 1 zero-one regressors denoted in econometrics usually as dummies (s is the
length of season). As an example, let us consider the additive seasonal decomposi-
tion with linear trend for a quarterly time series yt (i.e., s ¼ 4). Then the
corresponding regression model can be chosen as
yt ¼ β0 þ β1 t þ α2 xt2 þ α3 xt3 þ α4 xt4 þ εt , ð4:10Þ
where the dummies x2, x3, x4 are defined by the following table:
t xt2 xt3 xt4

1 0 0 0
2 1 0 0
3 0 1 0
4 0 0 1
5 0 0 0
6 1 0 0
7 0 1 0
8 0 0 1
⋮ ⋮ ⋮ ⋮
The estimated model with OLS estimates b0, b1, a2, a3, a4 can be used to construct
the point and interval predictions, for which the future values of dummies are
obtained by natural extension of the previous table till the corresponding prediction
horizon. Moreover, if one needs the seasonal indices explicitly, then their normal-
ization (4.3) is possible in the form
b t ¼ ðb0 þ aÞ þ b1 t, bI 1 ¼ a, bI 2 ¼ a2 a, bI 3 ¼ a3 a, bI 4 ¼ a4 a, ð4:11Þ

Tr
where
a2 þ a3 þ a4
a¼ : ð4:12Þ
4
Table 4.3 Quarterly data 2009Q1–2016Q4 and the regression approach to additive seasonal
elimination in Example 4.3 (Czech construction production index); see Table 4.1
Quarter yt Ij byðt 4Þ Quarter yt Ij byðt 4Þ

2009Q1 72.06 37.38 109.44 2013Q1 47.02 37.38 84.40
2009Q2 110.55 0.47 111.02 2013Q2 79.61 0.47 80.08
2009Q3 124.02 15.73 108.29 2013Q3 98.65 15.73 82.92
2009Q4 125.56 22.12 103.44 2013Q4 107.21 22.12 85.09
2010Q1 55.70 37.38 93.08 2014Q1 53.28 37.38 90.66
2010Q2 101.24 0.47 101.71 2014Q2 84.08 0.47 84.55
2010Q3 120.37 15.73 104.64 2014Q3 101.47 15.73 85.74
2010Q4 122.69 22.12 100.57 2014Q4 107.91 22.12 85.79
2011Q1 58.85 37.38 96.23 2015Q1 58.35 37.38 95.73
2011Q2 95.91 0.47 96.38 2015Q2 94.39 0.47 94.86
2011Q3 109.37 15.73 93.64 2015Q3 108.92 15.73 93.19
2011Q4 121.59 22.12 99.47 2015Q4 109.53 22.12 87.41
2012Q1 52.94 37.38 90.32 2016Q1 53.17 37.38 90.55
2012Q2 90.20 0.47 90.67 2016Q2 84.53 0.47 85.00
2012Q3 102.61 15.73 86.88 2016Q3 99.26 15.73 83.53
2012Q4 110.69 22.12 88.57 2016Q4 105.93 22.12 83.81
Source: Czech Statistical Office
Example 4.3 Table 4.3 and Fig. 4.4 present the additive elimination of seasonality
in the time series yt of the Czech construction production index for particular quarters
2009Q1–016Q4 (t ¼ 1, . . ., 32) using the regression approach to seasonality (see
also Example 4.1 for the same data but applying the simple approach to additive
seasonality from Sect. 4.1.1). Again Table 4.3 shows numerically the seasonal
indices I1, . . ., I4 estimated according to formulas (4.10)–(4.12) with the only
difference that the model using quadratic trend has been applied instead of linear
trend used in (4.10), i.e.,
yt ¼ β0 þ β1 t þ β2 t 2 þ α2 xt2 þ α3 xt3 þ α4 xt4 þ εt :
This model has been estimated as
byt ¼ 74:85 2:15 t þ 0:04t 2 þ 36:91xt2 þ 53:11xt3 þ 59:50xt4 ,
so that normalization (4.11) with a¼ 37.38 gives
b t ¼ 112:23 2:15 t þ 0:04t 2 ; bI 1 ¼ 37:38; bI 2 ¼ 0:47; bI 3 ¼ 15:73; bI 4 ¼ 22:12:

Tr
130
120
110
100
90
80
70
60
50
40
2009 2010 2011 2012 2013 2014 2015 2016 2017
Czech construction production index and point predictions for 2017

additive seasonal elimination by regression approach
Fig. 4.4 Quarterly data 2009Q1–2016Q4 and the regression approach to the additive seasonal
elimination including predictions for data 2017Q1–2017Q4 in Example 4.3 (Czech construction
production index)
Table 4.3 and Fig. 4.4 again present the values byðt 4Þ after seasonal elimination (4.6).
Finally, the predictions for year 2017 are shown in Fig. 4.4, e.g., for 2017Q1
calculated as
by33 ð32Þ ¼ 74:85 2:15 33 þ 0:04 332 þ 36:91 0 þ 53:11 0 þ 59:50 0

¼ 51:98:
⋄
4.1.2.2 Seasonality Modeled by Goniometric Functions
The goniometric functions enable us to model the seasonality with length of season
s explicitly by means of models of the form

2πt 2πt
yt ¼ β0 þ β1 t þ β2 sin þ β3 cos þ εt ð4:13Þ
s s
(linear trend with additive seasonality);

2πt 2πt
yt ¼ β0 þ β1 t þ β2 t sin þ β3 t cos þ εt ð4:14Þ
s s
(linear trend with multiplicative seasonality) and the like.

Table 4.4 Monthly data 2005M1–2016M12 and the multiplicative seasonal elimination by Holt–
Winters’ method including predictions for data 2017M1–2017M12 in Example 4.4 ( job applicants
kept in the Czech labor office register)
obs yt byt obs yt byt obs yt byt
2005M1 561 662 561 401 2009M5 457 561 415 928 2013M9 557 058 560 632
2005M2 555 046 560 999 2009M6 463 555 436 915 2013M10 556 681 557 446
2005M3 540 456 543 236 2009M7 485 319 471 992 2013M11 565 313 566 554
2005M4 512 557 514 050 2009M8 493 751 490 025 2013M12 596 833 606 851
2005M5 494 576 492 728 2009M9 500 812 500 613 2014M1 629 274 636 620
2005M6 489 744 483 847 2009M10 498 760 500 469 2014M2 625 390 639 169
2005M7 500 325 496 289 2009M11 508 909 512 454 2014M3 608 315 619 771
2005M8 505 254 497 258 2009M12 539 136 554 066 2014M4 574 908 585 569
2005M9 503 396 498 453 2010M1 574 226 586 434 2014M5 549 973 560 130
2005M10 491 878 492 879 2010M2 583 135 594 621 2014M6 537 179 546 537
2005M11 490 779 496 296 2010M3 572 824 585 490 2014M7 541 364 554 316
2005M12 510 416 524 791 2010M4 540 128 562 720 2014M8 535 225 549 561
2006M1 531 235 549 598 2010M5 514 779 539 075 2014M9 529 098 545 301
2006M2 528 154 539 345 2010M6 500 500 526 087 2014M10 519 638 533 373
2006M3 514 759 519 664 2010M7 505 284 534 095 2014M11 517 508 531 787
2006M4 486 163 489 954 2010M8 501 494 526 799 2014M12 541 914 556 343
2006M5 463 042 468 218 2010M9 500 481 517 533 2015M1 556 191 579 989
2006M6 451 106 456 462 2010M10 495 161 502 021 2015M2 548 117 567 196
2006M7 458 270 461 290 2010M11 506 640 503 480 2015M3 525 315 540 863
2006M8 458 729 458 063 2010M12 561 551 536 489 2015M4 491 585 502 268
2006M9 454 182 453 279 2011M1 571 853 590 852 2015M5 465 689 473 425
2006M10 439 788 442 873 2011M2 566 896 589 323 2015M6 451 395 456 122
2006M11 432 573 441 332 2011M3 547 762 568 253 2015M7 456 341 456 930
2006M12 448 545 460 103 2011M4 513 842 533 396 2015M8 450 666 450 923
2007M1 465 458 479 713 2011M5 489 956 504 445 2015M9 441 892 446 274
2007M2 454 737 469 777 2011M6 478 775 488 996 2015M10 430 432 435 493
2007M3 430 474 448 495 2011M7 485 584 495 685 2015M11 431 364 431 793
2007M4 402 932 414 079 2011M8 481 535 491 332 2015M12 453 118 451 977
2007M5 382 599 388 422 2011M9 475 115 485 599 2016M1 467 403 469 766
2007M6 370 791 374 060 2011M10 470 618 471 305 2016M2 461 254 463 410
2007M7 376 608 374 935 2011M11 476 404 473 334 2016M3 443 109 444 681
2007M8 372 759 370 801 2011M12 508 451 505 004 2016M4 414 960 415 571
2007M9 364 978 363 714 2012M1 534 089 531 607 2016M5 394 789 393 370
2007M10 348 842 351 060 2012M2 541 685 532 930 2016M6 384 328 381 009
2007M11 341 438 345 060 2012M3 525 180 522 306 2016M7 392 667 384 149
2007M12 354 878 356 754 2012M4 497 322 496 235 2016M8 388 474 381 810
2008M1 364 544 370 824 2012M5 482 099 476 241 2016M9 378 258 379 133
2008M2 355 033 361 174 2012M6 474 586 469 617 2016M10 366 244 370 435
2008M3 336 297 342 850 2012M7 485 597 482 137 2016M11 362 755 367 779
2008M4 316 118 317 381 2012M8 486 693 483 795 2016M12 381 373 382 580
2008M5 302 507 298 817 2012M9 493 185 483 997 2017M1 393 632
(continued)

obs yt byt obs yt byt obs yt byt
2008M6 297 880 289 208 2012M10 496 762 481 615 2017M2 388 287
2008M7 310 058 293 601 2012M11 508 498 493 515 2017M3 372 641
2008M8 312 333 295 350 2012M12 545 311 534 816 2017M4 348 250
2008M9 314 558 295 268 2013M1 585 809 564 588 2017M5 329 616
2008M10 311 705 291 838 2013M2 593 683 579 507 2017M6 318 508
2008M11 320 299 296 627 2013M3 587 768 572 841 2017M7 320 201
2008M12 352 250 321 184 2013M4 565 228 553 061 2017M8 313 932
2009M1 398 061 347 520 2013M5 547 463 541 313 2017M9 306 686
2009M2 428 848 368 210 2013M6 540 473 538 308 2017M10 297 872
2009M3 448 912 385 651 2013M7 551 096 555 683 2017M11 295 628
2009M4 456 726 398 596 2013M8 551 731 558 583 2017M12 308 814
Source: Czech Statistical Office (https://www.czso.cz/csu/czso/casove-rady-zakladnich-ukazatelu-
statistiky-prace-leden-2020)
If the form of seasonality is more complex one can add components of the form

4πt 4πt
β4 sin þ β5 cos : ð4:15Þ
s s
4.1.3 Holt–Winters’ Method
This method extends Holt’s method from Sect. 3.3.3 to include in the adaptive way
not only the local linear trend but also the seasonality. Therefore, both versions of
Holt–Winters’ method (additive and multiplicative) exploit even three smoothing
constants: α to smooth the level Lt, γ to smooth the slope Tt, and δ to smooth the
seasonal index It of given time series with length of season s (0 < α, γ , δ < 1); see,
e.g., Abraham and Ledolter (1983), Bowerman and O’Connell (1987), Montgomery
and Johnson (1976), and others.
4.1.3.1 Additive Holt–Winters’ Method
Recursive formulas of additive Holt–Winters’ method have the form
Lt ¼ αðyt I ts Þ þ ð1 αÞðLt1 þ T t1 Þ, ð4:16Þ

I t ¼ δðyt Lt Þ þ ð1 δÞI ts , ð4:18Þ

byt ¼ Lt þ I t , ð4:19Þ
bytþτ ðt Þ ¼ Lt þ T t τ þ I tþτs for τ ¼ 1, . . . , s,
¼ Lt þ T t τ þ I tþτ2s for τ ¼ s þ 1, . . . , 2s, ð4:20Þ
⋮
This method similarly as Holt’s method in Sect. 3.3.3 has been initially suggested ad
hoc using logical arguments only. For example, the eliminated seasonal index It of
given time series in time t is constructed according to (4.18) as a convex combination
of two items, namely (1) an estimation of this seasonal index constructed in time t by
removing the trend component from the observed value yt (i.e., yt Lt) and (2) an
estimation of this seasonal index constructed in time t 1 using the most actual
estimated value Its from the previous season. One proceeds in a similar way also in
(4.16) to remove the seasonal component from the observed value yt (i.e., yt Its)
using again the most actual estimated value Its from the previous season, and in
(4.20) to predict (in the case of predictions, one must distinguish in (4.20) particular
future seasons respecting the fact that forecasting in prediction horizons, which are
too remote in future, may be unreliable).
To start the recursive formulas of additive Holt–Winters’ method. one must
choose initial values L0, T0, Is+1, Is+2, . . ., I0 and smoothing constants α, γ, δ:
1. Suitable initial values can be found simply if one models the seasonality by
dummies as in Sect. 4.1.2 (the normalization is not here necessary)
yt ¼ β0 þ β1 t þ α2 xt2 þ . . . þ αs xts þ εt ð4:21Þ
with dummy variables x2, . . ., xs, so that using OLS estimates b0, b1, a2, . . ., as
one can put
L0 ¼ b0 , T 0 ¼ b1 , I sþ1 ¼ 0, I sþ2 ¼ a2 , . . . I 0 ¼ as : ð4:22Þ
2. Suitable smoothing constants α, γ, δ can be found using

(a) Fixed choice: in routine situations one recommends α ¼ δ ¼ 0.4 and γ ¼ 0.1.
(b) Estimates α, γ, δ: one proceeds in a similar way as for the exponential
smoothing by minimizing SSE (see, e.g., Sect. 3.3.1).
4.1.3.2 Multiplicative Holt–Winters’ Method
Recursive formulas of multiplicative Holt–Winters’ method have the form
Lt ¼ αðyt =I ts Þ þ ð1 αÞðLt1 þ T t1 Þ, ð4:23Þ

I t ¼ δðyt =Lt Þ þ ð1 δÞI ts , ð4:25Þ
byt ¼ Lt I t , ð4:26Þ
bytþτ ðt Þ ¼ ðLt þ T t τÞ I tþτs for τ ¼ 1, . . . , s,
⋮ ð4:27Þ
¼ ðLt þ T t τÞ I tþτ2s for τ ¼ s þ 1, . . . , 2s,
In comparison with the previous additive Holt–Winters’ method, one only replaces
sums and differences within brackets in (4.16)–(4.20) by products and quotients,
respectively.
To start the recursive formulas of multiplicative Holt–Winters’ method, one must
again choose initial values L0, T0, I-s+1, I-s+2, . . ., I0 and smoothing constants α, γ, δ:
1. Suitable initial values can be found by means of simple formulas
ym y1 sþ1
T0 ¼ , L0 ¼ y 1 T 0,
ðm 1Þs 2
1 X
m1
y jþsi
I js ¼ , j ¼ 1, . . . , s, ð4:28Þ
m i¼0 yiþ1 sþ1
2 j T0
where yi is the arithmetic average of observations over the ith season (of length s)
and m is the total number of these seasons.
2. Suitable smoothing constants α, γ, δ can be found in the same way as in the case
of additive Holt–Winters’ method.
Remark 4.2 Using a similar denotation as in Remark 3.6, one recommends to con-
struct the (1p)100% prediction interval in the form

bynþτ ðnÞ u1p=2 dτ MAE, bynþτ ðnÞ þ u1p=2 dτ MAE , ð4:29Þ
where one substitutes for any τ > 0

0 11=2
θ
1 þ ð1þν Þ3 ð1 þ 4ν þ 5ν Þ þ 2θ ð1 þ 3νÞτ þ 2θ τ
2 2 2
dτ 1, 25 @ θ
A ð4:30Þ
1 þ ð1þν Þ3
ð1 þ 4ν þ 5ν2 Þ þ 2θð1 þ 3νÞ þ 2θ2
(one denotes θ ¼ max{α, γ, δ } and ν ¼ 1 θ), and for additive Holt–Winters’

method
1 X
n
MAE ¼ y Lt1 T t1 I ts , ð4:31Þ
ns t¼sþ1
t
or for multiplicative Holt–Winters’ method

1 Xn yt
MAE ¼ L T : ð4:32Þ

n s t¼sþ1 I ts t1 t1
⋄
Example 4.4 Table 4.4 and Fig. 4.5 present the multiplicative elimination of
seasonality by Holt–Winters’ method (with the fixed choice of smoothing constant
α ¼ δ ¼ 0.4 and γ ¼ 0.1) in the time series yt of job applicants kept in the Czech labor
office register for particular months 2005M1–2016M12 (t ¼ 1, . . ., 144) including
predictions for data 2017M1–2017M12. The smoothed values and predictions have
been obtained by EViews and can be compared with the corresponding results by the
simple approach to multiplicative seasonal elimination in Example 4.2.
650000
600000
550000
500000
450000
400000
350000
300000
250000
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
job applicants kept in the Czechlabour office register

multiplicative seasonale limination by Holt-Winters'method
Fig. 4.5 Monthly data 2005M1–2016M12 and the multiplicative seasonal elimination by Holt–
Winters’ method in Example 4.4 ( job applicants kept in the Czech labor office register), see
Table 4.4. Source: Czech Statistical Office
4.1.4 Schlicht’s Method
Schlicht’s method approaches the problem of additive decomposition by optimizing

specific criteria. If y is the vector of observed values of given time series, then these
criteria characterize the vector of trend values x, the vector of seasonal values z, and
the vector of residual values u
y¼xþzþu ð4:33Þ
(all these vectors are column vector of type T 1). The criteria that characterize the
particular decomposition components are recommended in the following form (see
Schlicht (1982)):
1. Criterion minimizing the measure of trend smoothness f: RT ! R (over x 2 RT):
X
T 2 X
T
f ðxÞ ¼ α Δ 2 xt ¼ α ðxt 2xt1 þ xt2 Þ2 : ð4:34Þ
t¼3 t¼3
2. Criterion minimizing the measure of seasonal stability g: RT ! R with length of

season s (over z 2 RT):
!2
X
T
2
X
T X
s1
gð z Þ ¼ β ðzt zts Þ þ γ ztτ : ð4:35Þ
t¼sþ1 t¼s τ¼0
3. Criterion minimizing the measure SSE of residual component h: RT ! R (over

u 2 RT):
X
T
hðuÞ ¼ u0 u ¼ u2t : ð4:36Þ
t¼1
The positive coefficients α, β, γ control the relative intensity of particular compo-

nents in the decomposition, e.g., the choice of higher α will result in a higher
smoothness of trend. One must also respect the fact that x + z + u ¼ y. Therefore,
the decomposition of y is transferred to the optimization problem
min f f ðxÞ þ gðzÞ þ hðy x zÞg: ð4:37Þ

x, z 2 RT
It is equivalent to solving the following system of 2T linear equations with 2T

unknowns x and z:

x y
H ¼ , ð4:38Þ
z y
where

α P0 P þ I I
H¼ ð4:39Þ
I β Q0 Q þ γ R0 R þ I
with matrices P of type (T2) T, Q of type (Ts) T, and R of type (Ts + 1) T

defined as
(these are band matrices that have zero elements with the exception of main diagonal
and several upper diagonals that are formed by the same elements, e.g., by 2 and 1
in the case of P; here only nonzero bands are shown in P, Q, and R).
4.2 Tests of Periodicity
In the decomposition model (2.2.2), the cyclical component Ct can sometimes play
an important role. It is the periodic component with periodicity longer than one year,
e.g., the five-year business cycle from Sect. 2.2.2 (the annual periodicity is classified
as seasonality). Sometimes there are even several such periodicities compounded in
the given time series. Their elimination is complex since one must decide on the
number and length of corresponding periodicities (e.g., quarterly one in combination
with annual and five-year periodicities using monthly observations). As objective
instruments in such situations one can apply various tests of periodicity which are
usually based on a so-called periodogram.
The periodogram (as well as the spectral density) is an important instrument of
spectral analysis of time series (see Sect. 2.2.2). Spectral analysis transfers the time
domain (which looks upon the given time series as a sequence of observations in
time) to the spectral domain (which looks upon the given time series as an (infinite)
mixture of periodic components and calculates their intensities in this mixture).
More specifically, the periodogram I(ω) of time series y1, y2, . . ., yn is a function
of the frequency ω (such functions are typical for the spectral domain, while
functions of time are used in the time domain). The frequency ω is usually measured
by radians per time unit (this time unit corresponds to the time interval between
4.2 Tests of Periodicity 103
neighboring observations, e.g., one year for an annual time series). Then ω /2π is the
number of cycles per one time unit. For example, the five-year periodicity in an
annual time series, where one-fifth of cycle occurs per one year, has the frequency 2π
/5. It is obvious that by observing a given time series one is capable of recognizing
statistically only the frequencies ranging maximally to π radians per time unit, i.e., to
one-half cycle per time unit (this upper limit is called Nyquist frequency); the
“quicker” frequencies remain hidden from the point of view of the grid of observed
values (e.g., the “quick” frequency 2π radians per time unit of the time series yt ¼
sin(2πt) observed at times t ¼ 1, . . ., n with one cycle per each time unit obviously
cannot be identified from the observed zero values yt ¼ 0 for t ¼ 1, . . ., n).
Numerically, the periodogram I(ω) is defined as
1 2
I ð ωÞ ¼ a ð ωÞ þ b 2 ð ωÞ , 0 ω π, ð4:40Þ
4π
where
rffiffiffi n rffiffiffi n
2 X 2 X
a ð ωÞ ¼ y cos ðω t Þ, bð ω Þ ¼ y sin ðω t Þ: ð4:41Þ
n t¼1 t n t¼1 t
The key property of periodogram, which motivates the introduction of this

instrument, is its behavior when the periodogram is applied for a periodic time series
of the form
X
k
yt ¼ μ þ δi cos ðωi t þ ϕi Þ þ εt
i¼1
X
k
¼μþ ðαi cos ðωi t Þ þ βi sin ðωi t ÞÞ þ εt , t ¼ 1, . . . , n, ð4:42Þ
i¼1
where μ denotes the level of this time series, ω1, . . ., ωk are the mutually different
(unknown in general) frequencies from the interval (0, π) for k periodic components
contained in (4.42), φ1, . . ., φk are the corresponding phases (i.e., the shifts of
cosinusoids from the origin for particular periodic components), and εt is the residual
component in the form of white noise with variance σ 2. Then the periodogram of
time series (4.42) fluctuates around the constant σ 2/ 2π with the exception of
frequencies ω1, . . ., ωk, in which the periodogram rockets to local extremes compa-
rable with the size of n (i.e., they are of order O(n)). Therefore, the periodogram can
indicate by its “bursts” the position of frequencies ω1, . . ., ωk .
In practice, a graphical search for the local extremes of periodogram may be
subjective. Moreover, a practical realization of periodogram typically highly fluctu-
ates because it is not consistent estimator of the spectrum (i.e., its variance may not
decrease with the increasing length of time series). Therefore, in practice one should
prefer suitable statistical tests. The best-known test of this type is Fisher’s test of
periodicity, which tests the null hypothesis
H0 : yt ¼ μ þ εt , t ¼ 1, . . . , n ð4:43Þ
with the normally distributed white noise {εt} against the alternative hypothesis
(4.42) with a given significance level α. The test statistics is constructed using the
periodogram values over the grid of frequencies
2πj
ωj ¼ , j ¼ 1, . . . , m, ð4:44Þ
n

n1
where m ¼ 2 is the integer part of n1
2 , and has the form

I ωj
W ¼ max Y j ¼ Y j ¼ max ð4:45Þ
j¼1, ..., m j¼1, ..., m I ω1 þ . . . þ I ωm
(i.e., the test statistics equals to the maximum standardized value of periodogram
over the grid (4.44) achieved for the index value denoted as j).
The critical region of Fisher’s test of periodicity with significance level α is then
W gα , ð4:46Þ
where gα is the critical value of this test (see the tabulated values in Table 4.5). When
the inequality (4.46) occurs, then we have found simultaneously the frequency of the
periodic component that causes the rejection of null hypothesis (4.43): it is the grid
point (4.44) for j ¼ j. Repeating the test after removing this frequency (i.e., for
m 1), we can find further frequencies in the analyzed time series (see Example
4.5). The practical realization of Fisher’s test of periodicity is described in Example
4.5.
Table 4.5 Critical values of m g0.05 g0.01

Fisher’s test
5 0.684 0.789
10 0.445 0.536
15 0.335 0.407
20 0.270 0.330
25 0.228 0.278
30 0.198 0.241
35 0.175 0.216
40 0.157 0.192
50 0.131 0.160
4.2 Tests of Periodicity 105
Remark 4.3 The distribution of test statistics under the null hypothesis in Fisher’s
test is complex. However, for m 50 a simple approximation holds, namely
PðW > xÞ m ð1 xÞm1 , 0 x 1: ð4:47Þ
There exist various modification of Fisher’s test, which have been suggested to
improve the power of this test, in particular when the periodicity is compounded
⋄
k > 1 (e.g., Siegel’s test (1980), Bølviken’s test (1983), and others).

b1, . . . , ω
If in time series y1, . . ., yn the frequencies ω bk 2 ω1 , . . ., ωm
has
been indicated by Fisher’s test, then the OLS estimates in the resulting model
(4.42) are very simple due to the orthogonality of its regressors, namely
1X
n
b
μ¼ y,
n t¼1 t
2X 2X
n n
b
αj ¼ b jt , b
yt cos ω βj ¼ b jt ,
yt sin ω j ¼ 1, . . . , k: ð4:48Þ
n t¼1 n t¼1
Example 4.5 Table 4.6 presents observed numbers of defective pieces

(in thousands) in daily production of a production unit during 21 days (i.e., three
weeks). One investigates whether this time series in Fig. 4.6 involves periodicities
(e.g., the weekly periodicity due to the “bad” days in the beginning and end of
particular weeks, or the three-day periodicity due to the rotation in shifts of workers,
or others).
Since n ¼ 21 and m ¼ 10, one calculates (in Table 4.7) ten values of
periodograms I(ω j) in the grid points ω j ¼ 2π j/ 21 ( j ¼ 1, . . ., 10) according to
(4.40) and (4.41). Then the value of the corresponding test statistics (4.45) is
Table 4.6 Data for Example t yt t yt t yt

4.5 (numbers of defective
1 3.69 8 3.34 15 3.35
pieces in daily production in
thousands) 2 4.05 9 2.20 16 3.88
3 1.40 10 2.39 17 2.75
4 2.53 11 2.49 18 0.77
5 1.87 12 1.53 19 3.92
6 2.57 13 4.18 20 2.13
7 5.16 14 4.59 21 4.24
Table 4.7 Periodogram j I(ω j)

values in Example 4.5
1 0.028 5
2 0.021 9
3 1.083 8
4 0.019 0
5 0.034 4
6 0.020 3
7 0.513 6
8 0.017 7
9 0.164 6
10 0.260 7
Σ 2.164 5

I ω3 1:083 8
W ¼ max Y j ¼ Y 3 ¼ ¼ ¼ 0:500 7: ð4:49Þ
j¼1, ..., 10 I ω1 þ . . . þ I ω10 2:164 5
Using the tabulated critical values from Table 4.5, it holds for m ¼ 10
W ¼ 0:500 7 > g0:05 ¼ 0:445, ð4:50Þ
so that Fisher’s test of periodicity confirms with significance level of 5% the

presence of the periodic component with frequency ω3 ¼ 2π 3/21 ¼ 2π/7,
which can be interpreted in the given time series of daily observations as a weekly
periodicity.
Since the null hypothesis (4.43) on non-periodicity has not been rejected in the
previous step, one should continue to look for further potential periodicities and
repeat Fisher’s test with periodogram values from Table 4.7 for m ¼ 9 deleting the
value I(ω3), i.e.,

I ω7 0:513 6
W ¼ ¼
I ω1 þ I ω2 þ I ω4 þ I ω5 þ . . . þ I ω10 1:080 7
¼ 0:475 2: ð4:51Þ
The approximation (4.47) can replace the tabulated critical value (since m ¼ 9 50)
PðW > 0:475 2Þ 9 ð1 0:475 2Þ8 ¼ 0:051 8, ð4:52Þ
so that the presence of further periodic component with frequency ω7 ¼ 2π7/21
¼ 2π/3 cannot be confirmed with significance level of 5% (obviously, this compo-
nent would model a 3-day periodicity).
Finally by (4.48), the model (4.42) can be estimated in the form (see also Fig. 4.6)
4.3 Transformations of Time Series 107
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
numbers of defective pieces in daily production(in thousands)

estimated periodicity by means of (4.2.14)
Fig. 4.6 Numbers of defective pieces in daily production (in thousands) and estimated periodicity
by means of (4.53) in Example 4.5

2π 2π
byt ¼ 3:01 þ 1:06 cos t þ 0:42 sin t
7 7

2π
¼ 3:01 þ 1:14 cos t 0:377 2 : ð4:53Þ
7
4.3 Transformations of Time Series
In practice, the analyzed time series are sometimes transformed in a suitable way to
simplify the decomposition of the time series after transformation. Moreover, such
transformations may be useful not only in the framework of decomposition. Two
examples will be given here: Box–Cox transformation and the transformation based
on differencing.
4.3.1 Box–Cox Transformation
Box–Cox transformation has some appealing properties:

• It makes homogenous the variance of given time series (including the seasonal
variance) to become (approximately) constant in time.
• It makes symmetric the skewed distribution of given time series (or even normal,
which enables, e.g., to construct easily the prediction intervals).
• It makes linear a given model of time series (frequently in the framework of Box–
Jenkins methodology; see Chap. 6).
The usual form of Box–Cox transformation is
8
< ð yt þ cÞ λ 1
ðλÞ
¼ for λ 6¼ 0, ð4:54Þ
yt λ
:
ln ðyt þ cÞ for λ ¼ 0:
Here the level parameter c > 0 may be fixed in such a way that holds yt + c > 0,
while the type parameter λ 2 R plays the key role in this transformation, e.g., it is
obviously
ð yt þ cÞ λ 1
lim ¼ ln ðyt þ cÞ ð4:55Þ
λ!0 λ
(therefore the index λ participates in the symbol yt(λ) denoting the transformed time
series). Even though the parameter value λ that makes homogenous the variance of
given time series can be estimated by the maximum likelihood method, practical
applications frequently prefer more subjective approaches based on considerations
of the following type. Since it holds
!2
ðλÞ
ðλÞ dyt
var yt varðyt Þ ð4:56Þ
dyt
yt ¼y
and since we want to achieve the variance homogeneity

ð λÞ
var yt ¼ k2 ¼ const, ð4:57Þ
one obtains from (4.56) and (4.57) the following important relation for the sample
standard deviation sy of time series yt:
sy k y 1λ
: ð4:58Þ
For example, the logarithmic transformation (4.55) with λ ¼ 0 will make the
considered time series homogenous if the relation sy k y holds approximately
between the sample standard deviation sy and the sample mean y, and similarly for
other values of λ.
4.3 Transformations of Time Series 109
Fig. 4.7 Choice of type sy(j)

parameter λ in Box‑Cox
transformation λ<0
λ=0
0<λ < 1
λ=1
y( j)
Table 4.8 Choice of type parameter λ in Box–Cox transformation

Shape of Values of Simplified
curve λ choice of λ Type of transformation
Constant λ¼1 λ¼1 ðλÞ
yt ¼ yt
Concave 0<λ<1 λ ¼ 1/2 ðλÞ pffiffiffiffi
yt ¼ yt
Line λ¼0 λ¼0 ðλÞ
yt ¼ ln yt
Convex λ<0 λ ¼ 1/2 ðλÞ pffiffiffiffi
yt ¼ 1= yt
Therefore, in practice one recommends to divide the given time series into short
segments of the same length (logically the length of segments may be 4 for quarterly
time series and 12 for monthly time series). In each segment, the sample mean yð jÞ
and the sample standard deviation sy( j) are calculated ( j denotes the jth segment),
and one plots a point with these coordinates in the plane (i.e., one has a system of
points corresponding to particular segments). Finally, a smooth curve is fitted
subjectively to this system of points in the plane (see Fig. 4.7). According to the
(a) (b)
90000 11.4
80000 11.2
70000
11.0
60000
10.8
50000
10.6
40000
30000 10.4
20000 10.2
2014 2015 2016 2014 2015 2016
Fig. 4.8 Monthly data 2014M1–2016M12 in Example 4.6 ( job applicants kept in the Czech labor
office register): (a) before transformation; (b) after logarithmic transformation
11000
10000
9000
Standard Deviation
8000
7000
6000
5000
4000
30000 35000 40000 45000 50000 55000 60000 65000 70000
Average
Fig. 4.9 Choice of type parameter λ in Box–Cox transformation in Example 4.6
(a) (b)
10 12
10
8
8
6
6
4
4
2 2
0 0
30000 40000 50000 60000 70000 80000 10.2 10.4 10.6 10.8 11.0 11.2 11.4
Fig. 4.10 Histogram of job applicants kept in the Czech labor office register in Example 4.6:
(a) before transformation; (b) after logarithmic transformation
shape of such a curve, one selects a value for the parameter λ and decides in this way
on the corresponding form of Box–Cox transformation for given time series (see
Table 4.8). The power transformation with λ > 1 gives a hyperbolic shape of the
corresponding curve (this case is not usual in practice and is ignored here).
Example 4.6 Let us consider the time series yt of job applicants kept in the Czech
labor office register for particular months 2014M1–2016M12 (t ¼ 1, . . ., 36); see
Table 4.4 and Fig. 4.8a. Figure 4.9 for segments of length of 12 monthly observa-
tions indicates that the logarithmic transformation is desirable (i.e., Box–Cox trans-
formation with type parameter λ ¼ 0; see Fig. 4.8b). In this example, Fig. 4.8a, b
demonstrates the homogenization of variance after this transformation, and histo-
grams in Fig. 4.10a, b show that the logarithmic transformation really rectified a
skewed distribution to approximately symmetric distribution.
⋄
4.4 Exercises 111
4.3.2 Transformation Based on Differencing
Other transformations frequently used for time series consist in a suitable differenc-
ing (see Remark 3.4) that simplifies decomposition components of the original time
series (such transformations can be looked upon as special cases of moving averages
from Sect. 3.2). Usually, a constant trend remains in the transformed time series
only, as it is the case in the following examples with various decomposition
structure:
(a) Linear trend yt ¼ β0 + β1t + εt:
ð1 BÞyt ¼ Δyt β1 : ð4:59Þ
(b) Polynomial trend yt ¼ β0 + β1 t + . . . + βk t k + εt:
ð1 BÞk yt ¼ Δk yt βk : ð4:60Þ
(c) Additive seasonality yt ¼ Szt + εt (s is the length of season):
ð1 Bs Þyt ¼ Δs yt 0: ð4:61Þ
(d) Polynomial trend and additive seasonality yt ¼ β0 + β1 t + . . . + βk t k + Szt + εt

(s is the length of season):
ð1 BÞk1 ð1 Bs Þyt ¼ Δk1 Δs yt konst: ð4:62Þ
(e) Polynomial trend and multiplicative seasonality yt ¼ (β0 + β1 t + . . . + βk t k)

Szt + εt (s is the length of season):
ð1 B s Þkþ1 yt ¼ ðΔs Þkþ1 yt 0: ð4:63Þ
4.4 Exercises
Exercise 4.1 Repeat the analysis from Example 4.1 (the simple approach to additive
seasonal elimination for the Czech construction production index) only for data since
2013 (hint: I1 ¼ 33.76, I2 ¼ 1.00, I3 ¼ 15.01, I4 ¼ 19.75).
Exercise 4.2 Repeat the analysis from Example 4.2 (the multiplicative elimination
of seasonality for the job applicants kept in the Czech labor office register) only for
data since 2013 (hint: I1 ¼ 1.084, I2 ¼ 1.082, I3 ¼ 1.053, I4 ¼ 1.000, I5 ¼ 0.962,
I6 ¼ 0.948, I7 ¼ 0.970, I8 ¼ 0.969, I9 ¼ 0.970, I10 ¼ 0.964, I11 ¼ 0.976, I12 ¼ 1.035).
Exercise 4.3 In Example 4.3 (the additive elimination of seasonality for the Czech
construction production index using the regression approach with dummies) con-
struct the prediction intervals for the year 2017.
Exercise 4.4 Repeat the analysis from Example 4.4 (the multiplicative Holt–Win-
ters’ method for the job applicants kept in the Czech labor office register) only for
data since 2010 (hint: predictions for 2017: 395068; 390412; 375287; 350837;
332122; 320818; 322058; 315104; 307204; 298323; 296419; 310548).
Exercise 4.5 Apply Fisher’s test of periodicity for the time series in Table 5.1 (hint:
no significant periodicities with significance level of 5%, bμ ¼ 0.167).
Chapter 5
Residual Component
5.1 Tests of Randomness
Sometimes it seems from the visual point of view that the analyzed time series does
not indicate the presence of any systematic component, so that it is white noise only
(even if this white noise can be shifted to a nonzero level). For example, the
graphical record of monthly time series in Table 5.1 plotted for 2015–2017 (t ¼ 1,
. . ., 36) in Fig. 5.1 seems to be white noise. Moreover, sometimes one must assess
whether the elimination of systematic components from a decomposed time series
has been perfect, i.e., whether some reminders of systematic behavior do not persist
in the estimated residuals (e.g., patterns of trend, seasonality, and the like).
However, a visual decision can be subjective so that objective statistical tests with
fixed significance levels are desirable to test the null hypothesis
H 0 : yt iid: ð5:1Þ
This hypothesis is stronger than a test of white noise since it requires the indepen-
dence and the identical distribution (iid). On the other hand, (5.1) does not require
the level in zero (as is the case of white noise).
In general, the tests of this type are denoted as tests of randomness and they are
mostly nonparametric. We will describe some of them briefly. For all of them one
recommends before initiating the test procedure to arrange the given time series in
such a way that in each group of equal neighboring observations (equal approxi-
mately in the sense of applied rounding) one keeps only one observation (the other
equal observations in the group are deleted). In each of following tests, let y1, . . ., yn
denote the tested time series after this adjustment (i.e., yt 6¼ yt + 1 for all t ¼ 1, . . .,
n 1). To be on the safe side, we remind that we deal exclusively with time series
with continuous states.

https://doi.org/10.1007/978-3-030-46347-2_5
114 5 Residual Component
Table 5.1 Time series to be 2015 2016 2017

analyzed by tests of random-
t yt t yt t yt
ness (see also Fig. 5.1)
1 4 13 5 25 2
2 0 14 3 26 8
3 5 15 4 27 5
4 13 16 1 28 5
5 5 17 6 29 21
6 4 18 4 30 3
7 7 19 14 31 4
8 6 20 8 32 4
9 3 21 0 33 11
10 2 22 4 34 2
11 5 23 12 35 5
12 9 24 4 36 8
15
10
-5
-10
-15
-20
-25
12 24 36
Fig. 5.1 Time series to be analyzed by tests of randomness (see also Table 5.1)
5.1.1 Test Based on Signs of Differences
This test is based on the number of positive first differences of given time series, i.e.,
on the number of points in which this time series grows (so-called points of growth;
see also the growth function in Sect. 3.1.2.4).
Let Vt be random variables defined as

1 for yt < ytþ1 ,
Vt ¼ ð5:2Þ
0 for yt > ytþ1
5.1 Tests of Randomness 115
(the case yt ¼ yt+1 is excluded due to the preliminary adjustment). The mean value of
the number of positive first differences k (or equivalently the number of points of
growth) is then obviously under the null hypothesis (5.1) equal to
!
X
n1 n1
X
1 1 n1
Eðk Þ ¼ E Vt ¼ 1þ 0 ¼ , ð5:3Þ
t¼1 t¼1
2 2 2
since the relations between values of two neighboring values yt and yt + 1 have under
the null hypothesis the same probabilities 1/2. One can derive analogously that the
variance of k under the null hypothesis (5.1) fulfills
nþ1
varðk Þ ¼ : ð5:4Þ
12
Even if under (5.1) one can tabulate the (non-asymptotic) distribution of random
variable k, in practice one prefers the asymptotic version of the test which is
acceptable for higher n. Its critical region is
j k ðn 1Þ=2 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u1α=2 , ð5:5Þ
ðn þ 1Þ=12
where u1α/2 is the (1α/2)-quantile of standard normal distribution N(0, 1).
5.1.2 Test Based on Turning Points
Let r denote the total number of upper and lower turning points in the tested time
series (see Sect. 3.1.1). Analogously as in the previous test one can derive that under
the null hypothesis (5.1) it holds
2ð n 2Þ 16n 29
E ðr Þ ¼ , varðr Þ ¼ : ð5:6Þ
3 90
In practice, one again applies the asymptotic version of the corresponding test
with the critical region
j r 2ðn 2Þ=3 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u1 α=2 : ð5:7Þ
ð16n 29Þ=90
5.1.3 Test Based on Kendall Rank Correlation Coefficient τ
This test makes use of Kendall rank correlation coefficient τ (or briefly Kendall’s
tau), which was originally suggested as a measure of ordinal association between
two observed quantities. In our context, it has the form
4v
τ¼ 1, ð5:8Þ
nð n 1Þ
where v denotes such a number of pairs ys and yt in the given time series y1, . . ., yn
fulfilling ys < yt for s < t. The formula (5.8) for τ standardizes v in such a way that
1 τ 1 and under the null hypothesis (5.1) it holds
2ð2n þ 5Þ
EðτÞ ¼ 0, varðτÞ ¼ : ð5:9Þ
9nðn 1Þ
In practice, one applies mainly the asymptotic version of the corresponding test with
critical region
j τ j
qffiffiffiffiffiffiffiffiffiffiffiffi u1 α=2 : ð5:10Þ
2ð2nþ5Þ
9nðn1Þ
5.1.4 Test Based on Spearman Rank Correlation

Coefficient ρ
Let q1, . . ., qn denote the ranks of values of given time series. For example, if it is
y1 ¼ 10, y2 ¼ 6, y3 ¼ 2, y4 ¼ 6, then q1 ¼ 4, q2 ¼ 1, q3 ¼ 3, q4 ¼ 2 (sometimes
one uses fractional ranks with rank averages for equal values, i.e., q1 ¼ 4, q2 ¼ 1.5,
q3 ¼ 3, q4 ¼ 1.5). Then the Spearman rank correlation coefficient ρ (or briefly
Spearman’s rho), suggested similarly as τ to measure statistical dependence between
the ranking of two observed variables, can be calculated as
6 Xn
ρ¼1 ð i qi Þ 2 ð5:11Þ
nðn2 1Þ i¼1
(in our context, the one of rankings is obviously the natural one 1, 2, . . ., n). Even if
the tabulated critical values r1α/2 fulfilling P(|ρ| r1α/2) α under the null
hypothesis (5.1) are easily available nowadays, in practice again the asymptotic
version of the corresponding test is preferred with critical region
pffiffiffiffiffiffiffiffiffiffiffi
n 1 j ρj u1 α=2 : ð5:12Þ
5.1 Tests of Randomness 117
5.1.5 Test Based on Numbers of Runs Above and Below

Median
In this test, one must construct the sample median M of observations in given time
series (therefore one calls it sometimes median test). Graphically it means that we
look for such a line parallel with the time axis that the numbers of observations above
and below it are the same (see Fig. 5.2).
Sometimes several observations of the given time series must lie on this line. In
other situations (see, e.g., Fig. 5.2), such a line cannot be even constructed: then one
recommends to shift arbitrary observations from the line to the region (above or
below) with smaller number of observations to make of the line the correct median
(in Fig. 5.2, we have to shift one observation downward). Now we ignore all
observations on the line and pool the others into groups called runs in such a way
that all neighboring observations lying above or below the line create one particular
run (see Fig. 5.2). Let us denote the number of runs by u and the number of
observations above (or equivalently below) the median line by m.
Even if the critical values of this test are tabulated, in practice one usually makes
use of the asymptotic version of this test with the following critical region:
j u ð m þ 1Þ j
qffiffiffiffiffiffiffiffiffiffiffiffi u1 α=2 : ð5:13Þ
mðm1Þ
2m1
Example 5.1 The time series from Table 5.1 and Fig. 5.1 has the length n ¼ 36
(obviously the preliminary adjustment recommended in the previous text is not
yt
− − + + − + − + − + t
Fig. 5.2 Median test

necessary). Particular tests of randomness (each of them with significance level of

5% and corresponding critical value u0.975 ¼ 1.96) give the following results:
• Test based on signs of differences: one obtains according to (5.5) with k ¼ 16
j 16 ð36 1Þ=2 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:854 < 1:96;
ð36 þ 1Þ=12
• Test based on turning points: one obtains according to (5.7) with r ¼ 25
j 25 2ð36 2Þ=3 j
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:946 < 1:96;
ð16 36 29Þ=90
• Test based on τ: here v ¼ 297 gives
4 297
τ¼ 1 ¼ 0:057,
36 35
so that according to (5.10) one obtains
j 0:057 j
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:489 < 1:96;
2ð236þ5Þ
936ð361Þ
• Test based on ρ: here it is ρ ¼ 0.051, so that according to (5.12) one obtains
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
36 1 j 0:051j ¼ 0:302 < 1:96;
• Median test: the given time series has M ¼ 2 (here it is y10 ¼ y34 ¼ 2, so that there
is no need to shift observations), m ¼ 17 and u ¼ 23, which implies according to
(5.13)
j 23 ð17 þ 1Þ j
qffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 1:742 < 1:96:
17ð171Þ
2171
Obviously the null hypothesis that the observations in Table 5.1 are iid could not
be rejected with significance level of 5% by any of the applied tests of randomness.
⋄
Remark 5.1 The recommended choice of a test may be subjective depending on our
suspicion that a systematic behavior survives in residuals, e.g.:
5.2 Exercises 119
• If we suspect that a linear trend remains in residuals, then one recommends the
test based on signs of differences, test based on τ ,and test based on ρ.
• If we suspect that a periodicity remains in residuals, then one recommends the test
based on turning points and median test (e.g., the test based on signs of differ-
ences is not suitable for residuals with remaining periodicity since in such a case
obviously k ~ n/2 so that the test statistics (5.5) lies close to zero and the test has a
low power).
5.2 Exercises
Exercise 5.1 Derive the formula (5.4) for the variance of test statistics k in the test of
randomness based on signs of differences. Hint: under H0 it holds var(Vt) ¼ 1/4 and
cov(Vt,Vt + 1) ¼ 1/12 in the variance of k:
!
X
n1 X
n1 X
n2
varðkÞ ¼ var Vt ¼ varðV t Þ þ 2 covðV t , V tþ1 Þ:
t¼1 t¼1 t¼1
Exercise 5.2 Simulate white noise N(0, 1) with length of 100 and apply five tests of
randomness from Sect. 5.1 to it.
Part III
Autocorrelation Methods for Univariate
Time Series
Chapter 6
Box–Jenkins Methodology
This chapter is devoted to so-called Box–Jenkins methodology applying special

stochastic models (ARMA, ARIMA, SARIMA, and others) to time series analysis
(e.g., to time series predictions). It enables us to model satisfactorily time series with
general courses that cannot be handled by the classical decomposition approach (see
also Sect. 2.2.2). The methodology is entitled according to the well-known mono-
graph by Box and Jenkins (1970). The authors summarized the temporary knowl-
edge on this issue and transferred theoretical results to algorithmic form. The
methodology has some typical features: in particular, it prefers the (auto)correlation
analysis as the main instrument of time series analysis, it models the trend and
seasonality in a stochastic way, and other particularities can be stressed. It implies
that time series with strongly (auto)correlated observations can be studied using this
approach. Indeed, the linear models such as ARMA offer the most popular approach
to the routine correlatedness among observations in time (however, the financial time
series require specific nonlinear modifications of linear models; see, e.g., models
GARCH in Sect. 8.3, even if the basic principles are the same). In this chapter, we
describe the given issue in a systematic way. We start introducing some pros and
cons of Box–Jenkins methodology:
(+) The stochastic models of the type ARMA are flexible enough to model time
series with general courses.
(+) One can document plenty of successful applications of this approach.
(+) The software based on Box–Jenkins methodology is easily available in most
econometric and statistical packets nowadays.
(+) Meanwhile there do not exist better routine instruments for analysis of time-
dependent observations.
(–) Box–Jenkins methodology requires longer time series (minimal length of fifty
observations is recommended, which is not usually a problem in the case of
financial time series).
(–) Box–Jenkins methodology cannot be used for real data without disposing of
relevant software and instructions.

https://doi.org/10.1007/978-3-030-46347-2_6
124 6 Box–Jenkins Methodology
(–) The interpretation of constructed models is not mostly easy; typically, laymen
ask how it is possible that their data are modeled combining random shocks;
numerical outputs (e.g., predictions) may serve as acceptable arguments in such
cases.
References for more comprehensive study are, e.g., Brockwell and Davis (1993,
1996), Hamilton (1994), and others.
6.1 Autocorrelation Properties of Time Series
6.1.1 Stationarity
Generally speaking, the stationarity of a time series {yt} means that the behavior of
this series is stable in a specific way. One usually distinguishes two cases:
• Strict stationarity means that the probability behavior of corresponding
stochastic process is invariant to shifts in time, i.e., the probability distribution
of random vector yt1 , . . ., ytk is the same as the distribution of vector

yt1 þh , . . ., ytk þh for arbitrary h.
• (Weak) Stationarity is not so restrictive as the strict stationarity since the invari-
ance to time shifts suffices only for the first and second moments, i.e., it must hold
for each s and t
Eðyt Þ ¼ μ ¼ const; ð6:1Þ

covðys , yt Þ ¼ Eðys μÞðyt μÞ ¼ cov ysþh , ytþh for arbitrary h, ð6:2Þ
i.e., particularly
varðyt Þ ¼ σ 2y ¼ const: ð6:3Þ
In other words, the level and variance of stationary time series are constant in time. A
trend, seasonality or non-constant variance (volatility) is incompatible with
stationarity and should be removed from time series to make it stationary. Also the
covariance structure of stationary time series must be invariable in time (e.g., the
character of dependence between the first and second quarter of stationary quarterly
series must be the same in all years).
Remark 6.1 If finite second moments of a given process exist, then obviously the
strict stationarity implies the weak one. Moreover, if such a process is normal (i.e.,
each finite sample from this process has joint normal distribution), then the both
types of stationarity are equivalent.
⋄
6.1 Autocorrelation Properties of Time Series 125
This text deals only with the weak stationarity that will be addressed simply as
stationarity. If introducing Box–Jenkins methodology it is suitable to start just with
models of stationary time series. The text respects this methodological recommen-
dation which will be valid until being canceled. The concept of autocovariance and
autocorrelation functions will be introduced only for stationary time series as well.
6.1.2 Autocovariance and Autocorrelation Function
The typical feature of time series is frequently a strong correlatedness among

observations in time. For instance, if the value of 3-month LIBOR for a given day
was 1.45 % p.a., then it will range probably between 1.40 and 1.50 % in next days
(and not somewhere around 3 %). The common instruments to describe quantita-
tively this phenomenon are autocovariance and autocorrelation functions:
Autocovariance function for lag k (simply autocovariance for lag k) is defined as
γ k ¼ covðyt , ytk Þ ¼ Eðyt μÞðytk μÞ , k ¼ . . . , 1, 0, 1, . . . : ð6:4Þ
Analogously autocorrelation function for lag k (simply autocorrelation for lag

k abbreviated in software systems usually as ACF) is defined as
γk γk
ρk ¼ ¼ , k ¼ . . . , 1, 0, 1, . . . : ð6:5Þ
γ 0 σ 2y
Remark 6.2 The term “autocorrelation” for ρk in (6.5) is correct, since one can
write due to stationarity
γk covðyt , ytk Þ
ρk ¼ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffip ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ corr ðyt , ytk Þ: ð6:6Þ
σ 2y varðyt Þ varðytk Þ
Further the autocovariance and autocorrelation functions are obviously even

functions (i.e., γ k ¼ γ –k and ρk ¼ ρ–k), so that their domain of definition can be
curtailed to k 0). It is always ρ0 ¼ 1 and |ρk | 1. The graphical plot of ρk for
particular k is called correlogram (see, e.g., Fig. 6.1). In fact, the correlogram
describes by means of several values (e.g., ρ1, ..., ρ10 from Fig. 6.1) the short-term
dynamics of given time series (on the contrary, the long-term dynamics may be
reflected by trend). There are just several last values yt – k (k 1), whose correlations
ρk with yt explain this value of time series in the stationary model. It stands in
contrast to the classical model of linear regression, where the regressor y is explained
by means of other (exogenous) variables x; the classical OLS estimate (see, e.g.,
Sect. 3.1.2) includes the correlation between x and y.
⋄
Fig. 6.1 Example of ρk

correlogram
1
0 1 2 3 4 5 6 7 8 9 10 k
6.1.3 Estimated Autocovariance and Autocorrelation

Function
For a given stationary time series, one usually constructs the estimated mean value
1X
n
y¼ y, ð6:7Þ
n t¼1 t
the estimated autocovariance function
1 X
n
ck ¼ ðy yÞðytk yÞ, k ¼ 0, 1, . . . , n 1 ð6:8Þ
n t¼kþ1 t
and the estimated autocorrelation function
ck
rk ¼ , k ¼ 0, 1, . . . , n 1: ð6:9Þ
c0
Remark 6.3 The estimates (6.7)–(6.9) are applicable if empirical recommenda-

tions are fulfilled, namely n > 50 and k < n/4 (however, these requirements are not
usually respected in practice). Sometimes, the divisor n in formula (6.8) is replaced
by n – k: in this way, one achieves a lower bias of estimate ck (i.e., E(ck) will be
closer to the theoretical value γ k ), but on the other hand, the mean squared error of
ck defined as E(ck – γ k )2 will increase. In any case for n going to infinity, both E(ck)
and E(rk) approach to γ k and ρk, respectively, so that these estimates are unbiased
asymptotically.
⋄
In the framework of Box–Jenkins methodology, the behavior of autocorrelation
function is an important instrument that helps to indicate which type of model is
suitable for the given time series; one says that this behavior identifies the
corresponding model. In addition, it is also important to find a value k0 such that
6.1 Autocorrelation Properties of Time Series 127
all ρk are zero for k > k0 (k0 is then called truncation point), or to conclude that such
a point k0 does not exist at all. For example in a model of the form
yt ¼ εt þ θ1 εt1 ð6:10Þ
(εt is the white noise (see (2.1)) and θ1 is a parameter (see Sect. 6.2)) it holds
θ1
ρ1 ¼ , ρk ¼ 0 for k > 1, ð6:11Þ
1 þ θ21
so that k0 ¼ 1 (in Box–Jenkins methodology this truncation point identifies the

model (6.10) denoted as MA(1)). However, the theoretical autocorrelations ρk are
unknown for observed time series. Therefore, in practice one must replace ρk by rk
estimated easily according to (6.8) and (6.9). In particular, it is important to answer
the question which distance of small rk from zero justifies the null hypothesis
ρk ¼ 0 (with a given significance level). For such a test, so-called Bartlett’s
approximation is recommended: if ρk ¼ 0 for k > k0, then under specific assumptions
it holds (asymptotically with growing length of time series n; see, e.g., Brockwell
and Davis (1993), Theorem 7.2.2)
Xk 0
1
rk N 0 , 1 þ 2 r
j¼ 1 j
2
for k > k0 : ð6:12Þ
n
6.1.4 Partial Autocorrelation Function and Its Estimate
In addition to the autocorrelation function ρk, Box–Jenkins methodology also

makes use of the partial autocorrelation function denoted as ρkk (and abbreviated
as PACF in software). The value ρkk is defined as the partial correlation coefficient
between yt and yt – k under fixed values yt –k + 1, ..., yt – 1 (e.g., the symbol ρ22 could
be replaced by more correct though tedious denotation ρ13.2). Obviously, it holds
ρ00 ¼ 1 and ρ11 ¼ ρ1.
Due to the definition of partial autocorrelation ρkk, its logic estimate rkk is the
estimated parameter φ b kk in the model
yt ¼ δ þ φk1 yt1 þ φk2 yt2 þ . . . þ φkk ytk þ εt : ð6:13Þ
However, in practice one usually applies the following (Durbin-Levinson) recur-

sive algorithm:
kP
1
rk r k1,j r kj
j¼1
r 11 ¼ r 1 , r kk ¼ kP
1
for k > 1, ð6:14Þ
1 r k1,j r j
j¼1
where
r kj ¼ r k1,j r kk r k1,kj for j ¼ 1, . . . , k 1: ð6:15Þ
Similarly to the autocorrelation function, there can exist truncation points for the
partial autocorrelation function as well (e.g., for autoregressive processes ), so that
ρkk is an important identifying instrument again. In this case, one can apply so-called
Quenouille’s approximation: if ρkk ¼ 0 for k > k0, then under specific assumptions
again it holds (asymptotically with growing length n)

1
r kk N 0, for k > k0 : ð6:16Þ
n
6.2 Basic Processes of Box–Jenkins Methodology
6.2.1 Linear Process
The theoretical ground of Box–Jenkins methodology (though not important for

mastering it practically so that practitioners can skip this theoretical concept) is
so-called linear process defined as

yt ¼ εt þ ψ 1 εt1 þ ψ 2 εt2 þ . . . ¼ 1 þ ψ 1 B þ ψ 2 B2 þ . . . εt ¼ ψ ðBÞεt , ð6:17Þ
where {εt } is white noise (i.e., a sequence {εt} of uncorrelated random variables
with zero mean values and constant (finite) variances σ 2 > 0; see (2.1)) and B is lag
operator (see Remark 3.4: the transcription of the models of Box–Jenkins method-
ology by means of the operators B and Δ is popular due to its simplicity, e.g., in
(6.17) one constructs a power series ψ(B) applying formally the operator B as if it is
the variable z in the classical power series ψ(z)). Moreover, one assumes that
ψ ðzÞ converges for j z j 1 ði:e:, inside the unit circle in complex planeÞ
ð6:18Þ
(see Brockwell and Davis (1996)). One can show under this assumption that the
infinite series of random variables (6.17) for particular times t converge in the sense
6.2 Basic Processes of Box–Jenkins Methodology 129
of convergence in mean square and the limits form a stationary process with zero
mean value (E(yt) ¼ 0). Some authors (see, e.g., Davidson (2000)) assume more
strongly that εt ~ iid (0, σ 2) in the linear process.
Another expression of the linear process (6.17), which can be useful especially
when constructing predictions, is possible for so-called invertible process. In this
case, one can rewrite (6.17) in the form
yt ¼ π 1 yt1 þ π 2 yt2 þ ... þ εt , i:e:, εt ¼ yt π 1 yt1 π 2 yt2 ... ¼ π ðBÞyt : ð6:19Þ
The sufficient condition of invertibility is analogous to the assumption (6.18),

namely the power series
π ðzÞ converges for j z j 1 ði:e:, inside the unit circle in complex planeÞ:
ð6:20Þ
Remark 6.4 There is a lot of reasons why the models based on the principle of
linear process are suitable to model reality. For instance, let us consider a stationary
process {yt} with zero mean value and let us predict the value yt on the basis of last
values Yt – 1 ¼ {yt – 1, yt – 2, ...}. Then the optimal prediction (in the sense of minimal
mean squared error MSE in (2.11)) is E(yt | Yt – 1). The error of this prediction
et ¼ yt Eðyt j Y t1 Þ ð6:21Þ
has properties of white noise. One calls it innovation (this name is logic since the
innovation process {et} corresponds to unpredictable movements in values {yt}).
Moreover, if the process {yt} is normal, then the conditional mean value E(yt | Yt – 1)
has the form of linear combination of values yt – 1, yt – 2, ..., and (6.21) can be
rewritten as
et ¼ yt π 1 yt1 π 2 yt2 . . . : ð6:22Þ
Obviously, it is just the inverted form (6.19) of linear process.

⋄
Remark 6.5 Since εt ¼ π(B)yt ¼ π(B)ψ(B)εt, it must hold
ψ ðBÞ π ðBÞ ¼ 1, ð6:23Þ
i.e., ψ 1 – π 1 ¼ 0, ψ 2 – ψ 1π 1 – π 2 ¼ 0, etc. These relations transform the parameters

{ψ j} to {π j}, and vice versa. Formally, one can also write π(B) ¼ ψ(B)–1.
⋄
6.2.2 Moving Average Process MA
Here one must start warning that MA models have nothing to do with the method of
moving averages for trend elimination (see Sect. 3.2). Moving average process of
order q denoted as MA(q) has the form
yt ¼ εt þ θ1 εt1 þ . . . þ θq εtq ¼ θðBÞεt , ð6:24Þ
where θ1, ..., θq are parameters and θ(B) ¼ 1 + θ1B + ... + θqBq is moving average
operator (obviously, MA(q) originates by truncating the linear process (6.17) behind
the lag q).
The process MA(q) is always stationary with zero mean value and variance

σ 2y ¼ 1 þ θ21 þ . . . þ θ2q σ 2 ð6:25Þ
and autocorrelation function

8
< θk þ θ1 θkþ1 þ . . . þ θqk θq for k ¼ 1, . . . , q
ρk ¼ 1 þ θ21 þ . . . þ θ2q ð6:26Þ
:
0 for k > q
(apparently the autocorrelation function has the truncation point k0 equal to the
model order q). The partial autocorrelation function ρkk of the process MA(q) has no
truncation point, but it is bounded by a linear combination of geometrically decreas-
ing sequences and sinusoids with geometrically decreasing amplitudes.
The process MA(q) is invertible if all roots z1, ..., zq of polynomial θ(z) lie outside
the unit circle in complex plane (i.e., |z1|, ..., |zq| > 1, since then the assumption (6.20)
is fulfilled).
Remark 6.6 The process MA(1) (see (6.10)) has the autocorrelation function (6.11)
with truncation point k0 ¼ 1. Its partial autocorrelation function is (without trunca-
tion point)

ð1Þk1 θk1 1 θ21
ρkk ¼ 2ðkþ1Þ
for k ¼ 1, 2, . . . , ð6:27Þ
1 θ1
so that in the case of process invertibility, ρkk is really bounded by a geometrically

decreasing sequence (|ρkk | < |θ1|k ). Indeed, the invertibility condition (6.20) has
a simple form |θ1| < 1 here. Since ρ1 ¼ θ1/(1 + θ12), it must be |ρ1| < 1/2 for
invertible MA(1) process (this inequality holds even for all |θ1| 6¼ 1).
⋄
Remark 6.7 The process MA(2)
yt ¼ εt þ θ1 εt1 þ θ2 εt2 ð6:28Þ
has the autocorrelation function

8
> θ 1 ð1 þ θ 2 Þ
>
> for k ¼ 1
>
> 1 þ θ21 þ θ22
>
<
ρk ¼ θ2 ð6:29Þ
> for k ¼ 2
>
> 1 þ θ 1 þ θ2
2 2
>
>
>
:
0 for k > 2
with truncation point k0 ¼ 2. The invertibility condition (6.20) for MA(2) process
has the form
θ1 þ θ2 > 1, θ2 θ1 > 1, 1 < θ2 < 1, ð6:30Þ
so that the invertibility region of MA(2) (in the plane with horizontal axis for
values θ1 and vertical axis for values θ2) is the interior of triangle with vertices
(–2, 1), (0, –1), and (2, 1).
⋄
6.2.3 Autoregressive Process AR
Autoregressive process of order p denoted as AR( p) has the form
yt ¼ φ1 yt1 þ . . . þ φp ytp þ εt , i:e: yt φ1 yt1 . . . φp ytp ¼ φðBÞyt ¼ εt ,

ð6:31Þ
where φ1, ..., φp are parameters and φ(B) ¼ 1 – φ1B – ... – φpB p is autoregressive
operator (it originates by truncating the inverted linear process (6.19) behind the
lag p).
The process AR( p) is stationary, if all roots z1, ..., zp of polynomial φ(z) lie
outside the unit circle in complex plane (i.e., |z1|, ..., |zp| > 1, since then the
assumption (6.18) is fulfilled). In such a case the process has zero mean value and
variance
σ2
σ 2y ¼ ð6:32Þ
1 φ1 ρ1 . . . φp ρp
and its autocorrelation function fulfills the following difference equation:

ρk ¼ φ1 ρk1 þ φ2 ρk2 þ . . . þ φp ρkp for k > 0 ð6:33Þ
(to derive (6.33) it suffices to multiply all terms in the equality (6.31) by the value
yt – k /σ y2 and to calculate mean values at the both sides; moreover, it is E(yt – kεt) ¼
0 for k > 0 since the stationary process AR( p) can be expressed as the linear
process). Due to the theory of difference equations (see Brockwell and Davis
(1993), Section 3.6) the solution of (6.33) can be expressed in the form
ρk ¼ α1 zk k k
1 þ α2 z2 þ . . . þ αp zp for k 0, ð6:34Þ
where z1, ..., zp are mutually distinct roots of the polynomial φ(z) (|z1|, ..., |zp| > 1; see
above) and α1, ..., αp are fixed coefficients (if the roots zi and zj are complex
conjugate, then they can be replaced by a single term of the type α d ksin(λk + φ)
with 0 < d < 1; similarly if the roots z1, ..., zp are not mutually distinct, then all terms with
a multiple root zi of multiplicity r must be replaced in (6.34) by a more complex term
(β0 + β1k + ... + βr–1 kr–1)zi–k, which is always significantly overlapped by the behavior of
the term zi–k for higher k). In any case, the autocorrelation function of process AR( p) can
be approximated by a linear combination of geometrically decreasing sequences and
sinusoids with geometrically decreasing amplitudes (see, e.g., Fig. 6.4).
Remark 6.8 If we write (6.33) only for k ¼ 1, ..., p, then we obtain so-called system
of Yule-Walker equations for unknown parameters φ1, ..., φp by means of autocor-
relations ρ1, ..., ρp (or vice versa)
ρ1 ¼ φ 1 þ φ2 ρ 1 þ ... þ φp ρp1 ,
ρ2 ¼ φ1 ρ 1 þ φ2 þ ... þ φp ρp2 ,
ð6:35Þ
⋮ ⋮ ⋮ ⋮
ρp ¼ φ1 ρp1 þ φ2 ρp2 þ ... þ φp :
⋄
The partial autocorrelation function ρkk of the process AR( p) has the truncation
point k0 equal to the model order p (it follows directly from the definition of partial
autocorrelation function of an autoregressive process of order p fulfilling ρkk ¼ 0 for
all k > p; see (6.13)). This property makes of the partial autocorrelation function an
important instrument for the identification of autoregressive processes.
The process AR( p) is always invertible since (6.31) is directly the invertible form
of this model.
Remark 6.9 The process AR(1)
yt ¼ φ1 yt1 þ εt ð6:36Þ
is stationary for |φ1| < 1. In such a case, it has zero mean value and variance
4 4
3 3
2 2
1 1
0 0
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
-1 -1
-2 -2
-3 -3
-4 -4
Fig. 6.2 (a) Positive correlatedness for yt ¼ 0.8yt 1 + εt (ρ > 0) and (b) negative correlatedness for
yt ¼ 0.8yt 1 + εt (ρ < 0)
σ2
σ 2y ¼ ð6:37Þ
1 φ21
ρk ¼ φk1 for k 0 ð6:38Þ
in the form of geometrically decreasing sequence (oscillating for negative φ1)

without any truncation point. In particular, it holds for k ¼ 1
ρ1 ¼ φ1 , ð6:39Þ
i.e., the first autocorrelation of the process AR(1) equals its autoregressive param-
eter. Hence the sign of parameter φ2 plays an important role here: the positive φ1 >
0 (so-called positive correlatedness) induces the inertia for the signs of neighboring
values of the corresponding time series (see Fig. 6.2(a) with a relatively rare crossing
of time axis), while on the contrary the negative φ1 < 0 (so-called negative
correlatedness) induces frequent changes of the signs of neighboring values of the
corresponding time series (see Fig. 6.2(b) with a relatively dense crossing of time
axis).
The partial autocorrelation function of the process AR(1) has the form
ρ11 ¼ φ1 , ρkk ¼ 0 for k > 1 ð6:40Þ
with truncation point k0 ¼ 1.

⋄
Remark 6.10 The process AR(2)
yt ¼ φ1 yt1 þ φ2 yt2 þ εt ð6:41Þ
is stationary for
φ1 þ φ2 < 1, φ2 φ1 < 1, 1 < φ2 < 1, ð6:42Þ
so that the stationarity region of AR(2) (in the plane with horizontal axis for values
φ1 and vertical axis for values φ2) is the interior of triangle with vertices (–2, –1),
(0, 1), and (2, –1). In such a case, the process AR(2) has zero mean value and
variance
σ2
σ 2y ¼ ð6:43Þ
1 φ1 ρ1 φ 2 ρ2

z 1
1 z 2 k
z1 z 1
1 z 2
z k
ρk ¼ 1
12

2

1
2 for k 0, ð6:44Þ
z1 z2 1 1 þ z1 z21 1
where z1 and z2 are mutually distinct roots of the polynomial φ(z) (|z1|, |z2| > 1, in the
case of double root the form of autocorrelation function is analogous); ρk is without
any truncation point and has the form of a linear combination of two geometrically
decreasing sequences or the form of a sinusoid with geometrically decreasing
amplitude.
The partial autocorrelation function of the process AR(2) has the truncation point
k0 ¼ 2.
⋄
6.2.4 Mixed Process ARMA
Mixed process of order p and q denoted as ARMA( p, q) has the form
yt ¼ φ1 yt1 þ ... þ φp ytp þ εt þ θ1 εt1 þ ... þ θq εtq , i:e: φðBÞyt ¼ θðBÞεt , ð6:45Þ
where the operators φ(B) and θ(B) have been defined in the context of processes
AR( p) and MA(q), respectively. The condition of stationarity and the condition of
invertibility of the process ARMA( p, q) correspond with the condition of
stationarity of AR( p) and the condition of invertibility of MA(q), respectively.
The stationary process ARMA( p, q) has zero mean value, and its autocorrelation
function fulfills the following difference equation:
ρk ¼ φ1 ρk1 þ φ2 ρk2 þ . . . þ φp ρkp for k > q ð6:46Þ
with solution of the form
ρk ¼ α1 zk k k
1 þ α2 z2 þ . . . þ αp zp for k max ð0, q p þ 1Þ, ð6:47Þ
where z1, ..., zp are mutually distinct roots of the polynomial φ(z) (|z1|, ..., |zp| > 1).
Hence the autocorrelation function of process ARMA( p, q) is without any truncation
point and can be approximated by a linear combination of geometrically decreasing
sequences and sinusoids of various frequencies with geometrically decreasing
amplitudes excepting the initial values ρ0, ρ1, ..., ρq – p (this exception is non-
empty only in the case of q p).
The partial autocorrelation function of the process ARMA( p, q) has no truncation
point as well and it is bounded by a linear combination of geometrically decreasing
sequences and sinusoids of various frequencies with geometrically decreasing
amplitudes excepting the initial values ρ00, ..., ρp – q, p – q (this exception is non-
empty only in the case of p q).
Remark 6.11 The process ARMA(1, 1)
yt ¼ φ1 yt1 þ εt þ θ1 εt1 ð6:48Þ
is stationary for |φ1| < 1. In such a case, it has zero mean value and variance
1 þ θ21 þ 2φ1 θ1 2
σ 2y ¼ σ ð6:49Þ
1 φ21
ð 1 þ φ 1 θ 1 Þ ð φ1 þ θ 1 Þ
ρ1 ¼ , ρk ¼ φ1 ρk1 for k > 1 ð6:50Þ
1 þ θ21 þ 2φ1 θ1
in the form of geometrically decreasing sequence excepting ρ0 without any trunca-

tion point.
The condition of invertibility of the process ARMA(1, 1) is |θ1| < 1.
The partial autocorrelation function of ARMA(1, 1) is bounded (in absolute
values) by a geometrically decreasing sequence starting from ρ11.
⋄
Remark 6.12 The stationary processes introduced in this section have the zero
mean value. It is natural to generalize them to the case of nonzero value (constant in
time). For example, the process MA(q) with the mean value μ has the form
yt ¼ μ þ εt þ θ1 εt1 þ . . . þ θq εtq : ð6:51Þ
More generally, the process ARMA( p, q) with the mean value μ has the form

yt μ ¼ φ1 ðyt1 μÞ þ . . . þ φp ytp μ þ εt þ θ1 εt1 þ . . . þ θq εtq ,
ð6:52Þ
or equivalently
yt ¼ α þ φ1 yt1 þ . . . þ φp ytp þ εt þ θ1 εt1 þ . . . þ θq εtq , where

α ¼ 1 φ1 . . . φp μ: ð6:53Þ
In special cases, nonlinear modifications of ARMA models with thresholds can

be also useful; see, e.g., the so-called SETAR (or TAR) model (9.18).
⋄
6.3 Construction of Models by Box–Jenkins Methodology
The construction of models by Box–Jenkins methodology is recommended in three

steps:
1. Identification of model: e.g., for time series y1, ..., yn one identifies the model
AR(1).
2. Estimation of model: e.g., the estimated model is yt ¼ 0.68 yt –1 + εt with bσ¼
11.24.
3. Verification of model: e.g., the model from the step (2) is verified with signifi-
cance level of 5 % or one assesses its prediction competences.
If the diagnostic results from the step (3) are not satisfying, one should repeat all
three steps for another competitive model (in such a case, one frequently corrects or
modifies the rejected model, and the previous identification offers instructions how
to do it).
6.3 Construction of Models by Box–Jenkins Methodology 137
6.3.1 Identification of Model
6.3.1.1 Identification Based on Autocorrelation and Partial

Autocorrelation Function
General findings on the form of autocorrelation and partial autocorrelation function

of stationary and invertible processes AR( p), MA(q), and ARMA( p, q) described in
Sect. 6.2 are summarized in Table 6.1.
Then the corresponding identifying method consists in examining the graphical
plot of the estimated correlogram and partial correlogram of modeled time series
striving to find the most relevant model for this time series just according to
characteristics from Table 6.1. If there are any doubts, we can test the potential
truncation point k0 by means of Bartlett’s approximation (6.12) with the approximate
(asymptotic) critical region (applying the significance level of 5 %)
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
1 Xk 0
jr k j 2 1þ2 r
j¼ 1 j
2 for some k > k0 ð6:54Þ
n
or by means of Quenouille’s approximation (6.16) with the critical region (applying

again the significance level of 5 %)
rffiffiffiffiffi
1
jr kk j 2 for some k > k0 : ð6:55Þ
n
Example 6.1 Table 6.2 presents values yt of 3-month interbank interest rate (in % p.
a.) in Germany (Dreimonatsgeld; see Deutsche Bundesbank) for particular years
1960–1999 (t ¼ 1 , ..., 40). Since the corresponding graph in Fig. 6.3 can be regarded
as stationary in this time period (see also Example 6.4 in Sect. 6.3.3), one has
estimated the corresponding correlogram and partial correlogram (see Table 6.3
and Fig. 6.4).
Applying the characteristics from Table 6.1, the most suitable model for this time
series seems to be the process AR(4): the correlogram rk corresponds to a sinusoid
Table 6.1 Form of autocorrelation and partial autocorrelation function of stationary and invertible
processes AR( p), MA(q), and ARMA( p, q) (U denotes the curve in the form of linear combination
of geometrically decreasing sequences and sinusoids with geometrically decreasing amplitudes)
AR( p) MA(q) ARMA( p, q)
ρk Non-existent k0 ; k0 ¼ q Non-existent k0 ;
ρk in form of curve U ρk in form of curve U
excepting values ρ0, ρ1, ..., ρq–p
ρkk k0 ¼ p Non-existent k0 ; Non-existent k0 ;
ρkk bounded by curve U ρkk bounded by curve U
excepting values ρ00, ..., ρp–q, p–q
Table 6.2 Annual data 1960–1999 in Example 6.1 (3-month interbank interest rate in Germany in
% p.a.—Dreimonatsgeld)
t Year yt t Year yt t Year yt t Year yt
1 1960 5.10 11 1970 9.41 21 1980 9.54 31 1990 8.43
2 1961 3.59 12 1971 7.15 22 1981 12.11 32 1991 9.18
3 1962 3.42 13 1972 5.61 23 1982 8.88 33 1992 9.46
4 1963 3.98 14 1973 12.14 24 1983 5.78 34 1993 7.24
5 1964 4.09 15 1974 9.90 25 1984 5.99 35 1994 5.31
6 1965 5.14 16 1975 4.96 26 1985 5.44 36 1995 4.48
7 1966 6.63 17 1976 4.25 27 1986 4.60 37 1996 3.27
8 1967 4.27 18 1977 4.37 28 1987 3.99 38 1997 3.30
9 1968 3.81 19 1978 3.70 29 1988 4.28 39 1998 3.52
10 1969 5.79 20 1979 6.69 30 1989 7.07 40 1999 2.94
Source: OECD (https://data.oecd.org/interest/short-term-interest-rates.htm#indicator-chart)
14
12
10
2
1960 1965 1970 1975 1980 1985 1990 1995
three-month interbank interest rate (in % p.a.) (Dreimonatsgeld)
Fig. 6.3 Annual data 1960–1999 in Example 6.1 (3-month interbank interest rate in Germany in
% p.a.—Dreimonatsgeld)
with geometrically decreasing amplitude and the partial correlogram rkk has evi-
dently the truncation point k0 ¼ 4; the statistical test (6.55) gives really
rffiffiffiffiffi rffiffiffiffiffiffiffiffi
1 1
jr kk j <2 ¼ 2 ¼ 0:316 for k > 4
n 40
(and it is not true for k ¼ 4). One could try as an alternative the process MA(1) with
the truncation point k0 ¼ 1 in the correlogram rk, since it holds according to (6.54)
Table 6.3 Correlogram and AC PAC AC PAC

partial correlogram in
1 0.612 0.612 11 0.256 0.083
Example 6.1
(Dreimonatsgeld) estimated 2 0.150 0.360 12 0.004 0.128
by means of EViews 3 0.028 0.124 13 0.195 0.004
4 0.228 0.392 14 0.241 0.119
5 0.400 0.082 15 0.193 0.154
6 0.318 0.023 16 0.139 0.102
7 0.037 0.208 17 0.137 0.206
8 0.147 0.015 18 0.062 0.053
9 0.162 0.048 19 0.039 0.018
10 0.249 0.189 20 0.048 0.037
1.00 1.00
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
-0.25 -0.25
-0.50 -0.50
-0.75 -0.75
-1.00 -1.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
autocorrelations ACF partial autocorrelations PACF
Fig. 6.4 Correlogram and partial correlogram in Example 6.1 (Dreimonatsgeld) estimated by
means of EViews
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 1
jr k j <2 1 þ 2r 21 ¼ 2 1 þ 2 0:6122 ¼ 0:418 for k > 1,
n 40
but the value |r5| ¼ 0.400 is relatively close to this border (moreover, it would be
evidently difficult to look for a curve U bounding the partial correlogram rkk).
⋄
To check the correctness of identified model, one sometimes makes use of the
inequalities for estimated autocorrelations rk that should hold theoretically under the
assumption of stationarity and invertibility of process (see Table 6.5: e.g., according
to Remark 6.6 it holds |ρ1| < 1/2 in the process MA(1)).
6.3.1.2 Identification Based on Information Criteria
This advanced approach to the model identification enables (at least theoretically) a
fully automatic identification excluding any subjective interference of analysts. The
problem of identification of process ARMA( p, q) for a given time series is addressed
here as the problem of estimation of unknown parameters p and q by means of
optimization
ðb
p, b
qÞ ¼ arg min Aðk, lÞ, ð6:56Þ
ðk, lÞ
where A(k, l) is a suitable criterion constructed by estimating the process ARMA

(k, l) for the given time series. The minimization is performed over a grid of values
k ¼ 0, 1, ..., K and l ¼ 0, 1, ..., L chosen a priori.
One of possible choices of the criterion A could be to set A(k, l) ¼ bσ k,l using the
estimated standard deviation of white noise of the process ARMA(k, l), i.e., for
a given time series one prefers the model with the lowest residual error in such a case
(thus in Example 6.1 one would prefer the process AR(4) with the residual standard
deviation 1.756 to the process MA(1) with the residual standard deviation 1.782).
However, more adequate approach is to apply the theory of information and to
penalize unnecessarily high orders k and l (simultaneously with the minimization of
residual standard deviation) achieving the consistency of the estimates (6.56) in this
way. The popular information criteria based on this idea are:
1. Criterion AIC (Akaike information criterion):
2ð k þ l þ 1 Þ
σ 2k,l þ
AICðk, lÞ ¼ ln b : ð6:57Þ
n
2. Criterion BIC (Bayes information criterion or Schwarz information criterion):
ðk þ l þ 1Þ ln n
σ 2k,l þ
BICðk, lÞ ¼ ln b : ð6:58Þ
n
The value b σ 2k,l in (6.57) and (6.58) denotes the estimated variance of white noise in
the process ARMA(k, l ) (more correctly one should use the minimal value of
logarithmic likelihood multiplied by coefficient (–2/n) instead of the first term in
(6.57) and (6.58); see, e.g., EViews), the numerator of the second term contains
obviously the number of estimated parameters (including the level parameter μ to
penalize unnecessarily high orders k and l), and n is the length of given time series.
The criterion AIC produces the strongly consistent estimator of the model order (i.e.,
this estimator converges to the true order with probability one for increasing n), but it
may have a high variance (i.e., it lacks of efficiency). The properties of the criterion
BIC are just opposite: the corresponding estimator of model order is not consistent,
but it is efficient.
Table 6.4 Values of infor- AIC BIC

mation criteria AIC and BIC
White noise 4.678 4.721
from Example 6.2
(Dreimonatsgeld) calculated AR(1) 4.257 4.342
by means of EViews (the AR(2) 4.170 4.299
minimal values are in bold AR(3) 4.225 4.399
figures) AR(4) 4.093 4.313
AR(5) 4.159 4.426
AR(6) 4.251 4.565
Example 6.2 The values of information criteria AIC and BIC calculated by means
of EViews for the time series from Example 6.1 (Dreimonatsgeld) are shown in
Table 6.4, where one examines autoregressions up to the order six. The identified
process is AR(4), since the process AR(2) according to BIC is nested into the process
AR(4) according to AIC.
⋄
Remark 6.13 If two models are acceptable in the identification step and one of
them is nested into the second one (e.g., AR(2) is nested into AR(4); see above), then
one can decide on proper identification by means of statistical tests (F-test or
Lagrange Multiplier (LM) test), which test whether the parameters distinguishing
the both models are zero (see, e.g., EViews).
⋄
6.3.2 Estimation of Model
Simple models of Box–Jenkins methodology (up to the orders two) can be estimated
by means of the moment estimates making use of relations among the parameters of
the identified model and its autocorrelations (see Table 6.5: e.g., according to (6.39)
it holds φ1 ¼ ρ1 in the process AR(1)). However, such estimates are usually
perceived as preliminary ones and are used in practice as initial values for more
complex (iterative) procedure.
The estimation procedures for construction of the final estimates (not only the
initial ones) in particular models are software matters definitely. For the process
AR( p)
yt ¼ α þ φ1 yt1 þ . . . þ φp ytp þ εt ð6:59Þ
one can use the classical OLS estimation (including the classical estimation of its
variance matrix). Under the stationarity assumption, this estimate is consistent due to
the orthogonality of regressors to residuals in (6.59) (the orthogonality cov(yt – 1, εt)
¼ ... ¼ cov(yt – p, εt) ¼ 0 can be shown by expressing the process AR( p) in the form
of linear process (6.17)).
Table 6.5 Moment estimates of simple stationary and invertible models of Box–Jenkins method-
ology and check inequalities for estimated autocorrelations
Model Moment estimates Check inequalities for rk
AR(1) b1 ¼ r1 ,
φ b
σ2 ¼ b b1 r1 Þ
σ 2y ð1 φ |r1| < 1
AR(2) r 1 ð1 r 2 Þ r 2 r 21 jr 2 j < 1, r 21 < 1þr 2
b1 ¼
φ , b2 ¼
φ , 2
1 r 21 1 r 21
σ2 ¼ b
b σ 2y ð1 φ b1 r1 φb2 r2 Þ
pffiffiffiffiffiffiffiffiffi2
MA(1) b 1 14r 1 bσy
2
|r1| < 1/2
θ1 ¼ , b
σ 2
¼
1þb
2r 1 2
θ1
MA(2) b bσ y
2
r 1 þ r 2 > 1=2, r 2 r 1 > 1=2
θ1 ¼ b
θ2 0:1 , b
σ2 ¼
1þbθ1 þb
2 2
θ r 21 < 4r 2 ð1 2r 2 Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffi2
ARMA
b 2r 21 jr 1 j < r 2 < jr 1 j
b b
2
r b 4
(1, 1) b1 ¼ 2 , b
φ θ1 ¼ , b θ1 < 1,
r1 2
1 2r 2 þ φ b 21 b
σy2
b
b¼ , b σ2 ¼ ,
r1 φ
b1 1þb
2
θ1
2
X n
yt y
b
σ 2
y ¼ , y t ¼ yt φ b 1 yt1
t¼1
n
In the case of the stationary and invertible process ARMA( p, q) (with zero mean
value for simplicity)
yt ¼ φ1 yt1 þ . . . þ φp ytp þ εt þ θ1 εt1 þ . . . þ θq εtq ð6:60Þ
one usually uses the NLS estimates (nonlinear least squares) which are realized by
means of iterative algorithms (Gauss–Newton and others; see, e.g., EViews). The
corresponding NLS procedures consist mostly in the minimization of sum of squares
(which are nonlinear in parameters φ1, ..., θq)
X
n 2
min et φ1 , . . ., θq , ð6:61Þ
φ1 , ..., θq
t¼pþ1
where residuals et(φ1, ..., θq) are constructed recursively by means of the relation

et φ1 , . . ., θq ¼ et ¼ yt φ1 yt1 . . . φp ytp θ1 et1 . . . θq etq
for t ¼ p þ 1, . . . , n:
ð6:62Þ
with suitable initial values ep–q+1, ..., ep. Finally, the estimate of the variance of white
noise σ 2 is obtained if dividing the minimal value of (6.61) by the length of time
series n. Under the normality assumption and for higher n, these estimates are very
Table 6.6 Approximate standard deviations of estimated parameters of simple stationary and
invertible models of Box–Jenkins methodology
Model Standard deviations of estimated parameters
AR(1) 2 1=2
b
b 1 Þ 1nφ1
σ ðφ
AR(2) 1=2
1b
2
φ2
σ ðφ b2 Þ
b 1 Þ σ ðφ n
MA(1) b2 1=2
σ b
1θ1
θ1 n
MA(2) b2 1=2
θ1 σ b
σ b
1θ2
θ2 n
0 2 2 11=2
ARMA(1, 1) 2 !1=2
1b
2
φ1 φ1b
1þb θ1
1b
θ1 1þb φ1b
θ1
b1 Þ
σ ðφ 2 , σ b
θ1 @ 2 A
φ1 þb
n b θ1 n bφ1 þbθ1
close to the ML estimates constructed by maximizing the logarithmic likelihood

function
np 2 1 X 2
n
np
L φ1 , ..., θq ,σ 2 ¼ ln ð2π Þ ln σ 2 e φ , ...,θq : ð6:63Þ
2 2 2σ t¼pþ1 t 1
Remark 6.14 Table 6.6 enables us to evaluate the errors of estimated parameters by
means of their approximate standard deviations (the applicability and derivations can
be found in Box and Jenkins (1970, Chapter 7)). For instance in the process AR(1),
one can evaluate the corresponding error as
!1=2
1φb 21
b b1 Þ ¼
σ ðφ : ð6:64Þ
n
⋄
Example 6.3 In Table 6.7, the time series from Example 6.1 (Dreimonatsgeld)
identified as the process AR(4) (see also Example 6.2) is estimated by means of
EViews as
yt ¼ 6:203 þ 0:950yt1 0:726yt2 þ 0:542yt3 0:452yt4 þ εt , b

σ ¼ 1:756:
⋄
Table 6.7 Estimation of the process AR(4) from Example 6.3 (Dreimonatsgeld) calculated by
means of EViews
C 6.203253 0.427508 14.51027 0.0000
AR(1) 0.950188 0.163212 5.821793 0.0000
AR(2) 0.725744 0.216765 3.348060 0.0021
AR(3) 0.541991 0.215576 2.514154 0.0173
AR(4) 0.451544 0.165381 2.730320 0.0103
R-squared 0.567557 Mean dependent var 6.186667
Adjusted R-squared 0.511758 S.D. dependent var 2.513598
Sum squared resid 95.62875 Schwarz criterion 4.312543
Log likelihood 68.66697 F-statistic 10.17145
Durbin–Watson stat 2.057756 Prob(F-statistic) 0.000022
Inverted AR Roots 0.700.49i 0.70+0.49i 0.22+0.76i 0.220.76i
Source: Calculated by EViews
6.3.3 Verification of Model
The verification step of Box–Jenkins methodology is relatively elaborate. Applying

various diagnostic instruments, the compatibility of the estimated model with the
analyzed data should be verified. It is usually done by checking various properties of
the constructed model:
6.3.3.1 Check of Stationarity
In this case, one checks whether the estimated model fulfills the condition of
stationarity, i.e., whether the roots of estimated autoregressive polynomial lie outside
the unit circle in complex plane (or equivalently, whether their inverted values,
which are the roots of autoregressive polynomial written with the opposite order of
powers z p – φ1 z p – 1 – ... – φp, lie inside this circle; see Example 6.4). In particular,
this check of stationarity is important in the cases in which the estimation method is
strongly based on the stationarity assumption (e.g., for the estimates based on the
Yule–Walker equations; see Remark 6.8). It is also possible to separate several
segments in the given time series and to test the coincidence of estimated levels,
variances, and autocorrelations (or higher moments such as skewness and others)
among particular segments.
Another approach (so-called impulse response) consists in analyzing the response
of an impulse that occurred in the estimated model either in a single time moment or
repeatedly since this moment and that influences the consecutive values of the
process (such an impulse is mostly standardized to the size of standard deviation
of the corresponding white noise or to a multiple of this standard deviation). For
example, the estimated ARMA structure is transferred to the form of linear process
(6.17), and hereinto one substitutes (since a given time moment) an “artificial”
innovation process {εt} either with the only nonzero value in this time or with
fixed nonzero values since this moment. If the given time series is stationary, then by
increasing the time distance from the initial moment of impulse (1) the response to a
single impulse should fade away gradually to the zero level and (2) the response to
repeated impulses should stabilize itself to an appropriate (non-zero) level (see
Example 6.4).
6.3.3.2 Check of ARMA Structure
This check means first of all the coincidence of the correlation structure estimated
from the data (i.e., the functions rk and rkk) with the correlation structure derived
from the estimated model that is to be verified (see Example 6.4). Another check of
structure of model is based on testing the uncorrelatedness (e.g., by means of Q-tests;
see below) in the white noise that has been estimated using the tested model.
6.3.3.3 Graphical Examination of Estimated White Noise
An important diagnostic instrument is the white noise fbεt g constructed from the
estimated model of given time series (similarly as the residuals calculated using an
estimated regression model). The graphical record of this estimated white noise (and
its estimated correlogram, histogram, etc.) can indicate eventual flaws of the model
(in standard situations the estimated white noise is usually expected to show zero
mean value, constant variance, uncorrelatedness and normality; see Example 6.4).
6.3.3.4 Tests of Uncorrelatedness for Estimated White Noise
The uncorrelatedness of the estimated white noise (see above) can be tested under
the normality assumption directly by means of the test based on Bartlett’s approx-
imation (6.54), where we use the estimated autocorrelations of the estimated white
noise r k ðbεt Þ . Obviously, the null hypothesis has the critical region (applying the
significance level of 5 %)
rffiffiffiffiffi
1
jr k ðbεt Þj 2 for k ¼ 1, 2, . . . : ð6:65Þ
n
However, so-called Q-tests (or equivalently portmanteau tests) are also fre-
quently used that test cumulatively the significance of the K initial autocorrelations
of estimated white noise (the integer K must be chosen in advance with
recommended size K √n, where n is the length of given time series). In this way
one verifies simultaneously the used structure ARMA( p, q) since the corresponding
Q-statistics of this test has the asymptotic distribution χ 2(K – p – q) (under the null
hypothesis that the original time series admits to be modeled by ARMA( p, q)). As
the Q-statistics are concerned, in practice one uses mainly Box–Pierce statistics with
the critical region (applying the significance level α) of the form
X
K
Q¼n ðr k ðbεt ÞÞ2 χ 21α ðK p qÞ ð6:66Þ
k¼1
or statistically more powerful Ljung–Box statistics with the critical region (applying
again the significance level α) of the form
X
K
1
Q ¼ nðn þ 2Þ ðr ðbε ÞÞ2 χ 21α ðK p qÞ: ð6:67Þ
k¼1
nk k t
Example 6.4 The stationarity of model AR(4) estimated in Example 6.3 (Drei-
monatsgeld) was checked at first in the framework of verification: in Table 6.7 and
Fig. 6.5, one can see the inverted roots of estimated autoregressive polynomial
which lie distinctly inside the unit circle in complex plane. Further in Fig. 6.6 one
shows the response corresponding to impulse (standardized to the double size of
estimated standard deviation of white noise) and to repeated impulses. In the first
case, the response fades away gradually reaching the zero level finally and in the
second case it stabilizes to the level of 2.56. Obviously, none of performed checks
reject stationarity.
Fig. 6.5 Inverted roots of Inverted roots of AR polynomial

estimated autoregressive
1.5
polynomial from Example
6.4 (Dreimonatsgeld)
calculated by means of
EViews 1.0
0.5
0.0
-0.5
-1.0
-1.5
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Response to impulse ± 2 S.E.

3
-1
-2
2 4 6 8 10 12 14 16 18 20 22 24
Response to repeated impulses ± 2 S.E.

8
-2
2 4 6 8 10 12 14 16 18 20 22 24
Fig. 6.6 Response to impulse standardized to the double size of (estimated) standard deviation of
white noise in the case of (a) single impulse (see upper graph) and (b) repeated impulses (see lower
graph) from Example 6.4 (Dreimonatsgeld) calculated by means of EViews
In Fig. 6.7, one compares the correlation structure estimated from data (i.e., the
functions rk and rkk) with the correlation structure derived from estimated model (i.e.,
the functions ρk and ρkk corresponding to the estimated model). The achieved
coincidence testifies to the adequacy of constructed model AR(4).
Finally, Table 6.8 shows the estimated autocorrelations of estimated white noise
and the results of a Q-test. According to (6.65) one gets
rffiffiffiffiffi
1
jr k ðbεt Þj 2 ¼ 0, 316 for k ¼ 1, 2, . . . : ð6:68Þ
40
In addition using the Q-test based on Ljung–Box statistics (6.67) for various K ¼ 5, 6,
... , one cannot reject (applying the significance level 5 %) the null hypothesis on
uncorrelatedness of white noise (i.e., null hypothesis on adequacy of the constructed
model AR(4)).
⋄
.8
.4
ACF
.0
-.4
-.8
2 4 6 8 10 12 14 16 18 20 22 24
autocorrelations estimated from data

autocorrelations derived from estimated model
.8
.4
PACF
.0
-.4
-.8
2 4 6 8 10 12 14 16 18 20 22 24
partial autocorrelations estimated from data

partial autocorrelations derived from estimated model
Fig. 6.7 Coincidence of the correlation structure estimated from data (i.e., the functions rk and rkk)
with the correlation structure derived from estimated model (i.e., the functions ρk and ρkk) in
Example 6.4 (Dreimonatsgeld) calculated by means of EViews
Table 6.8 Estimated auto- AC Q-Stat Prob

correlations of estimated
1 –0.056 0.1240
white noise and results of Q-
test based on Ljung–Box sta- 2 –0.016 0.1344
tistics in Example 6.4 3 0.028 0.1663
(Dreimonatsgeld) calculated 4 –0.064 0.3439
by means of EViews 5 –0.009 0.3475 0.556
6 –0.117 0.9681 0.616
7 –0.054 1.1053 0.776
8 0.147 2.1594 0.706
9 –0.107 2.7413 0.740
10 0.018 2.7579 0.839
11 0.281 7.0829 0.420
12 –0.068 7.3437 0.500
13 –0.060 7.5545 0.580
14 –0.200 10.050 0.436
15 –0.103 10.736 0.466
16 0.094 11.338 0.500
6.4 Stochastic Modeling of Trend 149
6.4 Stochastic Modeling of Trend
Majority of time series in practice (in particular, in economy and finance) is

nevertheless nonstationary, i.e., such time series do not fulfill minimally the time
invariance assumption for mean value, variance, and correlation structure. There are
frequently significant differences in behavior of stationary and nonstationary time
series with important impacts for their analysis and model instruments, e.g.,
• In Sect. 6.3.3.1, we have discussed the response to impulse analysis: while in the
stationary environment the impulse influence expires gradually (i.e., the impulse
realized in time t has mostly smaller “power” in time t + τ2 than in time t + τ1 for
0 < τ1 < τ2), it does not hold in nonstationary environment where the persistence
of the impulse can be unlimited with nondecreasing power of response.
• Presence of nonstationary data can lead to so-called spurious regression: if two
stationary time series are mutually independent, then the regression of the first
one on the second one (or on the contrary) does not usually exhibit attributes of a
significant regression relation (there are usually nonsignificant t-ratios for the
corresponding regression parameter and small coefficients of determination R2).
In the nonstationary situation, the regression may give highly significant results
(particularly when the series contain trend components), but such regression
relations may be only spurious without any rational reason.
• Moreover, in the previous point under nonstationarity one cannot rely on some
routine properties according to which t-ratio has (asymptotically) t-distributions
or F-statistics has (asymptotically) F-distribution.
In addition, one must distinguish two types of nonstationarity:
1. The first type of nonstationarity can be demonstrated simply by means of the
following linear trend model:
yt ¼ α þ β t þ εt , ð6:69Þ
where εt is a white noise (see, e.g., Fig. 3.2). It is the example of so-called
deterministic nonstationarity caused for instance by deterministic trend (in our
case by a linear line), and when it is eliminated the time series becomes stationary
(e.g., white noise in our case).
2. The second type of nonstationarity is represented, e.g., by the model
yt ¼ α þ yt1 þ εt , ð6:70Þ
where εt is again the white noise with variance σ 2 (see, e.g., Fig. 6.8(a)), though
here one usually assumes that in addition εt ~ iid. It is the example of the
stochastic nonstationarity, which can be modeled in some specific situations by
(a) (b)
960 40
30
920
20
10
880
840 -10
-20
800
-30
760 -40
50 100 150 200 250 50 100 150 200 250
index PX (in year 2016) first differences of index PX (in year 2016)
Fig. 6.8 (a) Index PX in the year 2016 (values for 251 trading days) from Example 6.5. (b) First
differences of time series from (a)
special (stochastic) models and then also made stationary by exploiting these
models in a suitable way. More specifically, the model (6.70) is so-called random
walk with drift, and in this case, the corresponding time series can be
“stationarized” simply by transferring it to the time series of first differences
Δyt, since from the model (6.70) one easily obtains
Δyt ¼ α þ εt : ð6:71Þ
The right-hand side of (6.71), i.e., the white noise shifted to level α, is trivially
a stationary time series. The principle of stochastic nonstationarity in the case of
model (6.70) can be presented better if one rewrite it in the form
X
t
y t ¼ y 1 þ α ð t 1Þ þ ετ : ð6:72Þ
τ¼2
The time series has obviously not only the deterministic trend (namely, the linear
trend with the slope α), but also a stochastic trend consisting in progressive
cumulation of white noise. From the interpretation point of view, the conditional
values are also interesting (under the assumption of mutual independence of εt)
EðΔyt jyt1 , . . ., y1 Þ ¼ α, ð6:73Þ

Eðyt jy1 Þ ¼ y1 þ α ðt 1Þ,

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tk1
varðyt jy1 Þ ¼ σ ðt 1Þ,
2
corrðytk , yt jy1 Þ ¼ : ð6:74Þ
t1
According to (6.73), this time series tends not to revert back to the original level,
but on the contrary, it tends to higher (or lower) values for α > 0 (or α < 0),
respectively, since the development rate of the mean value is O(t), while for the
standard deviation it is only O(√t). Even in the case of zero slope (α ¼ 0), the
random walk without drift (in contrast to the white noise) intersects the horizontal
axis (i.e., the zero level) only rarely. Moreover, the relations (6.74) imply that the
mean level and variance (volatility) are unlimited, while the autocorrelation
function has values near to one and decreases to zero in a slower rate than linearly.
Remark 6.15 Let us rewrite the relation (6.70) to a more general form
yt ¼ α þ φ1 yt1 þ εt ð6:75Þ
(the relation (6.70) is a special case for φ1 ¼ 1). If it holds |φ1| < 1, then (6.75) is
obviously the stationary process AR(1) with nonzero mean value μ ¼ α /(1 – φ1)
yt μ ¼ φ1 ðyt1 μÞ þ εt ð6:76Þ
(see Remark 6.12), which can be rewritten as Δyt ¼ (φ1 – 1)(yt – 1 – μ) + εt. Then the
conditional mean value (6.73) of such a stationary process AR(1) obviously fulfills
EðΔyt jyt1 , . . ., y1 Þ < 0 for yt1 > μ,

EðΔyt jyt1 , . . ., y1 Þ > 0 for yt1 < μ, ð6:77Þ
i.e., now in contrast to the random walk with drift, the process {yt} does not drift, but
it reverts to the previous level (so-called mean reverting). Finally, the remaining case
of |φ1| > 1 is a very special one since then {yt} is the explosive process comparable
with the powers φ1k (e.g., the process yt ¼ 2yt – 1 + εt behaves since later times t as
the deterministic sequence 2t regardless of the size of white noise εt).
⋄
Remark 6.16 Once more let us stress the distinction in stationarization for the
model described above:
• The stationarity of the model (6.69) with deterministic trend can be achieved
simply by regression methods eliminating trend. The stationarization based on
differences should not be used since it may lead to models with residuals in the
form of a (noninvertible) MA process
Δyt ¼ β þ εt εt1 : ð6:78Þ
• To achieve stationarity in the model (6.70) of random walk with drift it is

sufficient to construct the (first) differences. As the possibility of a regression
elimination of stochastic trend is concerned, it is not clear what we should
eliminate. If we have extended the model (6.75) to the form
Δyt ¼ α þ β t þ ðφ1 1Þyt1 þ εt or yt ¼ α þ β t þ φ1 yt1 þ εt ð6:79Þ
with both deterministic and stochastic trend, then the effort to eliminate the trend
by means of regression methods would face the problem that, e.g., t-ratio may not
have (not even asymptotically) t-distribution. The model (6.79) can be also
rewritten as
β α φ1 β1
yt β0 β1 t ¼ φ1 ðyt1 β0 β1 ðt 1ÞÞ þ εt , where β1 ¼ , β0 ¼ ,
1 φ1 1 φ1
ð6:80Þ
i.e., if |φ1| < 1, then (6.80) is in fact the stationary process AR(1) with linear
trend.
6.4.1 Tests of Unit Root
The possibility to stationarize the analyzed time series by means of differencing can
be considered as the evidence of existence of (nearly) unit roots of the
autoregressive operator for given model (e.g., the autoregressive operator of the
model (6.70) has obviously the single root equal to one). In accord with the
previous discussion, the decision on existence of such a unit root (or multiple
unit roots) is often the key point of the corresponding analysis, even if the form
of estimated correlogram can indicate the presence of such a root (namely, a very
slow decline starting from unit to zero, since the particular estimated autocorrela-
tions converge to 1 with the increasing length of nonstationary time series).
However, a subjective checking of estimated correlograms cannot usually distin-
guish nonstationary models of the type yt ¼ yt–1 + εt from stationary ones with
nearly unit root of the type yt ¼ 0.95yt–1 + εt so that an application of a statistical
test with a prescribed significance level is recommended here.
6.4.1.1 Dickey–Fuller Test
DF test (see Dickey and Fuller (1979, 1981)) was the pioneering one among the tests
of the unit root. In particular, Dickey and Fuller suggested three variants all denoted
as τ-tests:
(1) τ-test: H0: yt ¼ yt – 1 + εt versus H1: yt ¼ φ1yt – 1 + εt for φ1 < 1, i.e., the
one-tailed test of random walk versus stationary AR(1) process (the possible
nonstationarity caused by φ1 –1 is not important in practice);
(2) τμ-test: H0: yt ¼ yt – 1 + εt versus H1: yt ¼ α + φ1yt – 1 + εt pro φ1 < 1, i.e., the
one-tailed test of random walk versus stationary AR(1) process with (nonzero)
level;
(3) ττ-test: H0: yt ¼ yt – 1 + εt versus H1: yt ¼ α + β t + φ1yt – 1 + εt pro φ1 < 1, i.e.,
the one-tailed test of random walk versus stationary AR(1) process with linear
trend.
The null hypothesis in each of all three cases can be written simply as
H 0 : Δyt ¼ ψyt1 þ εt for ψ ¼ 0, ð6:81Þ
while the alternative generally as
H 1 : Δyt ¼ α þ β t þ ψyt1 þ εt for ψ < 0, ð6:82Þ
where ψ ¼ φ1 – 1 and (1) α ¼ β ¼ 0 and (2) β ¼ 0. One should stress that all
alternatives consist in the inequality ψ < 0 only, and the equalities α ¼ β ¼ 0 in the
alternative (1) or β ¼ 0 in the alternative (2) are not investigated at all (including the
numerical values of the intercept α or the slope β whose correctness is not guaranteed
under nonstationarity caused by ψ ¼ 0 anyhow).
The test statistics in each of three variants of DF test is the classical t-ratio (we test
simply the significance of regression parameter ψ in the model (6.81))
b
ψ
DF ¼ ð6:83Þ
b bÞ
σ ðψ
using the estimates constructed by means of the methodology from Sect. 6.3.2 and
with the critical region
DF t α ðnÞ: ð6:84Þ
However under the null hypothesis ψ ¼ 0, the test statistics DF does not have
t-distribution (not even asymptotically or under the assumption of εt ~ iid) as is the
case of the classical t-ratio, but a nonstandard distribution, for which one must
calculate the critical value in (6.84) by means of simulations separately for particular
tests (1), (2), and (3) and for particular lengths n of time series (see selected critical
Table 6.9 Selected critical Significance level 10 % 5% 1%

values of DF tests (for the
Critical values for τ-test (n ! 1) –1.62 –1.95 –2.58
asymptotic case of n ! 1)
Critical values for τμ-test (n ! 1) –2.57 –2.86 –3.43
Critical values for ττ-test (n ! 1) –3.12 –3.41 –3.96
values for the asymptotic case n ! 1 in Table 6.9). It holds generally that this
distribution has heavier tails than the corresponding t-distribution so that its critical
values are much more higher in the absolute value than for t-distribution (e.g., the
critical value –3.41 for 5 % and n ! 1 is in the absolute value twice as much as the
critical value –1.645 for the classical t-test, i.e., one needs a more significant value of
t-ratio to reject the null hypothesis of ψ ¼ 0). The reason consists in the fact that one
applies the nonstationary regressor (see also the introduction of Sect. 6.4). Even if
the critical values were calculated by Dickey and Fuller, nowadays the software
systems use more sophisticated algorithms delivering directly the corresponding
p-values; see, e.g., MacKinnon (1996).
6.4.1.2 Augmented Dickey-Fuller Test
The previous DF test is applicable only in the case that the residual component εt has
the form of independent white noise. However, if the model (6.81) explaining the
dependent variable Δyt includes autocorrelatedness, which is not reflected correctly,
then the type-one error of DF test (i.e., the probability of rejection of valid H0) is
higher than the declared α. Mainly for this case, so-called augmented DF test (ADF-
test) has been suggested which has the null hypothesis of the form
X
p
H 0 : Δyt ¼ ψyt1 þ γ i Δyti þ εt for ψ ¼ 0 ð6:85Þ
i¼1
instead of (6.81). The test statistics and the critical values for particular variants (1),
(2), and (3) (i.e., for τ-test, τμ-test, and ττ-test) are the same as before the augmen-
tation (the test concerns again the parameter ψ only). The added autoregressive terms
in (6.85) absorb the dynamic structure explaining the dependent variable. For the
identification of order p of added autoregressive terms, one recommends to apply the
information criteria (see Sect. 6.3.1.2).
6.4.1.3 Phillips–Perron Test
PP test (see Phillips and Perron (1988)) is similar to ADF test, except that it models
the possible autocorrelatedness of residuals not by adding autoregressive terms as
in (6.85), but directly by correcting the estimated standard deviation in the deno-
minator of the original DF statistics (6.83). Essentially, it is HAC approach
(heteroscedasticity and autoregression consistent covariances) to the linear regression

model with autocorrelated residuals; see, e.g., Newey–West (1987).
6.4.1.4 KPSS Test
KPSS test (see Kwaitkovski et al. (1992)) improves the resolving power of DF test
which can be sometimes weaker. For instance, one should reject the null hypothesis
of unit root for the theoretical model yt ¼ 0.95yt – 1 + εt. If it is not the case, then it
means that either the model is really nonstationary or we do not have sufficient
information to reject it (e.g., a short segment of time series yt ¼ 0.95yt – 1 + εt is
observed only). Therefore, KPSS test was suggested in such a way that the hypoth-
eses H0 and H1 are just opposite than for ADF test (i.e., the null hypothesis H0
represents stationarity versus nonstationarity in the alternative H1). Moreover, one
recommends to carry out ADF test and KPSS test always simultaneously with the
following conclusions: (a) if H0ADF is rejected and simultaneously H0KPSS cannot be
rejected, then the stationarity is confirmed; (b) if H0ADF cannot be rejected and
simultaneously H0KPSS is rejected, then the nonstationarity is confirmed; and (c) both
remaining combinations are regarded as inconclusive. To summarize the topic, the
previous tests (and others) can be found in modern software systems recommended
for time series analysis.
Nowadays the topic of the unit roots testing is very complex so that other
references should be also addressed for a deeper understanding (see, e.g., Brockwell
and Davis 1996, Section 6.3 or Heij et al. 2004, Section 7.3.3).
Example 6.5 Figure 6.8(a) and Table 6.10 show the values of index PX of Prague
Exchange (i.e., the time series {PXt}) for 251 trading days of the year 2016. This
time series seems to be the random walk PXt ¼ PXt – 1 + εt, (see (6.70) for α ¼ 0).
Table 6.11 shows the results of DF test of type τ (i.e., with H1: yt ¼ φ1yt – 1 + εt for
φ1 < 1): the null hypothesis of nonstationarity with one unit root is not rejected even
when applying the significance level of 10 %. This test is performed simultaneously
with ADF test since according to Table 6.11 the system EViews performs the
automatic choice of the order p of autoregressive terms in (6.85) by means of the
information criterion SIC (so-called Schwarz information criterion also denoted
sometimes as BIC; see Sect. 6.3.1.2).
After transferring to the first differences ΔPXt, the previous DF test of type τ
rejects the null hypothesis of nonstationarity (i.e., the existence of second unit root in
the original time series PXt) applying the significance level of 1 % (see Table 6.12),
so that the construction of first differences is sufficient in order to make the series PXt
stationary.
Finally, Table 6.13 presents the results of KPSS test by means of EViews that
rejects significantly the null hypothesis of stationarity of the (nondifferenced) time
series PXt even when applying the significance level of 1 %. This result together with
the previous ADF test confirms unambiguously the nonstationarity of PXt.
⋄
Table 6.10 Index PX in year 2016 (values for 251 trading days written in columns) from Example 6.5 (see also Fig. 6.8(a))
156
1 2 3 4 5 6 7 8 9 10
1 938.23 872.53 913.94 916.94 890.28 808.21 849.79 879.83 915.33 886.31
2 941.07 852.97 909.99 918.60 891.30 816.91 862.37 870.08 922.50 888.14
3 936.17 863.74 910.20 919.60 890.47 824.43 856.79 868.29 921.35 885.13
4 916.09 847.23 898.32 912.35 892.68 826.29 859.18 864.48 928.28 879.33
5 924.04 845.92 914.85 914.89 893.76 814.58 861.20 860.81 935.36 881.22
6 918.64 875.81 908.44 913.71 888.21 811.26 863.87 861.83 934.05 885.05
7 919.62 861.37 901.14 909.01 888.41 820.31 861.30 865.53 919.18 887.20
8 914.73 877.55 890.47 916.04 879.51 827.31 856.30 864.77 925.46 886.66
9 898.62 878.51 888.93 909.34 892.27 826.17 850.80 875.87 921.78 894.86
10 881.12 871.22 893.17 896.67 895.20 844.90 850.98 874.05 908.80 894.24
11 867.85 886.64 900.82 886.94 874.05 863.55 847.60 869.10 902.89 899.57
12 873.98 879.15 899.91 886.83 867.79 870.26 846.29 866.34 909.66 900.71
13 855.92 855.43 892.92 867.79 840.05 876.21 850.72 874.57 893.82 905.43
14 859.87 865.67 896.85 864.20 818.38 882.07 858.06 863.58 899.00 911.09
15 886.77 865.37 889.53 869.46 808.20 887.50 855.79 868.59 897.95 903.02
16 883.85 857.61 884.39 866.93 817.58 891.37 853.19 875.13 897.76 911.98
17 892.81 871.89 884.30 871.43 815.86 892.44 852.94 881.09 901.69 917.59
18 902.56 879.32 899.33 873.87 831.21 889.42 858.83 889.38 900.99 912.46
19 909.43 883.06 893.56 882.66 838.94 893.42 859.14 885.72 905.11 917.48
20 921.07 889.69 887.01 869.31 842.31 887.26 866.37 891.00 904.68 917.55
21 914.71 888.01 895.62 873.38 852.05 881.74 875.71 894.57 889.62 917.53
22 902.20 892.56 895.62 873.17 855.26 880.08 880.84 890.74 884.41 916.75
23 886.72 886.71 906.63 875.19 819.58 876.28 882.42 886.64 884.00 920.35
24 897.48 896.15 904.65 874.83 790.09 857.86 881.41 899.16 892.29 923.54
25 904.79 907.56 915.23 876.22 806.43 856.10 884.80 906.97 888.72 919.58
26 921.61
6 Box–Jenkins Methodology
Source: Prague Stock Exchange (https://www.pse.cz/en/indices/index-values/detail/XC0009698371?tab¼detail-history)

Table 6.11 DF test of time series of index PX from Example 6.5 by means of EViews
Null Hypothesis: PX2016 has a unit root
Exogenous: None
Lag Length: 0 (Automatic—based on SIC, maxlag¼15)
t-Statistic Prob.a
Augmented Dickey–Fuller test statistic –0.202174 0.6126
Test critical values: 1% level –2.574245
5% level –1.942099
10% level –1.615852
a
MacKinnon (1996) one-sided p-values
Table 6.12 DF test of first differences of index PX from Example 6.5 by means of EViews
Null Hypothesis: D(PX2016) has a unit root
Exogenous: None
Lag Length: 0 (Automatic—based on SIC, maxlag¼15)
t-Statistic Prob.a
Augmented Dickey—Fuller test statistic –14.49788 0.0000
Test critical values: 1% level –2.574282
5% level –1.942104
10% level –1.615849
a
MacKinnon (1996) one-sided p-values
Table 6.13 KPSS test of time series of index PX from Example 6.5 by means of EViews
Null Hypothesis: PX2016 is stationary
Exogenous: Constant
Bandwidth: 11 (Newey–West automatic) using Bartlett kernel
Kwiatkowski–Phillips–Schmidt–Shin test statistic 0.284476
Asymptotic critical values: 1% level 0.739000
5% level 0.463000
10% level 0.347000
6.4.2 Process ARIMA
The time series with stochastic trend of type (6.70), which can be stationarized by
means of differencing, are modeled as processes ARIMA in Box–Jenkins method-
ology. Integrated mixed process of order p, d, q denoted as ARIMA( p, d, q) has the
form
φðBÞwt ¼ α þ θðBÞεt , ð6:86Þ
where
wt ¼ Δd yt ð6:87Þ
is the dth difference of the original time series yt (see also (3.62)), which is modeled
as a stationary (and invertible) process ARMA( p, q) in (6.86). In other words, the
principle of ARIMA processes is as follows: at first (1) the modeled time series is
stationarized by differencing it suitably and then (2) the corresponding stationary
time series (denoted as wt in (6.86) and (6.87)) is modeled by means of mixed
process ARMA. Usually one writes it summarily as
φðBÞΔd yt ¼ α þ θðBÞεt : ð6:88Þ
An important special case is the integrated process I(d) presented mostly in a simple
form
Δ d yt ¼ εt , ð6:89Þ
which can be constructed by “integrating” the white noise, e.g., for d ¼ 1 it holds
X
t
yt ¼ y1 þ ετ : ð6:90Þ
τ¼2
Remark 6.17 The drift parameter α serves to model a possible nonzero level of the
process wt, i.e., a deterministic trend in the form of polynomial of the dth order for
original time series {yt}. If d > 0, then the model ARIMA of time series yt is
invariant when shifting the time series by an arbitrary constant. Therefore, obviously
it makes no sense to center such series by subtracting the sample means before their
analysis.
⋄
Remark 6.18 The operator φ(B)Δd on the left-hand side of the model (6.88) is
called generalized autoregressive operator. This operator is characterized by the
property that the corresponding polynomial φ(z)Δd has p roots lying outside the unit
circle in complex plane and, in addition, a unit root of multiplicity d. More general
types are the processes ARUMA that have at least one root different from unit, but
lying on the unit circle, and the explosive processes (see also Remark 6.15), which
have at least one root inside the unit circle.
⋄
1.0
0.9
0.8
0.7
0.6
ACF
0.5
0.4
0.3
0.2
0.1
2 4 6 8 10 12 14 16 18 20 22 24
1.0
0.8
0.6
PACF
0.4
0.2
0.0
-0.2
2 4 6 8 10 12 14 16 18 20 22 24
Fig. 6.9 Estimated correlogram and partial correlogram from Example 6.6 (index PX in the year
2016)
The construction of model ARIMA is based on the construction of stationary

model ARMA for appropriate differences of original time series (possibly after an
initial transformation which linearizes the given time series before differencing it;
see Sect. 4.3). In practice, the order of differencing d does not exceed the order two
(the routine time series of economic or financial character have usually d ¼ 1, and
particularly, the time series of consumer price indices or nominal salaries can
demand d ¼ 2). There are various possibilities how to find the order of differencing
d for modeled time series, e.g.:
• The tests of unit root (see Sect. 6.4.1).
• A subjective examination of plots of time series yt, Δyt, Δ2yt, ... including their
estimated correlograms and partial correlograms (in particular, a slow (linear)
decrease of estimated autocorrelations indicates that a higher differencing is
necessary; see Fig. 6.9).
• The comparison of sample standard deviations (volatilities) of time series yt, Δyt,
Δ2yt, ... (one chooses such an order of differencing that shows the lowest
volatility; on the other hand, one must pay attention to overdifferencing since
the volatilities can again increase for higher d).
• The application of information criteria (see Sect. 6.3.1.2) which can be modified
for the models ARIMA.
Example 6.6 For index PX in year 2016, the Example 6.5 identified by means of
tests of unit root the random walk PXt ¼ PXt – 1 + εt (also the linear decrease of
estimated correlogram of {PXt} in Fig. 6.9 indicates the need to transfer this time
series by differencing). In Table 6.14, one estimated {PXt} by means of EViews in
the form
ΔPX t ¼ εt , b
σ ¼ 9:26
(the intercept (or drift parameter) α in the model ΔPXt ¼ α + εt is highly insignif-
⋄
icant; see Table 6.14).
Remark 6.19 One should understand correctly the meaning of constants in time
series models. In stationary models, this constant interrelates to the mean value (i.e.,
the level) of the corresponding process. For example, for MA(1) process yt ¼ α + εt +
θ1εt – 1 it is directly μ ¼ E(yt) ¼ α . For the stationary AR(1) process yt ¼ α + φ1yt – 1
+ εt it holds μ ¼ α /(1 – φ1). Finally for the random walk with drift yt ¼ α + yt – 1 + εt,
the constant α presents the slope of process (even if this trend is loaded significantly
by the integrated random walk).
⋄
Table 6.14 Estimation of random walk from Example 6.6 (index PX in the year 2016) calculated
by means of EViews
Dependent Variable: DPX2016
DPX2016¼C(1)
Coefficient Std. Error t-Statistic Prob.
C(1) –0.066480 0.585607 –0.113523 0.9097
R-squared 0.000000 Mean dependent var –0.066480
Sum squared resid 21,347.75 Schwarz criterion 7.307204
Log likelihood –910.6397 Hannan–Quinn criter. 7.298787
Durbin–Watson stat 1.834323
6.5 Stochastic Modeling of Seasonality 161
Remark 6.20 In financial practice, we model frequently the time series of logarith-
mic rates of returns rt (so-called log returns)

Pt
r t ¼ ln ¼ ln Pt ln Pt1 ¼ pt pt1 ð6:91Þ
Pt1
for various financial assets (e.g., stocks or commodities) or price indices. These time
series have usually a constant mean value of small positive size with added white
noise
r t ¼ μ þ εt : ð6:92Þ
Then the corresponding time series of logarithmic prices pt ¼ ln Pt fulfill
pt ¼ μ þ pt1 þ εt , ð6:93Þ
which can be looked upon as a random walk with drift increasing approximately
as μ t.
⋄
6.5 Stochastic Modeling of Seasonality
In addition to the stochastic modeling of trend, the methodology described in this

chapter enables us to model also the seasonality in a stochastic way (by means of
so-called seasonal models of Box–Jenkins methodology).
Let us consider, e.g., the seasonality in monthly observations yt (i.e., s ¼ 12). In
such a case, one constructs at first the following model for the time series of January
observations only:
12
Φ B12 ΔD
12 yt ¼ Θ B ηt , ð6:94Þ
where the time index skips across the January periods. The symbols used in the
formula (6.94) are following:

Φ B12 ¼ 1 Φ1 B12 Φ2 B24 . . . ΦP B12P ð6:95Þ
is the seasonal autoregressive operator of order P, and

Θ B12 ¼ 1 þ Θ1 B12 þ Θ2 B24 þ . . . þ ΘQ B12Q ð6:96Þ
is the seasonal moving average operator of order Q, and

Δ12 ¼ 1 B12 ð6:97Þ
is the seasonal difference operator, e.g., it holds

2
Δ212 ¼ 1 B12 yt ¼ 1 2B12 þ B24 yt ¼ yt 2yt12 þ yt24 ð6:98Þ
(see also (3.63)). The model (6.94) can be looked upon as the process ARIMA
describing development of January observations. Similar models are constructed for
the time series that skips across the February observations only, and so on. Let us
suppose now that the models for particular months are approximately the same.
However, the random components ηt in these models should be correlated mutually
in time since there can exist, e.g., a relation between January and February values.
Therefore, let us assume that also the time series {ηt} is described by a model
ARIMA of the form
φðBÞΔd ηt ¼ θðBÞεt , ð6:99Þ
where εt is finally the white noise and the time index runs in the classical way.
Obviously, the models (6.94) and (6.99) can be linked together to a single model of
the form
12
φðBÞΦ B12 Δd ΔD
12 yt ¼ θ ðBÞΘ B εt : ð6:100Þ
The model (6.100) is called the multiplicative seasonal process of order ( p, d, q)
(P, D, Q)12 and is denoted usually by the acronym SARIMA (the adjective “multi-
plicative” expresses the fact that the operators of models (6.94) and (6.99) are
multiplied mutually). For example, the process SARIMA (0, 1, 1)
(0, 1, 1)12 has
the form

ð1 BÞ 1 B12 yt ¼ ð1 þ θ1 BÞ 1 þ Θ1 B12 εt ð6:101Þ
or equivalently
yt yt1 yt12 þ yt13 ¼ εt þ θ1 εt1 þ Θ1 εt12 þ θ1 Θ1 εt13 : ð6:102Þ
Clearly, the number twelve is replaced by four in the case of quarterly seasonality.
There are also the additive seasonal processes but they are applied in practice only
rarely: if comparing with (6.104), a simple example of an additive seasonal processes
is
yt ¼ εt þ θ1 εt1 þ θ12 εt12 þ θ13 εt13 : ð6:103Þ
The previous SARIMA models can be completed by a deterministic trend (see

Example 6.7).
6.5 Stochastic Modeling of Seasonality 163
The construction of seasonal models is realized in three steps similarly as for

ARIMA models (see Sect. 6.3). Here the first identification step is more difficult
since the shapes of correlograms and partial correlograms may be more complex due
to seasonality. For example the process SARIMA (0, 0, 1)
(0, 0, 1)12, i.e.,
yt ¼ εt þ θ1 εt1 þ Θ1 εt12 þ θ1 Θ1 εt13 , ð6:104Þ
has the following variance:

σ 2y ¼ 1 þ θ21 þ Θ21 þ θ21 Θ21 σ 2 ¼ 1 þ θ21 1 þ Θ21 σ 2 ð6:105Þ
and the following autocorrelation function:
θ1 Θ1 θ 1 Θ1
ρ1 ¼ , ρ12 ¼ , ρ11 ¼ ρ13 ¼ ,
1 þ θ21 1 þ Θ1
2
1 þ θ21 1 þ Θ21
ρk ¼ 0 for k 6¼ 1, 11, 12, 13: ð6:106Þ
In general, the process SARIMA (0, 0, q)

(0, 0, Q)12 has nonzero autocorrelations
only for arguments 1, ..., q, 12 – q, ..., 12 + q, 24 – q, ..., 24 + q, ..., 12Q – q, 12Q +
q (and analogously for the partial autocorrelations of the process SARIMA ( p, 0, 0)
(P, 0, 0)12 ). In practice, one usually chooses among several alternatives offered by
a suitable software (see, e.g., EViews).
Example 6.7 Figure 6.10 and Table 4.4 show the time series yt (t ¼ 1, ..., 96) of the
job applicants kept in the Czech labor office register for particular months 2009M1-
2016M12 (the same data with different time range were analyzed in Sect. 4.1
applying decomposition methods; see Examples 4.2 and 4.4).
By means of EViews (see Table 6.15) one has constructed the multiplicative
seasonal model SARIMA (1, 0, 0)
(0, 1, 0)12 (with deterministic trend) of the form

ð1 0:93BÞ 1 B12 yt ¼ 56 506:21 þ 51:11 t þ εt , b
σ ¼ 9 658:47
or equivalently
yt 0:93yt1 yt12 þ 0:93yt13 ¼ 56 506:21 þ 51:11 t þ εt , b

σ ¼ 9 658:47:
⋄
Remark 6.21 The seasonal time series in financial practice are frequently modeled
using so-called airline model, which is the process SARIMA (0, 1, 1)
(0, 1, 1)s
ð1 BÞð1 Bs Þyt ¼ ð1 þ θ1 BÞð1 þ Θ1 Bs Þεt : ð6:107Þ
⋄
650000
600000
550000
500000
450000
400000
350000
2009 2010 2011 2012 2013 2014 2015 2016
job applicants
job applicants estimated by SARIMA
Fig. 6.10 Monthly data 2009M1-2016M12 and the values estimated by model SARIMA in
Example 6.7 ( job applicants kept in Czech labor office register); see Table 4.4. Source: Czech
Statistical Office
Table 6.15 Estimation of the process SARIMA (1, 0, 0)

(0, 1, 0)12 (with deterministic trend)
from Example 6.7 (job applicants kept in Czech labor office register) calculated by means of
EViews
Dependent Variable: JOBAPPLICANTS—JOBAPPLICANTS(12)
Variable Coefficient Error t-Statistic Prob.
C –56,506.21 70,358.24 –0.803121 0.4243
T 51.11415 845.0047 0.060490 0.9519
AR(1) 0.931628 0.023134 40.27022 0.0000
R-squared 0.971096 Mean dependent var –12,300.01
Adjusted R-squared 0.970373 S.D. dependent var 56,113.27
Sum squared resid 7.46E+09 Schwarz criterion 21.31196
Log likelihood –877.8181 Hannan–Quinn criter. 21.25966
F-statistic 1343.879 Durbin–Watson stat 0.865019
Prob(F-statistic) 0.000000
6.6 Predictions in Box–Jenkins Methodology
One of convenient features of Box–Jenkins methodology consists in easy construc-

tion of predictions. We shall demonstrate basic principles of corresponding fore-
casting philosophy by means of stationary and invertible process ARMA( p, q) with
zero mean value
6.6 Predictions in Box–Jenkins Methodology 165
yt ¼ φ1 yt1 þ . . . þ φp ytp þ εt þ θ1 εt1 þ . . . þ θq εtq : ð6:108Þ
Similarly as in the previous chapters, the symbol bytþk ðt Þ will denote the prediction of
value yt+k constructed in time t, i.e., the prediction for time t + k in time t (k-step-
ahead prediction).
For simplicity, we shall construct the linear prediction, i.e., the prediction
which is a linear function of values yt, yt–1, ... or equivalently a linear function of
εt, εt–1, ... (since we assume the stationarity and invertibility). In addition, the
mean square error of constructed prediction
2
MSE ¼ E ytþk bytþk ðt Þ ð6:109Þ
should be minimal over all linear predictions. If one takes such a prediction in the
form
bytþk ðt Þ ¼ ψ k εt þ ψ kþ1 εt1 þ . . . , ð6:110Þ
and keeps the usual form of linear processes
ytþk ¼ εtþk þ ψ 1 εtþk1 þ ψ 2 εtþk2 þ . . . þ ψ k1 εtþ1 þ ψ k εt þ . . . ð6:111Þ
(see (6.17)), then obviously one should look for such coefficients ψ k , ψ kþ1 , . . . ,
which minimize the expression
!
1
X 2
1þ ψ 21 þ ... þ ψ 2k1 þ ψj ψ j σ2 : ð6:112Þ
j¼k
Evidently, the expression (6.112) attains its minimum for
ψ j ¼ ψ j , j ¼ k, k þ 1, . . . : ð6:113Þ
Hence we have derived that
bytþk ðt Þ ¼ ψ k εt þ ψ kþ1 εt1 þ . . . : ð6:114Þ
Moreover, the error of prediction bytþk ðt Þ defined as
etþk ðt Þ ¼ ytþk bytþk ðt Þ ð6:115Þ
can be obviously expressed as

etþk ðt Þ ¼ εtþk þ ψ 1 εtþk1 þ . . . þ ψ k1 εtþ1 ð6:116Þ
with zero mean value and variance

varðetþk ðt ÞÞ ¼ 1 þ ψ 21 þ . . . þ ψ 2k1 σ 2 : ð6:117Þ
Particularly, it holds
et ðt 1Þ ¼ yt byt ðt 1Þ ¼ εt , ð6:118Þ
i.e., the white noise can be looked upon as the one-step-ahead prediction errors (this
fact also justifies why one sometimes denotes the white noise as innovation; see also
Remark 6.4).
So far we have dealt with theoretical features of predictions. Now we will show
how to construct predictions according to Box–Jenkins methodology in reality. As it
holds
ytþk ¼ φ1 ytþk1 þ . . . þ φp ytþkp þ εtþk þ θ1 εtþk1 þ . . . þ θq εtþkq , ð6:119Þ
one can write (due to linearity of predictions)
bytþk ðt Þ ¼ φ1bytþk1 ðt Þ þ . . . þ φpbytþkp ðt Þ þ bεtþk ðt Þ þ θ1bεtþk1 ðt Þ þ . . .

þb
θq εtþkq ðt Þ: ð6:120Þ
The relation (6.120) is basic one for real calculations of predictions since one can
substitute
bytþj ðt Þ ¼ ytþj for j 0, ð6:121Þ
(
0 for j > 0 ,
bεtþj ðt Þ ¼ ð6:122Þ
εtþj ¼ ytþj bytþj ðt þ j 1Þ for j 0 :
Considering the previous relations, it is possible to formulate the following

algorithm for real calculations of predictions (this algorithm can be also used in
the case of nonstationary models of the type ARIMA and SARIMA):
(a) One proceeds recursively from a suitable time t, i.e., at first we construct one-
step-ahead predictions
bytþ1 ðt Þ, bytþ2 ðt þ 1Þ, . . . ,
then two-step-ahead predictions (by means of one-step-ahead predictions)

bytþ2 ðt Þ, bytþ3 ðt þ 1Þ, . . .
etc., until we reach the prediction horizon and the prediction time (mostly the end
t ¼ n of observed time series), which correspond to the our real prediction
problem.
(b) To realize (a) one makes use of the formula (6.120) with estimated parameters
and substituting relations (6.121) and (6.122) (in order to start the recursive
calculations, one must choose initial values, e.g., in the model MA(q) one can
start with ε1 ¼ ε2 ¼ ... ¼ εq ¼ 0).
(c) One can also construct the interval predictions. For example assuming the
normality, the 95 % prediction interval can be approximated by means of
(6.117) as
0 !1=2 !1=2 1
X
k1 X
k1
@bytþk ðt Þ 2b
σ 1þ b 2j
ψ , bytþk ðt Þ þ 2b
σ 1þ b 2j
ψ A: ð6:123Þ
j¼1 j¼1
Remark 6.22 Let us stress once more that in practice one substitutes to the
prediction formulas the estimated parameter (see, e.g., (6.123)). Fortunately in
routine situations, the predictions remain after such a substitution acceptable (par-
ticularly for longer time series).
⋄
Example 6.8 This example demonstrates how to construct predictions in three
estimated models of different types:
1. Stationary AR(1) process with deterministic linear trend:
yt 9:58 1:75 t ¼ 0:48ðyt1 9:58 1:75 ðt 1ÞÞ þ εt
(see ((6.79) and ((6.80)); one calculates step by step:
by2 ð1Þ ¼ 9:58 þ 1:75 2 þ 0:48ðy1 9:58 1:75 1Þ ¼ 7:64 þ 0:48y1

by3 ð2Þ ¼ 9:58 þ 1:75 3 þ 0:48ðy2 9:58 1:75 2Þ ¼ 8:55 þ 0:48y2
⋮
by3 ð1Þ ¼ 9:58 þ 1:75 3 þ 0:48ðby2 ð1Þ 9:58 1:75 2Þ ¼ 8:55 þ 0:48by2 ð1Þ
by4 ð2Þ ¼ 9:58 þ 1:75 4 þ 0:48ðby3 ð2Þ 9:58 1:75 3Þ ¼ 9:46 þ 0:48by3 ð2Þ
⋮
2. Process ARIMA(0, 1, 1):
yt ¼ yt1 þ εt þ 0:39εt1 ;
one again calculates step by step (for ε1 ¼ 0):
by2 ð1Þ ¼ y1 þ 0:39ε1 ¼ y1

by3 ð2Þ ¼ y2 þ 0:39ðy2 by2 ð1ÞÞ
⋮
by3 ð1Þ ¼ by2 ð1Þ
by4 ð2Þ ¼ by3 ð2Þ
(obviously, it holds in general bytþk ðt Þ ¼ bytþ1 ðt Þ for k 1).

3. Process SARIMA (0, 0, 1)
(0, 0, 1)4:
yt ¼ εt 0:4εt1 þ 0:5εt4 0:2εt5 ;
one calculates step by step (for ε1 ¼ ... ¼ ε5 ¼ 0):
by6 ð5Þ ¼ 0

by7 ð6Þ ¼ 0:4 y6 by6 ð5Þ ¼ 0:4y6

by8 ð7Þ ¼ 0:4 y7 by7 ð6Þ ¼ 0:4y7 0:16y6
⋮
by7 ð5Þ ¼ by8 ð6Þ ¼ by9 ð7Þ ¼ 0
by10 ð8Þ ¼ 0:5ðy6 by6 ð5ÞÞ ¼ 0:5y6
⋮
As the interval predictions are concerned, e.g., in the second example (i.e., for
ARIMA) one can write
ytþk yt ¼ εtþk þ ð1 þ 0:39Þðεtþk1 þ . . . þ εtþ1 Þ þ 0:39εt ,
so that
1=2
σ ðetþk ðt ÞÞ ¼ 1 þ ðk 1Þð1 þ 0:39Þ2 þ 0:392 b
b σ,
and hence the 95 % prediction interval can be approximated by

σ ðetþk ðt ÞÞ, bytþk ðt Þ þ 2 b
bytþk ðt Þ 2b σ ðetþk ðt ÞÞ :
⋄
(a) (b)
14 440000
12
400000
10
360000
8
6 320000
4
280000
2
240000
0 01 02 03 04 05 06 07 08 09 10 11 12
2000 2001 2002 2003 2004 2017
point prediction (Dreimonatsgeld) point prediction of job applicants

interval prediction (± 2 S.E.) interval prediction of job applicants (± 2 S.E.)
Fig. 6.11 Point and 95 % interval predictions from Example 6.9 for (a) 3-month interbank interest
rate (Dreimonatsgeld in % p.a.) in Germany for years 2000–2004 (see also Example 6.1); (b) job
applicants kept in the Czech labor office register for particular months 2017M1-2017M12 (see also
Example 6.7) calculated by means of EViews
Remark 6.23 The behavior of predictions is different for stationary and nonstationary
processes. If the prediction horizon increases in a stationary process, then the predic-
tion converges to the mean value of the process (mean reversion) and the variance of
prediction error converges to the variance of process: bytþk ðt Þ! E(yt) ¼ μ in the sense
of convergence in mean square (it follows from (6.114) for k ! 1 where μ ¼ 0),
var(et+k(t)) ! var(yt) ¼ σ 2 in nondecreasing way (it follows from (6.116) again
for k ! 1). On the contrary in nonstationary processes (e.g., ARIMA), if the
prediction horizon increases then the width of prediction horizon grows to infinity:
var(et+k(t)) ! 1. Therefore, the applicability of these predictions is more and more
dubious for k ! 1) and also the (unconditional) variance of such processes is infinite
(i.e., yt can attain any real value for sufficiently large t). In any case, the prediction
band composed from particular prediction intervals has a “funnel” shape (see, e.g.,
Fig. 6.11).
⋄
Remark 6.24 If we write an estimated model ARIMA(0,1,1) as
Δyt ¼ εt b
θεt1 , ð6:124Þ
then the predictions have the form
X
1
bytþ1 ðt Þ ¼ 1 b b j
θ θ ytj , ð6:125Þ
j¼0
i.e., they coincide with predictions according to simple exponential smoothing if one
chooses the discount constant β ¼ b θ1 (i.e., the smoothing constant α ¼ 1 b
θ1 ; see
(3.75)).
⋄
Example 6.9 Figure 6.11(a) and (b) plot the point and 95 % interval predictions for
(a) 3-month interbank interest rate (Dreimonatsgeld in % p.a.) in Germany for years
2000–2004 (see Example 6.1); (b) job applicants kept in the Czech labor office
register for particular months 2017M1-2017M12 (see Example 6.7).
For example, one can see in Fig. 6.11(a) that the predictions of 3-month interbank
interest rate stabilize with increasing prediction horizon to the level 8% of uncondi-
tional mean value of the given process.
⋄
6.7 Long Memory Process
Some financial (but also economic or hydrologic) data remain autocorrelated even
over very long time distances. Strictly speaking, their estimated correlogram and
partial correlogram decrease hyperbolically (by a polynomial rate; see (6.132)). Such
a rate lies between a very slow linear decrease for the processes ARIMA with some
characteristic roots nearly on the border of unit circle and a very fast exponential
decrease for the stationary processes ARMA. The time series of this type are usually
called long-memory process (or persistent process). A successful method how to
model it consists in so-called fractional differencing (see, e.g., Hurst (1951) in
hydrology and Granger (1980) in economy).
The simplest example of a long-memory process is the fractionally integrated
process of order d (d is a non-integer) denoted by acronym FI(d) of the form
ð1 BÞd yt ¼ εt or Δd yt ¼ εt : ð6:126Þ
For this process, one should distinguish the following cases:

1. d < 0.5: The process is (weakly) stationary with representation in the form of
linear process
X
1
yt ¼ εt þ ψ i εt i , ð6:127Þ
i¼1
where
6.7 Long Memory Process 171
d ðd þ 1Þ . . . ðd þ i 1Þ
ψi ¼ ð6:128Þ
i !
(it follows from the extension of (1–z)–d to power series). The process is
nonstationary for d 0.5.
2. d > –0.5: The process is invertible with representation in the form of (infinite)
autoregressive process
X
1
yt ¼ π i yti þ εt , ð6:129Þ
i¼1
where
d ð d 1Þ . . . ð d i þ 1Þ
π i ¼ ð1Þi ð6:130Þ
i !
(it follows from the extension of (1–z)d to power series). The process is
noninvertible for d –0.5.
3. –0.5 < d < 0.5: The process is stationary and invertible with autocorrelation and
partial autocorrelation function of the form
d ðd þ 1Þ ... ðd þ k 1Þ d
ρk ¼ ,ρ ¼ , k ¼ 1, 2, ... : ð6:131Þ
ð1 dÞð2 dÞ ... ðk dÞ kk k d
Particularly for large lags k it holds

ψ k ¼ O k d1 , π k ¼ O kd1 , ρk ¼ O k2d1 : ð6:132Þ
4. 0 < d < 0.5: The autocorrelation function is positive and fulfills
X
1
ρk ¼ 1: ð6:133Þ
k¼1
Particularly in this case one uses explicitly the attribution long-memory process
(or persistent process). The partial sums y1 + ... + yt grow with a quicker rate than
the linear one.
5. –0.5 < d < 0: The autocorrelation function fulfills
X
1
jρk j < 1: ð6:134Þ
k¼1
In this case, one uses the attribution intermediate-memory process or antipersistent

process. The partial sums y1 + ... + yt grow with a slower rate than the linear one.
In general, one can apply model ARFIMA( p, d, q) (autoregressive fractionally
integrated moving average process of order p, d, q) of the form
φðBÞwt ¼ θðBÞεt ð6:135Þ
(i.e., a stationary and invertible model ARMA( p, q) for wt), where wt arises from the
original process yt as a fractionally integrated process of order d
wt ¼ ð1 BÞd yt : ð6:136Þ
Remark 6.25 Sometimes the processes of the type or random walk
X
t
yt ¼ y1 þ ετ ð6:137Þ
τ¼2
(see (6.90)) are called strong-memory processes since they “remember” all last
shocks εt, εt – 1, ... .
⋄
Remark 6.26 A sudden structural break (i.e., an abrupt change of the model) within
an observed time series may lead, when one fits a model, to a pseudo-long-memory
behavior. Even a mean change of this type already pretends a long-memory behavior
in sample (partial) correlograms. Particularly, it is typical for some financial time
series which are modeled as long-memory processes since a structural break has
been not taken into account.
⋄
6.8 Exercises
Exercise 6.1 Repeat the analysis from Examples 6.1–6.4 and 6.9(a) (the stationary
process of “Dreimonatsgeld”), but only for data since 1965 (hint: yt ¼ 6:466 þ
0:924yt1 0:735yt2 þ 0:508yt3 0:510yt4 þ εt , b
σ ¼ 1:785).
6.8 Exercises 173
Exercise 6.2 Repeat the analysis from Examples 6.5 and 6.6 (the nonstationary
index PX), but only for last 150 observations (hint: b
σ ¼ 8:33).
Exercise 6.3 Repeat the analysis from Examples 6.7 and 6.9(b) (the seasonal
process of job applicants), but only for data since 2010 (hint: yt 0:98yt1
yt12 þ 0:98yt13 ¼ 193 593:5 2 684:2 t þ εt , b
σ ¼ 7 267:96).
Chapter 7
Autocorrelation Methods in Regression
Models
7.1 Dynamic Regression Model
Box–Jenkins methodology is often applied in the context of so-called dynamic

regression models with dynamics of explanatory variables on the right-hand side
of model (including lagged values of the explained variable y from the left-hand side
of model) and the residual component u with a dynamic correlation structure
(including ARMA structure), e.g.,
Δyt ¼ β1 þ β2 Δxt2 þ β3 xt3 þ β4 xt1,2 þ β5 yt1 þ ut : ð7:1Þ
Formally, one can write the (linear) dynamic regression model for explained vari-
able yt as
yt ¼ β1 þ β2 xt2 þ . . . þ βk xtk þ ut , ð7:2Þ
where all explanatory variables xti are orthogonal to the residual ut in the same time
t, i.e., cov(xti, ut) ¼ 0 (so-called simultaneous uncorrelatedness). Such a condition of
orthogonality guarantees some useful statistical properties of the model, e.g., OLS
estimates of regression parameters in (7.2) are consistent under this condition.
Dynamic regression models are broadly used in econometric modeling, where the
orthogonal explanatory variables xti may be either (strictly) exogenous (i.e., origi-
nating outside the model (7.2)) or predetermined (i.e., originating within the model,
but in a past time viewed from the perspective of present time t, e.g., originating in
time t 1).
More specifically, one usually assumes that the residual component ut is modeled
as an ARMA process

https://doi.org/10.1007/978-3-030-46347-2_7
176 7 Autocorrelation Methods in Regression Models
ut ¼ φ1 ut1 þ . . . þ φp utp þ εt þ θ1 εt1 þ . . . þ θq εtq , ð7:3Þ
where εt is white noise with variance σ 2 and β1, . . ., βk, φ1, . . ., φp, θ1, . . ., θq, σ 2 are
(unknown) parameters.
One should stress that the typical feature of dynamic regression models is the
exploitation of time lagged (delayed) variables. It has practical reasons, e.g.:
• Decelerated responses to changes: Some economic and financial variables
change slowly so that a response to the changes of this type (e.g., changes in
the structure of financial markets, government politics, bank strategy) is measur-
able with a substantial time delay and not within the same time period. In the
economic and financial context, the reasons of such delay can be manifold:
– Psychological: e.g., market subjects disbelieve at first new messages or under-
estimate their consequences.
– Technological: e.g., the speed of transactions depends on technical facilities of
financial exchanges.
– Due to liquidity: e.g., new investment positions cannot be open until the old
ones are closed (or sold), or until a necessary capital is available.
Moreover, the speed and intensity of responses depend on the character of
changes, e.g., whether the changes are permanent or transient. In any case, a
complex dynamic structure complicates the model interpretation.
• Overreaction to changes: Sometimes pessimistic economic prognoses cause
immediate decreases of prices which are stabilized later (i.e., with a time gap)
as soon as the real results are announced.
• Modeling autocorrelated residuals: The application of dynamic models instead
of static ones can sometimes remove the problem of autocorrelated residuals (see
Sect. 7.2).
In this chapter, we deal with several special cases of dynamic regression models
which are important for economic and financial time series, namely:
• Linear regression model with autocorrelated residuals: does not contain any
lagged variables (neither explanatory nor explained), but the delay is comprised
in the residual component (see Sect. 7.2).
• Distributed lag model: contains the lagged explanatory variables but no lagged
explained variable (see Sect. 7.3).
• Autoregressive distributed lag model: contains the lagged explained variable
(and possibly also lagged explanatory variables (see Sect. 7.4)).
7.2 Linear Regression Model with Autocorrelated

Residuals
Linear regression model with autocorrelated residuals is usually the classical

regression
7.2 Linear Regression Model with Autocorrelated Residuals 177
yt ¼ β1 þ β2 xt2 þ . . . þ βk xtk þ ut ð7:4Þ
with the ARMA structure (7.3) of residuals ut, but in contrast to (7.2) the explanatory
variables xti are looked upon as deterministic regressors. It is a popular generaliza-
tion of linear regression where the residual component has the form of uncorrelated
white noise to the case with correlated observations. Such a correlatedness must be
taken into account since it is usual in practice (e.g., the delayed values of some
variables, which should be included among regressors of (7.4), are present only in
the residuals ut causing correlatedness in time).
The simplest type of correlatedness covering majority of routine situations
consists in modeling the residual component ut by means of the stationary
autoregressive model of the first order (see the process AR(1) in Remark 6.9)
written as
ut ¼ ρut1 þ εt , ð7:5Þ
where the autoregressive parameter ρ (–1 < ρ < 1) equals the first autocorrelation of
the process ut (it is denoted ρ instead of ρ1 for simplicity) and εt is white noise. The
sign of ρ plays an important role here: the positive ρ > 0 (so-called positive
correlatedness plotted for a trajectory of residuals ut in the scatterplot on the left-
hand side of Fig. 7.1) induces the inertia for the signs of neighboring values ut (see
the right-hand side of Fig. 7.1 with a relatively rare crossing of time axis), while on
the contrary the negative ρ < 0 (so-called negative correlatedness plotted in the
scatterplot on the left-hand side of Fig. 7.2) induces frequent changes of the signs of
neighboring values ut (see the right-hand side of Fig. 7.2 with a relatively dense
crossing of time axis).
ut ut
ut−1 t
Fig. 7.1 Positive autocorrelatednes (ρ > 0)

ut ut
ut−1 t
Fig. 7.2 Negative autocorrelatedness (ρ < 0)
7.2.1 Durbin–Watson Test
Durbin–Watson test (also called the test of autocorrelatedness of residuals) is one of

the most frequent tests in regression analysis. In contrast to the subjective graphical
instruments in Figures 7.1 and 7.2 (with the estimated OLS-residuals b ut , i.e., the
residuals calculated by the method of ordinary least squares in the linear regression
model (7.4)), it is the statistical test with null hypothesis H0: ρ ¼ 0 in (7.5). Its test
statistics has the form
Pn 2
ut b
t¼2 ðb ut1 Þ
DW ¼ PT 2 : ð7:6Þ
t¼1 b
ut
Obviously, it can be approximated as
DW 2ð1 b
ρÞ, ð7:7Þ
where
Pn
t¼2 b
ut1but
b
ρ¼ P T
ð7:8Þ
t¼1 b
2
ut
is the estimate of the first autocorrelation ρ (see r1 according to (6.9) with u ¼ 0). The
relation (7.7) implies:
• If b
ρ 0 (i.e., the neighboring residuals are uncorrelated), then DW 2.
• If b
ρ 1 (i.e., the neighboring residuals are extremely positively correlated), then
DW 0.
• If b
ρ –1 (i.e., the neighboring residuals are extremely negatively correlated),
then DW 4.
Rejection of H0: Inconclusive H0 cannot be rejected: Inconclusive Rejection of H0:

positive autocorrelatedness test uncorrelatedness test negative autocorrelatedness
0 dL dU 2 4−dU 4−dL 4 DW
Fig. 7.3 Conclusions of Durbin–Watson test corresponding to particular values of statistics DW
The statistics DW does not have any standard probability distribution. However,
if one assumes the normality of white noise εt, then DW has two critical values dL
(lower) and dU (upper), which depend only on the number of observations n and
regressors k (but not on the values of these regressors). Nowadays, DW test is mainly
used as an informal instrument indicating possibility of existence of autocorrelated
residuals. Its conclusions on the null hypothesis H0: ρ ¼ 0 are summarized in Fig. 7.3
(however, the test is inconclusive for some values of the statistics DW, so that no
conclusions are possible in such a case):
The critical values dL and dU can be found in statistical tables or they are
calculated by means of simulations directly in software systems (in the form of p-
values). Moreover, in practice one can apply simplified rules (“rules of thumb”): e.g.,
if one has more observations n (more than fifty) and the number of regressors k is not
too high, then the value of DW lower than 1.5 usually implies the positive
autocorrelatedness (see Example 7.1).
7.2.2 Breusch–Godfrey Test
Later some more general tests have been also suggested that enable to detect even
higher order autocorrelations than the order one in DW test (of course, it would be
possible to try sequentially in the numerator of DW statistics (7.6) various differ-
ences of non-neighboring OLS residuals, but such an approach is too tedious). The
other alternative is to apply procedures suggested originally for verification of Box–
Jenkins models, mainly Q-tests (e.g., Box–Pierce test or Ljung–Box test; see (6.66)
and (6.67)).
Nowadays in the context of regression models of the type (7.4), econometric
software systems offer particularly Breusch–Godfrey test of autocorrelated residuals.
It was suggested for the alternative hypothesis that the residual component ut is the
autoregressive model AR( p) of a higher order p 1. The BG test proceeds in the
following way:
1. One calculates OLS residuals b
ut in the model (7.4) (i.e., by the classical method of
least squares in the same way as for DW test).
2. One estimates an auxiliary model
b
ut ¼ γ 1 þ γ 2 xt2 þ . . . þ γ k xtk þ φ1b
ut1 þ φ2b
ut2 þ . . . þ φpb
utp þ εt : ð7:9Þ
where εt is white noise.

3. One tests
H0 : φ1 ¼ φ2 ¼ . . . ¼ φp ¼ 0 against H1 : φ1 6¼ 0 or
φ2 6¼ 0 or . . . or φp 6¼ 0 ð7:10Þ
applying the classical F-test in the model (7.10). In particular, the critical value of
this test with significance level α is the quantile F1α( p, nkp) of F-distribution.
Other tests instead of F-test, e.g., LM test (Lagrange multiplier), are also possible
(see Example 7.1).
The problematic point of BG test is how to choose the autoregressive order p.
A simple recommendation is the choice corresponding to the frequency of data (e.g.,
p ¼ 4 for quarterly observations, p ¼ 12 for monthly observations, etc.), since the
residual component is usually correlated mainly with the residual component for the
same seasonal period of previous year. On the other hand, if the model is adequate
from the statistical point of view, then no autocorrelations should be significant for
arbitrary choice of p.
7.2.3 Construction of Linear Regression Model with ARMA

Residuals
Let us return to the construction of model (7.4) with autocorrelated residuals. If DW

test confirmed the correlation structure in the form of process AR(1) (i.e., the
simplest case (7.5) of correlatedness), then one can estimate the linear regression
(7.4) by means of Cochrane–Orcutt method, which has been very popular in
practice. This method makes use of so-called Koyck transformation, when one
subtracts from the regression equation (7.4) in time t the same equation in time
t – 1 multiplied by the constant ρ:
yt ρyt1 ¼ ð1 ρÞβ1 þ β2 ðxt2 ρxt1,2 Þ þ . . . þ βk ðxtk ρxt1,k Þ þ εt

ð7:11Þ
(obviously, εt ¼ ut – ρut – 1; see (7.5)). Should the value of ρ be known, then one
would obtain by means of Koyck transformation the classical linear regression
yt ¼ β1 þ β2 xt2 þ . . . þ βk xtk þ εt , ð7:12Þ

where yt ¼ yt ρyt1 , β1 ¼ ð1 ρÞβ1 , xt2 ¼ xt2 ρxt1,2 , . . . , xtk ¼ xtk ρxt1,k .
This fact is the principle of Cochrane–Orcutt method that is phased in the following steps:
1. One calculates the OLS residuals b ut in the model (7.4).
2. One constructs the estimate bρ of the parameter ρ according to (7.8).
3. One constructs the OLS estimate in the model (7.12) replacing the parameter ρ by
the estimate b
ρ.
4. The procedure goes on iteratively by repeating the steps 1 to 3, where in the step 1
one applies the OLS residuals calculated by means of the OLS estimate from the
previous step 3. The procedure stops finally using a suitable stopping rule (e.g., if
the change in the estimated value of ρ between neighboring iteration cycles drops
under a limit fixed in advance).
The disadvantage of Cochrane–Orcutt method consists mainly in the fact that it
delivers estimated parameters of the transformed model (7.12). Some software
systems (e.g., EViews in Example 7.1) enable to return to the estimated parameters
in the original model (7.4) (i.e., before Koyck transformation) by testing the para-
metric constraints following from this transformation.
If we consider the linear regression model (7.4) with general ARMA structure
(7.3) of residuals ut (and not specifically AR(1)), then there are various sophisticated
method for its construction (e.g., the two-stage estimation procedures using the
concept of instrumental variables or other methods; see EViews).
On the other hand, an opposite approach ignoring the residual correlations is also
possible. Namely, if we apply the classical OLS methodology directly to the model
(7.4) (i.e., ignoring the fact that ut need not be white noise), then the corresponding
OLS estimates of parameters β1, . . ., βk are not the best linear unbiased estimates, but
remain consistent (i.e., for large n they are near to the true theoretical values of these
parameters with a high probability). The only weak point of OLS estimates if they
are used in the models with autocorrelated residuals consists in underestimating their
errors (it can, e.g., impair t-tests of parameter significance). The remedy of this
weakness is the application of Newey–West estimate of the error matrix of OLS
estimates of parameters β1, . . ., βk denoted in software as HAC (heteroscedasticity
and autoregression consistent covariances; see, e.g., EViews in Example 7.1).
Example 7.1 Table 7.1 presents the values AAAt and TBILLt of (average) yields to
maturity (YTM in % p.a.) of corporate bonds of the highest quality AAA and
3-month T-bills according to S&P in the USA for particular quarters 1990–1994
(t ¼ 1 , ..., 20). These are two alternative ways of risk-free investing so that one
should expect that yields AAAt depend positively on short-term interest rates TBILLt
in time. This expectation is confirmed in Table 7.2 for model
AAAt ¼ β1 þ β2 TBILLt þ εt , t ¼ 1, . . . , 20 ð7:13Þ
(the estimated model is highly significant according to t-ratios and F-test for the
coefficient of determination R2 with significantly positive estimate b2 ¼ 0.426 of the
Table 7.1 Quarterly data t Year Quarter AAAt TBILLt

1990–1994 in Example 7.1
1 1990 1 9.37 7.76
(yields to maturity of corpo-
rate bonds of the highest 2 1990 2 9.26 7.77
quality AAA and three-month 3 1990 3 9.56 7.49
T-bills in the USA in % p.a.) 4 1990 4 9.05 7.02
5 1991 1 8.93 6.05
6 1991 2 9.01 5.59
7 1991 3 8.61 5.41
8 1991 4 8.31 4.58
9 1992 1 8.20 3.91
10 1992 2 8.22 3.72
11 1992 3 7.92 3.13
12 1992 4 7.98 3.08
13 1993 1 7.58 2.99
14 1993 2 7.33 2.98
15 1993 3 6.66 3.02
16 1993 4 6.93 3.08
17 1994 1 7.48 3.25
18 1994 2 7.97 4.04
19 1994 3 8.34 4.51
20 1994 4 8.46 5.28
Source: FRED (Federal Reserve Bank of St. Louis) (https://fred.
stlouisfed.org/graph/?id¼AAA, https://fred.stlouisfed.org/graph/?
id¼TB3MS,TB3MA)
Table 7.2 Estimation of the model (7.13) from Example 7.1 (yields to maturity of corporate bonds
AAA) calculated by means of EViews
Dependent Variable: AAA
Sample: 1 20
C 6.242529 0.230299 27.10622 0.0000
TBILL 0.425939 0.045876 9.284506 0.0000
R-squared 0.827259 F-statistic 86.20205
S.E. of regression 0.343246 Prob (F-statistic) 0.000000
parameter β2; see Table 7.2). The strong positive autocorrelatedness between εt–1
and εt is also demonstrated by means of the scatterplot in Fig. 7.4 (the corresponding
correlation coefficient is 0.602).
As statistical tests are concerned, the rule of thumb in the framework of Durbin–
Watson test confirms the positive autocorrelatedness, since DW ¼ 0.778 is much
lower than 1.5 (see Table 7.2). The results of Breusch–Godfrey test with the residual
Fig. 7.4 Scatterplot 0.6

demonstrating positive
autocorrelatedness of 0.4
residuals in Example 7.1
(yields to maturity of
corporate bonds AAA)
0.2
0.0
OLS-RESID
-0.2
-0.4
-0.6
-0.8
-1.0
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
OLS-RESID(-1)
Table 7.3 Breusch–Godfrey test of the model (7.13) with p ¼ 1 from Example 7.1 (yields to
maturity of corporate bonds AAA) calculated in EViews
Breusch–Godfrey Serial Correlation F- and LM Test:
F-statistic 9.759497 Prob. F(1,17) 0.006178
ObsR-squared 7.294230 Prob. Chi-Square(1) 0.006918
autoregressive model of order p ¼ 1 in Table 7.3 (both in the form of F-test, and in
the form of LM test) give the same conclusion.
The final estimate of the identified model
AAAt ¼ β1 þ β2 TBILLt þ ut , ut ¼ ρut1 þ εt ð7:14Þ
has been obtained by Cochrane–Orcutt method (see Table 7.4).

Table 7.5 presents the estimate of the same model obtained applying HAC
approach of Newey–West. In comparison with Table 7.2, the estimated standard
deviations of OLS estimates b1 ¼ 6.243 and b2 ¼ 0.426 are much higher (compare
0.377 with 0.230 for b1 and 0.061 with 0.046 for b2) so that the corresponding t-
ratios are lower according to Newey–West.
⋄
Table 7.4 Estimation of the model (7.14) from Example 7.1 by means of Cochrane–Orcutt method
(yields to maturity of corporate bonds AAA) calculated in EViews
Sample (adjusted): 2 20
Included observations: 19 after adjustments
Convergence achieved after 10 iterations
C 6.246693 0.487125 12.82358 0.0000
TBILL 0.429785 0.105027 4.092135 0.0009
AR(1) 0.601729 0.199094 3.022333 0.0081
Table 7.5 Estimation of the model (7.14) from Example 7.1 including Newey–West estimate of
the error matrix of estimated parameters (yields to maturity of corporate bonds AAA) calculated by
means of EViews
Dependent Variable: AAA
Sample: 1 20
Newey-West HAC Standard Errors and Covariance (lag truncation ¼ 2)
C 6.242529 0.376704 16.57143 0.0000
TBILL 0.425939 0.061072 6.974403 0.0000
7.3 Distributed Lag Model
Distributed lag model (or DL model) contains lagged explanatory variables but no
lagged explained variable (obviously, it fulfills in this way the condition that
explanatory variables are orthogonal to the residual in the same time, the lagged
regressors being predetermined in the previous times). For simplicity, we confine
ourselves to the case with a single explanatory variable x and the residual component
in the form of a white noise (the generalization for more lagged explanatory vari-
ables does not cause a serious complication):
X1
yt ¼ α þ i¼0
βi xti þ εt : ð7:15Þ
7.3 Distributed Lag Model 185
In such a model, the influence of the explanatory variable is distributed to a large

number of past time moments. In this context, one can distinguish:
• Short-run effect in consequence of an immediate change of the explanatory
variable: it is described by the parameter β0 (so-called impact multiplier or
short-run multiplier).
• Cumulative effect in consequence of changes of the explanatory variable till a lag
τ: it is described by the finite sum of parameters
Xτ
βðτÞ ¼ i¼0
βi , ð7:16Þ
• Long-run effect in consequence of changes of the explanatory variable: it is

described by the infinite sum of parameters (so-called equilibrium multiplier or
long-run multiplier reflecting the influence of the explanatory variable after its
transition to an equilibrium, i.e., to a balanced state)
X1
β¼ i¼0
βi : ð7:17Þ
Moreover, one usually calculates also other characteristics of lagged effects, e.g.:
Pq
βi
median lag ¼ smallest q such that Pi¼0
1 0:5 ð7:18Þ
i¼0 β i
and
P1
i βi
mean lag ¼ Pi¼0
1 : ð7:19Þ
i¼0 β i
7.3.1 Geometric Distributed Lag Model
The DL model of the form (7.15) is too general to be applied in practice. Therefore,
various modifications have been suggested. The geometric distributed lag model
(or GDL model) is a very pragmatic solution since it uses only a finite number of
parameters α, β, and λ:
X1
yt ¼ α þ β i¼0
ð1 λÞλi xti þ εt , 0 < λ < 1: ð7:20Þ
In this case, the long-run effect (7.17) equals directly the parameter β since
X1
β i¼0
ð1 λÞλi ¼ β: ð7:21Þ
Applying Koyck transformation, when one subtracts from the regression equation
(7.20) in time t the same equation in time t – 1 multiplied by the constant λ (compare
with (7.11)), one obtains
yt ¼ αð1 λÞ þ βð1 λÞxt þ λyt1 þ ηt , where ηt ¼ εt λεt1 : ð7:22Þ
Even though the model (7.22) presents a substantial simplification in comparison

with (7.20), the MA(1) residual structure of ηt (see (7.22)) can complicate the
estimation procedures (the OLS estimates need not be consistent).
7.3.2 Polynomial Distributed Lag Model
Polynomial distributed lag model (or PDL model) was suggested by Almon (1965)
as a special case of DL model with simplified expression of coefficients by means of
polynomials. This approach can reduce substantially the number of parameters
which must be estimated when constructing “trimmed” distributed lag model
Xk
yt ¼ α þ i¼0
βi xti þ εt ð7:23Þ
(obviously, one must choose a priori an adequate length k of trimming). The PDL
models suppose the possibility of approximation
βi ¼ α0 þ α1 i þ α2 i2 þ . . . þ αr ir , i ¼ 0, 1, . . . , k, ð7:24Þ
where the order r of approximative polynomial is much lower than the maximum lag
k (r << k). After substituting (7.24) into (7.23), one obtains
X
k X
k X
k
yt ¼ α þ α 0 xti þ α1 i xti þ . . . þ αr ir xti þ εt ¼
i¼0 i¼0 i¼0
ð7:25Þ
¼ α þ α0 z0t þ α1 z1t þ . . . þ αr zrt þ εt ,
i.e., each zjt is a linear combination of actual value and k lagged values xt, xt–1, ...,
xt–k. In the model (7.25), one can use mostly without problems the classical OLS
methodology and then according to (7.24) find the corresponding estimates of
original parameters βi. Moreover, one can also include the constraint
β1 ¼ α0 þ α1 ð1Þ þ α2 ð1Þ2 þ . . . þ αr ð1Þr ¼ 0 ð7:26Þ

ensuring the null influence of xt+1 on yt (i.e., the null influence from the future time),
or the constraint
βkþ1 ¼ α0 þ α1 ðk þ 1Þ þ α2 ðk þ 1Þ2 þ . . . þ αr ðk þ 1Þr ¼ 0 ð7:27Þ
ensuring the null influence of xt–k–1 on yt (i.e., the null influence from the past time
beyond the used trimming).
Example 7.2 Table 7.6 presents the values of money supply M1t and gross domes-
tic product GDPt (in billions of USD) in the USA for particular quarters 1950–2000
(t ¼ 1, ..., 204). For these data, we shall estimate the DL model explaining the gross
domestic product by means of lagged money supplies (since there is usually an
inertia in the effect of M1):
X4
ln GDPt ¼ α þ i¼0
βi ln M1ti þ εt , t ¼ 5, . . . , 204: ð7:28Þ
The model (7.28) trimmed beyond the lag of four quarters is estimated in Table 7.7
(the coefficient of determination R2 is relatively high, but the statistics DW near to
zero indicates the strong positive autocorrelatedness). Hence the long-run effect of
money supply on gross domestic product is
X4
βDL ¼ i¼0
βi ¼ 1:314 þ . . . þ ð0:021Þ ¼ 0:579:
Alternatively the corresponding geometric distributed lag model

X1
ln GDPt ¼ α þ β i¼0
ð1 λÞλi ln M1ti þ εt ð7:29Þ
rewritten by means of Koyck transformation to the form
ln GDPt ¼ αð1 λÞ þ βð1 λÞ ln M1t þ λ ln GDPt1 þ ηt , where ηt ¼ εt λεt1

ð7:30Þ
is estimated in Table 7.8. In this case, the long-run effect of M1 on GDP works out
βGDL ð1 λÞ 0:004 530

βGDL ¼ ¼ ¼ 0:482:
1λ 1 0:990 601
Finally, Table 7.9 presents the results when applying PDL approach (7.25) to
model (7.28) with a higher trimming lag k ¼ 12 and lower order r ¼ 3 of approx-
imative polynomial. In the first part of Table 7.9, one estimates the model (7.25)
(e.g., PDL01 is the regressor z0t, etc.). In the second part of Table 7.9 one calculates
according to (7.24) the estimates of original parameters βi (till the lag of 12)
including their graphical plot.
Table 7.6 Quarterly data 1950–2004 in Example 7.2 (money supply M1 and gross domestic product GDP in the USA in billions of USD)
188
Quarter t Year GDP M1 t Year GDP M1 t Year GDP M1

1 1 1950 1610.5 110.20 69 1967 3291.8 174.80 137 1984 5402.3 530.80
2 2 1950 1658.8 111.75 70 1967 3289.7 177.00 138 1984 5493.8 540.50
3 3 1950 1723.0 112.95 71 1967 3313.5 180.70 139 1984 5541.3 543.90
4 4 1950 1753.9 113.93 72 1967 3338.3 183.30 140 1984 5583.1 551.20
1 5 1951 1773.5 115.08 73 1968 3406.2 185.50 141 1985 5629.7 565.70
2 6 1951 1803.7 116.19 74 1968 3464.8 189.40 142 1985 5673.8 582.90
3 7 1951 1839.8 117.76 75 1968 3489.2 192.70 143 1985 5758.6 604.40
4 8 1951 1843.3 119.89 76 1968 3504.1 197.40 144 1985 5806.0 619.10
1 9 1952 1864.7 121.31 77 1969 3558.3 200.00 145 1986 5858.9 632.60
2 10 1952 1866.2 122.37 78 1969 3567.6 201.30 146 1986 5883.3 661.20
3 11 1952 1878.0 123.64 79 1969 3588.3 202.10 147 1986 5937.9 688.40
4 12 1952 1940.2 124.72 80 1969 3571.4 203.90 148 1986 5969.5 724.00
1 13 1953 1976.0 125.33 81 1970 3566.5 205.70 149 1987 6013.3 732.80
2 14 1953 1992.2 126.05 82 1970 3573.9 207.60 150 1987 6077.2 743.50
3 15 1953 1979.5 126.22 83 1970 3605.2 211.90 151 1987 6128.1 748.50
4 16 1953 1947.8 126.37 84 1970 3566.5 214.30 152 1987 6234.4 749.40
1 17 1954 1938.1 126.54 85 1971 3666.1 218.70 153 1988 6275.9 761.10
2 18 1954 1941.0 127.18 86 1971 3686.2 223.60 154 1988 6349.8 778.80
3 19 1954 1962.0 128.38 87 1971 3714.5 226.60 155 1988 6382.3 784.60
4 20 1954 2000.9 129.72 88 1971 3723.8 228.20 156 1988 6465.2 786.10
1 21 1955 2058.1 131.07 89 1972 3796.9 234.20 157 1989 6543.8 782.70
2 22 1955 2091.0 131.88 90 1972 3883.8 236.80 158 1989 6579.4 773.90
3 23 1955 2118.9 132.40 91 1972 3922.3 243.30 159 1989 6610.6 782.00
4 24 1955 2130.1 132.64 92 1972 3990.5 249.10 160 1989 6633.5 792.10
1 25 1956 2121.0 133.11 93 1973 4092.3 251.50 161 1990 6716.3 800.80
2 26 1956 2137.7 133.38 94 1973 4133.3 256.90 162 1990 6731.7 809.70
7 Autocorrelation Methods in Regression Models
3 27 1956 2135.3 133.48 95 1973 4117.0 258.00 163 1990 6719.4 821.10
4 28 1956 2170.4 134.09 96 1973 4151.1 262.70 164 1990 6664.2 823.90
1 29 1957 2182.7 134.29 97 1974 4119.3 266.50 165 1991 6631.4 838.00
2 30 1957 2177.7 134.36 98 1974 4130.4 268.60 166 1991 6668.5 857.40
3 31 1957 2198.9 134.26 99 1974 4084.5 271.30 167 1991 6684.9 871.20
4 32 1957 2176.0 133.48 100 1974 4062.0 274.00 168 1991 6720.9 895.90
1 33 1958 2117.4 133.72 101 1975 4010.0 276.20 169 1992 6783.3 935.80
2 34 1958 2129.7 135.22 102 1975 4045.2 282.70 170 1992 6846.8 954.50
7.3 Distributed Lag Model
3 35 1958 2177.5 136.64 103 1975 4115.4 286.00 171 1992 6899.7 988.70
4 36 1958 2226.5 138.48 104 1975 4167.2 286.80 172 1992 6990.6 1024.00
1 37 1959 2273.0 139.70 105 1976 4266.1 292.40 173 1993 6988.7 1038.10
2 38 1959 2332.4 141.20 106 1976 4301.5 296.40 174 1993 7031.2 1075.30
3 39 1959 2331.4 141.00 107 1976 4321.9 300.00 175 1993 7062.0 1105.20
4 40 1959 2339.1 140.00 108 1976 4357.4 305.90 176 1993 7168.7 1129.20
1 41 1960 2391.0 139.80 109 1977 4410.5 313.60 177 1994 7229.4 1140.00
2 42 1960 2379.2 139.60 110 1977 4489.8 319.00 178 1994 7330.2 1145.60
3 43 1960 2383.6 141.20 111 1977 4570.6 324.90 179 1994 7370.2 1152.10
4 44 1960 2352.9 140.70 112 1977 4576.1 330.50 180 1994 7461.1 1149.80
1 45 1961 2366.5 141.90 113 1978 4588.9 336.60 181 1995 7488.7 1146.50
2 46 1961 2410.8 142.90 114 1978 4765.7 347.10 182 1995 7503.3 1144.10
3 47 1961 2450.4 143.80 115 1978 4811.7 352.70 183 1995 7561.4 1141.90
4 48 1961 2500.4 145.20 116 1978 4876.0 356.90 184 1995 7621.9 1126.20
1 49 1962 2544.0 146.00 117 1979 4888.3 362.10 185 1996 7676.4 1122.00
2 50 1962 2571.5 146.60 118 1979 4891.4 373.60 186 1996 7802.9 1115.00
3 51 1962 2596.8 146.30 119 1979 4926.2 379.70 187 1996 7841.9 1095.80
4 52 1962 2603.3 147.80 120 1979 4942.6 381.40 188 1996 7931.3 1080.50
(continued)
189
190
Quarter t Year GDP M1 t Year GDP M1 t Year GDP M1

1 53 1963 2634.1 149.20 121 1980 4958.9 388.10 189 1997 8016.4 1072.00
2 54 1963 2668.4 150.40 122 1980 4857.8 389.40 190 1997 8131.9 1066.20
3 55 1963 2719.6 152.00 123 1980 4850.3 405.40 191 1997 8216.6 1065.30
4 56 1963 2739.4 153.30 124 1980 4936.6 408.10 192 1997 8272.9 1073.40
1 57 1964 2800.5 154.50 125 1981 5032.5 418.70 193 1998 8396.3 1080.30
2 58 1964 2833.8 155.60 126 1981 4997.3 425.50 194 1998 8442.9 1077.60
3 59 1964 2872.0 158.70 127 1981 5056.8 427.50 195 1998 8528.5 1076.20
4 60 1964 2879.5 160.30 128 1981 4997.1 436.20 196 1998 8667.9 1097.00
1 61 1965 2950.1 161.50 129 1982 4914.3 442.40 197 1999 8733.5 1102.20
2 62 1965 2989.9 162.20 130 1982 4935.5 447.90 198 1999 8771.2 1099.80
3 63 1965 3050.7 164.90 131 1982 4912.1 457.50 199 1999 8871.5 1093.40
4 64 1965 3123.6 167.80 132 1982 4915.6 474.30 200 1999 9049.9 1124.80
1 65 1966 3201.1 170.50 133 1983 4972.4 490.20 201 2000 9102.5 1113.70
2 66 1966 3213.2 171.60 134 1983 5089.8 504.40 202 2000 9229.4 1105.30
3 67 1966 3233.6 172.00 135 1983 5180.4 513.40 203 2000 9260.1 1096.00
4 68 1966 3261.8 172.00 136 1983 5286.8 520.80 204 2000 9303.9 1088.10
Source: FRED (Federal Reserve Bank of St. Louis) (http://www.economagic.com/fedstl.htm#GDP, http://www.economagic.com/em-cgi/data.exe/frbH6/m1)
7 Autocorrelation Methods in Regression Models
Table 7.7 Estimation of distributed lag model (7.28) from Example 7.2 (gross domestic product
GDP explained by lagged money supply M1) calculated by means of EViews
Dependent Variable: LOG(GDP)
C 4.946307 0.056813 87.06268 0.0000
LOG(M1) 1.313582 0.867809 1.513677 0.1317
LOG(M1(-1)) –0.406028 1.489386 –0.272614 0.7854
LOG(M1(-2)) –0.055179 1.493922 –0.036936 0.9706
LOG(M1(-3)) –0.252407 1.490153 –0.169383 0.8657
LOG(M1(-4)) –0.021173 0.878340 –0.024106 0.9808
Table 7.8 Estimation of geometric distributed lag model (7.29) after Koyck transformation (7.30)
from Example 7.2 (gross domestic product GDP explained by lagged money supply M1) calculated
by means of EViews
C 0.060444 0.030959 1.952384 0.0523
LOG(M1) 0.004530 0.003765 1.203104 0.2304
LOG(GDP(-1)) 0.990601 0.006235 158.8895 0.0000
The sum of estimated parameters corresponding to the actual value and lagged
values of lnM1 in Table 7.9 (i.e., the long-run effect of money supply on gross
domestic product) is βPDL ¼ 0.565.
⋄
Table 7.9 Estimation of polynomial distributed lag model from Example 7.2 (gross domestic
product GDP explained by lagged money supply M1) calculated by means of EViews
C 5.041623 0.054881 91.86544 0.0000
PDL01 –0.193079 0.073166 –2.638932 0.0090
PDL02 0.058884 0.090832 0.648271 0.5176
PDL03 0.016896 0.005232 3.229455 0.0015
PDL04 –0.003285 0.003699 –0.887938 0.3757
Durbin–Watson stat 0.016460 Prob(F-statistic) 0.000000
Lag Distribution of LOG(M1) i Coefficient Std. Error t-Statistic
| 0 0.77136 0.27859 2.76880
| 1 0.34548 0.06241 5.53531
| 2 0.05194 0.12823 0.40505
| 3 –0.12898 0.17439 –0.73960
| 4 –0.21699 0.15996 –1.35647
| 5 –0.23178 0.10951 –2.11655
| 6 –0.19308 0.07317 –2.63893
| 7 –0.12058 0.11146 –1.08182
| 8 –0.03400 0.16172 –0.21026
| 9 0.04695 0.17525 0.26793
| 10 0.10258 0.12791 0.80200
| 11 0.11317 0.06468 1.74959
| 12 0.05902 0.28403 0.20778
Sum of Lags 0.56508 0.00954 59.2126
7.4 Autoregressive Distributed Lag Model
Autoregressive distributed lag model (or ADL model) contains both lagged explan-
atory variables and lagged explained variable. It can be looked upon as a special
(linear) filtering scheme (therefore, it is sometimes also called transfer function
model, since transfer functions are typical concepts in filtering theories). This
model can be formally written by means of operators used in Box–Jenkins method-
ology (see Sect. 6.2) as
φðBÞyt ¼ α þ βðBÞxt þ εt , ð7:31Þ
where φ(B) ¼ 1 – φ1B –...– φp B p is the autoregressive operator, β(B) ¼ β0 + β1B +...
+ βk Bk is the operator of distributed lags of explanatory variable x, and εt is the
7.4 Autoregressive Distributed Lag Model 193
residual in the form of stationary process ARMA(r, s). In addition, more explanatory
variables can be included, but then the model (7.31) must be extended, e.g., for two
explanatory variables x1 and x2 to the form
φðBÞyt ¼ α þ β1 ðBÞxt1 þ β2 ðBÞxt2 þ ut : ð7:32Þ
If the autoregressive operator φ(B) is stationary (i.e., its roots lie outside the unit
circle in complex plane), then (7.31) can be rewritten as
βðBÞ
yt ¼ μ þ x þ ηt , ð7:33Þ
φðBÞ t
where μ ¼ α / (1 – φ1 – ... – φp) and the residual component ηt ¼ φ(B)–1ut is

a stationary process ARMA( p + r, s). According to (7.33), the process {yt} origi-
nates by filtering the process {xt} (i.e., one can indeed look upon the ADL model as
the “model based on the principle of filters”; see above).
Remark 7.1 Obviously, also other models described in this chapter can be
presented as ADL models:
• Linear regression models with ARMA residuals (7.4) from Sect. 7.2 (see ADL
model written as (7.32) with φ(B) ¼β1(B) ¼β2(B) ¼ ... ¼ 1).
• DL models (7.15) from Sect. 7.3 (see ADL model written as (7.33)).
• GDL models after Koyck transformation (7.22) from Sect. 7.3.1 (see ADL model
written as (7.31) with φ(B) ¼ 1 – λB, β(B) ¼ β(1 – λ) and ut ¼ εt λεt1).
⋄
The construction of ADL model is analogous to the classical procedures in Box–
Jenkins methodology and supposes the application of a suitable software (see, e.g.,
EViews). Moreover, the models have usually specific forms since they are
constructed for specific situations. We describe here two specific cases, for which
the ADL models seem to be useful:
7.4.1 Intervention Analysis
An important application of ADL models is so-called intervention analysis that is

suitable for situations when the course of a time series is evidently impaired in a time
period t0 by a one-shot incidence from outside which changes in a significant way the
course of this time series. For instance, the sale of a product suddenly jumps up due
to successful advertisement, the financial market is influenced by a change in
legislative, and the like. Such interferences are denoted in time series analysis as
interventions.
A recommended method how to construct a model of intervention analysis is to

apply the scheme of ADL models (7.31), in which the explanatory variable xt has the
form of jumps St or pulses Pt defined as

0 for t < t 0 , 0 for t 6¼ t 0
St ¼ Pt ¼ ð7:34Þ
1 for t t 0 , 1 for t ¼ t 0
with the moment of intervention t0 (jumps St and pulses Pt are obviously special
examples of so-called dummy variables (or dummies) which are popular in the
econometric modeling). Choosing a jump or a pulse and a suitable form of ADL
scheme (7.31), one can model various modes how the intervention fades away (such
an analysis is sometimes also denoted as the impulse response). For example, an
immediate dynamic change in a given time series can be modeled using (7.33) in
the form
β0
yt ¼ μ þ S þ ηt ¼ μ þ β0 St þ φ1 St1 þ φ21 St2 þ . . . þ ηt : ð7:35Þ
1 φ1 B t
The response to such an intervention mode corresponds to shifting the time series by
the value β0(1 + φ1 + ... + φ1h) in each time t0 + h (h ¼ 0, 1, ...). If |φ1| < 1, then this
shift achieves asymptotically the value β0/(1 – φ1). If φ1 ¼ 1, then the level of time
series changes linearly with the accrual of β0 during each time unit. Models for other
modes of intervention changes including practical applications are described, e.g., in
Box and Tiao (1975).
7.4.2 Outliers
Another possible application of ADL schemes consists in modeling outliers (on the
other hand, the outliers can be also handled by applying other approaches, e.g., by
robustifying statistical methods to be insensitive to the outlying values; see Sect.
2.2.1.2). The approach based on ADL modeling may be convenient (especially, if
the aim is predicting time series). In general, two types of outliers should be
distinguished:
1. Additive outlier (abbreviated as AO) is linked additively to the basic (e.g.,
stationary) process in time t0, i.e.:
yt ¼ z t þ δ P t , ð7:36Þ
where zt is a stationary process, Pt is the pulse according to (7.34), and δ is the size
of modeled outlier. Particularly in the case of stationary autoregressive process of
the form φ(B)zt ¼ α + εt, it holds for the observed (contaminated) time series yt
(simply substituting zt ¼ yt – δ Pt to this autoregressive model)
7.5 Exercises 195
X
p X
p
yt ¼ α þ φ j ytj þ δ Pt δφ j Ptj þ εt , ð7:37Þ
j¼1 j¼1
where the dummy variable Pt–j is unit in time t0 + j and otherwise zero. Neglecting
the outlier (i.e., applying the classical autoregressive model directly for yt) is
incorrect and can cause substantial estimation and prediction errors. In addition
under suspicion on an outlier in time t0, one can test the significance of parameters
δ in (7.37) by means of the classical t-test.
2. Innovation outlier (abbreviated as IO) is generated in the innovation process so
that, e.g., in the case of stationary autoregressive structure of observed process yt
one should write
X
p
yt ¼ φ j ytj þ δ Pt þ εt : ð7:38Þ
j¼1
Such an innovation irregularity has the main impact only in time t0 and then its
influence decays so that its ignoring is not so dangerous for estimation or
prediction as in the case of AO. The test for IO is analogical as for AO, i.e., by
means of the classical t-test of significance of the parameter δ in (7.38). However,
if the observed process {yt} has the nonstationary ARIMA structure, then the
influence of innovation outlier persists over long time horizons.
7.5 Exercises
Exercise 7.1 Repeat the analysis from Example 7.1 (the yields to maturity of
corporate bonds of the highest quality AAA), but only for data since 1991 (hint:
AAAt ¼ 5.84 + 0.535TBILLt + εt, DW ¼ 0.770).
Exercise 7.2 Repeat the analysis from Example 7.2 (the gross domestic product
GDP in the USA), but only for data since 1980 (hint: βDL ¼ 0.469, βGDL ¼ 0.907,
βPDL ¼ 0.421).
Part IV
Financial Time Series
Chapter 8
Volatility of Financial Time Series
8.1 Characteristic Features of Financial Time Series
The models introduced in previous chapters can be mostly considered as linear

models (e.g., the linear process from Sect. 6.2 is linear function of white noise
values) or can be linearized by a simple transformation (e.g., the logarithmic
transformation). However, many relations in economy and particularly in finance
are principally nonlinear (e.g., dependence of volatility of financial time series on
previous time series values). Therefore, various nonlinear models are preferred in
finance, since they fit better the substance of financial data.
The financial time series are usually derived from prices Pt of financial assets at
time (e.g., stocks or commodities, but also indices) and price variations ΔPt ¼ Pt
Pt1. However, one prefers to analyze returns (relative prices) since in contrast to the
prices they do not depend on monetary units which facilitates comparisons among
assets. The most usual relative prices are so-called log returns (namely continuously
compounded returns) defined as

Pt
r t ¼ ln ¼ ln Pt ln Pt1 ¼ pt pt1 ð8:1Þ
Pt1
(see also (6.91)), where pt ¼ ln Pt are logarithmic prices at time t. Sometimes also
relative price variations or simply returns are used (even if sometimes the term
“return” denotes the log return (8.1))
Pt Pt1
returnt ¼ : ð8:2Þ
Pt1
Obviously, for smaller | rt | one can approximate

https://doi.org/10.1007/978-3-030-46347-2_8
200 8 Volatility of Financial Time Series
.04
.02
.00
-.02
-.04
-.06
25 50 75 100 125 150 175 200 225 250
daily log returns of index PX (in year 2016)
Fig. 8.1 Daily log returns of index PX in 2016 (251 trading days)
r t ¼ ln ð1 þ returnt Þ returnt ð8:3Þ

(more exactly, it holds r t ¼ returnt þ O return2t Þ.
In particular, the linear models from Chaps. 6 and 7 are not capable of covering
some typical properties of financial time series:
• Nonstationarity of prices and stationarity of (log) returns: The time series {Pt}
are usually nonstationary in the form of random walk without intercept (see, e.g.,
the daily values of index PX during the year 2016 in Example 6.6 including
Fig. 6.8(a)). On the contrary, the log returns {rt} (or price variations ΔPt) can be
mostly considered as (weakly) stationary (i.e., with the first and second moments
invariant in time): (1) {rt} oscillate around zero; (2) even if the oscillations vary in
magnitude, their averages over longer periods are almost constant (see daily log
returns of index PX during 2016 in Fig. 8.1); (3) {rt} are almost uncorrelated in
time (see below).
• Uncorrelated log returns: The time series {rt} of log returns generally display
relatively small autocorrelations so that they are close to white noise (due to their
weak stationarity; see above). It is demonstrated in Fig. 8.2(a) with estimated
autocorrelations of log returns of index PX during 2016 (there are no significance
bands in Fig. 8.2, since the classical Bartlett’s approximation (6.12) cannot be
applied here). On the other hand, for intraday series with very small time intervals
between observations (measured in minutes or seconds) significant autocorrela-
tions can be observed due to so-called microstructure effect.
• Correlated log returns squared: In contrast to the previous property, the time
series {rt2} of log returns squared (or {| rt |} of log returns in absolute values) are
usually strongly correlated (see Fig. 8.2(b)). This property is important for the
8.1 Characteristic Features of Financial Time Series 201
(a) (b)
.4 .4
.3 .3
.2 .2
.1 .1
.0 .0
-.1 -.1
-.2 -.2
-.3 -.3
-.4 -.4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ACF of log returns of index PX (in year 2016) ACF of log returns squared of index PX (in year 2016)
Fig. 8.2 Estimated autocorrelations of (a) log returns and (b) log returns squared of index PX in
2016 (251 trading days)
Fig. 8.3 Probability density of N(0, 1) (dotted line) versus probability density of a leptokurtic
distribution with zero mean value, unit variance, and kurtosis γ 2 > 0 (solid line)
construction of GARCH models based on the principle of conditional

heteroscedasticity (see Sect. 8.3).
• Leptokurtic (or heavy-tailed or fat-tailed) distributions: Log returns of financial
assets have mostly probability distributions which are sharply peaked at zero,
have fat tails (decreasing to zero more slowly than exp( x2/2)), and have narrow
shoulders (one can identify “narrower waist and heavier tails”; see Figs. 8.3 and
8.4). The typical characteristic of such distributions is significantly positive
kurtosis coefficient (it is a measure for tail thickness).
35
Series: log returns of PX (in 2016)
30 Observations 250
25 Mean -7.15e-05
Median 0.000228
20 Maximum 0.034724
Minimum -0.042614
Std. Dev. 0.010647
15
Skewness -0.650275
Kurtosis 4.997044
10
Jarque-Bera 59.16266
5
Probability 0.000000
0
-0.025 0.000 0.025
Fig. 8.4 Leptokurtic distribution (particularly with higher kurtosis coefficient compared with the
corresponding normal distribution) for daily log returns of index PX in 2016 from Example 8.1.
• Volatility clustering: Large absolute log returns | rt | tend to appear in clusters, i.e.,
turbulent (high-volatility) subperiods are followed by quiet (low-volatility)
periods, since high (low) deviations of returns can be expected after high (low)
previous deviations, respectively (see Fig. 6.8(b) for first differences and Fig. 8.1
for log returns of daily PX in 2016). The subperiods of volatility bursts are
recurrent, but they do not appear periodically.
• Leverage effect: This effect involves an asymmetry of the impact of past positive
and negative log returns on the current volatility (obviously, positive log returns
correspond to increases of prices, while negative log returns correspond to
decreases of prices). More exactly, previous negative returns (i.e., price
decreases) tend to increase volatility by a larger amount than positive returns
(i.e., price increases) of the same magnitude. Empirically, a positive correlation is
often detected between rt+ ¼ max(rt, 0) and |rt+h| for positive h, but this correla-
tion is generally less than between rt ¼ max(rt, 0) and |rt+h| (e.g., for
log returns of daily PX in 2016 one has corr(rt+, |rt+1|) ¼ 0.377, while corr
(rt, | rt+1|) ¼ 0.527).
• Calendar effects: Various calendar effects should be also mentioned in the
context of financial time series: the day of week, the proximity of holidays,
seasonality, and other factors may have significant effects on returns. Following
a period of market closure, volatility tends to increase, reflecting the information
cumulated during this break. Similar effects appear in intraday series as well.
One can see that the concept of volatility is very important for financial analysis.
In general, volatility can be looked upon as the spread of all possible outcomes of an
uncertain variable. Volatility is related to, but not exactly the same as, risk. Risk is
associated with undesirable outcome, whereas volatility (as the measure strictly for
uncertainty) can occur due to a positive outcome. In any case, the volatility is an
8.1 Characteristic Features of Financial Time Series 203
input to core applications in finance such as investing, portfolio construction, option

pricing, hedging, and risk management. Even volatility indices are constructed
nowadays so that futures and options on the observed volatility can be traded on
exchanges (e.g., the index VIX on the derivative exchange CBOE).
Typically, in financial markets, one is often concerned with the spread of asset
returns and measures their volatility simply as the sample variance, e.g.:
1 X
T
σ2 ¼
b μ Þ2 ,
ðr b ð8:4Þ
T 1 t¼1 t
where rt is the log return on day t, and b

μ is the average return over a longer period of
T trading days (the standard deviation b σ may be sometimes preferred to the variance
in (8.4), since it has the same unit of measure as the observations in sample). As the
volatility does not remain constant through time, the conditional volatility bσ 2t may be
a better instrument for asset pricing and risk management in time t, e.g.:
1 Xt1 1 Xt1
b
σ 2t ¼ τ¼tk τ
μ τ Þ2 ,
ðr b where b
μt ¼ r
τ¼tk τ
ð8:5Þ
k1 k
(it is constructed conditionally using information relevant for time t, e.g., data over
several days for risk management, over several months for option pricing, or over
several years for investment analysis).
The volatility is obviously a latent (i.e., non-observable) matter. In Sect. 8.3,
several models are given that enable to estimate it. However, besides model
approaches to volatility one can also use so-called proxy approaches which are
based on replacing the non-observable volatility by an observable proxy of it (see,
e.g., Poon (2005)):
• The most usual proxy for the volatility in time t is the square of log return in this
time, i.e., rt2 (surprisingly, taking deviations around zero instead of centering
them by means of the sample mean typically increases the accuracy of volatility
prediction; see Poon (2005)).
• Another approach consists in applying H-L measure (so-called high-low measure
by Parkinson (1980))
ð ln H t ln Lt Þ2
σ 2t ¼
b , ð8:6Þ
4 ln 2
where Ht and Lt denote, respectively, the highest and the lowest prices on day t
(the H-L proxy assumes that the price process follows a geometric Brownian
motion).
• If one disposes of intraday data at short intervals such as 5 or 15 min (so-called
tick data), then the realized volatility can be constructed by integrating squared
log returns, i.e.:
X
m
RV tþ1 ¼ r 2m,tþj=m ð8:7Þ
j¼1
(there are m log returns in one unit of time). If the log returns are serially
uncorrelated, then one can show (see, e.g., Karatzas and Shreve (1988)) that
0 tþ1 1
Z X
m
plim @ σ 2 ds
s r2 m,tþj=m
A, ð8:8Þ
m!1
j¼1
t
i.e., the realized volatility converges in probability to so-called integrated vola-

tility that is based on the concept of volatility σ s2 continuous in time.
8.2 Classification of Nonlinear Models of Financial Time

Series
If we confine ourselves to purely stochastic models of time series (i.e., without

deterministic trends, periodicities, and other deterministic components), then the
general nonlinear model of time series can be written as
yt ¼ f ðet , et1 , et2 , . . .Þ, ð8:9Þ
where f is a (nonlinear) function of uncorrelated random variables et with zero mean

value and constant variance (a stronger assumption requiring that variables et are
even iid is often applied guaranteeing the existence of models of this type; see, e.g.,
Wu (2005)). The random variables et have the meaning of prediction errors
(or equivalently deviations from the conditional mean value) in (8.9) (some authors
call them also shocks or innovations). The model (8.9) is a natural nonlinear
extension of the linear process (6.17) from Sect. 6.2
X
1
yt ¼ εt þ ψ i εti ð8:10Þ
i¼1
(the linear process is a general scheme of linear models, which in comparison to (8.9)
uses explicitly white noise values εt as the corresponding generators).
The nonlinear process in the form (8.9) is too general to be applied practically.
Therefore, one prefers a more specific form of it written by means of the first and
second conditional moments. It should not be surprising, since, e.g., the simple
stationary process AR(1) introduced in (6.36) can be written by means of the
conditional mean value as
8.2 Classification of Nonlinear Models of Financial Time Series 205
Eðyt jyt1 Þ ¼ φ1 yt1 : ð8:11Þ
Generally, one can condition in time t by the entire information Ωt–1 known till
time t – 1. More specifically, we can imagine that the past information is generated
by all past values {yt–1, yt–2, ...} and {et–1, et–2, ...} using a suitable function of these
values (one usually uses the term σ-algebra in such a situation). Due to the restriction
to the first and second moments only, one usually models the conditional mean value
μt and the conditional variance σ t2 by means of simple (nonlinear) functions of
information in Ωt–1

μt ¼ E yt jΩt1 Þ ¼ gðΩt1 Þ, σ 2t ¼ ht ¼ varðyt jΩt1 Þ ¼ hðΩt1 Þ, ð8:12Þ
where g and h are suitable functions (h() > 0). Although the time index should
distinguish the conditional moments from the unconditional ones, it would be more
correct to write, e.g., μt|t–1 and σt|t–1
2
instead of simplified symbols μt and σ t2 in (8.12)
(in fact, these are one-step-ahead predictions of mean value and variance of the given
process). Then one can write
yt ¼ μ t þ et , ð8:13Þ
since et are the prediction errors or equivalently the deviations from conditional
mean value (if yt ¼ rt, then one also uses for et the term mean-corrected returns).
Moreover, as it holds
σ 2t ¼ varðyt jΩt1 Þ ¼ varðet jΩt1 Þ, ð8:14Þ
one addresses σ t2 as volatility of given time series in time t (see also Sect. 8.1). The
final form of nonlinear process, which is applied in this context most often, is then
pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
yt ¼ μt þ σ t εt ¼ μt þ ht εt ¼ gðΩt1 Þ þ hðΩt1 Þ εt , ð8:15Þ
where εt are iid random variables with zero mean value and unit variance. Obviously,
it holds
et ¼ σ t εt : ð8:16Þ
It is worth noting that the random variables et are uncorrelated, but in contrast to εt
they do not need to be generally independent.
Overall, the considered model is given by two equations in (8.12): the first one is
the mean equation and the second one is the volatility equation. According to the
type of these equations, the nonlinear processes can be classified to
• Nonlinear in mean: they have the nonlinear function g.

• Nonlinear in variance: they have the nonlinear function h; this function is often
invariant in time so that such processes are frequently addressed as the processes
with conditional heteroscedasticity.
Both categories can be combined and sorted to plenty of more specific processes
(see further). The linear Box–Jenkins models from Chap. 6 present a special case of
(8.15) for a linear function g and a constant function h.
8.3 Volatility Modeling
Modeling and forecasting of volatility is nowadays a very important topic of

financial analysis (both theoretical and practical ones). It is no way surprising
because the volatility considered as the standard deviation of various indicators of
profit and loss rates represents the basic measure of risk, see, e.g., the methodology
of capital adequacy for banks denoted as Basel III and the solvency in insurance
companies denoted as Solvency II which are based on the concept of value-at-risk
(VaR) and on similar measures of risk (including commercial software products of
the type Risk Metrics (1996)).
In this section, we shall present various methods for estimating and predicting
volatility in financial time series (even if approaches based on proxies are also
possible; see Sect. 8.1).
8.3.1 Historical Volatility and EWMA Models
Historical volatility is the original approach to volatility estimating it simply as the

sample variance or the sample standard deviation over a chosen historical period
(hence historical volatility), i.e., in the simplest case as
Pt1 Pt1
τ¼tk μ t Þ2
ð yτ b τ¼tk yτ
σ 2t ¼
b , μt ¼
where b ð8:17Þ
k1 k
for a suitable length of sample period k (see also (8.5)). Moreover, the value (8.17) is
often used in practice as the prediction constructed in time t for short prediction
horizons. Even though previously the historical volatility has been applied broadly in
practice (e.g., in order to estimate the volatility of underlying assets when calculating
option premiums according to Black–Scholes formula), nowadays its meaning is
reduced to determination of benchmarks when assessing the effectiveness of more
complex models of volatility.
A pragmatic extension of the historical volatility approach are EWMA models.
The most frequent model EWMA (exponentially weighted moving average) is an
8.3 Volatility Modeling 207
analogy of simple exponential smoothing (see Sect. 3.3.1) for volatility. In contrast
to the historical volatility calculation, in EWMA models the averaged squares in
(8.17) are weighted with weights which decrease exponentially to the past. Such a
modification is advantageous practically:
• In practice, volatility is usually influenced more by current values which are
distinguished by higher weights from the values farther in the past.
• Moreover, in EWMA models the influence of high deviations persists during
longer time periods than in (8.17) with smaller k, where high deviations leaving
the sample range can cause even jumps in the estimated volatility.
Due to analogy of EWMA models to the simple exponential smoothing (see
(3.75) and (3.77)), the volatility can be estimated using EWMA as
X
1 2
σ 2t ¼ ð1 λÞ
b λ j yt1j y ¼ ð1 λÞðyt1 yÞ2 þ λ b
σ 2t1 , ð8:18Þ
j¼0
where the estimated volatility b σ 2t presents the volatility prediction from time t – 1, y
is an average level of given time series, and λ (0 < λ < 1) is a discount constant
chosen in advance. If one calculates the volatility of time series of log returns rt (see
(8.1)), then the average return is often nearly zero (particularly for higher frequencies
of observations, e.g., for daily returns), then (8.18) transfers to the form
X
1
b
σ 2t ¼ ð1 λÞ λ j r 2t1j ¼ ð1 λÞr 2t1 þ λ b
σ 2t1 : ð8:19Þ
j¼0
In financial practice (see, e.g., RiskMetrics (1996)) due to broad experience with
volatility estimation, one recommends for constant λ routinely the value 0.94.
Example 8.1 Table 8.1 and Fig. 8.1 show daily log returns rt of index PX in 2016
(250 values for 251 trading days calculated as differences of logarithmic index
values rt ¼ lnPXt – lnPXt – 1).
This time series {rt} shows typical features of financial time series (see Sect. 8.1):
• Volatility clustering (see the volatility bunches in the beginning and in the middle
of time series in Fig. 8.1)
• Leptokurtic distribution (see the histogram, the kurtosis coefficient 4.997 – 3 ¼
1.997 > 0, and the test of normality Jarque–Bera in Fig. 8.4).
By means of the recursive formula (8.19) of EWMA model with zero initial
value, one has estimated the corresponding volatility (see the graphical plot in
Fig. 8.5). The EWMA estimation justifies the previous subjective conclusion
(namely the occurrence of increased volatility in the beginning and in the middle
of {rt}).
⋄
Table 8.1 Daily log returns of index PX in 2016 (250 values for 251 trading days written in columns) from Example 8.1 (see also Fig. 8.1 and Table 6.10)
208
1 2 3 4 5 6 7 8 9 10
1 0.0030 0.0227 0.0043 0.0018 0.0011 0.0107 0.0147 0.0111 0.0078 0.0021
2 0.0052 0.0125 0.0002 0.0011 0.0009 0.0092 0.0065 0.0021 0.0012 0.0034
3 0.0217 0.0193 0.0131 0.0079 0.0025 0.0023 0.0028 0.0044 0.0075 0.0066
4 0.0086 0.0015 0.0182 0.0028 0.0012 0.0143 0.0023 0.0043 0.0076 0.0021
5 0.0059 0.0347 0.0070 0.0013 0.0062 0.0041 0.0031 0.0012 0.0014 0.0043
6 0.0011 0.0166 0.0081 0.0052 0.0002 0.0111 0.0030 0.0043 0.0160 0.0024
7 0.0053 0.0186 0.0119 0.0077 0.0101 0.0085 0.0058 0.0009 0.0068 0.0006
8 0.0178 0.0011 0.0017 0.0073 0.0144 0.0014 0.0064 0.0128 0.0040 0.0092
9 0.0197 0.0083 0.0048 0.0140 0.0033 0.0224 0.0002 0.0021 0.0142 0.0007
10 0.0152 0.0175 0.0085 0.0109 0.0239 0.0218 0.0040 0.0057 0.0065 0.0059
11 0.0070 0.0085 0.0010 0.0001 0.0072 0.0077 0.0015 0.0032 0.0075 0.0013
12 0.0209 0.0274 0.0078 0.0217 0.0325 0.0068 0.0052 0.0095 0.0176 0.0052
13 0.0046 0.0119 0.0044 0.0041 0.0261 0.0067 0.0086 0.0126 0.0058 0.0062
14 0.0308 0.0003 0.0082 0.0061 0.0125 0.0061 0.0026 0.0058 0.0012 0.0089
15 0.0033 0.0090 0.0058 0.0029 0.0115 0.0044 0.0030 0.0075 0.0002 0.0099
16 0.0101 0.0165 0.0001 0.0052 0.0021 0.0012 0.0003 0.0068 0.0044 0.0061
17 0.0109 0.0085 0.0169 0.0028 0.0186 0.0034 0.0069 0.0094 0.0008 0.0056
18 0.0076 0.0042 0.0064 0.0100 0.0093 0.0045 0.0004 0.0041 0.0046 0.0055
19 0.0127 0.0075 0.0074 0.0152 0.0040 0.0069 0.0084 0.0059 0.0005 0.0001
20 0.0069 0.0019 0.0097 0.0047 0.0115 0.0062 0.0107 0.0040 0.0168 0.0000
21 0.0138 0.0051 0.0000 0.0002 0.0038 0.0019 0.0058 0.0043 0.0059 0.0009
22 0.0173 0.0066 0.0122 0.0023 0.0426 0.0043 0.0018 0.0046 0.0005 0.0039
23 0.0121 0.0106 0.0022 0.0004 0.0366 0.0212 0.0011 0.0140 0.0093 0.0035
24 0.0081 0.0127 0.0116 0.0016 0.0205 0.0021 0.0038 0.0086 0.0040 0.0043
25 0.0363 0.0070 0.0019 0.0159 0.0022 0.0074 0.0056 0.0092 0.0027 0.0022
8 Volatility of Financial Time Series

.0004
.0003
.0002
.0001
.0000
25 50 75 100 125 150 175 200 225 250
daily volatility of index PX (in year 2016) by means of EWMA
Fig. 8.5 Volatility of daily log returns of index PX in 2016 (250 values for 251 trading days)
estimated by means of EWMA model in Example 8.1
8.3.2 Implied Volatility
In finance one exploits some relations using the volatility as one of explaining
factors. The best known in this context is Black–Scholes formula mentioned in
Sect. 8.3.1, which expresses analytically the call or put option premium as a function
of five factors: St (spot price of underlying asset, e.g., a stock), X (exercise price of
option), T – t (time to maturity of option), σ (volatility of underlying asset), and
i (risk-free interest rate in the given capital environment). For example, the premium
of European call option Ct in time t is
C t ¼ St Φ ð d 1 Þ X e iðT t Þ
Φðd2 Þ, ð8:20Þ
ln ðSt =X Þ þ ði þ σ 2 =2ÞðT t Þ pffiffiffiffiffiffiffiffiffiffiffi

where d1 ¼ pffiffiffiffiffiffiffiffiffiffiffi , d2 ¼ d1 σ T t,
σ T t
Φ() ~ distribution function N(0, 1) (see, e.g., Hull (1993)). If we observe the prices
of traded (i.e., quoted) options for known values of explained factors (excluding the
volatility), then we are capable of applying a suitable numerical procedure to
calculate just this volatility factor, which is then usually called implied volatility
(more exactly, it is the prediction of volatility of underlying asset price in time t with
prediction horizon equal to time T – t to the maturity of option).
However, the implied volatility is derived under assumptions that do not need to
be fulfilled in practice (e.g., the lognormal distribution of price of underlying asset),
and therefore, it can be significantly different from the real volatility. The practical
experience shows that the implied volatility is usually higher than the volatility
derived, e.g., by means of the GARCH models (see Sect. 8.3.5).
8.3.3 Autoregressive Models of Volatility
Autoregressive models of volatility were originally introduced as a direct implemen-

tation of Box–Jenkins methodology for volatility (they can be classified as a
stochastic volatility approach; see Sect. 8.3.6):
X
s
σ 2t ¼ β0 þ β j σ 2tj þ εt : ð8:21Þ
j¼1
The classical autoregressive scheme AR(s) in (8.21) (with the classical white
noise {εt}) is used to predict volatility if we replace {σ t2} by a suitable proxy
(usually by rt2 or by (8.6); see Sect. 8.1). Nowadays, this method is not
recommended in practice since it has several handicaps (e.g., the nonnegativity of
the right-hand side of (8.21) is not guaranteed, even if the logarithmic transformation
can solve this problem; see also (8.79)).
8.3.4 ARCH Models
A significant breakthrough to model the volatility systematically has been just the
model ARCH (autoregressive conditional heteroscedasticity) applied by Engle
(1982) to model the inflation in the UK. The models of this type (and particularly
their generalization to the GARCH models; see Sect. 8.3.5) are apparently one of the
most successful instruments of modeling financial time series (so far without signif-
icant competitors). Their principle is based on two predicates, namely
• The models of financial time series are heteroscedastic, i.e., their volatility
changes in time.
• The volatility is a simple quadratic function of past prediction errors et (deviations
from the conditional mean value).
Only the second predicate needs an explanation (the first one is sufficiently
supported by financial empirical experience): Due to the phenomenon of volatility
clustering according to which high (low) deviations of returns can be expected rather
after higher (lower) previous deviations, respectively, one can assume that the
particular volatilities are positively correlated and make use of the autoregressive
model as the simplest scheme to model them. Moreover, according to (8.14) it holds

σ 2t ¼ varðet jΩt1 Þ ¼ E e2t jΩt1 e2t ð8:22Þ
(obviously E(et) ¼ 0) so that the squared errors et2 can be used as natural approx-
imations of volatilities σ t2. Therefore, if we express the volatility as the following
quadratic function of delayed values et2:
σ 2t ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr , ð8:23Þ
then we obtain a realistic model (when the order r is chosen in a suitable size). It is
worth noting that (8.23) is a “nonstochastic” relation, i.e., without a random residual
component.
Due to the previous discussion and respecting the general form of nonlinear
model (8.15), one formulates the model ARCH(r) of order r as
yt ¼ μ t þ et , et ¼ σ t εt , σ 2t ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr , ð8:24Þ
where εt are iid random variables with zero mean value and unit variance (moreover,
they are frequently assumed to have the normal distribution, i.e., εt ~ N(0, 1), or the t-
distribution which is standardized to have also zero mean value and unit variance). In
any case, increased past values of volatility imply the increased present volatility in
the model (8.24) which can be also rewritten as
pffiffiffiffi
yt ¼ e t , et ¼ ht ε t , ht ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr : ð8:25Þ
The conditional mean value μt is modeled by means of a suitable mean equation

that is often linear: one can apply the conditional mean value corresponding to a
linear regression model (sometimes it can be even reduced to the intercept only) or a
process ARMA. More specifically, the following forms may serve as examples
(some models nonlinear in mean value will be described later in Sect. 9.1):
yt ¼ et , et ¼ σ t εt , σ 2t ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr ð8:26Þ
or equivalently
yt ¼ σ t εt , σ 2t ¼ α0 þ α1 y2t1 þ . . . þ αr y2tr ð8:27Þ
(i.e., with zero mean value μt) or
yt ¼ γ 0 þ γ 1 xt1 þ . . . þ γ k xtk þ et , et ¼ σ t εt , σ 2t ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr

ð8:28Þ
(i.e., with exogenous variables x1, ..., xk) or
yt ¼ φ1 yt1 þ . . . φp ytp þ et , et ¼ σ t εt , σ 2t ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr

ð8:29Þ
(i.e., with conditional mean value μt corresponding to the process AR( p)).
Moreover, the parameters of the model ARCH(r) must fulfill the following
constraints:
α0 > 0, α1 0, . . . , αr 0 ð8:30Þ
and
α1 þ . . . þ αr < 1: ð8:31Þ
The first constraint (8.30) guarantees that the sign of volatility σ t2 in (8.24) is
positive: it is a sufficient (but not necessary) condition for this natural property of
the model. The second constraint (8.31) is not so clear but is also important: it
guarantees that the model ARCH(r) has constant (finite) unconditional variance (see
its derivation for ARCH(1) in (8.36)).
Remark 8.1 One should stress once more that in general the random variables et are
only uncorrelated (see (8.35)), while εt are independent. The graphical plots of
correlogram and partial correlogram of model ARCH correspond to this fact: e.g.,
the estimated correlogram of time series yt in the model (8.27) should have all values
insignificant as a white noise while the estimated partial correlogram of squared time
series yt2 should have the truncation point equal to r.
⋄
Remark 8.2 The model (8.24) can be generalized by means of matrix calculus:
yt ¼ μt þ et , et ¼ σ t εt , σ 2t ¼ α0 þ ðet1 , . . . , etr ÞAðet1 , . . . , etr Þ0 , ð8:32Þ
where the matrix A of unknown parameters must be positive semidefinite and α0 > 0.
The original model (8.24) is a special parsimonious version of (8.32), where the matrix
A is diagonal with nonnegative diagonal elements.
⋄
The main properties of the ARCH models will be derived only for the process
ARCH(1), i.e., for the model
yt ¼ μt þ et , et ¼ σ t ε t , σ 2t ¼ α0 þ α1 e2t1 ðα0 > 0, α1 0Þ ð8:33Þ
(the derivation for higher orders is analogous):

1. Zero unconditional mean value of et :
Eðet Þ ¼ EðEðet jΩt1 ÞÞ ¼ Eðσ t Eðεt jΩt1 ÞÞ ¼ 0: ð8:34Þ

2. Uncorrelated et and etk (k > 0):
covðet , etk Þ ¼ Eðet etk Þ ¼ EðEðσ t εt etk jΩt1 ÞÞ ¼ Eðσ t etk Eðεt jΩt1 ÞÞ ¼ 0:
ð8:35Þ
3. Constant unconditional variance of et :

varðet Þ ¼ E e2t ¼ E E e2t jΩt1 ¼ E σ 2t ¼ E α0 þ α1 e2t1
¼ α0 þ α1 varðet1 Þ: ð8:36Þ
Since the variance of prediction errors et should be constant, it must hold
α0
varðet Þ ¼ ð8:37Þ
1 α1
under the constraint
0 α1 < 1 ð8:38Þ
(the properties 1–3 mean that the time series {et } is a white noise (in particular,
weakly stationary) under the sufficient condition (8.38)). Interestingly, despite the
changing conditional variance, i.e., the changing volatility, the unconditional
variance of {et } remains constant over time.
4. Constant nonnegative kurtosis of et : If εt ~ N(0, 1), then it holds
2
E e4t ¼ E E e4t jΩt1 ¼ 3E α0 þ α1 e2t1

¼ 3 α20 þ 2α0 α1 varðet1 Þ þ α21 E e4t1 : ð8:39Þ
Hence analogously as in 3 one obtains
3α20 ð1 þ α1 Þ
E e4t ¼ ð8:40Þ
ð1 α1 Þ 1 3α21
under the sufficient condition

pffiffiffiffiffiffiffiffi
0 α1 < 1=3: ð8:41Þ
Finally using (8.37), the (unconditional) kurtosis coefficient of et is


E e4t 1 α21 6α21
γ2 ¼ 3¼3 3 ¼ 0, ð8:42Þ
ðvarðet ÞÞ 2 1 3α21 1 3α21
which is in accord with modeling leptokurtic distributions (in particular, the

deviations et in the conditionally normal model ARCH(1) produce outliers with
higher probabilities than a normal white noise).
Remark 8.3 The previous results can be extended easily to the model ARCH(r).
The sufficient condition of weak stationarity of its deviations et demands that all
roots of the autoregressive polynomial 1 – α1z – ... – αr zr lie outside the unit circle in
complex plane. Since the parameters α1, ..., αr must be nonnegative, this sufficient
condition can be rewritten in a more comfortable form (8.31). Under this condition,
the variance of deviations et can be expressed as
α0
varðet Þ ¼ : ð8:43Þ
1 α1 . . . αr
⋄
After the theoretical description of ARCH models, we can deal briefly with their
practical construction (since the construction of GARCH models is analogical, we
will skip this technical topic in Sect. 8.3.5 devoted to these models). Here for
simplicity, we confine ourselves to the model ARCH(r) in the form (8.26), i.e.,
with zero conditional mean value μt , where the deviations et are directly observable
(this assumption is fulfilled in practice for the financial time series of log returns rt).
In the opposite case, one eliminates the deviations et at first, e.g., in the model
(8.29) as
et ¼ yt φ1 yt1 . . . φp ytp : ð8:44Þ
8.3.4.1 Identification of Order of Model ARCH
The order r can be identified as the truncation point of estimated partial correlogram
in model
e2t ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr þ ut , ð8:45Þ
where ut is the classical white noise (i.e., in the same way as for the classical AR
model in the framework of Box–Jenkins methodology; see Sect. 6.3.1). If the order
r is too high, then the nonnegativity of large number of parameters can be a problem
(see the constraint (8.30)). In his seminal work, Engle (1982) suggested to apply
a parsimonious model only with two parameters (but with r ¼ 4) instead of (8.23),
namely

σ 2t ¼ δ0 þ δ1 0:4e2t1 þ 0:3e2t2 þ 0:2e2t3 þ 0:1e2t4 : ð8:46Þ
8.3.4.2 Estimation of Model ARCH
Due to various reasons, the estimation methods based on the principle of least
squares are not suitable for models with conditional heteroscedasticity. Therefore,
one recommends for these models the method of maximum likelihood. The
corresponding probability density fulfills obviously the relation
f ðe1 , . . ., en Þ ¼ f ðen jΩn1 Þ . . . f ðerþ1 jΩr Þf ðe1 , . . ., er Þ: ð8:47Þ
Therefore assuming εt ~ N(0, 1), one can write the (conditional) log likelihood
function as
Xn
1 1 2 1 e2t
lðα0 , . . ., αr Þ ¼ ln ð2π Þ ln σ t ð8:48Þ
t¼rþ1
2 2 2 σ 2t
(we have omitted the last factor in (8.47) since we condition by initial values e1, ..., er
in (8.48)). The values et necessary for the construction of (8.48) are calculated
recursively for each choice of arguments α0, . . ., αr including the volatilities
σ 2t ¼ α0 þ α1 e2t1 þ . . . þ αr e2tr , t ¼ r þ 1, . . . , n: ð8:49Þ
If the normal distribution of εt does not fit heavy tails of modeled financial data
properly, then one can apply other distributions. For instance, if the standardized t-
distribution is a better choice for εt (with the unit variance and the degrees of freedom
v (v > 2)), then (8.48) must be replaced by
Xn
vþ1 e2t 1 2
lðα0 , . . ., αr Þ ¼ ln 1 þ þ ln σ ð8:50Þ
t¼rþ1
2 ðv 2Þσ 2t 2 t
(the probability behavior of tails can be controlled by means of v : if v ! 1, then the

applied t-distribution transfers back to the normal one). If even the t-distribution is
not enough for the analyzed heavy-tailed data, then some software systems (e.g.,
EViews) offer the generalized error distribution GED or other possibilities (see, e.g.,
McNeil et al. (2005)).
The maximization of log likelihood function to obtain the final ML estimates is
the concern of software optimization algorithms. As a matter of fact, one usually
estimates the entire model including its mean equation (e.g., including the parame-
ters φ, ..., φp in the expression (8.44)) and including the variance matrix of estimated
parameters.
Remark 8.4 If the assumption of conditional normal distribution (tj. εt ~ N(0, 1)) is
used improperly in a (correctly identified) model ARCH then the corresponding ML
estimates of its parameters remain consistent, but their estimated variance matrix
should be repaired. For such a case, Bollerslev and Wooldridge (1992) suggested a
possible approach denoted as heteroscedasticity consistent covariances, which is
robust against non-normal distributions (it is based on the QML estimation (quasi-
maximum likelihood; see Example 8.2). Nowadays, nonparametric estimation of
conditional heteroscedasticity is also recommended (see, e.g., Fan and Yao (2005)).
⋄
8.3.4.3 Verification of Model ARCH
Most estimation procedures for ARCH models enable to obtain “by-products” which
can be used consequently to the verification of constructed model, namely:
• The estimated deviation bet (the one-step-ahead prediction error in given time
series for time t): e.g., in the model (8.29) it can be estimated as
bet ¼ yt φ
b 1 yt1 . . . φ
b p ytp : ð8:51Þ
• The estimated volatility b σ 2t (the variance of deviation et estimated in time t – 1 for

time t or equivalently the one-step-ahead prediction of variance in given time
σ 2t ðt 1Þ using such an interpretation; see (8.53)).
series denoted as b
• The estimated standardized deviation eet :
bet
eet ¼ : ð8:52Þ
b
σt
To verify a constructed model ARCH(r) the following procedures are mostly

applied which explore the properties of estimated standardized deviation (8.52):
• Verification of estimated mean equation: Q-tests of the type (6.67) for time series
feet g.
• Verification of estimated volatility equation: Q-tests of the type (6.67) for time

series ee2t or special tests for time series feet g (e.g., LM-tests based on Lagrange
multipliers) testing a potential remaining ARCH structure in feet g.
• Verification of normality of conditional ARCH model: Jarque–Bera test (or the
numerical value of kurtosis coefficient only) for time series feet g:
8.3.4.4 Prediction of Volatility in Model ARCH
Volatility can be predicted by means of the relation (8.23) in the same way as we
construct predictions in the linear models of Box–Jenkins methodology (see Sect.
6.6), i.e.:
b σ 2t ¼ b
σ 2t ðt 1Þ ¼ b α0 þ b
α1be2t1 þ b
α2be2t2 þ . . . þ b
αrbe2tr , ð8:53Þ
σ 2tþ1 ðt 1Þ ¼ b
b α0 þ b
α1 b α2be2t1 þ . . . þ b
σ 2t ðt 1Þ þ b αrbe2tþ1r ð8:54Þ
etc.
8.3.5 GARCH Models
The model ARCH(r) from the previous section has some drawbacks, e.g.:
• One must often use a high order r to describe the volatility of given time series in
an adequate way.
• If r is high, then it is necessary to estimate a large number of parameters under the
condition of nonnegativeness (8.30) and stationarity (8.31).
• ARCH models cover the volatility clustering but not the leverage effect (i.e.,
asymmetry of the impact of past positive and negative deviations et on the current
volatility).
These drawbacks can be reduced by applying the model GARCH (generalized
ARCH) suggested by Bollerslev (1986). In this model and in its various modifica-
tions (see Sect. 8.3.6), the volatility (i.e., the conditional variance) may also depend
on its previous (lagged) values. Specially the model GARCH(1,1), which is the
simplest representative of this class of models, is very popular model instrument for
financial time series nowadays: it is capable of managing very general volatility
structures by applying three parameters only (the GARCH models of higher orders
are used in routine practice rarely).
The model GARCH(r, s) has the form
X
r X
s
yt ¼ μ t þ et , et ¼ σ t εt , σ 2t ¼ α0 þ αi e2ti þ β j σ 2tj , ð8:55Þ
i¼1 j¼1
where εt are iid random variables with zero mean value and unit variance (again they
are mostly assumed to have the normal or t-distribution) and the parameters of model
fulfill
X
max fr, sg
α0 > 0, αi 0, β j 0, ðαi þ βi Þ < 1 ð8:56Þ
i¼1
(one puts αi ¼ 0 for i > r and βj ¼ 0 for j > s; if s ¼ 0, then we go back to the model
ARCH(r)). The last inequality in (8.56) is the sufficient condition for the existence of
variance
α0
varðet Þ ¼ P max fr,sg : ð8:57Þ
1 i¼1 ðαi þ βi Þ
Remark 8.5 If we put ut ¼ et2 – σ t2 in (8.55), then ut has the property of white noise,
and it holds
X
max fr, sg X
s
e2t ¼ α0 þ ðαi þ βi Þe2ti þ ut β j utj : ð8:58Þ
i¼1 j¼1
Hence the volatility equation of the model GARCH can be looked upon as the model
ARMA for the time series of squared deviations {et2}. As the (non-squared) process
{et} is concerned, under the assumptions as (8.31) or (8.56) it is weakly (second
order moments) stationary. The strict (distribution) stationarity demands other type
of assumptions than (8.56), e.g., E{ln(α1εt2 + β1)} < 0 for GARCH(1,1) in (8.59);
see Francq and Zakoian (2010).
⋄
In particular, the model GARCH(1,1) has a simpler form
yt ¼ μ t þ et , et ¼ σ t εt ,
σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 ðα0 > 0, α1 , β1 0, α1 þ β1 < 1Þ: ð8:59Þ
Its kurtosis coefficient fulfills

3 1 ðα1 þ β1 Þ2 6α21
γ2 ¼ 2
3¼ 0 ð8:60Þ
1 2α21 ðα1 þ β1 Þ 1 2α21 ðα1 þ β1 Þ2
under the validity of sufficient condition
1 2α21 ðα1 þ β1 Þ2 > 0 ð8:61Þ
(compare with (8.41) and (8.42) for the model ARCH(1)).

As the construction of the models of type GARCH is concerned, we premised in

Sect. 8.3.4 that it is quite analogical to ARCH models. For instance, the volatility in
the model GARCH(1,1) (see (8.59)) can be predicted as
b
σ 2t ðt 1Þ ¼ b
σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 ð8:62Þ
(in practice one uses estimated parameters). Since it holds

σ 2tþ1 ¼ α0 þ ðα1 þ β1 Þσ 2t þ α1 σ 2t ε2t 1 and E ε2t 1jΩt1 ¼ 0, ð8:63Þ
one can also write
σ 2tþ1 ðt 1Þ ¼ α0 þ ðα1 þ β1 Þb
b σ 2t ðt 1Þ ð8:64Þ
and generally
b
σ 2tþτ ðt Þ ¼ α0 þ ðα1 þ β1 Þb
σ 2tþτ1 ðt Þ, τ > 1: ð8:65Þ
Repeating this procedure, one obtains finally

α0 1 ðα1 þ β1 Þτ1 α0
σ 2tþτ ðt Þ ¼
b þ ðα1 þ β1 Þτ1 b
σ 2tþ1 ðt Þ ! ð8:66Þ
1 ðα1 þ β1 Þ 1 ðα1 þ β1 Þ
for τ ! 1. Obviously, the volatility prediction converges with increasing prediction

horizon to the unconditional variance of prediction errors et (see (8.57)).
Example 8.2 In Example 8.1, we have estimated the volatility of daily log returns rt
of index PX in 2016 (see 250 values in Table 8.1) by means of the model EWMA.
For comparison, the model GARCH(1,1) is estimated by means of EViews (see
Table 8.2) for the same data as
r t ¼ et , et ¼ σ t εt , σ 2t ¼ 0:1259e2t1 þ 0:8521σ 2t1
or (since the conditional mean value is insignificant according to Table 8.2)
r t ¼ σ t εt , σ 2t ¼ 0:1259r 2t1 þ 0:8521σ 2t1
(the method by Bollerslev and Wooldridge from Remark 8.4 has been applied to
estimate the variance matrix of estimated parameters to be robust against non-normal
distributions). The results of the verification procedures are not presented here, but
Q tests for the estimated standardized deviation (8.52) and for its square (see Sect.
Table 8.2 Estimation of the process GARCH(1, 1) from Example 8.2 (index PX in year 2016)
Dependent Variable: log returns of PX (in 2016)
Bollerslev–Wooldrige robust standard errors and covariance
GARCH ¼ C(2) + C(3)RESID(-1)^2 + C(4)GARCH(-1)
Coefficient Std. Error z-Statistic Prob.
C 0.000286 0.000538 0.531451 0.5951
Variance Equation
C 2.48E-06 2.00E-06 1.244526 0.2133
RESID(-1)^2 0.125878 0.036311 3.466680 0.0005
GARCH(-1) 0.852148 0.037461 22.74740 0.0000
36
Series: Standardized Residuals
32 Observations 250
28
Mean -0.038325
24 Median -0.007027
Maximum 2.375025
20
Minimum -3.470062
16 Std. Dev. 1.006280
Skewness -0.532023
12 Kurtosis 3.667134
8
Jarque-Bera 16.42982
4 Probability 0.000271
0
-3 -2 -1 0 1 2
Fig. 8.6 Histogram of estimated standardized deviation eet (see (8.52)) for daily log returns of index
PX in 2016 from Example 8.2. Source: calculated by EViews
8.3.4.3) verify statistically the constructed model (also the LM test mentioned in
Sect. 8.3.4.3 does not find in these estimated deviations any remaining ARCH
structure). On the other hand, the histogram of estimated feet g and Jarque–Bera
test shown in Fig. 8.6 indicate the non-normality with higher kurtosis so that one
should apply t or GED distributions when constructing this GARCH model (see
Sect. 8.3.4.2).
Finally, Fig. 8.7 plots the volatility which is constructed by means of the
estimated model GARCH(1, 1) (one can compare it with its EWMA estimate from
Example 8.1 in Fig. 8.5).
⋄
.0005
.0004
.0003
.0002
.0001
.0000
25 50 75 100 125 150 175 200 225 250
daily volatility of index PX (in year 2016) by means of GARCH(1,1)
Fig. 8.7 Volatility of daily log returns of index PX in 2016 (250 values for 251 trading days)
estimated by means of model GARCH(1, 1) in Example 8.2 (compare with its EWMA estimate
from Example 8.1 in Fig. 8.5)
8.3.6 Various Modifications of GARCH Models
Analysis of financial (or nonlinear) time series is a very progressive sector. The offer
of various models is really enormous including a flood of nonsystematic acronyms of
the type FIEGARCH with tens of references in various sources each year (therefore,
it has no sense to survey the bibliography in this section). Typical examples are just
various modifications of GARCH models motivated mostly by an effort to repair
various drawbacks of the classical GARCH models from Sect. 8.3.5: some of them
are briefly described just in this section (respecting the fact that practical calculations
mostly suppose the application of specialized software instruments).
8.3.6.1 IGARCH
Integrated GARCH model denoted as IGARCH(r, s) is the model GARCH(r, s) with

unit roots of the autoregressive polynomial in volatility equation (it is an analogy of
ARIMA models but for volatility modeling). Its typical feature is the so-called
volatility persistence: while the classical process GARCH is stationary in volatility
(so that the volatility prediction converges with increasing horizon to the uncondi-
tional variance of this process; see (8.66)), in the model IGARCH the current
information persists even for very long prediction horizons.
The model IGARCH(r, s) is defined as the model GARCH(r, s) (see (8.55)),
where in addition
X
max fr, sg
ðαi þ βi Þ ¼ 1, ð8:67Þ
i¼1
so that the unconditional variance (8.57) of deviations et does not exist.

In particular, the model IGARCH(1,1) has the form
σ 2t ¼ α0 þ ð1 β1 Þe2t1 þ β1 σ 2t1 ðα0 > 0, 0 β1 1Þ: ð8:68Þ
Obviously, if μt ¼ 0 and α0 ¼ 0, then the model IGARCH(1,1) transfers to the

scheme EWMA in (8.19). The recursive prediction formula (8.65) simplifies in
IGARCH(1,1) to the form
σ 2tþτ ðt Þ ¼ α0 þ b
b σ 2tþτ1 ðt Þ, τ > 1, ð8:69Þ
so that
b
σ 2tþτ ðt Þ ¼ ðτ 1Þα0 þ b
σ 2tþ1 ðt Þ ¼ ðτ 1Þα0 þ b
σ 2tþ1 , τ > 1: ð8:70Þ
One can see that the influence of current volatilities on predictions of future
volatilities really persists and that these predictions follow a line with the slope α0.
8.3.6.2 GJR GARCH
The classical GARCH model is not capable of modeling the leverage effect, i.e., the
asymmetry in the impact of past positive and negative deviations et on the current
volatility (the volatility is prone to increase more after price drops than after price
growths of the same size; see also Sect. 8.1). Glosten et al. (1993) suggested a
successful modification of GARCH model correcting this drawback (see also
Zakoian (1994)), which is usually denoted as GJR GARCH according to its authors
(sometimes the denotation threshold GARCH or acronym TARCH also appears):
(
X
r X
s X
n 1 for et < 0,
σ 2t ¼ α0 þ αi e2ti þ β j σ 2tj þ γ k e2tk I
tk , I
t ¼
i¼1 j¼1 k¼1 0 for et 0:
ð8:71Þ
This model can be interpreted in such a way that the impact of “good news” (et–i 0)
modeled by means of αi differs from the impact of “bad news” (et–i < 0) modeled by
means of αi + γ i. If γ i > 0, then the bad news induce the growth of volatility so that
the leverage effect works with delay i. In any case, the model behaves asymmetri-
cally for γ i 6¼ 0.
The most frequent form of the model GJR GARCH in practice (see also Example
8.3) is simply
yt ¼ μ t þ et , et ¼ σ t εt , σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 þ γ 1 e2t1 I

t1 ,
(
1 for et < 0,
I
t ¼ ð8:72Þ
0 for et 0:
8.3.6.3 EGARCH
EGARCH is another approach to asymmetry suggested by Nelson (1991). After

various simplifications (e.g., originally Nelson recommended only the probability
distribution GED for variables εt), the exponential model GARCH (denoted by
acronym EGARCH) has the form
X
r eti X s Xn
e
yt ¼ μ t þ et , et ¼ σ t εt , ln σ 2t ¼ α0 þ
αi þ β ln σ 2
þ γ k tk :
σ ti
j tj σ tk
i¼1 j¼1 k¼1
ð8:73Þ
The application of logarithmic volatilities enables us to remove the constraints for

parameter signs (e.g., α0 > 0). Further the leverage effect is exponential (and not
quadratic) in (8.73). Obviously, the asymmetry occurs, if γ i 6¼ 0 for a delay
i (particularly, the leverage effect occurs for γ i < 0).
The most frequent form of the model EGARCH in practice (see also Example
8.3) is

et1
yt ¼ μ t þ et , et ¼ σ t εt , ln σ 2t
¼ α0 þ α1 þ β1 ln σ 2 þ γ 1 et1 : ð8:74Þ
σ t1 t1 σ t1
Moreover, before constructing asymmetric models of the type GJR GARCH and
EGARCH, the asymmetry should be tested statistically (see, e.g., Engle and Ng
(1993)). One usually uses the residuals bet obtained from the estimated (symmetric)
GARCH model (see, e.g., (8.51)) and tests by means of classical t, F, or LM tests the
significance of parameters in the linear model of the type

1 for bet1 < 0 ,
be2t ¼ δ0 þ δ 1 S þ ut , S ¼ ð8:75Þ
t1 t1
0 for bet1 0 ;
be2t ¼ δ0 þ δ1 S
t1b
et1 þ ut ; ð8:76Þ
be2t ¼ δ0 þ δ1 S
et1 þ δ3 Sþ
t1 þ δ2 St1b t1b
et1 þ ut , Sþ
t1 ¼ 1 St1 , ð8:77Þ
where ut is the classical white noise. The significant parameters

• δ1 in the model (8.75) justifies the asymmetry of volatility in the given time series.
• δ0 and δ1 in the model (8.76) justify the asymmetry of volatility and the impact of
size of negative deviations et on volatility in the given time series.
• δ0 and δ3 in the model (8.77) justify the asymmetry of volatility and the impact of
size of positive and negative deviations et on volatility in the given time series.
Example 8.3 Table 8.3 and Fig. 8.8 show log returns rt ¼ ln KBt – ln KBt – 1 of daily
closing prices of stocks KB (the bank in Société Générale Group) in 2005 (see also
prices KBt in CZK in Fig. 8.9 for 253 trading days).
The model GJR GARCH(1,1) was estimated for rt by means of EViews (see
Table 8.4) as
r t ¼ et , et ¼ σ t εt , σ 2t ¼ 0:0291e2t1 þ 0:7622σ 2t1 þ 0:2727e2t1 I

t1 ,
(
1 for et < 0,
I
t ¼
0 for et 0,
where the significantly positive estimate 0.2727 of parameter γ 1 confirms the

occurrence of leverage effect in the given time series.
Similarly, the model EGARCH(1,1) was estimated for rt by means of EViews
(see Table 8.5) as

et1
r t ¼ et , et ¼ σ t εt , ln σ 2t
¼ 0:9683 þ 0:1531 þ 0:8958 ln σ 2 0:1661 et1 ,
σ t1 t1 σ t1
where this once the significantly negative estimate –0.166 1 of parameter γ 1 confirms
again the occurrence of leverage effect.
Finally, Figs. 8.10 and 8.11 plot the volatilities which are constructed by means of
the estimated model GJR GARCH(1, 1) and EGARCH(1, 1) (they can be compared
mutually).
⋄
Table 8.3 Daily log returns of stocks KB in 2005 (252 values for 253 trading days written in columns) from Example 8.3 (see also Fig. 8.9)
1 2 3 4 5 6 7 8 9 10 11
1 – 0.0215 –0.0281 0.0041 –0.0268 0.0068 –0.0057 0.0081 –0.0417 –0.0039 –0.0055
2 0.0255 –0.0014 0.0116 –0.0154 –0.0181 –0.0023 –0.0048 0.0526 0.0121 0.0039 0.0128
3 –0.0062 –0.0043 –0.0794 –0.0083 –0.0052 0.0100 –0.0015 –0.0029 0.0000 0.0104 –0.0087
4 –0.0194 0.0043 –0.0073 0.0146 0.0172 0.0096 0.0150 0.0060 –0.0124 –0.0021
5 0.0238 0.0239 0.0285 –0.0098 0.0196 0.0182 –0.0060 0.0082 –0.0129 0.0208
8.3 Volatility Modeling
6 –0.0074 0.0179 –0.0303 –0.0013 0.0149 –0.0182 –0.0030 0.0145 0.0029 –0.0138
7 0.0118 –0.0014 0.0422 –0.0341 0.0131 0.0032 –0.0060 0.0179 0.0303 0.0088
8 0.0003 0.0216 –0.0413 –0.0353 –0.0078 –0.0095 –0.0091 0.0173 0.0230 –0.0070
9 –0.0003 –0.0244 0.0294 0.0220 –0.0020 0.0127 0.0119 –0.0110 0.0125 0.0218
10 0.0058 –0.0110 0.0051 0.0277 0.0114 –0.0111 0.0093 –0.0027 0.0210 0.0029
11 0.0176 0.0308 –0.0265 –0.0250 0.0132 0.0079 –0.0060 0.0027 0.0290 –0.0218
12 –0.0060 –0.0016 –0.0225 –0.0384 0.0156 –0.0032 –0.0106 –0.0219 0.0060 0.0088
13 –0.0087 –0.0097 0.0289 0.0290 0.0072 –0.0160 0.0076 0.0227 –0.0086 0.0072
14 –0.0205 –0.0071 –0.0015 0.0101 –0.0041 0.0032 –0.0122 –0.0103 0.0158 0.0066
15 0.0059 0.0257 –0.0107 0.0040 0.0016 –0.0129 0.0003 –0.0055 0.0014 –0.0034
16 –0.0068 –0.0048 0.0227 –0.0178 –0.0238 0.0058 –0.0058 –0.0125 0.0045 0.0012
17 0.0286 0.0075 0.0015 –0.0150 0.0159 0.0290 –0.0006 0.0193 –0.0217 0.0086
18 0.0242 –0.0180 0.0006 –0.0314 0.0281 0.0094 –0.0127 0.0082 –0.0043 –0.0071
19 –0.0343 –0.0145 0.0318 –0.0316 0.0000 –0.0091 0.0019 0.0014 –0.0090 0.0043
20 0.0093 –0.0041 –0.0087 –0.0248 0.0015 0.0141 0.0276 –0.0571 –0.0023 –0.0086
21 –0.0136 0.0296 0.0035 0.0259 –0.0287 0.0068 0.0105 –0.0276 –0.0106 0.0029
22 0.0058 –0.0011 0.0020 0.0568 0.0144 0.0147 0.0228 0.0132 –0.0149 0.0000
23 –0.0087 –0.0081 –0.0218 0.0336 –0.0281 0.0166 –0.0035 0.0116 0.0149 –0.0116
24 0.0186 –0.0081 –0.0256 –0.0064 –0.0178 0.0089 –0.0032 –0.0253 0.0089 0.0029
25 –0.0098 –0.0165 –0.0291 0.0166 0.0033 –0.0119 –0.0272 –0.0415 –0.0172 –0.0017
225
Source: kurzy.cz (https://akcie-cz.kurzy.cz/akcie/komercni-banka-590/graf_2005)

.06
.04
.02
.00
-.02
-.04
-.06
-.08
-.10
25 50 75 100 125 150 175 200 225 250
daily log returns of stocks KB (in year 2005)
Fig. 8.8 Daily log returns of stocks KB in 2005 (252 values for 253 trading days) from Example
8.3 (see also Table 8.3). Source: calculated by EViews
3800
3600
3400
3200
3000
2800
2600
25 50 75 100 125 150 175 200 225 250
daily closing prices of stocks KB (in year 2005)
Fig. 8.9 Daily closing prices of stocks KB in year 2005 (values in CZK for 253 trading days) from
Example 8.3. Source: kurzy.cz (https://akcie-cz.kurzy.cz/akcie/komercni-banka-590/graf_2005)
Remark 8.6 In financial applications, one combines frequently the asymmetric

models for volatility with the classical AR models for mean value. For example,
for the daily log returns rt of stocks IBM from July 1962 to 1999 (9442 values), Tsay
(2002) constructed the model AR(2)-GJR GARCH(1,1) of the form
Table 8.4 Estimation of the process GJR GARCH(1, 1) from Example 8.3 (daily log returns of
stocks KB in year 2005)
GARCH ¼ C(2) + C(3)RESID(-1)^2 + C(4)RESID(-1)^2(RESID(-1)<0) + C(5)GARCH(-1)
C 0.000117 0.001024 0.114515 0.9088
Variance Equation
C 4.26E-05 1.93E-05 2.210756 0.0271
RESID(-1)^2 –0.029062 0.039569 –0.734468 0.4627
RESID(-1)^2(RESID(-1)<0) 0.272706 0.125334 2.175834 0.0296
GARCH(-1) 0.762199 0.083962 9.077869 0.0000
Table 8.5 Estimation of the process EGARCH(1, 1) from Example 8.3 (daily log returns of stocks
KB in year 2005)
LOG(GARCH) ¼ C(2) + C(3)ABS(RESID(-1)/@SQRT(GARCH(-1))) + C(4)RESID(-1)/
@SQRT(GARCH(-1)) + C(5)LOG(GARCH(-1))
C –1.34E-05 0.001040 –0.012899 0.9897
Variance Equation
C(2) –0.968342 0.415948 –2.328036 0.0199
C(3) 0.153073 0.069119 2.214631 0.0268
C(4) –0.166100 0.059934 –2.771389 0.0056
C(5) 0.895763 0.048687 18.39837 0.0000
r t ¼ 0:043 0:022r t2 þ et ,

et ¼ σ t εt , σ 2t ¼ 0:098e2t1 þ 0:954σ 2t1 þ 0:060 0:052e2t1 0:069σ 2t1 I
t1 ,
(
1 for et < 0,
I
t ¼
0 for et 0:
⋄
.0020
.0016
.0012
.0008
.0004
.0000
25 50 75 100 125 150 175 200 225 250
volatility of daily log returns of stocks KB (in year 2005) by means of model GJR GARCH(1, 1)
Fig. 8.10 Volatility of daily log returns of stocks KB in 2005 (252 values for 253 trading days)
estimated by means of model GJR GARCH(1, 1) in Example 8.3
.0020
.0016
.0012
.0008
.0004
.0000
25 50 75 100 125 150 175 200 225 250
volatility of daily log returns of stocks KB (in year 2005) by means of model EGARCH(1, 1)
Fig. 8.11 Volatility of daily log returns of stocks KB in 2005 (252 values for 253 trading days)
estimated by means of model EGARCH(1, 1) in Example 8.3 (compare with volatility estimated by
means of GJR GARCH(1,1) in Fig. 8.10)
8.3.6.4 GARCH-M
The return of financial asset often depends on its volatility (e.g., investors are
compensated for higher risk by higher return). Therefore, Engle et al. (1987)
suggested a modification of ARCH models (and later GARCH models), where the
volatility or its square root enters the mean equation (so-called ARCH-M models).
For instance, the model GARCH(1,1)-M (i.e., GARCH-in-mean) has the form
yt ¼ μt þ γ 1 σ 2t þ et ðor yt ¼ μt þ γ 1 σ t þ et Þ, et ¼ σ t εt ,
ð8:78Þ
σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 :
If the parameter γ 1 is significantly positive, then the increased risk manifests itself by
increased volatility, which causes the increased level of time series (i.e., the
increased mean).
8.3.6.5 Models of Stochastic Volatility SV
The volatility equation of GARCH model is obviously fully deterministic (in the
sense of conditioning by past information). The denotation stochastic volatility is used
in this context only in such a case, when the volatility equation contains additional
error term which remains random even if one conditions by the past information.
Although simple examples of such models are the autoregressive models of volatility
from Sect. 8.3.3, the general SV model (see, e.g., Taylor (1994)) is presented as
X
s
yt ¼ μ t þ et , et ¼ σ t εt , ln σ 2t ¼ α0 þ β j ln σ 2tj þ ut , ð8:79Þ
j¼1
where {ut} is another white noise (mostly iid with normal distribution) that is
independent on {εt} (the formulation by means of logarithmic volatility enables to
ignore the condition of nonnegativeness similarly as in EGARCH model). The
models of the type SV turned out well, e.g., in the context of option pricing,
where the volatility of underlying asset enters the famous Black–Scholes formula
(see Sect. 8.3.2). On the other hand, the difficult estimation is one of drawbacks of
these models.
Remark 8.7 We have mentioned in the beginning of this section that there are many
modifications of GARCH (and many new ones probably will appear in future), e.g.:
• FIGARCH ( fractionally IGARCH) are FI models (i.e., the long-memory pro-
cesses from Sect. 6.7), but for volatility, an analogical character has the model
FIEGARCH and others.
• QGARCH (quadratic GARCH) models (see Sentana (1995)) reflect asymmetry
in such a way that the delayed deviations et–i figure directly on the right-hand side
of volatility equation (in addition to the squared delayed deviations e2ti ).
• APARCH (asymmetric power ARCH) models (see Ding et al. (1993)) induce the
long-memory property by means of a suitable power transformation of volatilities
and are capable of expressing well the fat tails, excess kurtosis, and leverage
effects.
⋄
8.4 Exercises
Exercise 8.1 Repeat the analysis from Examples 8.1 and 8.2 (daily log returns of
index PX in 2016), but only for last 100 values of time series {rt} (hint: r t ¼ σ t εt ,
σ 2t ¼ 0:0168r 2t1 þ 0:5419σ 2t1 ).
Exercise 8.2 Repeat the analysis from Example 8.3 (daily log returns of stocks
KB in 2005), but only for last 203 values of time series {rt} hint: rt ¼ et , et ¼ σ t εt ,

1 for et < 0,
σ 2t ¼ 0:0641e2t1 þ 0:8062σ 2t1 þ 0:2629e2t1 I
t1 , I t ¼ ;
0 for et 0,

e e
rt ¼ et , et ¼ σ t εt , ln σ 2t ¼ 0:9576 þ 0:0559 t1 þ 0:8862 ln σ 2t1 0:1811 t1 .
σ t1 σ t1
Chapter 9
Other Methods for Financial Time Series
9.1 Models Nonlinear in Mean Value
In Sect. 8.2, we presented a general nonlinear scheme

pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
yt ¼ μt þ et ¼ μt þ σ t εt ¼ μt þ ht εt ¼ gðΩt1 Þ þ hðΩt1 Þ εt , ð9:1Þ
where εt ¼ et /σ t are standardized shocks et (εt are usually iid in contrast to the
uncorrelated et, which may be possibly dependent). In this framework, we dealt so
far (see Sect. 8.3) with the volatility equation σ t2 ¼ h(Ωt – 1) only (an exception was
the model GARCH-M). Now on the contrary, we focus on various nonlinear models
for the (conditional) mean equation μt ¼ g(Ωt – 1), even though we shall present only
the most important ones from the point of view of applications in finance (see also
monographs by Priestley (1988), Tong (1990), and others). These models are mostly
specific cases (acceptable from the computational point of view) of the
general model
yt ¼ f ðet , et1 , et2 , . . .Þ ð9:2Þ
(see also (8.9)), where et is an (uncorrelated) white noise with the variance σ 2e :
More specifically, the models in this section have been motivated by some
nonlinear characteristics of data from practice (not necessarily from financial prac-
tice only). Examples are the asymmetry between increase and decrease of time
series, the limit cycle (i.e., limit form of the process in regular cycles if one excludes
all random elements), and the dependence of frequency on amplitude in periodic
behavior of some processes (e.g., the frequency increases with decreasing amplitude
and decreases with increasing amplitude, or on the contrary).

https://doi.org/10.1007/978-3-030-46347-2_9
232 9 Other Methods for Financial Time Series
9.1.1 Bilinear Models
While the linear process (6.17) can be looked upon in such a way that it originates by
means of Taylor expansion of the function f() in (9.2) to the first order, in the case of
bilinear models it should be the expansion to the second order
X
p X
q P X
X Q
yt ¼ α þ φi yti þ θ j etj þ βmn ytn etm þ et : ð9:3Þ
i¼1 j¼1 n¼1 m¼1
Some special cases of (9.3) belong to the class of models with conditional
heteroscedasticity assuming usually et ~ iid (0, σ e2): e.g., if we consider the model
X
Q
yt ¼ μ þ βm et etm þ et ð9:4Þ
m¼1
then it holds
!2
X
Q
μt ¼ Eðyt jΩt1 Þ ¼ μ, σ 2t ¼ varðyt jΩt1 Þ ¼ 1þ βm etm σ 2e : ð9:5Þ
m¼1
The most frequent model of this type is so-called completely bilinear model of
the form
P X
X Q
yt ¼ βmn ytn etm þ et ð9:6Þ
n¼1 m¼1
(here again the white noise values et are usually assumed to be independent). If
dealing with models of the type (9.6), the form of matrix (βmn) is substantial.
Moreover, one distinguishes so-called superdiagonal or diagonal or subdiagonal
models depending on whether the matrix (βmn) has zero elements only above the
main diagonal or only on the main diagonal or only under the main diagonal,
respectively.
The detailed theoretical analysis of some special cases of superdiagonal, diago-
nal, and subdiagonal models (including conditions of stationarity for these models)
has shown some paradoxical results: e.g., the correlation structures of some bilinear
models correspond to the correlation structures of simple linear processes ARMA
(or even to the one of white noise). It has the practical consequence for the
identification of bilinear models when the correlogram does not distinguish them
from ARMA models. Moreover, the practical calculation of partial autocorrelation
function of bilinear models is so complex that it cannot be recommended for
9.1 Models Nonlinear in Mean Value 233
identification of such models. Fortunately, applications show that a sufficient

distinguishing criterion in questionable cases is the form of autocorrelation function
of the squared time series {yt2}. In order to demonstrate it, we will present some
results that have been derived for the simplest types of bilinear models (see, e.g.,
Granger and Andersen (1978)):
1. Example of superdiagonal model:
yt ¼ βytn etm þ et , m < n, ð9:7Þ
where the sufficient and necessary condition of stationarity has the form λ2 < 1 for
λ ¼ βσ e. Then the corresponding time series {yt} has the zero mean value, the
variance σ e2/(1 – λ2), and the autocorrelation function
ρk ¼ 0 for k 6¼ 0: ð9:8Þ
Moreover, the autocorrelation function ρk(2) of the squared time series {yt2} fulfills
ð2Þ ð2Þ
ρk ¼ λn ρkn for k > m, ð9:9Þ
so that these autocorrelation functions identify the time series {yt} as the white noise,
while the time series {yt2} as the process ARMA(n, m) (see (6.46)).
2. Example of diagonal model:
yt ¼ βyt1 et1 þ et , ð9:10Þ
where the sufficient and necessary condition of stationarity has again the form
λ2 < 1. The corresponding time series {yt} has the mean value βσ e2, the variance
σ e2(1 + λ2 + λ4) /(1 – λ2), and the autocorrelation function

λ2 1 λ2
ρk ¼ for k ¼ 1, ρk ¼ 0 for k > 1 ð9:11Þ
1 þ λ2 þ λ4
(under stronger assumption et ~ iid N(0, σ e2)). The autocorrelation function ρk(2) of
{yt2} fulfills
ð2Þ ð2Þ
ρk ¼ λ2 ρk1 for k > 1, ð9:12Þ
so that these autocorrelation functions identify the time series {yt} as the process
MA(1), while the time series {yt2} as the process ARMA(1, 1).
3. Example of subdiagonal model:
yt ¼ βyt2 et3 þ et , ð9:13Þ
where the sufficient and necessary condition of stationarity has again the form
λ2 < 1. The corresponding time series {yt} has the zero mean value, the variance
σ e2/(1 – λ2), and the autocorrelation function
ρk ¼ 0 for k 6¼ 0: ð9:14Þ
The autocorrelation function ρk(2) of {yt2} fulfills
ð2Þ ð2Þ
ρk ¼ λ2 ρk2 for k > 3, ð9:15Þ
so that these autocorrelation functions identify the time series {yt} as white noise,
while the time series {yt2} as the process ARMA(2, 3). Anyway, the analysis of
theoretical properties of subdiagonal models is usually much more complex than for
the superdiagonal and diagonal models.
The bilinear models can be estimated similarly as the linear models applying a
recursive calculation of values et in dependence on the model parameters (see Sect.
6.3.2). Also the predictions can be constructed analogously as in the linear case. For
example in the diagonal model (9.10), it is possible to derive a necessary condition of
invertibility in the form (see Granger and Newbold (1986))

λ2 2λ2 þ 1
<1 ð9:16Þ
1 λ2
(i.e., |λ| < 0.605). The prediction in this model can be constructed as

bytþ1 ðt Þ ¼ b bytþτ ðt Þ ¼ E ytþτ ¼ b
2
β ytbet , βbσe for τ > 1: ð9:17Þ
Remark 9.1 Granger and Andersen (1978) present the following financial applica-
tion of bilinear models. A time series {yt} of stock prices of a big corporation
with length of 169 observations was originally estimated by means of a linear
model yt ¼ et + 0.26et1 with the variance of white noise estimated as 24.8.
However, the time series {et} originally looked on as a (linear) white noise was
identified and estimated as the bilinear model of the form et ¼ 0.02et1ut1 + ut,
where ut denotes a white noise with estimated variance 23.5. Even though the
reduction of white noise variance seems insignificant (from 24.8 to 23.5), the
mean squared error MSE of the one-step-ahead prediction (see (2.11)) calculated
for the last fifteen out-of-sample observations (i.e., h¼15) decreased by 11% when
applying the bilinear scheme.
⋄
9.1.2 Threshold Models SETAR
Threshold models SETAR replaces linear relations by a piecewise linear function f()
in (9.2), the changes of this function being controlled not from the time space but
from the state space of function values. More specifically, these models are
constructed applying some critical limits (thresholds) and change when the observed
time series exceeds these thresholds (a similar principle has been applied for GJR
GARCH processes from Sect. 8.3.6, but for their conditional variance (volatility),
and not for the conditional mean which is just the case of models SETAR). A more
general framework are switching regimes models, where the particular regimes can
be controlled by fixed (deterministic) thresholds (see just the models SETAR) or by a
stochastic way (see, e.g., MSW models later in this section).
Let us consider a very simple model SETAR of the form

1:8yt1 þ et for yt1 < 0,
yt ¼ ð9:18Þ
0:5yt1 þ et for yt1 0,
where et are iid N(0, 1) (obviously, this model has a single threshold in zero, where
the past value yt1 with time delay d ¼ 1 controls the current value yt). Figure 9.1
plots one of simulations of this process with length 200 and zero starting value
(trajectories of other simulations are very similar). At first glance, one can see some
interesting properties of this process:
• The process is stationary (even though the first autoregressive polynomial has a
root lying significantly inside the unit circle in complex plane).
• The process is (geometrically) ergodic, i.e., its sample mean converges (in
a specific way) to the theoretical mean.
-1
-2
25 50 75 100 125 150 175 200
Fig. 9.1 Simulation of threshold model (9.18) with one zero threshold
• The given realization shows an asymmetry between its upward and downward
jumps: if yt1 < 0, then the process tends to turn over immediately to a positive
(i.e., opposite) value due to the significantly negative value of the autoregressive
parameter –1.8, while if yt1 > 0, then the turnover to negative (i.e., opposite)
values usually takes more time units. It implies directly that the process attains
more values above the zero threshold than below it and that it shows immediate
significant jumps upward to the positive values, as soon as it becomes negative.
• The sample mean of the given realization is 0.75 with standard deviation 0.08 of
this sample estimate so that it lies significantly above the zero threshold (the
theoretical mean of the given process is the weighted average of its conditional
mean values for both threshold regions with weights corresponding to the prob-
abilities of both regions from the point of view of stationary distribution of the
process).
In general, the process {yt} denoted usually as the thresholds autoregressive
model with r autoregressive regimes of orders pj and controlling delay d (the
acronym SETAR comes from self-exciting threshold AR to stress the self-regulation
of process) has the form
ð jÞ ð jÞ
yt ¼ αð jÞ þ φ1 yt1 þ . . . þ φðp jjÞ ytp j þ et for P j1 ytd < P j , j ¼ 1, . . . , r,
ð9:19Þ
where d and r are given natural numbers, thresholds Pj are real numbers fulfilling
inequalities –1 ¼ P0 < P1 < ... < Pr ¼ 1, and {et( j )} are mutually independent
white noises usually of the type iid with variances σ j2.
The identification and (simultaneous) estimation of the models SETAR is mostly
realized by applying information criteria of the type AIC from Sect. 6.3.1 (see Tong
(1983, 1990)).
Remark 9.2 Chappell et al. (1996) estimated the model SETAR with one threshold
for the time series {Et} of log returns of daily exchange rate French franc / German
mark (FRF/DEM) in the period from May 1, 1990, to March 30, 1992 (i.e., 450
observations)
(
ð1Þ
0:022 2 þ 0:996 2E t1 þ et for E t1 < 5:830 6 ,
Et ¼ ð2Þ
0:348 6 þ 0:439 4E t1 þ 0:305 7E t2 þ 0:195 1E t3 þ et for E t1 5:830 6:
The threshold value was estimated a few percent below the upper limit prescribed by
the Exchange Rate Mechanism (ERM) in the Economic and Monetary Union (EMU)
which was in force just at this time (it corresponds to reality, since the central banks
of particular states usually intervened some time before the exchange rate achieved
the permitted limit).
⋄
Remark 9.3 As the conditional mean value of models SETAR is not continuous
(indeed, the thresholds are points of discontinuity of μt ), one has suggested the
models STAR (smooth transition AR model; see Chan and Tong (1986) and others).
For instance, in the case of model with two regimes it can be
!
X
p
ð1Þ ytd Δ Xp
ð2Þ
ð1Þ ð2Þ
yt ¼ α þ φi yti þF α þ φi yti þ et , ð9:20Þ
i¼1
s i¼1
where parameters Δ and s and a transition function F() determine the way of
transition between both regimes (a usual choice of F in practice is the distribution
function of logistic or exponential distribution). Even though the corresponding
conditional mean value μt is assumed to be differentiable in continuous time, the
consistency of parameter estimation is usually problematic (particularly for the
location Δ and scale s).
⋄
9.1.3 Asymmetric Moving Average Models
Another approach to asymmetry is represented by asymmetric moving average

processes (see Wecker (1981))
yt ¼ eþ þ þ þ þ
t þ θ 1 et1 þ . . . þ θq etq þ et þ θ 1 et1 þ . . . þ θ q etq , ð9:21Þ
where et is a normal white noise with variance σ e2, et+ ¼ max (0, et), et– ¼ min (0, et),
and θ1+, ..., θq– are parameters. If θj+ ¼ θj– ( j ¼ 1, ..., q), then (9.21) is the classical
“symmetric” moving average process MA(q) (see (6.24)).
In contrast to the symmetric moving average models, the asymmetric ones may
not have zero mean value, even though their correlation structure is similar to the
symmetric case with truncation point in q. For example, the asymmetric process
MA(1) fulfills

θþ
1 θ1 σ e
μ ¼ Eðyt Þ ¼ pffiffiffiffiffi , ð9:22Þ
2π
2
2 2
2
1 þ θþ
1 σe 1 þ θ
1 σe
γ 0 ¼ varðyt Þ ¼ þ μ2 ,
2 2

θþ 2
1 þ θ1 σ e
ρ1 ¼ , ρk ¼ 0 for k > 1: ð9:23Þ
2
If θ1+ ¼ –θ1–, then such a process has obviously the same correlation structure as a
white noise.
9.1.4 Autoregressive Models with Random Coefficients RCA
Autoregressive processes with random coefficients RCA were originally suggested to

model more appropriately the conditional mean value of the autoregressive process
{yt} regarding its parameters as random variables (see Nicholls and Quinn (1982)),
but one could classify them also as models with conditional heteroscedasticity. The
basic form is
X
p
yt ¼ α þ ðφi þ δit Þyti þ et , ð9:24Þ
i¼1
where {δt} ¼ {(δ1t, ..., δpt)0 } is a sequence of independent random vectors with zero
(vector) mean and variance matrix Σδδ independent of the white noise {et}. The
conditional mean and variance of (9.24) fulfill
X
p
μt ¼ Eðyt jΩt1 Þ ¼ α þ φi yti ,
i¼1
0
σ 2t ¼ varðyt jΩt1 Þ ¼ σ 2e þ yt1 , . . ., ytp Σδδ yt1 , . . ., ytp : ð9:25Þ
9.1.5 Double Stochastic Models
Double stochastic models extend the principle of RCA by modeling the parameters
of an ARMA process (or another classical linear process) by means of other random
processes (see, e.g., Chen and Tsay (1993); Tjøstheim (1986)). Special cases are
functional-coefficient autoregressive processes FAR of the form
yt ¼ f 1 ðyt1 , . . ., ytk Þ yt1 þ . . . þ f p ðyt1 , . . ., ytk Þ ytp þ et , ð9:26Þ
where functions f1, ..., fp should be differentiable to the second order, e.g., the
exponential autoregressive model (see Haggan and Ozaki (1981))

yt ¼ φ1 þ π 1 exp γ y2t1 yt1 þ . . .

þ φp þ π p exp γ y2t1 ytp þ et , ð9:27Þ
where the autoregressive parameters depend exponentially on the amplitude |yt1|:

obviously, these parameters are close to φi for high amplitudes, while they are close
to φi + π i for low amplitudes (moreover, the amplitude effect is modified by means of
the parameter γ > 0). Such amplitude-dependent models possess some typical
features of physical process, e.g., the dependence of frequency on amplitude, the
limit cycle, and others (see introduction to this Sect. 9.1).
9.1.6 Switching Regimes Models MSW
In contrast to the models, the regimes of which are controlled by observable vari-
ables (e.g., by the location of a past value of the given time series between thresholds
of SETAR), the models denoted as MSW change particular regimes in an
unobservable (latent) way, namely by means of a Markov mechanism (MSW is
the acronym for Markov switching).
Particularly, the simplest case of so-called process MSA (Markov-switching
autoregressive) with two regimes has the form
8
>
> ð1Þ
P
p1
ð1Þ ð1Þ
>
< α þ ϕi yti þ et for st ¼ 1,
i¼1
yt ¼ ð9:28Þ
>
> Pp2
ð2Þ ð2Þ
> ð2Þ
:α þ ϕi yti þ et for st ¼ 2,
i¼1
where st is a Markov chain with values 1 and 2 and transition probabilities
Pðst ¼ 2 j st1 ¼ 1Þ ¼ w1 , Pðst ¼ 1 j st1 ¼ 2Þ ¼ w2 ð9:29Þ
and {et(1)} and {et(2)} are (mutually independent) iid white noises. Obviously, a
small value of the transit probability wi means that the process remains a longer time
in the state i (the reciprocal value 1/wi is equal to the mean period of stay (the mean
holding time) in this state. One can see that the process MSA makes use of the
Markov probability mechanism to control the transits among particular conditional
mean values.
Due to the stochastic (i.e., latent) control of regime switching, the construction of
models MSW is not simple (see, e.g., Hamilton (1989, 1994)). One can make use of
some estimation techniques based on simulations, e.g., MCMC method (Markov
Chain Monte Carlo). The construction of prediction is more complex as well,
combining linearly the predictions constructed for particular regimes (in contrast
to predicting in a threshold model, where the observed past value yt – d unambigu-
ously determines in which regime the prediction will be constructed).
Remark 9.4 In financial practice, the models MSW are popular for modeling time
series of gross domestic products GDPt. For example, Tsay (2002) constructed the
following model for the quarterly (seasonally adjusted) time series of GDP growth
(in %) in the USA in years 1947–1990
(
ð1Þ
0:909 þ 0:265yt1 þ 0:029yt2 0:126yt3 0:110yt4 þ et for st ¼ 1,
yt ¼ ð2Þ
0:420 þ 0:216yt1 þ 0:628yt2 0:073yt3 0:097yt4 þ et for st ¼ 2,
where the standard deviations σ e(1) and σ e(2) of white noises were estimated as 0.816
and 1.017. Hence the mean values of the process {yt} for the first and second state
can be evaluated as 0.965 and –1.288 (evidently, the first state corresponds to the
quarters of economic growth, while the second state to the quarters of economic
decline). Finally, the transit probabilities w1 and w2 were estimated as 0.118 and
0.286 which can be interpreted as follows: to get out of recession is more probable
than to enter it. More specifically, the mean length of recession period is
1/0.286 ¼ 3.50 quarters, i.e., less than 1 year, while the mean length of boom period
is 1/0.118 ¼ 8.47 quarters, i.e., more than 2 years).
⋄
9.2 Further Models for Financial Time Series
There are further approaches to nonlinear models of financial time series that cannot
be classified analogously as in Sect. 9.1 (due to their philosophy, due to the type of
analysis, etc.). Two examples of such approaches will be given here as illustrations.
9.2.1 Nonparametric Models
Let in the simplest case two financial variables yt and xt be linked by the relation
yt ¼ m ð xt Þ þ et , ð9:30Þ
9.2 Further Models for Financial Time Series 241
where m() is an unknown (nonlinear, but smooth) function and {et} is a white noise.
A natural task is to estimate the function m() as truly as possible by means of
observed data y1, ..., yT and x1, ..., xT .
As the arithmetic average of values e1, ..., eT converges to zero with increasing
T (according to the law of large numbers), it seems that the natural estimate of m() is
the arithmetic average of values y1, ..., yT
1 X
T
y: ð9:31Þ
T t¼1 t
However, there is a problem consisting in the fact that one should estimate the
function m(x) for a given value of argument x, but the observed values x1, ..., xT may
differ from x significantly. Therefore, it is reasonable to replace (9.31) by the
weighted average
1X
T
b ð xÞ ¼
m w ðxÞ yt , ð9:32Þ
T t¼1 t
where the weights wt(x) are large (or small) for such indices t, for which the observed
values xt lie close to x (or far from x), respectively.
In other words, one weighs the values yt by means of locally weighted averages
using weights of described properties. A unifying principle in this context consists in
application of so-called kernel (see, e.g., Härdle (1990)), which is a suitable function
K() with properties of a probability density
Z1
K ðxÞ 0, K ðzÞ dz ¼ 1: ð9:33Þ
1
Specialized statistical software systems offer various choices of kernels K(x)

(in the module usually titled as nonparametric regression). Instead of (9.32), one
then puts
P
T
K h ð x xt Þ yt
b ðxÞ ¼ t¼1T
m , ð9:34Þ
P
K h ð x xt Þ
t¼1
where h > 0 is so-called bandwidth and Kh(x) ¼ (1/h)K(x/h) is the kernel

calibrated by the scale h. The estimate (9.34) known in nonparametric regression
as Nadaraya–Watson estimate reflects the distances of particular regressors xt from
x in a systematic way.
9.2.2 Neural Networks
Neural networks enable to parametrize any continuous (nonlinear) function. There-

fore, this approach may be useful also for more complex financial models (see, e.g.,
Ripley (1993); Tsay (2002)).
A simple neural network can be looked upon as a system that connects an input
layer x over possible hidden layers h to an output layer o. Each layer is represented
by a given number of nodes (neurons). A neural network processes information from
one layer to the next one by means of activation functions (it holds for the neural
networks of the type feed-forward, while backward connections are also allowable in
the networks of the type feed-back). For example, the network of the type 3-2-1 has
three nodes in the input layer, two nodes in the hidden layer, and one node in the
output layer. A typical activation function has the form
!
X
oj ¼ f j α0j þ wij xi , ð9:35Þ
i!j
where fj is the logistic function of the type
exp ðzÞ
f j ðzÞ ¼ , ð9:36Þ
1 þ exp ðzÞ
xi is the value of the ith input node, oj is the value of the jth output node, α0j is called
bias, wij are weights, and the summation i ! j means summing over all input nodes
feeding to j. If a node has an activation function of the form

1 for z > 0,
f j ðzÞ ¼ ð9:37Þ
0 for z 0,
then one calls it a threshold node, with “1” denoting that the node fires (revitalizes)
its message. The final connection from inputs to outputs can be more complex, if
there are hidden layers (due to compounding gradually the activation functions),
e.g.:
!!
X X X
o ¼ f α0 þ wi xi þ wko f k α0k þ wik xi , ð9:38Þ
i!o k!o i!k
where not only a direct connection from the input to the output layer is possible, but
also an indirect one by means of a hidden layer (with summing index k in (9.38)).
A typical application of neural networks for financial time series is the following
one. One observes data xt and yt (t ¼ 1, ..., T ), where xt is a vector of input values at
time t and yt is an observation of given time series at time t. In addition to it, we have
9.3 Tests of Linearity 243
also model output values ot expressed analytically for particular times t by means of
relations of the type (9.38). Then by minimizing a simple criterion, e.g., the sum of
squares
X
T
ð yt o t Þ 2 , ð9:39Þ
t¼1
one estimates the biases α and weights w in (9.38). The neural network calibrated in
this way can be used, e.g., for construction of predictions in the given time series.
Moreover, the hold-out sample approach (see Sect. 2.2.3.4) enables us to evaluate
the prediction qualities of this model.
Remark 9.5 Tsay (2002) applied the model (9.38) with three nodes in the input
layer, two nodes in the hidden layer, and one node in the output layer for 864 daily
log returns rt of IBM stocks. Choosing the vector of input values as xt ¼ (rt – 1, rt – 2,
rt – 3), one estimated the neural network as
br t ¼ 3:22 1:81 f 1 ðxt Þ 2:28 f 2 ðxt Þ 0:09r t1 0:05r t2 0:12r t3 ,
where
exp ð8:34 18:97r t1 þ 2:17r t2 19:17r t3 Þ

f 1 ð xt Þ ¼ ,
1 þ exp ð8:34 18:97r t1 þ 2:17r t2 19:17r t3 Þ
exp ð39:25 22:17r t1 17:34r t2 5:98r t3 Þ
f 2 ð xt Þ ¼ :
1 þ exp ð39:25 22:17r t1 17:34r t2 5:98r t3 Þ
⋄
9.3 Tests of Linearity
An important part of analysis of financial time series concerns verifying nonlinearity

of analyzed data. It can be done using financial theory that often confirms that
nonlinear relations are the most acceptable ones for given variables. An alternative
approach is based on statistical tests denoted in this context as tests of linearity.
These tests are mostly applied not for the original time series but for residuals
calculated by means of the model that should be verified since it is frequent in
practice that after applying a linear model of the type ARMA or a nonlinear model of
the type GARCH, an unexplained nonlinear structure may remain in the analyzed
data outside the systematic part of the model. Therefore, we shall often use the
symbol {et} for the tested time series (and not {yt}) in the following text.
The null hypothesis of such tests of linearity is mostly the acceptability of a linear
model for the analyzed time series. In particular, one often verifies the null hypoth-
esis that the values {et} are independent or even iid: any violation of independence of
the calculated residuals usually indicates an inadequacy of constructed model
including the assumption of linearity. In practical analysis of (financial) time series,
one recommends particularly the following tests of linearity (see, e.g., Tsay (2002)):
• Q tests using, e.g., Ljung–Box statistics with the critical region of the form
(applying the significance level α)
X
K
1
Q ¼ nð n þ 2Þ ðr ðe ÞÞ2 χ 21α ðK p qÞ, ð9:40Þ
k¼1
nk k t
where {et} are residuals constructed by estimating a model ARMA( p, q) for the
given time series (see (6.67) and the verification of models ARCH in Sect. 8.3.4.3).
• RESET tests were suggested for various regression problems in statistics (Regres-
sion Equation Specification Error Tests, see Ramsey (1969)). When applying
them to test the linearity in time series, one tests, e.g., the null hypothesis of
the form
H 0 : β1 ¼ 0, . . . , βs ¼ 0 ð9:41Þ
in the following model:
yt ¼ α þ φ1 yt1 þ . . . þ φp ytp þ β1 by2t þ . . . þ βs bytsþ1 þ εt , ð9:42Þ
where the values byt are calculated in the original model AR( p) for the original
time series {yt} (i.e., under constraints (9.41) in (9.42)).
• BDS test is a widely used (nonparametric) test of independence H0: et ~ iid (it is
called according to its authors Brock, Dechert, and Scheinkman). This test is also
recommended as an effective test of linearity of {et} in the framework of
modeling financial time series. Numerous applications show that the test has a
high power when detecting various violations of independence (these violations
may have different forms of linear or nonlinear dependence of all types, deter-
ministic chaos, etc.). Let us describe this test in more details:
The BDS test starts by choosing a distance ε > 0. If really et ~ iid, then the
probability that the distance |eset| does not exceed ε for a pair es and et is the same
for an arbitrary choice of such pairs. Let us denote this probability as c1(ε). The index
1 in this symbol is used due to the fact that we shall consider more generally also
m such pairs ordered in time as (es, et), (es+1, et+1), ..., (es+m1, et+m1), and in such a
case, the symbol cm(ε) will denote the probability that for all pairs in this group the
corresponding distances do not exceed ε. If the null hypothesis on independence
holds, then it must be
9.3 Tests of Linearity 245
cm ð εÞ ¼ ð c1 ð εÞ Þ m : ð9:43Þ
To perform the test practically, one must dispose of sample versions (estimates) of
these values (they are called correlation integrals in the theory of chaos)
2 X
Tmþ1 X mY
Tmþ1 1
cm,T ðεÞ ¼ I ε esþj , etþj , ð9:44Þ
ðT m þ 1ÞðT mÞ s¼1 t¼sþ1 j¼0
where

1 for jx yj ε,
I ε ðx, yÞ ¼ ð9:45Þ
0 otherwise :
Then the test of hypothesis H0: et ~ iid consists in testing whether the deviation
bm,T ðεÞ ¼ cm,T ðεÞ ðc1,Tmþ1 ðεÞÞm ð9:46Þ
is not significantly different from zero (one removes deliberately m – 1 observations

in the second term on the right-hand side of (9.46) to base its calculation on the same
number of observations as for the first term). Brock et al. (1996) show that (under the
null hypothesis H0: et ~ iid ), the following asymptotic result holds for increasing T:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi bm,T ðεÞ

T mþ1 ! N ð0, 1Þ, ð9:47Þ
σ m,T ðεÞ
where
X
m1
σ 2T,m ðεÞ ¼ 4 ðk T ðεÞÞm þ 2 ðk T ðεÞÞmj ðc1,T ðεÞÞ2j þ ðm 1Þ2 ðc1,T ðεÞÞ2m
j¼1
!
2m2
m2 k T ðεÞðc1,T ðεÞÞ ,
2
k T ðεÞ ¼
T ðT 1ÞðT 2Þ
X
T X
T X
T
ðI ε ðet , es ÞI ε ðes , er Þ þ I ε ðet , er ÞI ε ðer , es Þ þ I ε ðes , et ÞI ε ðet , er ÞÞ:
t¼1 s¼tþ1 r¼sþ1
ð9:48Þ
This result is then used in the (asymptotic) BDS test with a given significance level.
Table 9.1 presents the application of BDS test for 100 simulated values of the type iid
N(0, 1). The test was performed for particular values m ¼ 2, ..., 6 and for the distance
limit ε ¼ 1.378 which is set up optimally by software (moreover, if the sample size
Table 9.1 BDS test for 100 simulated values of type iid N(0, 1)
BDS Test for Y
Dimension BDS Statistic Std. Error z-Statistic Normal Prob. Bootstrap Prob.
2 0.001753 0.006313 0.277743 0.7812 0.7352
3 0.002676 0.010054 0.266189 0.7901 0.7128
4 0.003480 0.011994 0.290146 0.7717 0.6772
5 0.009605 0.012523 0.767018 0.4431 0.4236
6 0.013696 0.012097 1.132115 0.2576 0.2848
Raw epsilon 1.378340
T is smaller, software systems often enable us to calculate the corresponding critical

values of the test with regard to their asymptotic character by means of bootstrap
simulations, see, e.g., Table 9.1). Obviously, due to high p-values in Table 9.1 the null
hypothesis on independence of simulated data cannot be rejected (see also Example
9.1 with real data).
Example 9.1 Table 9.2 presents the application of BDS test when verifying a model
construction for daily log returns of index PX50 of Prague Exchange in year 2004
(249 values for 250 trading days; see Table 9.2 and Fig. 9.2). The estimated model
GARCH(1,1) has the form
r t ¼ 0:0023 þ et , et ¼ σ t εt , σ 2t ¼ 0:0707e2t1 þ 0:8496σ 2t1 :
The test was performed again for particular values m ¼ 2, ..., 6 and for the distance
limit ε ¼ 0.013 set up optimally by software. Due to high p-values in Table 9.3, the
null hypothesis on independence of residuals estimated by means of this model
cannot be rejected (more specifically, the test confirms that no unexplained nonlinear
structure is remaining in these residuals).
⋄
9.4 Duration Modeling
Typical data in finance are transactions data (usually prices or volumes of traded
financial assets) with values observed at times of particular transactions (i.e., at time
ti for the ith transaction). Financial time series originating in this way have some
special features:
• As the non-aggregated information has often the form of high-frequency data in
this context, the timescale must reflect this fact (minutes or even seconds for big
stock and derivative exchanges or multinational foreign exchange markets).
Moreover, the technical support of trading becomes very important including
Table 9.2 Daily log returns of index PX50 in 2004 (249 values for 250 trading days written in columns) from Example 9.1 (see also Fig. 9.2 and Table 10.1)
1 2 3 4 5 6 7 8 9 10
1 – 0.0090 –0.0049 0.0114 –0.0045 0.0047 0.0074 –0.0001 –0.0064 0.0016
2 0.0123 0.0084 –0.0047 0.0017 0.0137 –0.0034 –0.0131 0.0054 –0.0107 0.0208
3 0.0057 0.0126 0.0129 –0.0139 0.0076 0.0088 –0.0052 0.0057 0.0250 0.0010
4 –0.0098 0.0171 0.0076 0.0022 0.0111 –0.0051 0.0089 0.0025 0.0088 –0.0108
5 0.0036 0.0022 0.0087 0.0067 0.0067
9.4 Duration Modeling
–0.0036 –0.0081 –0.0031 –0.0082 –0.0101

6 –0.0029 0.0007 –0.0092 0.0111 –0.0100 –0.0044 –0.0037 0.0013 –0.0055 0.0117
7 0.0175 –0.0043 0.0080 0.0073 0.0040 –0.0029 0.0027 0.0037 0.0022 0.0004
8 0.0030 0.0040 0.0111 –0.0093 –0.0003 0.0037 0.0075 –0.0119 –0.0014 –0.0089
9 0.0028 0.0016 0.0053 –0.0295 0.0040 0.0081 –0.0085 0.0057 –0.0148 0.0031
10 0.0082 0.0019 –0.0049 –0.0016 0.0134 –0.0075 –0.0020 0.0056 0.0211 –0.0114
11 0.0050 0.0008 –0.0010 –0.0212 –0.0028 –0.0100 0.0037 –0.0122 0.0054 –0.0317
12 0.0010 –0.0065 0.0132 –0.0065 –0.0113 0.0065 –0.0001 –0.0026 0.0214 0.0113
13 –0.0004 0.0150 0.0004 –0.0067 –0.0027 0.0022 0.0084 –0.0002 –0.0010 0.0019
14 0.0075 0.0029 0.0077 –0.0213 –0.0040 0.0008 0.0070 0.0325 –0.0029 0.0211
15 –0.0014 0.0184 0.0082 –0.0410 0.0000 –0.0082 –0.0013 0.0000 0.0084 0.0042
16 –0.0067 0.0019 0.0103 0.0295 0.0056 –0.0128 0.0063 –0.0038 –0.0080 0.0006
17 –0.0007 –0.0004 –0.0044 –0.0004 0.0052 0.0108 0.0037 –0.0092 0.0058 –0.0070
18 0.0038 0.0012 0.0079 –0.0188 0.0212 0.0065 0.0038 0.0091 0.0112 –0.0034
19 0.0026 0.0103 0.0058 0.0021 0.0036 0.0017 0.0107 0.0074 0.0081 –0.0076
20 0.0003 0.0096 –0.0040 –0.0158 –0.0117 0.0006 0.0020 0.0148 –0.0026 0.0051
21 0.0091 0.0238 0.0048 0.0080 –0.0040 0.0101 –0.0007 –0.0080 0.0037 0.0039
22 0.0071 –0.0019 –0.0227 0.0231 0.0130 –0.0049 0.0026 0.0199 0.0026 0.0046
23 0.0091 –0.0098 0.0036 –0.0021 –0.0065 –0.0037 0.0149 –0.0036 0.0348 0.0047
24 0.0066 –0.0138 –0.0079 –0.0022 –0.0014 0.0089 –0.0030 0.0016 0.0067 0.0101
25 0.0073 0.0037 0.0059 0.0054 –0.0079 0.0009 0.0106 –0.0123 –0.0030 0.0011
247

.04
.03
.02
.01
.00
-.01
-.02
-.03
-.04
-.05
25 50 75 100 125 150 175 200 225 250
daily log returns of index PX50 (in year 2004)
Fig. 9.2 Daily log returns of index PX50 in 2004 (249 values for 250 trading days)
Table 9.3 BDS test applied to residuals estimated by means of model GARCH(1,1) from Example
9.1 (daily log returns of index PX50 in year 2004)
BDS Test for RESID
Dimension BDS Statistic Std. Error z-Statistic Prob.
2 –0.002106 0.005460 –0.385828 0.6996
3 0.003350 0.008682 0.385813 0.6996
4 0.012462 0.010345 1.204570 0.2284
5 0.017166 0.010790 1.590984 0.1116
6 0.017386 0.010413 1.669745 0.0950
Raw epsilon 0.013311
the corresponding trading rules, e.g., the standardized or minimum admissible

traded volumes, the minimum upward or downward movement in prices
(so-called tick), regulation rules in the case of extreme or critical price changes,
the difference between purchase and sale price (bid–ask spread), and the system
of charges ( fee) for trade agents. Moreover, one should eliminate so-called
microstructure noise in such data; see, e.g., Ait-Sahalia and Jacod (2014),
Hautsch (2012), and others.
• Transactions such as stock trading do not occur at equally spaced time intervals,
but typically these are unequally or irregularly spaced data. In such a case, the
time duration between trades becomes important and might contain useful infor-
mation about market microstructure.
• The high-frequency data often exhibit characteristic daily periodicity (diurnal
pattern), where the transactions are heavier at the beginning and closing of the
traded hours and thinner during the lunch hours (typically it results in a U-shape
transaction intensity).
9.4 Duration Modeling 249
• The price is mostly a discrete-valued variable in transactions data, since the price
change from one transaction to the next occurs only in multiples of tick size (see
above). For example, the NYSE traded gradually in eights, sixteenths, and
decimals of dollar.
• In periods of heavy trading, more transactions may occur (even with different
prices) within a single second or another very small time unit of transaction
recording (so-called multiple transactions).
In this section, we focus on modeling durations between particular transactions.
To be more specific, let ti denote the (calendar) time measured in seconds since
midnight till the moment of the ith transaction (obviously, the index i describes the
order of transactions in time, not the calendar time). The corresponding duration
between the (i – 1)th and ith transaction will be then denoted as Δti ¼ ti – ti – 1 (and
sometimes for simplicity even in the abbreviated form as zi ¼ Δti).
One of the most successful approaches to the duration modeling copies the
philosophy of GARCH models, but for time durations zi (and not for values of the
given time series). The corresponding models based on this analogy are called
autoregressive conditional duration processes ACD(r, s) (see Engle and Russell
(1998)):
X
r X
s
z i ¼ τ i εi , τ i ¼ α0 þ α j τij þ βk zik , ð9:49Þ
j¼1 k¼1
where εi are iid nonnegative random variables generally with unit mean value and
specifically with exponential distribution in the model EACD(r, s), or Weibull
distribution in the model WACD(r, s), or gamma distribution in the model GACD
(r, s).
Analogously as in the model GARCH, if the following sufficient condition holds
X
max fr, sg

α0 > 0, α j 0, βk 0, α j þ β j < 1, ð9:50Þ
j¼1
then the model ACD is stationary with mean value of the form
α0
Eðzi Þ ¼ P max fr,sg : ð9:51Þ
1 j¼1 αj þ βj
In particular, let us consider the model EACD(1,1), where εt have standardized

exponential distribution with unit mean value and unit variance. Moreover, let the
following sufficient condition of stationarity be fulfilled:
α21 þ 2β21 þ 2α1 β1 < 1: ð9:52Þ
Then the process {zi} is stationary with variance

2
α0 1 α21 2α1 β1
varðzi Þ ¼ : ð9:53Þ
1 α1 β1 1 α21 2β21 2α1 β1
The models of the type EACD, WACD, and GACD can be estimated by the
method of maximum likelihood due to specified distributions of εi (see, e.g., Tsay
(2002)).
Remark 9.6 Tsay (2002) constructed the following model WACD(1,1) for dura-
tions in trading the stocks IBM during five trading days (in total, 3534 durations after
eliminating the diurnal pattern of daily periodicity were used in this construction)
z i ¼ τ i εi , τi ¼ 0:169 þ 0:885τi1 þ 0:064zi1 ,
where Weibull distribution of εi (standardized by unit mean value) has the proba-
bility density
( λ n λ o
λ Γ 1 þ 1λ xλ1 exp Γ 1 þ 1λ x for x 0,
f ðx j λ Þ ¼ ð9:54Þ
0 otherwise
with the estimated parameter λ of size 0.879. Then the estimated mean duration
(9.51) after elimination of daily periodicity is 3.31 seconds, which coincides with the
duration estimated directly as the sample mean of the time series of periodically
adjusted durations.
⋄
9.5 Exercises
Exercise 9.1 Apply the BDS test from Sect. 9.3 for (a) 100 simulated values of the
type iid N(0, 1); (b) the daily log returns of index PX in 2016 estimated as GARCH
(1, 1) in Example 8.2.
Chapter 10
Models of Development of Financial Assets
Important relations in modern finance and the corresponding econometric calcula-

tions (e.g., calibrations of models in given financial environment, simulations for
various financial scenarios, and the like) necessitate to model developments of
financial assets (prices, volumes, returns) in continuous time (see also random
processes in continuous time in Sects. 2.4 and 2.5). Such models mostly require
relatively complex theoretical background denoted generally as stochastic calculus,
which is the classical calculus of derivations and integrals modified for random
variables (including special instruments such as stochastic integrals, martingales,
Ito’s lemma, Wiener process, risk neutral probabilities, and others). In this chapter,
we outline basic principles of this methodology, but only within the scope that
enables to cope with simple practical applications for financial times series (in any
case, there are references with technical and theoretical details, e.g., Campbell et al.
(1997), Cipra (2010), Duffie (1988), Elliot and Kopp (2004), Enders (1995), Franke
et al. (2004), Franses and van Dijk (2000), Gourieroux and Jasiak (2001), Hendry
(1995), Hull (1993), Karatzas and Shreve (1988), Kwok (1998), Lim (2011),
Malliaris and Brock (1982), McNeil et al. (2005), Mills (1993), Musiela and
Rutkowski (2004), Neftci (2000), Poon (2005), Rachev et al. (2007), Ruppert
(2004), Steele (2001), Taylor (1986), Tsay (2002), Wang (2003), and Wilmott
(2000)).
10.1 Financial Modeling in Continuous Time
In Sect. 2.1, we have defined the random (or stochastic) process {Yt, t 2 T} in
continuous time as a set of random variables in the same probability space (Ω, ℑ, P)
indexed by means of values t from the set T ¼ h0, 1) interpreted as time. The
continuous time is the necessary assumption for various financial schemes that
model (in a practically acceptable way) the price changes of financial assets, even
though in reality these prices are observed in discrete time moments only (however,

https://doi.org/10.1007/978-3-030-46347-2_10
252 10 Models of Development of Financial Assets
an awareness of the fact that the analyzed prices exist continually and change in time
unceasingly may serve as a motivation for their analysis).
The models of Box–Jenkins methodology in discrete time from Chap. 6 (e.g., the
linear process (6.17)) are based on unpredictable discrete increments (innovations,
shocks) in the form of white noise. The analogy for models in continuous time can be
based on increments of Wiener process {Wt, t 0}. The properties (2.56) of Wiener
process can be rewritten by means of its increments ΔWt ¼ Wt+Δt – Wt as
8
> ði Þ W 0 ¼ 0;
>
>
< ðiiÞ the particular trajectories are continuous in time;
pffiffiffiffiffi ð10:1Þ
>
> ðiiiÞ ΔW t ¼ ε Δt with a random variable ε N ð0, 1Þ;
>
:
ðivÞ ΔW t is independent on W s for arbitrary 0 s < t:
The property (i) can be formulated more generally as P(W0 ¼ 0) ¼ 1. If we delete the
property (ii), then such a process is called standard Brownian motion (however, it is
possible to show that every Brownian motion has a modification with continuous
trajectories). The property (iii) can be also written as
ΔW t N ð0, Δt Þ: ð10:2Þ
In particular, the standard deviation of the process increment is equal to the square
root of the corresponding time increment. Finally, the assumption (iv) is so-called
Markov property (see also (2.48)), i.e., at time t, any future value Wt+h (h > 0)
depends only on the present value Wt, and not on previous values Ws (s < t). It means
consequently that the increments of the process are mutually independent for non-
overlapping time intervals (see Sect. 2.5.2). From the financial interpretation point of
view, this Markov property corresponds to so-called weakly efficient markets. There
are other specific properties of Wiener process suitable for financial modeling, e.g., it
holds (if we consider a “long” time increment from zero to t)
W t N ð0,t Þ for t 0 ð10:3Þ
so that the variance of Wiener process accrues linearly with increasing time.
As the trajectories of Wiener process are not differentiable in any point of time
(i.e., they have nowhere derivations, even though they are continuous), one cannot
integrate them in the classical way and has to make use of so-called stochastic (Ito’s)
calculus.
10.1 Financial Modeling in Continuous Time 253
10.1.1 Diffusion Process
The general scheme of financial models in continuous time is the diffusion process
(Ito’s process). If we denote small changes of a variable x as dx, then the usual form
of this process {Yt, t 0} is
dY t ¼ μðY t , t Þ dt þ σ ðY t , t Þ dW t for t 0, ð10:4Þ
where Wt is Wiener process. This model has the drift component μ(Yt, t)dt for
modeling the trend and the diffusion component σ(Yt, t)dWt for modeling the vola-
tility. Since the drift coefficient μ(Yt, t) and the diffusion coefficient σ(Yt, t) may
change in time (they depend on the time and even on the value of the process),
one has to use integration when solving the differential equation (10.4), i.e.,
Zt Zt
Yt ¼ Y0 þ μðY s , sÞds þ σ ðY s , sÞdW s for t 0, ð10:5Þ
0 0
where the second integral is stochastic (i.e., one integrates with respect to random
processes; see also Sect. 10.1.2) assuming so-called previsibility of process σ(Yt, t)
(i.e., independence on the future in terms of measurability of this process with
respect to the current and past information).
An important special case of Ito’s process is Wiener process with drift μ and
volatility σ (generalized Wiener process, arithmetic Wiener process) of the form
dY t ¼ μ dt þ σ dW t for t 0 ð10:6Þ
(σ 0), which has the constant drift and diffusion coefficient. It means that it holds
(assuming Y0 ¼ 0)
Yt ¼ μ t þ σ Wt for t 0 ð10:7Þ
and hence according to (10.3)
EðY t Þ ¼ μ t, varðY t Þ ¼ σ 2 t: ð10:8Þ
Remark 10.1 When testing various financial scenarios of development of prices

and returns of financial assets, one often uses simulations based on diffusion
processes. If, e.g., the time is measured in years and one chooses the daily step
Δt ¼ 1/252 (for 252 trading day in 1 year), then one can generate an iid sequence ε1,
ε2, . . . ~ N(0, 1) and construct (recursively) a simulated generalized Wiener process
(10.7) as
pffiffiffiffiffi X
k
Y kΔt ¼ μ k Δt þ σ Δt εj for k ¼ 1, 2, . . . : ð10:9Þ
⋄
j¼1
Another special case of Ito’s process is the exponential Wiener process (some-
times called geometric Brownian motion; see Sects. 2.5.2 and 10.1.3).
10.1.2 Ito’s Lemma and Stochastic Integral
The stochastic calculus demands to modify the classic derivatives and integrals to
their stochastic variants, which will be briefly described below (the theoretical
backgrounds of this complex discipline including a technical construction of sto-
chastic integral can be found in various sources, see, e.g., Baxter and Rennie (1996),
Dupačová et al. (2002), Karatzas and Shreve (1988), Kwok (1998), Malliaris and
Brock (1982), Musiela and Rutkowski (2004), Neftci (2000), Wilmott (2000), and
others):
The basic principle of random differential (or equivalently stochastic differenti-
ation) is the well-known Ito’s lemma. Let us consider a diffusion process {Yt, t 0}
according to (10.4). Further let f(y, t) be a continuous (nonrandom) function of
variable y and time t with continuous partial derivatives fy ¼ ∂f/∂y, fyy ¼ ∂2f/∂y2,
and ft ¼ ∂f/∂t. Then the transformed process f(Yt, t) fulfills (so-called Ito’s lemma):

1
df ðY t , t Þ ¼ f y μðY t , t Þ þ f t þ f yy σ 2 ðY t , tÞ dt þ f y σ ðY t , t Þ dW t for t 0:
2
ð10:10Þ
If, e.g., f(Wt, t) ¼ Wt2, then fy ¼ 2Wt, fyy ¼ 2 and ft ¼ 0, so that the differential of
squared Wiener process fulfills
dW 2t ¼ dt þ 2W t dW t for t 0, ð10:11Þ
which is different significantly from the classical (nonrandom) differential d(x2) ¼

2x dx.
The random integral represents an inverse operation to the random differential
(stochastic differentiation) described above, i.e.,
Zt
dY s ¼ Y t Y 0 : ð10:12Þ
0
In particular, due to the zero initial value of Wiener process, it holds

Zt
dW s ¼ W t : ð10:13Þ
0
When integrating both sides of relation (10.11), one receives
Zt
W 2t ¼ t þ 2 W s dW s , ð10:14Þ
0
which implies another important formula of stochastic integration of Wiener process
Zt
1 2
W s dW s ¼ Wt t ð10:15Þ
2
0
Rt
(again it is different significantly from the classical (nonrandom) integral x dx ¼
0
t 2 =2).
10.1.3 Exponential Wiener Process
One of the most utilized processes for modeling prices {Pt, t 0} of financial assets
(e.g., stocks) in continuous time is exponential Wiener process (geometric Brownian
motion) defined as Ito’s process of the form
dPt ¼ μ Pt dt þ σ Pt dW t for t 0 ð10:16Þ
(σ 0). The discretized version of (10.16)
ΔPt
¼ μ Δt þ σ ΔW t ð10:17Þ
Pt
indicates that one models, as a matter of fact, returns of given asset by means of
(deterministic) drift component μΔt and (random) diffusion component σ ΔWt ~ N
(0, σ 2Δt).
However, in practice we often model the logarithmic price
pt ¼ ln Pt , ð10:18Þ
since then we obtain easily by discrete differencing the log return rt ¼ pt pt1 (see
(8.1)). Obviously, it holds
2
∂ ln Pt 1 ∂ ln Pt 1 ∂ ln Pt
¼ , ¼ 2, ¼ 0,
∂Pt Pt ∂P2t Pt ∂t
so that the logarithmic transformation of the process Pt from (10.16) fulfills ac-
cording to Ito’s lemma

σ2
dpt ¼ d ln Pt ¼ μ dt þ σ dW t for t 0: ð10:19Þ
2
Therefore, the logarithmic price has the drift coefficient μσ 2/2 and the diffusion
coefficient (or volatility) σ. The differential equation (10.19) may be solved by
integrating its both sides (see Sect. 10.1.2)
Zt Zt Zt
σ2
dps ¼ μ ds þ σ dW s , ð10:20Þ
2
0 0 0
i.e.,

σ2
pt ¼ p0 þ μ t þ σ Wt for t 0 ð10:21Þ
2
or equivalently for the original price Pt (i.e., after removing the logarithms)

σ2
Pt ¼ exp p0 þ μ t þ σ Wt
2

σ2
¼ P0 exp μ t þ σ Wt for t 0: ð10:22Þ
2
The exponential Wiener process is usually presented just in this exponential form,
which justifies its name (the form (10.22) can be also compared with the
reparameterized version (2.59) of this process).
The formulas (10.21) and (10.22) imply that it holds (conditionally for the price
p0 ¼ ln P0 at time t ¼ 0)

σ2 σ2
pt N p0 þ μ t, σ t , Pt LN p0 þ μ
2
t, σ t , ð10:23Þ
2
2 2
where LN(μ, σ 2) denotes a random variable with lognormal distribution, which has
the form of exponential function exp(X) of a random variable X ~ N(μ, σ 2). Then one
can show (again conditionally for the price p0 ¼ ln P0) that, e.g.:

EðPt Þ ¼ P0 exp ðμ t Þ, varðPt Þ ¼ P20 exp ð2μ t Þ exp σ 2 t 1 : ð10:24Þ
In particular, the drift coefficient μ represents the average annual log return due to the
price changes of given asset (see (8.1) rewritten to the form Pt ¼ Pt1exp(rt) for log
returns {rt}).
Remark 10.2 The relations (10.21) and (10.22) can be used when simulating the
development of prices pt or Pt (see also Remark 10.1). Note also according to (10.21)
that if we apply for modeling the price of given financial asset the exponential
Wiener process, then the log return rt ¼ pt pt1 is white noise with probability
distribution

σ2
rt N μ , σ2 : ð10:25Þ
2
⋄
In financial practice, both unknown parameters μ and σ 2 of exponential Wiener
process can be statistically estimated by means of observed data (so-called calibra-
tion of model in a given financial environment):
Let, e.g., r1, . . ., rn be the corresponding log returns measured using regular time
intervals of length Δ, which is a given fraction of year (mostly daily log returns are
used, i.e., Δ 1/250 for 250 trading day in 1 year). Then similarly as in (10.25)
it holds

σ2
Eðr t Þ ¼ μ Δ, varðr t Þ ¼ σ 2 Δ: ð10:26Þ
2
The given data set usually enables to estimate the sample mean and sample variance
as
1X 1 X
n n
r¼ r, s2r ¼ ðr r Þ2 : ð10:27Þ
n t¼1 t n 1 t¼1 t
Comparing the theoretical values (10.26) and sample values (10.27) for higher n (the
sample values can be obviously used as consistent estimate of theoretical values),
then one finally obtains the estimates of μ and σ as
s σ2
r b r s2
σ ¼ prffiffiffiffi ,
b b
μ¼ þ ¼ þ r ð10:28Þ
Δ Δ 2 Δ 2Δ
ffiffiffiffiffithe estimate b
(moreover, e.g., the standard deviationpof σ, which serves as an error of
this estimate, can be estimated by b
σ = 2n).
Example 10.1 In this example, we estimate the exponential Wiener process for
index PX50 (250 trading days in year 2004; see Table 10.1 and Fig. 10.1 on the left).
By means of the formulas (10.27), one estimates easily for data sample in
Table 10.1
r ¼ 0:001 782, sr ¼ 0:009 895:
If Δ ¼ 1/250, then using the formulas (10.28) one obtains further
0:009 895 0:001 782 0:156 4542

σ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0:156 454 15:6 %,
b b
μ¼ þ
ð1=250Þ 1=250 2
¼ 0:457 739 45:8 %
(in particular, the error of volatility estimate is 0.7%), so that the average annual log
return of the index PX50 was 45.8%.
Using the estimated parameters μ and σ 2 in the relation (10.22), simulations of the
index PX50 are possible similarly to that in Remark 10.1. One of such simulations is
plotted in Fig. 10.1 on the right (including the true observations on the left to
compare both plots). It is apparent that in such simulations the high drift coefficient
45.8% prevails over the volatility 15.6% so that the simulated trajectories are often
very distinctly increasing.
⋄
10.2 Black–Scholes Formula
The previous models in continuous time enable to evaluate so-called financial

derivatives (options, futures, swaps, and others). Financial derivatives
(or derivative securities) are securities, whose values are dependent on (“derived”
from) the values of other more basic underlying variables, which are often traded
prices of so-called underlying assets (stocks or bonds, currencies, commodities,
stock market indices, and the like). For instance, the options are securities in the
form of contracts agreed at time t, which guarantee to their holder the right (but not
the duty) to do a specified asset transaction (e.g., to buy a stock for a fixed price) till a
future date T (t < T ). The well-known Black–Scholes formula is broadly used in
practice to evaluate option premiums (i.e., the prices of options; see, e.g., Cipra
(2010), Hull (1993), and others). The key principle of the corresponding methodol-
ogy (including the derivation of Black–Scholes formula) consists in the assumption
of no arbitrage (the arbitrage consists in opportunities of riskless yield due to
different simultaneous prices of possible buy or sell positions for underlying assets).
One can derive Black–Scholes formula solving the following general problem.
We look for an explicit form of unknown function F evaluating a given financial
derivative (e.g., an option) under the following assumptions:
Table 10.1 Index PX50 in the year 2004 (values for 250 trading days written in columns) from Example 10.1 (see also Fig. 10.1 on the left)
10.2
1 2 3 4 5 6 7 8 9 10
1 662.10 726.00 789.70 846.00 759.70 796.20 800.90 837.60 883.80 999.90
2 670.30 732.10 786.00 847.40 770.20 793.50 790.50 842.10 874.40 1020.90
3 674.10 741.40 796.20 835.70 776.10 800.50 786.40 846.90 896.50 1021.90
4 667.50 754.20 802.30 837.50 784.80 796.40 793.40 849.00 904.40 1010.90
5 665.10 748.10 805.20 834.90 786.50 789.90 785.40 856.40 910.50 1017.70
6 663.20 748.60 797.80 844.20 778.70 786.40 782.50 857.50 905.50 1029.70
Black–Scholes Formula
7 674.90 745.40 804.20 850.40 781.80 784.10 784.60 860.70 907.50 1030.10
8 676.90 748.40 813.20 842.50 781.60 787.00 790.50 850.50 906.20 1021.00
9 678.80 749.60 817.50 818.00 784.70 793.40 783.80 855.40 892.90 1024.20
10 684.40 751.00 813.50 816.70 795.30 787.50 782.20 860.20 911.90 1012.60
11 687.80 751.60 812.70 799.60 793.10 779.70 785.10 849.80 916.80 981.00
12 688.50 746.70 823.50 794.40 784.20 784.80 785.00 847.60 936.60 992.10
13 688.20 758.00 823.80 789.10 782.10 786.50 791.60 847.40 935.70 994.00
14 693.40 760.20 830.20 772.50 779.00 787.10 797.20 875.40 933.00 1015.20
15 692.40 774.30 837.00 741.50 779.00 780.70 796.20 875.40 940.90 1019.50
16 687.80 775.80 845.70 763.70 783.40 770.80 801.20 872.10 933.40 1020.10
17 687.30 775.50 842.00 763.40 787.50 779.20 804.20 864.10 938.80 1013.00
18 689.90 776.40 848.70 749.20 804.40 784.30 807.30 872.00 949.40 1009.60
19 691.70 784.40 853.60 750.80 807.30 785.60 816.00 878.50 957.10 1002.00
20 691.90 792.00 850.20 739.00 797.90 786.10 817.60 891.60 954.60 1007.10
21 698.20 811.10 854.30 744.90 794.70 794.10 817.00 884.50 958.10 1011.00
22 703.20 809.60 835.10 762.30 805.10 790.20 819.10 902.30 960.60 1015.70
23 709.60 801.70 838.10 760.70 799.90 787.30 831.40 899.10 994.60 1020.50
24 714.30 790.70 831.50 759.00 798.80 794.30 828.90 900.50 1001.30 1030.90
25 719.50 793.60 836.40 763.10 792.50 795.00 837.70 889.50 998.30 1032.00
259
Source: Prague Stock Exchange (https://www.pse.cz/en/indices/index-values/detail/XC0009698371?tab¼detail-history)

1,050 1,100
1,000
1,000
950
900 900
850
800 800
750
700
700
650 600
50 100 150 200 250 50 100 150 200 250
index PX50 (in year2004) simulation of index PX50(for year 2004)
Fig. 10.1 Index PX50 in the year 2004 (on the left) and its simulation by means of estimated
exponential Wiener process from Example 10.1 (on the right)
• F is a function of time t and the price Pt of underlying asset at time t (e.g., stock,
currency, crude oil, stock index). One will write for simplicity Ft ¼ F(Pt, t).
• The value FT is determined by a boundary condition at maturity time T (t < T )
according to the type of financial derivative.
For example, let us consider a (European) call option (simply call) which gives its
buyer (holder in long position) the right to buy at maturity time T an underlying asset
for a preset price X (exercise price or strike price) even though the market price of
the given asset at time T is PT. The call option can be purchased at time t for a price
Callt (so-called call premium), where one knows only the asset price Pt at time t, but
not the future price PT at time T. The seller of this call (underwriter in short position)
must sell the underlying asset at maturity time according to the holder’s decision. In
this case, the boundary condition for the function Ft (¼ Callt) has obviously the form
F T ¼ max ðPT X, 0Þ ð10:29Þ
(European options can be exercised only at the maturity date, while American
options at any time up to the maturity date). Similarly, a (European) put option
(simply put) gives its buyer (holder in long position) the right to sell at maturity time
T an underlying asset for a preset price X. The put option can be purchased at time
t for a price Putt (put premium). The seller of this put (underwriter in short position)
must buy the underlying asset at maturity time according to the holder’s decision. In
this case, the boundary condition for the function Ft (¼ Putt) has the form
F T ¼ max ðX PT , 0Þ: ð10:30Þ
Generally, if the price Pt of underlying asset behaves as an exponential Wiener

process (10.16), then according to Ito’s lemma (10.10) we get
10.2 Black–Scholes Formula 261
2
∂F t ∂F t 1 ∂ F t 2 2 ∂F t
dF t ¼ μ Pt þ þ σ Pt dt þ σ Pt dW t : ð10:31Þ
∂Pt ∂t 2 ∂P2t ∂Pt
One can rewrite the previous relations to the following discrete form:
ΔPt ¼ μ Pt Δt þ σ Pt ΔW t , ð10:32Þ
2
∂F t ∂F t 1 ∂ F t 2 2 ∂F t
ΔF t ¼ μ Pt þ þ σ Pt Δt þ σ Pt ΔW t : ð10:33Þ
∂Pt ∂t 2 ∂P2t ∂Pt
Let us now construct so-called replicating (“self-financing”) portfolio
∂F t
V t ¼ F t þ Pt : ð10:34Þ
∂Pt
In the framework of this portfolio, one owns the underlying asset of size ∂Ft / ∂Pt
(with price Pt per unit) and simultaneously owes the considered financial derivative
of unit size (with price Ft per unit). It holds
2
∂F t ∂F 1 ∂ Ft 2 2
ΔV t ¼ ΔF t þ ΔPt ¼ t σ Pt Δt, ð10:35Þ
∂Pt ∂t 2 ∂P2t
i.e., the changes of this portfolio do not include the random component ΔWt.
Therefore, the portfolio Vt is riskless in the framework of small time changes Δt
and must earn during such time changes equally as other riskless investments with
riskless interest rate rf (free of risk), i.e.:
ΔV t ¼ r f V t Δt ð10:36Þ
(otherwise one could break the no arbitrage rule; see above). Substituting (10.34)
and (10.35) to (10.36) one obtains
2
∂F t 1 ∂ F t 2 2 ∂F t
þ σ Pt Δt ¼ r f F t Pt Δt: ð10:37Þ
∂t 2 ∂P2t ∂Pt
We derived so-called Black–Scholes differential equation for evaluating (pricing)

financial derivatives
2
∂F t ∂F t 1 2 2 ∂ F t
þ r f Pt þ σ Pt ¼ r f Ft : ð10:38Þ
∂t ∂Pt 2 ∂P2t
The solution of this equation under the boundary condition (10.29) is the well-
known Black–Scholes formula for the (European) call premium (see also (8.20)):
Callt ¼ Pt Φðdþ Þ X er f ðTtÞ Φðd Þ, ð10:39Þ
where

ln ðPt =X Þ þ r f þ σ 2 =2 ðT t Þ ln ðPt =X Þ þ r f σ 2 =2 ðT t Þ
dþ ¼ pffiffiffiffiffiffiffiffiffiffiffi , d ¼ pffiffiffiffiffiffiffiffiffiffiffi
σ T t σ T t
pffiffiffiffiffiffiffiffiffiffiffi
¼ dþ σ T t,
Φ() is the distribution function of N(0, 1) and the remaining symbols are described
in the previous text. Similarly, Black–Scholes formula for the (European) put
premium is
Put t ¼ X er f ðTtÞ Φðd Þ Pt Φðd þ Þ: ð10:40Þ
Moreover, it holds so-called put–call parity
Put t ¼ Callt þ X er f ðTtÞ Pt : ð10:41Þ
Example 10.2 Let us consider a call option to buy a stock with exercise price
50 EUR. Three months before maturity of this European option, the price of stock is
53 EUR. Which is the call premium according to Black–Scholes formula (the price
volatility of stock has been estimated as 0.50 EUR and the corresponding riskless
interest rate is 5% p.a.)?
According to (10.39) for Pt ¼ 53, X ¼ 50, T t ¼ 3/12 ¼ 0.25, σ ¼ 0.50 and
i ¼ 0.05, one obtains

ln ð53=50Þ þ 0:05 þ 0:502 =2 0:25
dþ ¼ pffiffiffiffiffiffiffiffiffi ¼ 0:4081, Φð0:4081Þ ¼ 0:6584,
0:50 0:25
pffiffiffiffiffiffiffiffiffi
d ¼ 0:4081 0:50 0:25 ¼ 0:1581, Φð0:1581Þ ¼ 0:5628,
so that
Callt ¼ 53 0:6584 50 e0:050:25 0:5628 ¼ 7:10 EUR:
Analogously, the European put option premium according to (10.40) or (10.41)

would be Putt ¼ 3.48 EUR.
⋄
10.3 Modeling of Term Structure of Interest Rates 263
10.3 Modeling of Term Structure of Interest Rates
The structure of interest rates is usually presented as behavior of yields to maturity

(YTM) in time for investments that have the generic form of zero-coupon bonds.
Such bonds may be looked upon as investment loans which enables to trade loans as
discount securities.
Even though the structure of interest rates depends on a number of various
factors, here we shall deal with its dependence on the time of maturity only
(so-called term structure of interest rates). In this context, one usually denotes the
relation between the yield to maturity and the time of maturity as a yield curve. The
yield curves in practice are mostly constructed for the whole classes of bonds of
similar characteristics which are mainly the type of bond issuer (e.g., government
bonds) and rating (e.g., AA). Since the yields of government bonds may be usually
regarded as riskless (so-called risk-free yield), the yield curves of “risk” bonds are
often constructed in such a way that one shifts a suitable yield curve of government
bonds by credit spread which is considered as a risk premium corresponding to the
risk rating of the evaluated bond.
Moreover, one distinguishes so-called spot and forward yield curves: spot yield
curve describes the dependence of the yield to maturity on the time of maturity,
where time is measured from the actual (spot) moment. Forward yield curve
measures this dependence starting from a future moment (which must be exactly
specified in advance). In this context, one uses the following denotation:
1. The symbol P(t, T ) denotes the price of zero-coupon bond with time of maturity
T and unit nominal value (i.e., P(T, T ) ¼ 1) at time t (t < T ). The continuous yield
to maturity of this bond at time t denoted as R(t, T ) is
ln Pðt, T Þ
Rðt, T Þ ¼ , ð10:42Þ
T t
which follows from the formula of continuous discounting (see, e.g., Cipra
(2010)):
Pðt, T Þ ¼ eRðt,T ÞðTtÞ : ð10:43Þ
In financial theory, the yield to maturity R(t, ) considered as a function of

argument T (T t) is usually called yield curve at time t (while the price of
bond P(t, ) considered as a function of argument T is called discount curve at
time t).
2. Particularly the value

∂ ln Pðt, T Þ ∂ ln Pðt, t Þ
r t ¼ Rðt, t Þ ¼ ¼ ð10:44Þ
∂T T¼t ∂T
is called instantaneous interest rate at time t (since rt is the limit value R(t, T ) for
T ! t). The adjective “instantaneous” expresses the fact that if applying this
interest rate to a capital K at time t, then the capital accrual during a short time
interval Δt is approximately ΔK ¼ KrtΔt, even if more correctly one should
write the differential relation dK ¼ Krt dt. Obviously, the instantaneous rate rt
presents the interest intensity at time t independently of time of maturity of the
corresponding loan (investment).
Most models of term structure of interest rates assume that rt is the diffusion
process of the form (see (10.4))
dr t ¼ aðr t , t Þ dt þ bðr t , t Þ dW t : ð10:45Þ
The models used in practice have specified forms of drift coefficient a(rt, t) and
diffusion coefficient b(rt, t). Their main outputs are explicit formulas for instanta-
neous interest rate rt and bond price P(rt, t, T ) by means of these coefficients and
consequently an explicit formula for yield curve R(rt, t, T ) according to (10.42). If
we succeed in estimating the chosen model (10.45) for observed financial data
(at time t one observes the bond prices P(t, Tk) with various times of maturity Tk;
see also Remark 10.4), then we can:
• Estimate continuous yield curves using only limited volume of data.
• Perform various simulations (for yield curves, instantaneous interest rates, and
the like).
In practice, one constructs (see, e.g., Baxter and Rennie (1996), Cipra (2010),
Hull (1993)):
• Single-factor interest rate models that include only one interest rate factor rt (e.g.,
models by Vasicek, Cox–Ingersoll–Ross, Hull–White, Ho-Lee, Black–Derman–
Toy, Black–Karasinski, and others).
• Binomial tree models (e.g., models by Rendleman–Bartter, Jarrow-Rudd, and
others).
• Multi-factor interest rate models that include several interest rate factors (e.g.,
models by Brennan–Schwartz, Fong–Vasicek, Longstaff–Schwartz, and others).
1. Vasicek Model
Vasicek model (also Ornstein–Uhlenbeck process or mean-reverting model) is based
on the diffusion equation (10.45) in the form
10.3 Modeling of Term Structure of Interest Rates 265
dr t ¼ α ðγ r t Þ dt þ b dW t ð10:46Þ
with parameters α > 0, γ 2 R, b 2 R. The process rt fluctuates around a constant level

γ, but the trend coefficient α(γ rt) reverts the values rt that deviated too much from
γ back to this level (it is substantial for such a behavior of rt that the volatility b is
constant).
Remark 10.3 Vasicek (1977) showed that the yield curve corresponding to Vasicek
model has the form

b q b2 1
Rðr t , t, T Þ ¼ γ þ 2 1 eαTÞ
α 2α αT

b q b2 b2 2
γþ 2 r t þ 3 1 eαT , ð10:47Þ
α 2α 4α T
where q ¼ q(rt, t) is so-called market price of risk (if no arbitrage opportunities exist,
then q does not depend on the time of maturity T ). The interpretation of q can be
shown symbolically (not writing arguments of variables for simplicity). If one writes
the stochastic differential equations for P ¼ P(rt, t, T) symbolically as dP ¼ μ P
dt + σ P dW (applying Ito’s lemma to P(rt, t, T ) and (10.46)), then the market price
of risk is defined as q ¼ (μ r)/σ. The equality μ r ¼ q σ can be interpreted in
such a way that the expected yield μ exceeding r compensates the risk q σ.
Moreover, R(rt, t, T ) increases, or reverses from increase to decrease, or decreases
if rt γ + b q/α 3b2/(4α 2), or γ + b q/α 3b2/(4α2) < rt < γ + b q/α, or
rt γ + b q/α, respectively.
⋄
2. Model Cox–Ingersoll–Ross
This model denoted briefly as CIR model is based on the diffusion equation (10.45)
in the form
pffiffiffiffi
dr t ¼ α ðγ r t Þ dt þ b r t dW t ð10:48Þ
with parameters α > 0, γ 2 R, b 2 R. In contrast to Vasicek model, the volatility in

this model is not constant but proportional to the value √rt so that the corresponding
solution P(rt, t, T ) cannot attain negative values.
Remark 10.4 In statistic and econometric literature various methods are suggested
how to estimate diffusion processes (10.45) directly from discrete data. Several
approaches can be distinguished:
• ML estimate: see, e.g., Lo (1988).
• QML estimate (quasi-maximum likelihood): see, e.g., Kessler (1997).
• Moment estimate: see, e.g., Conley et al. (1997).
• Nonparametric estimate: see, e.g., Ait-Sahalia (1996).
• Semiparametric estimate: see, e.g., Gallant and Long (1997).
Fig. 10.2 Simulation of .08

instantaneous interest rate rt
by means of Vasicek model .07
(10.46) and CIR model
.06
(10.48) (α ¼ 0.1, γ ¼ 0.05,
b ¼ 0.03) .05
.04
.03
.02
.01
.00
25 50 75 100 125 150 175 200 225 250
Vasicek CIR
• MCMC (Markov chain Monte Carlo) approach by means of simulations: see,

e.g., Elerian et al. (2001), Eraker (2001).
⋄
Example 10.3 Figure 10.2 plots one trajectory of simulations of instantaneous
interest rate rt using 250 regular time intervals of length Δ ¼ 1/250 (see Remark
10.2) by means of Vasicek model
dr t ¼ 0:1 ð0:05 r t Þ dt þ 0:03 dW t
and CIR model

pffiffiffiffi
dr t ¼ 0:1 ð0:05 r t Þ dt þ 0:03 r t dW t :
Both trajectories fluctuate around the level 5%, but the trajectory of model CIR is
much more stable (the trajectory of Vasicek model might sink even to negative rates
in longer simulations).
⋄
10.4 Exercises
Exercise 10.1 Repeat the simulations analysis from Example 10.3 using different
values of coefficients for Vasicek and CIR model and compare their graphs.
Chapter 11
Value at Risk
Methodology VaR (value at risk) and its modifications are usual measures of risk in
practice (e.g., it is one of the best used approaches to set up capital requirements
when regulating capital adequacy in so-called internal models of banks). More
generally, VaR is the key instrument for financial risk management, e.g., by
means of commercial systems of the type RiskMetrics. This topic is included in
the presented text since some methods of VaR construction make use of the analysis
of financial time series.
In general, the financial risk concerns potential price changes of financial assets,
where the corresponding price change (expressed mainly as the rate of return; see
Remark 6.20) is looked upon as a random event. If the financial risk is measured as
the variance or standard deviation of (log) returns in the form of random process,
then it is usually called (conditional) volatility (see Chap. 8).
Moreover, the financial risk can be classified into several categories, mainly:
1. Market risk is the risk of loss due to changes (variations) of market prices
(of securities, commodities, and others) or market rates (interest rates, rates of
exchange, and others). Accordingly, it can be sorted to more specific risk sub-
categories, e.g.:
• Interest risk
• Currency risk
• Stock risk
• Commodity risk
• Credit spread risk (i.e., the risk of loss due to changes in differences between
the yields of various debt instruments)
• Correlation risk (i.e., the risk of loss due to changes in traditional correlations
between considered risk categories, e.g., between stocks and bonds) and
others.
2. Credit risk is the risk that the creditor (lender) may not receive promised
repayments on outstanding investments (such as loans, credits, bonds) because

https://doi.org/10.1007/978-3-030-46347-2_11
268 11 Value at Risk
of the default of the debtor (borrower). Defaults may consist in insolvency or

reluctance of the debtor, in his refusal to deliver or to buy underlying assets
according to contracts, and the like.
3. Liquidity risk is the risk of varying levels of convertibility of investments readily
into cash. This risk can force investors to sell some assets under very unfavorable
conditions.
4. Operational risk is the risk of a change in value caused by the fact that actual
losses, incurred for inadequate or failed internal processes, people, and systems,
or from external events (including legal risk), differ from the expected losses.
11.1 Financial Risk Measures
Risk can be measured and quantified by means of various ways. In some cases (e.g.,
in various regulatory systems for banks), one prefers deterministic instruments for
this purpose, e.g., stress tests constructed in accordance with prescribed instructions
without any portion of stochasticity.
In this text, we deal only with stochastic risk measures that respect the random
character of potential losses (or profits). Let random variable X represent loss (if X is
positive) or profit (if X is negative) accumulated during the given holding period
(moreover, X is usually observed in time, i.e., in the form of a time series; see Sect.
11.2). Then a risk measure ρ can be defined as a mapping that assigns real values
ρ(X) to the random variables X. In particular, a risk measure is called coherent if it
possesses the following properties (for bounded random variables X and Y denoting
losses in the same financial environment):
(i) Subadditivity: ρ(X + Y) ρ(X) + ρ(Y)
(ii) Monotony: if X Y, then ρ(X) ρ(Y )
(iii) Positive homogeneity: ρ(λX) ¼ λρ(X) for arbitrary constant λ > 0
(iv) Translation invariancy: ρ(X + a) ¼ ρ(X) + a for arbitrary constant a > 0.
11.1.1 VaR
The methodology VaR (value at risk) is based on an estimate of the worst loss that
can occur with a given probability (confidence) in a given future period (alternatively
one can say that with a prescribed confidence α, e.g., 95%, there cannot occur a loss
that is higher than VaR). For example in the context of capital requirements or capital
adequacy of banks, VaR represents the smallest capital amount that guarantees the
bank solvency with a given confidence. VaR is specified by the following factors:
• Holding period is the period in which a potential loss can occur. Accordingly, the
used terms may be the daily VaR (over one business day, e.g., in RiskMetrics) or
the 10 days VaR (over two calendar weeks with 10 business days, e.g., according
11.1 Financial Risk Measures 269
0,06
E(X) VaR95% ES
0,05
Probability density of X
0,04
0,03
0,02
0,01
0,00
-5 −0.88 0 2.75 4.26 5 10
Daily loss X (mil. EUR)
Fig. 11.1 Graphical plot of VaR95%
to the recommendation of the Basel Committee on Banking Supervision) or

monthly VaR, quarterly VaR, or even yearly VaR (for various credit portfolios).
• Confidence level is the probability that the actual loss does not exceed the value at
risk (during the given holding period). In practice, one applies, e.g., the confi-
dence 95% (in RiskMetrics), or 99% (according to Basel Committee on Banking
Supervision).
Formally, one can define VaR in the following way:
VaR ¼ VaRα ðX Þ ¼ inf fx 2 ð1, 1Þ : PðX xÞ αg

¼ inf fx 2 ð1, 1Þ : F X ðxÞ αg, ð11:1Þ
where random variable X denotes loss (of course the negative loss means profit)
accumulated during the given holding period (e.g., during one trading day), α is the
corresponding confidence level (e.g., 95% for α ¼ 0.95), and FX(x) ¼ P(X x) is the
probability distribution function of X. When one expresses it in statistical terms, then
VaRα(X) is α-quantile qα of random variable X. Moreover, if the probability distri-
bution function FX() is increasing and continuous, then it holds simply
VaRα ðX Þ ¼ F 1
X ðαÞ ¼ qα , i:e: PðX VaRα ðX ÞÞ ¼ α: ð11:2Þ
Figure 11.1 plots VaR95% for a daily loss X with given probability density. This
random variable X has mean value 0.88 million euros (i.e., one can expect a profit
on average) and skewed to the right (i.e., potential losses are not negligible). The
daily value at risk achieves with confidence 95% relatively high level 2.75 million
euros (i.e., ceteris paribus one may expect in each twentieth trading day the loss of at
least 2.75 million euros). The drawback of this risk measure is the fact that it does not
inform on possible losses higher than VaR95% (in contrast to the expected shortfall
ES ¼ 4.26 million euros; see Fig. 11.1 and Sect. 11.1.2).
In addition to the “absolute” VaR, one sometimes applies also the relative value at
risk which is related to the mean value E(X), namely
VaRrel ¼ VaRabs EðX Þ: ð11:3Þ
For example, the relative VaRrel in Fig. 11.1 is 2.75 (0.88) ¼ 3.63 million euros
as the “distance of the absolute VaR from the mean loss.”
Remark 11.1 In the class of basic parametric distributions, it is possible to write
analytic formulas for VaR directly as the corresponding quantiles qα, e.g.:
1. For normal distribution X ~ N(μ, σ 2):
VaR ¼ μ þ σΦ1 ðαÞ, VaRrel ¼ σΦ1 ðαÞ, ð11:4Þ
where Φ() is the distribution function of standard normal distribution N(0, 1)

(e.g., for confidence levels 95% and 99%, it holds VaRrel 95% ¼ 1:645σ and
VaRrel
99% ¼ 2:326σ).
2. For log-normal distribution X ~ LN(μ, σ 2), i.e., ln(X) ~ N(μ, σ 2):

VaR ¼ exp μ þ σΦ1 ðαÞ , VaRrel ¼ exp ðμÞ exp σΦ1 ðαÞ exp σ 2 =2
ð11:5Þ
(in this case, the probability density of X with a suitable configuration of param-
eters looks similarly as in Fig. 11.1).
3. For exponential distribution X ~ Exp(λ), i.e., FX(x) ¼ 1 exp(λx) for x 0:
VaR ¼ ln ð1αÞ=λ, VaRrel ¼ ð1 þ ln ð1αÞÞ=λ ð11:6Þ
(this distribution is applicable, e.g., in the case when no profit with negative X is
possible).
⋄
11.1.2 Other Risk Measures
Here we shall give a brief survey of other types of risk measures that are applied in
financial practice:
1. Deviation Risk Measures
Deviation risk measures regard the risk as fluctuations around a given value (usually
around the mean value E(X) which is interpreted as the average loss). The main
representatives (used, e.g., in risk management) are:
• Standard deviation:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 ffi
2 2
σ ðX Þ ¼ EðX EðX ÞÞ ¼ E X ðEðX ÞÞ ð11:7Þ
(σ(X) used by Markowitz in his theory of portfolio is a very popular risk measure due
to its simplicity; on the other hand, it has some drawbacks, namely (i) it is applicable
only when the second moment of loss X exists and (ii) it does not distinguish positive
and negative deviations around E(X) so that it cannot be recommended for asym-
metric and skewed loss distributions).
• Variance:

varðX Þ ¼ σ 2 ðX Þ ¼ E X 2 ðEðX ÞÞ2 ð11:8Þ
(var(X) is used as the measure of volatility in models of the type ARCH for financial
time series in Sect. 8.3).
• One-sided standard deviations:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
σ þ ðX Þ ¼ Eð max fX EðX Þ, 0gÞ2 , σ ðLÞ ¼ Eð min fX EðX Þ, 0gÞ2
ð11:9Þ
(in contrast to the two-sided standard deviation, σ(X) measures only positive or
negative deviations from the mean value, respectively).
• Variance coefficient:
σ ðX Þ
vð X Þ ¼ 100%: ð11:10Þ
jEðX Þj
• Mean absolute deviation:
MADðX Þ ¼ EjX EðX Þj ð11:11Þ

or mean absolute semi-deviations:
MADþ ðX Þ ¼ Eð max fX EðX Þ, 0gÞ, MAD ðX Þ ¼ Eð min fX EðX Þ, 0gÞ:

ð11:12Þ
2. Expected Shortfall and Conditional Value at Risk

The drawback of value at risk VaR (see Sect. 11.1.1) is the fact that this risk measure
does not inform on possible losses higher than VaR (moreover, VaR is not sub-
additive so that generally it cannot be coherent; see Sect. 11.1). Therefore, various
modifications of VaR are now preferred by risk managers in practice, mainly
so-called expected shortfall defined as
Z1
1
ESα ¼ VaRu du, ð11:13Þ
1α
α
Obviously, instead of fixing a particular confidence level α we average VaRu over all
levels u α and thus look further into the tail of the loss distribution (it holds always
ESα VaRα; see Fig. 11.1). In any case, the expected shortfall is the coherent risk
measure.
Remark 11.2 For continuous loss distribution, an even more intuitive expression
for ESα in (11.13) is possible, namely
ESα ¼ EðX jX VaRα Þ, ð11:14Þ
which shows that ESα can be also interpreted as the expected loss that is incurred in
the case that VaRα is exceeded (see, e.g., McNeil et al. (2005)). In general (i.e.,
including discrete loss distributions), one defines
1
CVaRα ¼ EðX jX VaRα Þ ¼ E X I ½XVaRα ð11:15Þ
1α
as conditional value at risk CVaR (or sometimes also tail conditional expectation
TVaR).
One can see that the difference between (11.13) and (11.14) consists in the lower
bound for averaging the worst losses:
• In (11.13) one averages over the worst scenarios that occur with probability 1α.
• In (11.14) one averages over the worst losses which are not lower than VaRα.
Table 11.1 Probability distribution of losses expected in investment portfolios A and B during
next year
Portfolio A Portfolio B
Loss (million euros) Probability (%) Loss (million euros) Probability (%)
20 4 30 2
10 3 10 98
100 93 – –
⋄
Remark 11.3 Similarly as in Remark 11.1 one can derive analytic formulas for ES
¼ ESα under some parametric loss distributions, e.g.:
1. For normal distribution X ~ N(μ, σ 2):

φ Φ1 ðαÞ φ Φ1 ðαÞ
ES ¼ μ þ σ , ES rel
¼σ , ð11:16Þ
1α 1α
where φ and Φ are the probability density and the distribution function of
standard normal distribution N(0, 1), respectively.
2. For log-normal distribution X ~ LN(μ, σ 2):

Φ σ Φ1 ðαÞ
ES ¼ exp μ þ σ 2 =2 ,
1 1
α
α Φ Φ ðαÞ σ
ESrel ¼ exp μ þ σ 2 =2 : ð11:17Þ
1α
⋄
3. Distorted Risk Measures
Distorted risk measures originate by artificially “distorting” the distribution function
of loss: the expected value of loss after this adjustment is the risk measure result. The
motivation of distortion consists in the fact that in specific situations the risk
measures of the type VaR and ES do not distinguish the risk in an acceptable way.
For example, let us have choice between two investment portfolios whose stochastic
behavior is described in Table 11.1.
Then it holds
• in the portfolio A:
4 1
VaR0:95 ¼ 10 million euros, ES0:95 ¼ 20 þ 10 ¼ 18 million euros;
5 5
• in the portfolio B:
2 3
VaR0:95 ¼ 10 million euros, ES0:95 ¼ 30 þ 10 ¼ 18 million euros:
5 5
Fig. 11.2 Distorted g(z)

functions for VaRα
1
0 α 1 z
Even though both the portfolios have the same risk values VaR0.95 and ES0.95, each
investor in practice would prefer to invest to the portfolio A (while the portfolio B is
always in loss with maximum possible loss of 30 million euros, the portfolio A is in
loss of 10 or 20 million euros only with relatively small probabilities and otherwise it
is highly profitable). This reasonable decision can be confirmed by means of risk
measures only if we distort them in a suitable way.
In general, let X be a loss with the distribution function F(x) ¼ P(X x) and the
finite mean value E(X). Then the distorted risk measure Eg(X) of loss X is defined as
Z0 Z1

E g ðX Þ ¼ F g ðxÞdx þ 1 F g ðxÞ dx, ð11:18Þ
1 0
where
F g ð x Þ ¼ gð F ð x Þ Þ ð11:19Þ
for a distorted function g(z) with following properties:

• g(z) is nondecreasing for 0 z 1.
• g(0) ¼ 0 and g(1) ¼ 1 (in particular, 0 g(z) 1 for 0 z 1).
Remark 11.4 Particularly it holds:
• For g(z) ¼ z: the relation (11.18) is the well-known general formula for E(X).
• For g(z) in Fig. 11.2: Eg(X) ¼ VaRα.
• Similarly Eg(X) ¼ ESα , etc. for other suitable choices of g(z).
⋄
Table 11.2 Values of Wang Wang distorted risk measure (million euros) for
distorted risk measure for
λ Portfolio A Portfolio B
portfolios A and B from
Table 11.1 3 11.93 24.84
2 17.02 14.36
1 62.85 4.38
0 91.90 0.60
1 99.24 0.03
2 99.97 0.00
3 100.00 0.00
A usual example of g(z) in practice is so-called Wang distorted function

gλ ðzÞ ¼ Φ Φ1 ðzÞ þ λ , 0 z 1, ð11:20Þ
where Φ is the distribution function of standard normal distribution N(0, 1) and λ 2 R

is a real constant to be chosen. If this Wang risk measure (or Wang transformation;
see Wang (2000)) is used in the numerical example for portfolios A and B from
Table 11.1, then one obtains the results in Table 11.2 for chosen values of constant λ.
Apparently, the higher values of this risk measure for portfolio B over a broad range
of λ indicate the higher riskness of this portfolio in comparison with portfolio A,
which corresponds to previous practical conclusions (see above).
4. Spectral Risk Measures
Spectral risk measures (see, e.g., Acerbi (2002)) generalize the concept of expected
shortfall ESα (see (11.13))
Z1
Mψ ¼ ψ ðuÞ VaRu du, ð11:21Þ
0
where a weight function ψ(u) is nonnegative and nondecreasing for 0 u 1 with
Z1
ψ ðuÞ du ¼ 1: ð11:22Þ
0
The interpretation of (11.21) is obvious: in contrast to ESα in (11.13), one assigns

higher weights to higher values VaRu in (11.21). The steeper is the function ψ(u) for
u going to 1, the higher weights are assigned to extreme (catastrophic) scenarios.
Remark 11.5 Particularly it holds Mψ ¼ ESα for the function ψ in Fig. 11.3.
⋄
Fig. 11.3 Function ψ for ψ (u)

coincidence Mψ with ESα
1/(1− α)
0 α 1 u
11.2 Calculation of VaR
One of the main drawbacks of VaR is the fact that different methods of its numerical
calculation (or estimation or prediction) may deliver in practice (substantially)
different results. In this section, we shall describe several methods for the calculation
of VaR used in practice which are based on data in the form of time series. Some of
them can be classified as parametric methods and others as nonparametric or
combined ones.
1. Variance-Covariance Method
This method is frequent in practice among various parametric approaches calculating
VaR. It offers a direct analytical solution of given problem, but it is based on
simplifying assumptions which need not be fulfilled in practice (even in routine
situations) and must be taken as drawbacks of this method, namely
(i) The loss Xt at time t originates as an aggregate of component losses by m risk
sources.
(ii) The probability distribution of this aggregate loss Xt can be approximated as

X t N μ1 þ þ μm , σ 21 þ þ σ 2d þ 2σ 12 þ þ 2σ m1,m , ð11:23Þ
where μ1, . . ., μm are mean values, σ 12, . . ., σ m2 are variances, and σ 12, σ 13, . . ., σ m1, m
are covariances of component losses (more generally, these moments can vary in time,
but must be estimable from data).
The most usual practical situation for application of this method is the prediction
of VaR in a portfolio composed of various investment or credit instruments, for
which the risk of possible losses must be evaluated or even controlled by manage-
ment (see Example 11.1). If one has the data information xt, xt1, xt2, . . . till time
11.2 Calculation of VaR 277
t (i.e., component losses in the form of m-variate time series observed till time t),
then, e.g., one can predict VaR for next time t + k (e.g., for the next trading day),
which may be prescribed by regulators of various financial institution (for banks by
Basel III or for insurance companies by Solvency II).
Remark 11.6 In practice, the financial time series xt can be often modeled using
methods of multivariate volatility modeling and predicting (see Chap. 13, or (8.62)
for univariate case). Moreover, one can model (log) returns rt instead of absolute
losses Xt (negative values of rt can be interpreted as relative losses), namely r t ¼
Pm
i¼1 cti r ti with moments of the form
X
m m X
X m
Eðr t Þ ¼ cti Eðr ti Þ, varðr t Þ ¼ cti ctj cov r ti , r tj , ð11:24Þ
i¼1 i¼1 j¼1
where rti and cti denote (log) returns and portfolio weights of the ith risk component
95% ¼ 1:645b
at time t, respectively. Then the formulas of the type VaR rel σ tþ1 ðt Þ can be
generalized to the multivariate case. The variances and covariances in (11.23) and
(11.24) explain the name of this method (sometimes they are not estimated from
analyzed data but taken from various published databases).
⋄
Example 11.1 (Calculation of VaR by variance-covariance method). The calcula-
tion of VaR by various methods is demonstrated using a real investment portfolio
composed of three investment instruments (the Czech Republic in 2013):
• 1000 pieces of the Czech government bonds 3.40/15 (i.e., the face value 10,000
CZK, the annual coupons 340 CZK paid on September 1, 2013, on September
1, 2014, and finally on the maturity date of September 1, 2015; see Fig. 11.4).
• 1 million pieces of the stocks of electricity operator ČEZ (the dividend 40 CZK
for each stock paid out in 2013 on June 25; see Fig. 11.5).
• 10 million euros (the deposit priced in CZK using the actual exchange rates EUR/
CZK; see Fig. 11.6).
Table 11.3 and Fig. 11.7 present the development of daily portfolio loss in the
year 2013 (negative values mean profits). For example, the loss for the first trading
day January 2, 2013, is calculated as
ð1000 108:16 þ 1000,000 680:20 þ 10,000,000 25:225Þ

ð1000 108:16 þ 1000,000 682:00 þ 10,000,000 25:140Þ ¼ 1:050 million CZK
(the stock dividend and coupon payment were included in such a way that on June
25, 2013, the price of portfolio was increased by the dividend income of
1,000,000 40 ¼ 40 million CZK and on September 2, 2013, by the coupon income
of 1000 340 ¼ 0.340 million CZK, but on the next trading day these incomes are
transferred to another account and further are not included in the price of portfolio).
Price of government bond 3.40/15 (% of face value)
108
107
106
105
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.4 Price of the Czech government bond 3.40/15 in the year 2013 from Example 11.1.
Source: kurzy.cz (https://akcie-cz.kurzy.cz/emise/dluhopisy/statni-dluhopisy/2010/)
700
Price of stock ČEZ (CZK)
600
500
400
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.5 Price of the stock of electricity operator ČEZ in the year 2013 from Example 11.1.
Source: kurzy.cz (https://prague-stock.kurzy.cz/akcie/cez-183/graf_2013)
Histogram of portfolio loss in Fig. 11.8 indicates (at least graphically) that the
assumption of normality is realistic with negative values denoting profits.
Table 11.4 contains the sample means and the sample covariance matrix of
portfolio components (bonds, stocks, euro deposit) which are necessary for the
variance-covariance method. Hence one easily calculates by means of (11.23) that
28
Exchange rate EUR/CZK (CZK)
27
26
25
24
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.6 Exchange rates EUR/CZK in the year 2013 from Example 11.1. Source: EUROSTAT
(https://ec.europa.eu/eurostat/data/database)

X t N 0:670; 8:5952
so that according to (11.4) one finally obtains
VaR95% ¼ 0:670 þ 8:595 1:645 ¼ 14:809 million CZK,
95% ¼ 8:595 1:645 ¼ 14:139 million CZK:

VaRrel
These values can be interpreted either as risk characteristics of the given portfolio
during year 2013 or as VaR predictions for the first trading day of year 2014.
⋄
2. Method of Historical Simulation
This method is evidently the most popular in the framework of nonparametric
approaches to the calculation of VaR in practice, since it is very simple. It ignores
entirely the problem of probability distribution or the correlation structure among
component losses of portfolio and assumes simply that the character of losses in
previous periods (e.g., in the trading days of previous years) will sustain also during
a future period. In other words, this method is based on losses, which are simulated
by the “history.” In this context, one usually applies the following estimate used
typically for construction of empirical distribution functions:
Table 11.3 Daily portfolio loss in the year 2013 from Example 11.1 (see also Fig. 11.7)
Price of Price of Exchange rate Price of Loss
Trading bond 3.40/15 stock ČEZ EUR/CZK portfolio (million
day Date (%) (CZK) (CZK) (CZK) CZK)
0 28.12.2012 108.16 680.00 25.140 2,013,000,000 –
1 2.1.2013 108.16 680.20 25.225 2,014,050,000 1.050
2 3.1.2013 108.16 675.00 25.260 2,009,200,000 4.850
3 4.1.2013 108.16 680.00 25.355 2,015,150,000 5.950
4 7.1.2013 108.16 658.50 25.535 1,995,450,000 19.700
5 8.1.2013 108.16 663.50 25.580 2,000,900,000 5.450
6 9.1.2013 108.16 673.50 25.530 2,010,400,000 9.500
7 10.1.2013 108.16 661.50 25.630 1,999,400,000 11.000
8 11.1.2013 108.16 655.10 25.615 1,992,850,000 6.550
9 14.1.2013 108.16 644.90 25.615 1,982,650,000 10.200
10 15.1.2013 108.16 644.00 25.610 1,981,700,000 0.950
11 16.1.2013 108.16 648.50 25.580 1,985,900,000 4.200
12 17.1.2013 107.75 651.70 25.540 1,984,600,000 1.300
13 18.1.2013 107.75 652.90 25.630 1,986,700,000 2.100
14 21.1.2013 107.75 647.10 25.625 1,980,850,000 5.850
15 22.1.2013 107.75 648.00 25.610 1,981,600,000 0.750
16 23.1.2013 107.75 643.00 25.600 1,976,500,000 5.100
17 24.1.2013 107.75 622.00 25.595 1,955,450,000 21.050
18 25.1.2013 107.75 617.00 25.605 1,950,550,000 4.900
19 28.1.2013 107.75 613.00 25.700 1,947,500,000 3.050
20 29.1.2013 107.75 615.00 25.660 1,949,100,000 1.600
21 30.1.2013 107.75 615.00 25.660 1,949,100,000 0.000
22 31.1.2013 107.42 612.10 25.620 1,942,500,000 6.600
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
230 26.11.2013 105.30 540.10 27.330 1,866,400,000 9.600
231 27.11.2013 105.30 553.00 27.340 1,879,400,000 13.000
232 28.11.2013 105.30 555.00 27.350 1,881,500,000 2.100
233 29.11.2013 105.30 559.00 27.390 1,885,900,000 4.400
234 2.12.2013 105.30 560.00 27.405 1,887,050,000 1.150
235 3.12.2013 105.30 545.00 27.460 1,872,600,000 14.450
236 4.12.2013 105.30 540.80 27.455 1,868,350,000 4.250
237 5.12.2013 105.30 530.00 27.450 1,857,500,000 10.850
238 6.12.2013 105.30 525.50 27.490 1,853,400,000 4.100
239 9.12.2013 105.30 533.00 27.500 1,861,000,000 7.600
240 10.12.2013 105.30 533.70 27.450 1,861,200,000 0.200
241 11.12.2013 105.30 532.90 27.435 1,860,250,000 0.950
242 12.12.2013 105.30 520.00 27.480 1,847,800,000 12.450
243 13.12.2013 105.30 514.40 27.535 1,842,750,000 5.050
244 16.12.2013 105.30 513.90 27.595 1,842,850,000 0.100
245 17.12.2013 105.30 514.80 27.655 1,844,350,000 1.500
(continued)

Price of Price of Exchange rate Price of Loss
Trading bond 3.40/15 stock ČEZ EUR/CZK portfolio (million
day Date (%) (CZK) (CZK) (CZK) CZK)
246 18.12.2013 105.30 517.80 27.720 1,848,000,000 3.650
247 19.12.2013 105.30 518.50 27.650 1,848,000,000 0.000
248 20.12.2013 105.30 514.00 27.655 1,843,550,000 4.450
249 23.12.2013 105.30 515.00 27.575 1,843,750,000 0.200
250 27.12.2013 105.30 515.60 27.440 1,843,000,000 0.750
251 30.12.2013 105.30 517.00 27.445 1,844,450,000 1.450
252 31.12.2013 105.30 517.00 27.425 1,844,250,000 0.200
40
30
Portfolio loss (mil. CZK)
20
10
-10
-20
-30
-40
2.1.2013 13.3.2013 27.5.2013 6.8.2013 15.10.2013 30.12.2013
Trading year 2013
Fig. 11.7 Daily portfolio loss in the year 2013 from Example 11.1 (negative values mean profits)
1 X
T
PðX > xÞ ¼ I , ð11:25Þ
T t¼1 ½xt >x
where x1, x2, . . ., xT are observed losses during a period of length T (e.g.,
T trading days).
Example 11.2 (Calculation of VaR by method of historical simulation). Let us
consider the portfolio from Example 11.1 composed of government bonds, stocks of
ČEZ, and euro deposit. Table 11.5 presents 15 highest daily losses. The value at risk
of this portfolio with confidence level 95% can be found according to (11.25): as 12/
252 ¼ 4.76% (the twelfth highest daily loss is 13.450 million CZK) and 13/
252 ¼ 5.16% (the thirteenth highest daily loss is 13.250 million CZK), hence it
follows approximately by means of interpolation (according to Table 11.5 with the
80
70
60
Loss frequency
50
40
30
20
10
0
-35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40
Portfolio loss (mil. CZK)
Fig. 11.8 Histogram of portfolio loss in the year 2013 from Example 11.1
Table 11.4 Variance-covariance method from Example 11.1: sample means and sample covari-
ance matrix of portfolio components (bonds, stocks, and euro deposit)
Bonds Stocks Euros
Sample mean 0.113 0.647 0.091
Sample variance 0.634 75.831 0.919
Sample standard deviation 0.796 8.708 0.959
Sample covariance matrix: Bonds 0.634 1.358 0.054
Stocks 1.358 75.831 0.342
Euros 0.054 0.342 0.919
daily losses during 2013 ordered from the highest 40.700 million CZK to the lowest
one)
VaR 95% 13:4 million CZK:
This value at risk is significantly lower than the value 14.8 million CZK from
Example 11.1.
⋄
3. Various Modification of Methods of Historical Simulation
The method of historical simulation described above in its basic form can be
modified in various ways:
(a) Method simulating previous development:
This method simulates additional data respecting the development of previous
ratios among neighboring observations (see Example 11.3):
Table 11.5 Method of Trading days ordered

historical simulation from according to losses Date Loss (million CZK)
Example 11.2: daily portfolio
1 21.6.2013 40.700
losses ordered downward
2 15.3.2013 26.400
3 24.1.2013 21.050
4 25.6.2013 20.000
5 7.1.2013 19.700
6 3.4.2013 18.000
7 16.4.2013 16.950
8 17.6.2013 14.850
9 3.12.2013 14.450
10 9.8.2013 14.000
11 20.9.2013 13.600
12 7.2.2013 13.450
13 9.7.2013 13.250
14 12.3.2013 12.650
15 8.7.2013 12.550
⋮ ⋮ ⋮
Example 11.3 (Calculation of VaR by method simulating previous development).

We again make use of portfolio from Example 11.1 (for simplicity, we ignore the
coupon and dividend income now). The task is to predict VaR for the first trading day
of the year 2014 using data of year 2013 (see Table 11.6). We shall proceed by the
method simulating previous development:
The reference row is chosen as the values observed on December 12, 2013 (see
Table 11.6), and we generate gradually data respecting the development of previous
ratios in such a way that in the first simulation (see the row of Table 11.6) the prices
of bond, stock, and euro are
108:16 680:20
105:30 ¼ 105:30 CZK, 517:00 ¼ 517:15 CZK,
108:16 680:00
25:225
27:425 ¼ 27:518 CZK,
25:140
i.e., the prices from the reference row of 31.12.2013 are multiplied by growth rates
between neighboring trading days December12, 2012, and January 2, 2013; in the
second simulation, prices from the reference row of December 31, 2013, are
multiplied by growth rates between neighboring trading days January 2, 2013, and
January 3, 2013, presenting further possibility of change of the reference row to the
neighboring date January 2, 2014 (it is the first trading day of year 2014, for which
VaR is predicted in this example), and so on.
In this way, one obtains 252 simulations in Table 11.6 including corresponding
losses. For example, the loss (i.e., the profit with negative sign) generated in the first
simulation is
Table 11.6 Method from Example 11.3 simulating previous development: 252 simulated losses
(the first row contains values for reference date)
Price of Price of Exchange rate Loss
Order number bond 3.40/15 stock ČEZ EUR/CZK Price of (million
of simulation (%) (CZK) (CZK) portfolio (CZK) CZK)
31.12.2013 105.30 517.00 27.425 1,844,250,000
1 105.30 517.15 27.518 1,845,329,316 1.079
2 105.30 513.05 27.463 1,840,678,158 3.572
3 105.30 520.83 27.528 1,849,111,053 4.861
4 105.30 500.65 27.620 1,829,850,630 14.399
5 105.30 520.93 27.473 1,848,658,896 4.409
6 105.30 524.79 27.371 1,851,505,949 7.256
7 105.30 507.79 27.532 1,836,112,645 8.137
8 105.30 512.00 27.409 1,839,087,530 5.162
9 105.30 508.95 27.425 1,836,200,237 8.050
10 105.30 516.28 27.420 1,843,474,960 0.775
11 105.30 520.61 27.393 1,847,541,316 3.291
12 104.90 519.55 27.382 1,842,380,681 1.869
13 105.30 517.95 27.522 1,846,168,397 1.918
14 105.30 512.41 27.420 1,839,603,758 4.646
15 105.30 517.72 27.409 1,844,808,518 0.559
16 105.30 513.01 27.414 1,840,153,715 4.096
17 105.30 500.12 27.420 1,827,311,521 16.938
18 105.30 512.84 27.436 1,840,201,201 4.049
19 105.30 513.65 27.527 1,841,915,824 2.334
20 105.30 518.69 27.382 1,845,509,938 1.260
21 105.30 517.00 27.425 1,844,250,000 0.000
22 104.98 514.56 27.382 1,838,159,635 6.090
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
230 105.30 507.42 27.485 1,835,270,637 8.979
231 105.30 529.35 27.435 1,856,698,616 12.449
232 105.30 518.87 27.435 1,846,220,112 1.970
233 105.30 520.73 27.465 1,848,377,223 4.127
234 105.30 517.92 27.440 1,845,325,058 1.075
235 105.30 503.15 27.480 1,830,952,187 13.298
236 105.30 513.02 27.420 1,840,215,844 4.034
237 105.30 506.68 27.420 1,833,875,350 10.375
238 105.30 512.61 27.465 1,840,260,013 3.990
239 105.30 524.38 27.435 1,851,728,451 7.478
240 105.30 517.68 27.375 1,844,430,351 0.180
241 105.30 516.23 27.410 1,843,325,169 0.925
242 105.30 504.48 27.470 1,832,184,730 12.065
243 105.30 511.43 27.480 1,839,231,207 5.019
244 105.30 516.50 27.485 1,844,345,076 0.095
245 105.30 517.91 27.485 1,845,751,733 1.502
(continued)

Price of Price of Exchange rate Loss
Order number bond 3.40/15 stock ČEZ EUR/CZK Price of (million
of simulation (%) (CZK) (CZK) portfolio (CZK) CZK)
246 105.30 520.01 27.489 1,847,907,415 3.657
247 105.30 517.70 27.356 1,844,256,368 0.006
248 105.30 512.51 27.430 1,839,812,611 4.437
249 105.30 518.01 27.346 1,844,462,490 0.212
250 105.30 517.60 27.291 1,843,509,674 0.740
251 105.30 518.40 27.430 1,845,703,774 1.454
252 105.30 517.00 27.405 1,844,050,146 0.200
ð1, 845, 329, 316 1, 844, 250, 000Þ ¼ 1, 079, 316 CZK
¼ 1:079 million CZK,
and similarly for further simulations. Table 11.7 contains such daily losses for each
of 252 simulations ordered downward. Hence in the same way as in Table 11.5, one
can find the value at risk approximately as
VaR95% 13:4 million CZK:
This value at risk is significantly lower than the value 14.8 million CZK from
Example 11.1, and it is nearly the same as in Example 11.2.
⋄
(b) Method of historical simulation based on principle EWMA:
It is the classical method of historical simulation from Example 11.1
supplemented by the principle EWMA (exponentially weighted moving average).
This principle, which weighs time data by means of weights decreasing ex-
ponentially to the past, is frequently applied for financial time series (see Sects.
3.3.1 or 8.3.1). When we constructed the corresponding quantile q0.95 ¼ VaR95% of
losses in Example 11.1, we looked for such a loss among T losses ordered downward
that its order number i fulfills as the first one the inequality i/T 0.05. If using the
principle EWMA, we assign to the ith loss in their descending arrangement the
weight
1λ
λi1 , ð11:26Þ
1 λT
so that now we look for such a loss among T losses ordered downwards that its order
number i fulfills as the first one the inequality
Table 11.7 Method from Trading days

Example 11.3 simulating pre- ordered according Loss
vious development: simulated to losses (million CZK)
daily portfolio losses ordered
1 40.854
downward
2 22.757
3 20.885
4 16.938
5 15.839
6 15.672
7 14.997
8 14.905
9 14.399
10 14.280
11 13.959
12 13.437
13 13.298
14 13.092
15 12.215
⋮ ⋮
1λ 1λ i1 1 λ 1 λi

T þλ T þ þ λ T ¼ 0:05, ð11:27Þ
1λ 1λ 1λ 1 λT
where T is the number of losses (it can be compared with the original inequality
i/T 0.05; see above). The coefficient λ must be chosen a priori controlling the
impact of time arrangement of losses: the closer to 1 this coefficient λ is, the less
important is the time allocation of losses so that losses more remote in the past may
have impact on VaR (the weights (11.26) converge to 1/T for λ going to 1, so that the
method converts to the classical calculation of VaR by means of historical simula-
tion, where the time arrangement does not play any role). For values λ usual in
practice, Table 11.8 indicates the order number of such a loss among losses ordered
downward which determines the corresponding VaR95% (e.g., for λ ¼ 0.99, the
position of the asterisk indicates the loss order i ¼ 5).
Example 11.4 (Method of historical simulation based on principle EWMA). We
shall again demonstrate this method by means of portfolio from Example 11.1. Let
us choose, e.g., λ ¼ 0.99, so that Table 11.8 indicates the value at risk VaR0.95 as the
fifth loss in the descending arrangement of losses in Table 11.5, i.e.,
VaR95% 19:7 million CZK:
This value at risk is by far the highest one in comparison with all previous results so
that the time allocation of losses has a significant impact for construction of VaR.
Table 11.8 Method of historical simulation based on principle EWMA from Example 11.4: order
number of loss in descending arrangement for construction of VaR95%
Order number
of loss i λ ¼ 0.9 λ ¼ 0.95 λ ¼ 0.99 λ ¼ 0.995 λ ¼ 0.999
1 0.1000* 0.0500* 0.0109 0.0070 0.0045
2 0.1900 0.0975 0.0216 0.0139 0.0090
3 0.2710 0.1426 0.0323 0.0208 0.0134
4 0.3439 0.1855 0.0428 0.0277 0.0179
5 0.4095 0.2262 0.0532* 0.0345 0.0224
6 0.4686 0.2649 0.0636 0.0413 0.0269
7 0.5217 0.3017 0.0738 0.0481* 0.0313
8 0.5695 0.3366 0.0839 0.0548 0.0358
9 0.6126 0.3698 0.0939 0.0615 0.0402
10 0.6513 0.4013 0.1039 0.0682 0.0447
11 0.6862 0.4312 0.1137 0.0748 0.0491*
12 0.7176 0.4596 0.1234 0.0814 0.0536
13 0.7458 0.4867 0.1330 0.0880 0.0580
14 0.7712 0.5123 0.1426 0.0945 0.0624
15 0.7941 0.5367 0.1520 0.1010 0.0668
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
⋄
4. Method of Simulation Monte Carlo
This method usually combines parametric and nonparametric approaches:
At first one estimates parametrically the probability distribution of losses. There
are various alternatives how to do it: (1) to estimate separately the marginal distri-
butions of particular loss components and their correlation (or copula structure),
(2) to estimate directly the multivariate distribution of loss vector, and (3) to estimate
the loss dynamically as a multivariate time series (e.g., by means of a multivariate
GARCH model; see Sect. 13.3).
In the second step, one realizes Monte Carlo simulations based on calibrated
(estimated) model. It results in a set of mutually independent loss values referred to
the time moment of constructed VaR. These loss realizations enable us to calculate
the corresponding value at risk in the same (nonparametric) way as in the previous
methods described in this section. The simulation technique denoted as bootstrap is
preferred in this context (then the resulting VaR is sometimes called resampled value
at risk).
The advantage of this Monte Carlo simulation method consists mainly in the fact
that the volume of simulated losses can be much larger than the volume of observed
losses (e.g., in the case of credit portfolio, the volume of observed losses is relatively
limited). On the other hand, there are also drawbacks of this method, namely the
calculation complexity (particularly for portfolios with financial derivatives, which
must be newly priced for each simulation) and high demands on the quality of
simulation models.
11.3 Extreme Value Theory
This section presents basic facts on quantitative approach to extreme values. Even
though the corresponding theory denoted explicitly as EVT (Extreme Value Theory;
see Embrechts et al. (1997), McNeil et al. (2005), and others) comprises very
complex and nontrivial results, its applications are very broad including time series
data not only in economy (particularly in finance and insurance, e.g., financial losses
or insured claims) but also in technical and environmental disciplines (e.g., river
flows in hydrology, wind forces in climatology, exhaust concentrations in environ-
mental control) and others. As the risk measures based on the value at risk principle
have some extreme properties, the theme of EVT is included in this chapter.
The EVT makes use mainly of parametric methods because extreme values are
rare (i.e., with small probabilities that can be quantified only in a parametric way). As
the extreme value methodology is concerned, the following two approaches are
preferred in practical data analysis:
• Block maxima (or minima): this approach segments particular data to blocks and
then uses maximum (or minimum) values of particular blocks.
• Threshold excesses: this approach uses data exceeding a given threshold only.
11.3.1 Block Maxima
The model of block maxima is a traditional model in extreme problems: it is the

model of maxima in particular blocks, which originate by segmentation of original
(large) sample of independent identically distributed observations (the assumption of
independence can be weakened). The most frequent application in finance are daily
maxima of log returns of a given investment asset (e.g., stocks) over a specific period
(see Remark 6.20).
Let us consider a block generated by independent identically distributed random
variables (iid) with distribution function F(x) ¼ P(Y x) that can be denoted for
simplicity as Y1, . . ., Yn. We denote the corresponding maximum (i.e., maximum
random variable) as
M n ¼ max ðY 1 , . . ., Y n Þ: ð11:28Þ
The distribution function of this maximum obviously fulfills
PðM n xÞ ¼ ðF ðxÞÞn ð11:29Þ
(we shall write simply Fn(x)).

11.3 Extreme Value Theory 289
1. Generalized Extreme Value Distribution

The generalized extreme value distribution GEV has for EVT a similar meaning as
the normal distribution for CLT (Central Limit Theorem). Its distribution function
has the form
(
exp ð1 þ ξxÞ1=ξ , ξ 6¼ 0,
H ξ ð xÞ ¼ ð11:30Þ
x
exp ðe Þ, ξ ¼ 0,
where ξ is a real parameter (it is so-called shape parameter) and 1 + ξx > 0. Generally,
one can add a parameter of location μ and a positive parameter of variability σ so that
one has three-parametric GEV with distribution function Hξ,μ,σ (x) ¼ Hξ ((xμ)/σ).
Here so-called Fisher–Tippett Theorem plays the role of Central Limit Theorem: If
there exist sequences of real constants cn > 0 and dn such that

M n dn
lim P x ¼ lim F n ðcn x þ d n Þ ¼ H ðxÞ ð11:31Þ
n!1 cn n!1
for a non-degenerated (i.e., non-concentrated in a single point) distribution function

H, then this limit function is the distribution function of generalized extreme value
distribution (11.30).
Remark 11.7 If (11.31) holds, then one says that F is in so-called maximum domain
of attraction of H and writes F 2 MDA(H ). The parameter ξ of the generalized
extreme value distribution Hξ,μ,σ originating as the limit distribution in (11.31) is
determined unambiguously, while the parameters of location and variability μ and σ
depend on chosen sequences cn and dn. Since for a suitable choice of these constants
one can achieve that the limit distribution is directly Hξ according to (11.30) (i.e.,
Hξ,0,1 with the zero parameter of location and the unit parameter of variability), one
can confine oneself to this one-parametric GEV only.
⋄
Example 11.5 Let independent random variables Yi have the exponential distribu-
tion with distribution function
F ðxÞ ¼ 1 exp ðλ xÞ , x0 ð11:32Þ
(λ is a positive parameter). Then for choice of cn ¼ 1/λ and dn ¼ (ln n)/λ it holds

n
M n dn 1
P x ¼ F n ðcn x þ d n Þ ¼ 1 exp ðxÞ , x ln n,
cn n
and hence
M n dn
lim P x ¼ lim F n ðcn x þ d n Þ ¼ exp ðex Þ, x 2 R,
n!1 cn n!1
i.e., F 2 MDA(H0) (the distribution corresponding to the distribution function H0 is

called Gumbel distribution).
For instance, if T1, T2, . . . are the periods between mortgage defaults in a bank
credit portfolio with the numbers of defaults modeled as Poisson process (see
Fig. 2.9), then the random variables T1, T2, . . . are iid with exponential distribution
(see Sect. 2.4.1). Hence the maximum length among particular defaults has asymp-
totically Gumbel distribution (this result can be applied in an analogical way as the
asymptotic normality following from CLT).
⋄
The generalized extreme value distribution Hξ (see (11.30)) includes three pos-
sible cases in dependence on the sign of the parameter ξ:
• For ξ > 0: Fréchet distribution.
• For ξ ¼ 0: Gumbel distribution.
• For ξ < 0: Weibull distribution.
We shall discuss shortly these possibilities (all with zero parameter of location
and unit parameter of variability) from the point of view of EVT (see also Fig. 11.9):
Fréchet distribution (see (11.30) for ξ > 0 with support (1/ξ, 1)) has in its
maximum domain of attraction the distributions with so-called (right) heavy tail
fulfilling
PðY > xÞ ¼ 1 F ðxÞ x1=ξ , x!1 ð11:33Þ
(a more exact description would necessitate to introduce the concept of so-called

slowly varying function). Such distributions are denoted as the distributions with tail
index 1/ξ and are frequent in financial applications since they possess not only heavy
tails but also infinite higher moments: one can even show that each nonnegative
random variable Y with distribution lying in the maximum domain of attraction of
Fréchet distribution with tail index 1/ξ (ξ > 0) has E(Yk) ¼ 1 for k > 1/ξ. The
maximum domain of attraction of Fréchet distribution contains some common
distributions, e.g., F-, (generalized) Pareto, t-, and other types of distributions.
Gumbel distribution (see (11.30) for ξ ¼ 0 with support (1, 1)) has in its
maximum domain of attraction the distributions both with bounded and unbounded
support. The heavy tails in the case of unbounded support can decrease quickly (see,
e.g., exponential distribution) or slowly (see, e.g., log-normal distribution), so that
the maximum domain of attraction of Gumbel distribution can be also useful for
financial applications. Each nonnegative random variable Y with distribution lying in
the maximum domain of attraction of Gumbel distribution has all moments finite,
i.e., E(Yk) < 1 for k > 0. The maximum domain of attraction of Gumbel distribution
contains, e.g., exponential, gamma, chi-, log-normal, normal, and other distributions.
0,50
0,45 Fréchet distribution
0,40 (ksi = 0.5)
0,35
0,30 Gumbel distribution
0,25 (ksi = 0)
0,20
0,15 Weibull distribution
0,10 (ksi = - 0.5)
0,05
0,00
-0,05
-2,0 -1,0 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
1,0
0,9
0,8
Fréchet distribution
0,7
(ksi = 0.5)
0,6
0,5 Gumbel distribution
0,4 (ksi = 0)
0,3
0,2 Weibull distribution
0,1 (ksi = - 0.5)
0,0
-0,1
-2,0 -1,0 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
Fig. 11.9 Probability density (upper figure) and distribution function (bottom figure) of Fréchet
distribution (ξ ¼ 0.5), Gumbel distribution (ξ ¼ 0), and Weibull distribution (ξ ¼ 0.5) (see
(11.30))
Weibull distribution (see (11.30) for ξ < 0 with support (1, 1/ξ)) contains in
its maximum domain of attraction the distributions that are mostly uninteresting in
financial applications since their support is bounded from the right-hand side:
xF ¼ supfx : F ðxÞ < 1g < 1 ð11:34Þ
(see Fig. 11.9, where the support of Weibull distribution with ξ ¼ 0.5 is bounded
from the right-hand side by the point xF ¼ 1/ξ ¼ 2). The maximum domain of
attraction of Weibull distribution contains, e.g., beta and uniform distribution.
2. Block Minima
The previous results can be easily extended to the case of block minima where
instead of (11.28) one investigates the behavior of min(Y1, . . ., Yn) ¼ max(Y1,
. . ., Yn). For instance, the limit relation (11.31) implies

min ðY 1 , . . ., Y n Þ bn
lim P x
n!1 an

max ðY 1 , . . ., Y n Þ þ bn
¼ 1 lim P x ¼ 1 H ðxÞ ð11:35Þ
n!1 an
for a suitable choice of the norm constants an and bn according to Fisher–Tippett

Theorem.
3. Block Maxima in Time Series
The assumption on independence of random variables Y1, . . ., Yn can be rather
restrictive in practice, particularly if one deals with extremes in financial time series
with mutually correlated observations. It appears that for the models of financial time
series of the type GARCH (but also for so-called strictly stationary processes with
probability distribution remaining stable in time; see Sect. 6.1), one can apply similar
methods as for analysis of extremes in sequences of independent random variables
(see Embrechts et al. (1997)).
Let us assume that a random process Y1, . . ., Yn observed as a time series y1, . . ., yn
can be modeled by GARCH(r, s) according to (8.55) written in the form with zero
mean value
X
r X
s
yt ¼ σ t εt , σ 2t ¼ α0 þ αi y2ti þ β j σ 2tj , ð11:36Þ
i¼1 j¼1
where {εt} are iid random variables with zero mean value and unit variance, and the
parameters of model fulfill
α0 > 0, αi 0, β j 0, α1 þ þ αr þ β1 þ þ βs < 1: ð11:37Þ
Then Fisher–Tippett Theorem (see above) can be reformulated to the form, in which
so-called extreme index θ (0 < θ < 1); see, e.g., Table 11.9 for the model ARCH
(1) from (8.33). Its existence is guaranteed for each process GARCH. Then instead
of (11.31) it holds

M n dn
lim P x ¼ lim F nθ ðcn x þ dn Þ ¼ H θ ðxÞ ð11:38Þ
n!1 cn n!1
(the extreme index should not be confused with the tail index 1/ξ, see above).
Therefore, instead of n dependent observations, it is possible to investigate the
Table 11.9 Extreme index θ for selected values of parameter α1 in the model ARCH(1)
α1 0.1 0.3 0.5 0.7 0.9
θ 0.999 0.939 0.835 0.721 0.612
maximum of nθ independent observations with the same distribution function (if the
number n of observations is higher): one may imagine that nθ is the number of
mutually independent clusters in the sequence of n dependent observations. From the
practical point of view, it means that for higher n one can approximate the distribu-
tion of the maximum of process GARCH by

z dn
P Mn z Hθ : ð11:39Þ
cn
The processes with extreme index θ ¼ 1 (e.g., ARMA processes with normally
distributed white noise) do not show tendency to cluster high values, and their
extremes behave as in the case of independent random variables.
4. Statistical Analysis of Block Maxima
The statistical analysis of block maxima demands a data sample of observed maxima.
Therefore, we usually apply the design where data y1, y2, . . . are divided into m blocks
of size n and maxima in particular blocks are denoted as mn1, . . ., mnm. These are, e.g.,
daily maxima of log returns of a stock index during one calendar year. If the data are
generated from the same distribution with a known distribution function F and are
mutually independent (or possibly of the type GARCH), then according to the theory
described above it suffices (for higher n) to approximate the distribution of block
maxima by the three-parametric distribution Hξ,μ,σ (x) ¼ Hξ ((xμ)/σ) (see its stan-
dardized form Hξ in (11.30)). If hξ,μ,σ denotes the corresponding probability density,
then the unknown parameters ξ, μ, and σ identifying the distribution of block maxima
can be estimated using the maximum likelihood method by maximizing over these
parameters the log likelihood function of the form
X
m
lðξ, μ, σ; mn1 , . . . , mnm Þ ¼ ln hξ,μ,σ ðmni Þ ¼
m¼1

Xm X m
1 m μ m μ 1=ξ
¼ m ln σ 1 þ ln 1 þ ξ ni ln 1 þ ξ ni
ξ i¼1
σ i¼1
σ
ð11:40Þ
under the conditions σ > 0 and 1 + ξ (mni μ)/σ > 0 for all i. These estimates have
convenient properties of maximum likelihood estimates even though the range of
their feasible values may depend on the observed data (in the case of ξ > 0.5).
Remark 11.8 There exists a conflict of interests between the number and the size of
blocks. It is convenient from the point of view of estimation if the number of blocks
m (i.e., the number of observations to estimates the parameters) is higher. On the

other hand, it means that in the case of fixed total number of observations one must
reduce the size of blocks n (or nθ in the case of dependent observations), which
worsens the approximation of maxima distribution by the limit distribution GEV.
⋄
In practice, the previous results are often used to find return level and return
period (see, e.g., McNeil et al. (2005)):
• Return level relates to the problem how to find the size of an extreme event which
will occur on average with a frequency given a priori. A more exact formulation is
as follows: if H is the distribution function of block maxima in blocks of size n,
then for a given k the corresponding return level is defined as
r n,k ¼ q11=k ðH Þ, ð11:41Þ
where qu ¼ inf{x: H(x) u} ¼ H1(u) is the u-quantile of distribution H. The

value rn,k may be obviously interpreted as the level, which is exceeded by the
block maxima in each k-tuple of blocks of size n on average just once: e.g., r260,10
is the level which is exceeded by the yearly maximum of daily log returns of a
given stock index on average just once in 10 years (260 is an average number of
trading days in one calendar year). After substituting the estimated distribution
function H, one obtains the estimated return level in the form

b
σ b
br n,k ¼ b
μþ ð ln ð1 1=k ÞÞξ 1 : ð11:42Þ
b
ξ
• Return period relates to the problem of how to find the average frequency of
occurrence of an extreme event over a given level. A more exact formulation is as
follows: if H is the distribution function of block maxima in blocks of size n, then
for a given u the corresponding return period is defined as
1
kn,u ¼ : ð11:43Þ
1 H ðuÞ
The value kn,u may be obviously interpreted in such a way that in each kn,u-tuple
of blocks of size n we can on average expect the occurrence of just one block, in
which the level u will be exceeded; e.g., k260, u is the number of years, in which
the yearly maximum of daily log returns just once exceeds the level u. After
substituting the estimated distribution function H, one obtains the estimated
return period in the form
b 1
k n,u ¼ : ð11:44Þ
1 Hb ðuÞ
ξ,b
μ,b
σ
Example 11.6 McNeil et al. (2005) apply the theory of block extremes to the time
series of daily drops (in percent) of stock index S&P 500 (this index is used globally
as a barometer of stock markets) for the period 1960 to Friday, October 16, 1987,
when during one day the given index dropped by 5.25% (as a forerunner of Black
Monday, October 19, 1987, with the catastrophic fall of this index by 20.5%).
Therefore, one analyzed the yearly and semiannual block maxima of daily drops
(recorded in absolute values), i.e., 28 and 56 observed values of block maxima,
respectively. First one constructed the maximum likelihood estimates according to
(11.40) obtaining
• For yearly block maxima: very unstable estimates b ξ ¼ 0:27, bμ ¼ 2:04 and
b
σ ¼ 0:72 with high standard deviations 0.21, 0.16, and 0.14 (the limit
Fréchet distribution shows a very heavy right tail and infinite fourth moment,
since 4 > 1/ 0.27).
• For semiannual block maxima: more stable estimates b ξ ¼ 0:36, b
μ ¼ 1:65, and
b
σ ¼ 0:54 with more reasonable standard deviations 0.15, 0.09, and 0.08.
Further one estimated the return level according to (11.42), namely
• Ten-year return level: br260,10 ¼ 4.3% with estimated 95% confidence interval
(3.4%; 7.1%).
• Twenty-year return level: br 130,20 ¼ 4.5% with estimated 95% confidence interval
(3.5%; 7.4%).
The drop by 20.5% during Black Monday, October 19, 1987 (i.e., just on the
beginning of future period from the point of view of performed analysis) missed
significantly previous confidence intervals both for the 10-year and for 20-year
return level.
⋄
11.3.2 Threshold Excesses
This approach explores the observations exceeding a given level (or threshold). Its
main advantage consists in the fact that it does not “waste” data as the method of
block maxima from Sect. 11.3.1 which exploits only maxima of (large) blocks and
throws away remaining information. The data which we handle in this method are
extreme in the sense that they exceed a given (usually high) level so that they may be
denoted as excesses (e.g., in the framework of reinsurance of commercial insurance
companies one can confine oneself to such parts of losses that lie in the layer that has
origin in a designated level).
1. Generalized Pareto Distribution

The generalized Pareto distribution GPD plays for excesses a similar role as the
generalized extreme value distribution GEV for block maxima (remind that GDP
itself belongs to the domain of attraction of GEV; see Sect. 11.3.1). The distribution
function of GDP can be parameterized as
8
> 1=ξ
< 1 ð1 þ ξx=βÞ , ξ 6¼ 0,

Gξ,β ðxÞ ¼ x ð11:45Þ

>
: 1 exp β , ξ ¼ 0,
where β and ξ are scale and shape real parameters (β > 0). One has x 0 for ξ 0,
while 0 x β/ξ for ξ < 0. The mean value of GPD is equal to β/(1 ξ) for ξ < 1,
and E(Y k) ¼ 1 for k 1/ξ, so that, e.g., the variance of GDP is infinite for ξ ¼ 0.5 as
a consequence of heavy tails. Similarly as GEV, the generalized Pareto distribution
includes three possible cases in dependence on the sign of the parameter ξ (see
Fig. 11.10):
• For ξ > 0: Pareto distribution.
• For ξ ¼ 0: exponential distribution.
• For ξ < 0: Pareto type II distribution (it has the bounded support (0, β/ξ)).
2. Distribution of Excesses
Let Y be a random variable with distribution function F. Then the excess Y u over a
given level (or threshold) u has so-called excess distribution function Fu(x) of the
form
F ð x þ uÞ F ð uÞ
F u ðxÞ ¼ PðY u xjY > uÞ ¼ ð11:46Þ
1 F ð uÞ
for 0 x xF u, where xF is the right endpoint of support of F (mostly xF ¼ 1).

Moreover, one defines the mean excess function e(u) as
eðuÞ ¼ EðY ujY > uÞ: ð11:47Þ
Example 11.7 Let us assume in addition that Y has the generalized Pareto distri-
bution (11.45). Then according to (11.46) it holds
F u ðxÞ ¼ 1 ð1 þ ξx=ðβ þ ξuÞÞ1=ξ ¼ Gξ,βþξu ðxÞ, ð11:48Þ
where x 0 for ξ 0, while 0 x β/ξ u for ξ < 0. It means that the excess
distribution remains of the type GPD. Particularly for the exponential distribution
F(x) ¼ G0,β (x), it stays Fu(x) ¼ G0,β (x) ¼ F(x), which confirms the characteristic
“loss of memory” of exponential distribution (i.e., the excess distribution of
exponential distribution remains the identical exponential distribution regardless
of the size of level u).
1,00
0,90 Pareto distribution
0,80 (ksi = 0.5)
0,70
0,60 exponential distribution
0,50 (ksi = 0)
0,40
0,30 Pareto type II distribut.
0,20 (ksi = -0.5)
0,10
0,00
-0,10
0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
1,0
0,9
0,8
Pareto distribution
0,7
(ksi = 0.5)
0,6
0,5 exponential distribution
0,4 (ksi = 0)
0,3
0,2 Pareto type II distribut.
0,1 (ksi = -0.5)
0,0
-0,1
0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0
Fig. 11.10 Probability density (upper figure) and distribution function (bottom figure) of Pareto
distribution (ξ ¼ 0.5), exponential distribution (ξ ¼ 0), and Pareto type II distribution (ξ ¼ 0.5)
(see (11.45) with β ¼ 1)
Furthermore, the mean excess function of generalized Pareto distribution has

the form
β þ ξu
eðuÞ ¼ , ð11:49Þ
1ξ
where u 0 for 0 ξ < 1, while 0 u β/ξ for ξ < 0. Hence the mean excess of
GPD is the linear function of u, which is a useful property in some related statistical
procedures (e.g., for the identification of distribution GDP).
⋄
The generalized Pareto distribution plays an important role for modeling the
excesses not only in the sense of Example 11.7 but also as their limit distribution.
Namely, the following Balkema–de Haan Theorem holds as an analogy to Fisher–
Tippett Theorem of limiting GEV distribution for block maxima (see Sect. 11.3.1): if
there exist real constants au and bu such that Fu(au x + bu) has a continuous limiting
distribution function for u ! xF (xF is the right endpoint of support of F including the
possibility of xF ¼ 1; see (11.34)), then

lim F u ðxÞ Gξ,βðuÞ ðxÞ ¼ 0 ð11:50Þ
u!xF
for a suitable parameter ξ and a function β(u) (obviously in the situation of Example
11.7 it will be β(u) ¼ β + ξu for another parameter β). In other words, the excess
distribution can be approximated for higher levels u by GPD.
3. Statistical Analysis of Excesses
The statistical analysis is usually based on a sample of excesses x1 , x2 , . . . , xN u ,
which in the original sample y1, y2, . . ., yn of iid observations with a distribution
F exceeded a given level u. If we accept the approximation by the generalized Pareto
distribution (11.45) with probability density gξ,β, then the unknown parameters ξ and
β can be estimated by maximizing over these parameters the log likelihood function
of the form
X
Nu

lðξ, β; x1 , . . . , xN u Þ ¼ ln gξ,β x j
j¼1

Nu
1 X xj
¼ N u ln β 1 þ ln 1 þ ξ ð11:51Þ
ξ j¼1 β
under conditions β > 0 and 1 + ξ xj /β > 0 for all j. Similarly as in the case of block
maxima, it is again possible to generalize this procedure by means of extreme index
to time series of correlated observations (e.g., for the models GARCH; see
Embrechts et al. (1997)).
Moreover, the model constructed for a given level u can be transformed to models
with excesses over any higher levels v u since one can easily show that it holds
F v ðxÞ ¼ Gξ,βþξðvuÞ ðxÞ ð11:52Þ
and similarly
β þ ξ ð v uÞ ξv β ξu
eð vÞ ¼ ¼ þ , ð11:53Þ
1ξ 1ξ 1ξ
where u v < 1 for 0 ξ < 1, while u v u β/ξ for ξ < 0. The linearity of the
mean excess function (11.53) of argument v (for fixed u) is helpful when looking for
such a level u that the excesses over u can be modeled by means of GDP (see
Example 11.8).
For loss observations y1, y2, . . ., yn, the following sample mean excess (11.54) can
be used as a statistical estimate of the mean excess over level u:
Pn
i¼1 ðyi uÞI ½yi > u
e n ð uÞ ¼ Pn : ð11:54Þ
i¼1 I ½yi > u
The sample excesses are often used to identify graphically the GDP of excesses in
real data. Data y1, y2, . . ., yn are ordered by their size in ascending order to the form
y(1) y(2) . . . y(n) (so-called ordered statistics) and then plotted in a plane graph
as points with coordinates (y(i), en(y(i))) for i ¼ 2, . . ., n, where en(∙) is the sample
mean excess according to (11.54). If these points lie on a line approximately starting
with some ordered statistics, then according to (11.53) the approximation of excess
distribution by the generalized Pareto distribution is proper starting again with the
level corresponding to this ordered statistics. Moreover, in such a case the slope of
identified line corresponds to the size of parameter ξ in the given GPD (see (11.53)).
Example 11.8 In the context of the excess modeling, the example of Danish fire
insurance data is well known (see, e.g., McNeil et al. (2005)): Table 11.10 and
Fig. 11.11 present time series of losses over 1 million DKK (Danish crowns) harmed
by fires in Denmark in the period 1980–1990.
Figure 11.12 shows the sample mean excess (11.54) as a function of level u (more
exactly, the graph plots points with coordinates (y(i), en(y(i))); see the discussion
above. If one ignores the points with high levels u, where the sample estimates of
mean excesses (11.54) are unreliable due to small number of data, then in the graph
starting approximately with the level of 10 million DKK one can identify an
increasing line. Therefore starting with this level it is possible to approximate the
excess distribution by the generalized Pareto distribution with a positive parameter ξ,
i.e., by the classical (“non-generalized”) Pareto distribution. The maximum likeli-
hood estimates of parameters maximizing the log likelihood (11.51) are then b β ¼ 7.0
b
with standard deviation 1.1 and ξ ¼ 0.50 with standard deviation 0.14. The GPD for
various levels u can be obtained by means of the simple transformation (11.48).
⋄
Table 11.10 Losses over 1 million DKK harmed by fires in Denmark in Example 11.8
Losses over
Losses over Losses over 1 million
Date 1 million DKK Date 1 million DKK Date DKK
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
01/03/1980 1.683748 ⋮ 12/02/1984 1.256545 ⋮ 11/27/1990 1.134488
01/04/1980 2.093704 ⋮ 12/03/1984 1.103048 ⋮ 11/29/1990 3.407591
01/05/1980 1.732581 ⋮ 12/03/1984 1.204188 ⋮ 11/29/1990 1.072607
01/07/1980 1.779754 ⋮ 12/08/1984 1.151832 ⋮ 11/30/1990 1.167492
01/07/1980 4.612006 ⋮ 12/08/1984 1.884817 ⋮ 11/30/1990 1.072607
01/10/1980 8.725274 ⋮ 12/10/1984 7.539267 ⋮ 12/04/1990 1.270627
01/10/1980 7.898975 ⋮ 12/11/1984 1.099476 ⋮ 12/05/1990 1.472772
01/16/1980 2.208045 ⋮ 12/12/1984 1.570681 ⋮ 12/06/1990 1.036304
01/16/1980 1.486091 ⋮ 12/13/1984 2.670157 ⋮ 12/07/1990 1.650165
01/19/1980 2.796171 ⋮ 12/17/1984 1.151832 ⋮ 12/08/1990 1.678218
01/21/1980 7.320644 ⋮ 12/19/1984 3.874346 ⋮ 12/09/1990 2.640264
01/21/1980 3.367496 ⋮ 12/22/1984 5.026178 ⋮ 12/09/1990 1.601485
01/24/1980 1.464129 ⋮ 12/28/1984 1.780105 ⋮ 12/10/1990 17.739274
01/25/1980 1.722223 ⋮ 12/29/1984 4.764398 ⋮ 12/14/1990 4.372937
01/26/1980 11.374817 ⋮ 12/31/1984 1.151832 ⋮ 12/15/1990 1.361386
01/26/1980 2.482739 ⋮ 01/01/1985 1.500000 ⋮ 12/16/1990 1.183993
01/28/1980 26.214641 ⋮ 01/03/1985 1.251000 ⋮ 12/17/1990 2.970297
02/03/1980 2.002430 ⋮ 01/04/1985 1.030000 ⋮ 12/19/1990 1.023102
02/05/1980 4.530015 ⋮ 01/05/1985 1.050000 ⋮ 12/20/1990 1.130363
02/07/1980 1.841753 ⋮ 01/05/1985 1.900000 ⋮ 12/21/1990 3.011551
02/10/1980 3.806735 ⋮ 01/05/1985 1.100000 ⋮ 12/21/1990 1.402640
02/13/1980 14.122076 ⋮ 01/06/1985 1.881750 ⋮ 12/22/1990 2.322607
02/16/1980 5.424253 ⋮ 01/07/1985 1.007000 ⋮ 12/23/1990 1.115512
02/19/1980 11.713031 ⋮ 01/07/1985 1.630000 ⋮ 12/23/1990 1.691419
02/20/1980 1.515373 ⋮ 01/07/1985 1.025000 ⋮ 12/24/1990 1.237624
02/21/1980 2.538589 ⋮ 01/08/1985 1.007274 ⋮ 12/27/1990 1.114686
02/22/1980 2.049780 ⋮ 01/08/1985 3.500000 ⋮ 12/30/1990 1.402640
02/23/1980 12.465593 ⋮ 01/10/1985 2.900000 ⋮ 12/30/1990 4.867987
02/25/1980 1.735445 ⋮ 01/11/1985 2.463137 ⋮ 12/30/1990 1.072607
02/27/1980 1.683748 ⋮ 01/11/1985 4.625000 ⋮ 12/31/1990 4.125413
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
Source: Copenhagen Reinsurance
11.4 Exercises 301
losses harmed by fires in Denmark

300 (in mio DKK)
250
200
150
100
50
0
1980 1981 1982 1983 1985 1986 1987 1988 1989 1990
Fig. 11.11 Losses over 1 million DKK harmed by fires in Denmark in Example 11.8. Source:
Copenhagen Reinsurance
70
60
50
sample mean 40
excess e(u) 30
20
10
0
0 10 20 30 40 50
level u
Fig. 11.12 Sample mean excesses as function of level u for losses over 1 million DKK harmed by
fires in Denmark in Example 11.8. Source: calculated by EViews
11.4 Exercises
Exercise 11.1 Repeat the calculation of VaR from Examples 11.1–11.4 (calculation
of VaR for daily losses in given investment portfolio in the year 2013), but only for
the last month of December 2013.
Part V
Multivariate Time Series
Chapter 12
Methods for Multivariate Time Series
12.1 Generalization of Methods for Univariate Time Series
Most procedures for univariate time series from previous chapters can be generalized
for multivariate time series, where instead of scalar values yt we observe m-variate
vector values yt ¼ (y1t, . . ., ymt)0 in time as realizations of a vector random process
(see Sect. 2.1). The transfer from univariate to multivariate dimension mostly means
only higher formal and numerical complexity of methods described in previous parts
of this text (decomposition methods, methods for linear and nonlinear processes, and
the like), which will be demonstrated briefly in this section by means of examples of
stationary multivariate time series. Later we shall see that such a parallel description
of several scalar processes brings to the analysis further elements that have exclu-
sively the multivariate character (examples are the routine methodology VAR for
multivariate time series, the cointegration among particular univariate components,
and others).
(Weak) stationarity of multivariate time series {yt} means again that the
corresponding process is invariant to time shifts of the first and second moments, i.e.,
Eðyt Þ ¼ μ ¼ const, ð12:1Þ

covðys , yt Þ ¼ Eðys μÞðyt μÞ0 ¼ cov ysþh , ytþh for arbitrary h, ð12:2Þ
i.e., particularly also
varðyt Þ ¼ Σyy ¼ const: ð12:3Þ
For stationary multivariate time series, one can define analogously as in the
univariate case the (matrix) autocovariance function

https://doi.org/10.1007/978-3-030-46347-2_12
306 12 Methods for Multivariate Time Series
Γk ¼ covðyt , ytk Þ ¼ Eðyt μÞðytk μÞ0 , k ¼ . . . , 1, 0, 1, . . . ð12:4Þ
and the (matrix) autocorrelation function
ρk ¼ D1=2 Γk D1=2 , k ¼ . . . , 1, 0, 1, . . . , ð12:5Þ
where D ¼ diag{Σyy} ¼ diag{var(y1t), . . ., var(ymt)}. The element γ ij(k) in position

(i, j) of matrix Γk ¼ (γ ij(k)) is called the mutual covariance function of time series
{yit} and {yjt}. Quite analogously, the term mutual correlation function ρij(k) is used.
Apparently it holds
Γk ¼ Γ0k , ρk ¼ ρ0k , ð12:6Þ
i.e., γ ij(k) ¼ γ ji(k) and ρij(k) ¼ ρji(k). Then the estimated (matrix) autocovariance
function (estimated by means of y1, . . ., yn) is simply
1 X
n
Ck ¼ ðy yÞðytk yÞ0 , k ¼ 0, 1, . . . , n 1 ð12:7Þ
n t¼kþ1 t
and the estimated (matrix) autocorrelation function is
Rk ¼ D b 1=2 ,
b 1=2 Ck D k ¼ 0, 1, . . . , n 1, ð12:8Þ
b has the main diagonal identical with the main diagonal

where the diagonal matrix D
of C0.
Remark 12.1 Let us consider components {yit} and {yjt} of multivariate time series
{yt}. Then the mutual correlation function ρij(k) describes (linear) dependence
between the series {yit} and {yjt} in time, e.g.,
• If ρij(k) ¼ ρji(k) ¼ 0 for all k 0, then {yit} and {yjt} are mutually uncorrelated
(i.e., there exists no stochastic linear dependence between them).
• If ρij(0) ¼ 0, then {yit} and {yjt} are simultaneously uncorrelated (in the opposite
case they are simultaneously correlated).
• If ρij(k) ¼ ρji(k) ¼ 0 for all k > 0, then {yit} and {yjt} are uncoupled.
• If ρij(k) ¼ 0 for all k > 0, but ρji(l) 6¼ 0 for some l > 0, then there exists a
unidirectional dependency relationship of {yjt} on {yit} (i.e., yit depends on no
past value yjt, but yjt depends on some past value yit).
• If ρij(k) 6¼ 0 for some k > 0 and ρji(l) 6¼ 0 for some l > 0, then there exists a
feedback between {yit} and {yjt}.
⋄
Example 12.1 Table 12.1 and Fig. 12.1 present the first differences of monthly
yields to maturity YTM for 3-month T-bills (so-called short-term interest rates
12.1 Generalization of Methods for Univariate Time Series 307
Table 12.1 Monthly data in Example 12.1 (the first differences of monthly yields to maturity for
3-month T-bills and corporate bonds AAA in USA in % p.a.); see also Table 12.16
Month DTB3 DAAA Obs DTB3 DAAA Obs DTB3 DAAA
1985M01 0.40 0.05 1988M05 0.35 0.23 1991M09 0.14 0.14
1985M02 0.46 0.05 1988M06 0.23 0.04 1991M10 0.22 0.06
1985M03 0.35 0.43 1988M07 0.23 0.10 1991M11 0.43 0.07
1985M04 0.57 0.33 1988M08 0.29 0.15 1991M12 0.48 0.17
1985M05 0.44 0.51 1988M09 0.21 0.29 1992M01 0.28 0.11
1985M06 0.55 0.78 1988M10 0.11 0.31 1992M02 0.00 0.09
1985M07 0.04 0.03 1988M11 0.34 0.06 1992M03 0.21 0.06
1985M08 0.13 0.08 1988M12 0.41 0.12 1992M04 0.24 0.02
1985M09 0.10 0.02 1989M01 0.20 0.05 1992M05 0.15 0.05
1985M10 0.09 0.05 1989M02 0.19 0.01 1992M06 0.04 0.06
1985M11 0.03 0.47 1989M03 0.35 0.17 1992M07 0.42 0.15
1985M12 0.13 0.39 1989M04 0.13 0.01 1992M08 0.14 0.12
1986M01 0.03 0.11 1989M05 0.30 0.22 1992M09 0.17 0.03
1986M02 0.01 0.38 1989M06 0.18 0.47 1992M10 0.13 0.07
1986M03 0.44 0.67 1989M07 0.30 0.17 1992M11 0.30 0.11
1986M04 0.53 0.21 1989M08 0.01 0.03 1992M12 0.11 0.12
1986M05 0.06 0.30 1989M09 0.19 0.05 1993M01 0.19 0.07
1986M06 0.09 0.04 1989M10 0.13 0.09 1993M02 0.11 0.20
1986M07 0.37 0.25 1989M11 0.08 0.03 1993M03 0.02 0.13
1986M08 0.27 0.16 1989M12 0.03 0.03 1993M04 0.08 0.12
1986M09 0.38 0.17 1990M01 0.00 0.13 1993M05 0.07 0.03
1986M10 0.01 0.03 1990M02 0.12 0.23 1993M06 0.14 0.10
1986M11 0.17 0.18 1990M03 0.11 0.15 1993M07 0.05 0.16
1986M12 0.14 0.19 1990M04 0.09 0.09 1993M08 0.00 0.32
1987M01 0.04 0.13 1990M05 0.00 0.01 1993M09 0.09 0.19
1987M02 0.14 0.02 1990M06 0.04 0.21 1993M10 0.08 0.01
1987M03 0.03 0.02 1990M07 0.08 0.02 1993M11 0.08 0.26
1987M04 0.20 0.49 1990M08 0.22 0.17 1993M12 0.04 0.00
1987M05 0.01 0.48 1990M09 0.06 0.15 1994M01 0.06 0.01
1987M06 0.06 0.01 1990M10 0.19 0.03 1994M02 0.19 0.16
1987M07 0.09 0.10 1990M11 0.12 0.23 1994M03 0.31 0.40
1987M08 0.22 0.25 1990M12 0.26 0.25 1994M04 0.22 0.40
1987M09 0.32 0.51 1991M01 0.51 0.01 1994M05 0.45 0.11
1987M10 0.08 0.34 1991M02 0.35 0.21 1994M06 0.01 0.02
1987M11 0.59 0.51 1991M03 0.04 0.10 1994M07 0.21 0.14
1987M12 0.01 0.10 1991M04 0.24 0.07 1994M08 0.11 0.04
1988M01 0.10 0.23 1991M05 0.16 0.00 1994M09 0.14 0.27
1988M02 0.21 0.48 1991M06 0.09 0.15 1994M10 0.32 0.23
1988M03 0.00 0.01 1991M07 0.02 0.01 1994M11 0.29 0.11
1988M04 0.23 0.28 1991M08 0.19 0.25 1994M12 0.39 0.22
https://fred.stlouisfed.org/graph/?id=TB3MA, https://fred.stlouisfed.org/graph/?id=AAA
.6
.4
.2
.0
-.2
-.4
-.6
-.8
1985 1986 1987 1988 1989 1990 1991 1992 1993 1994
DTB3 (%) DAAA (%)
Fig. 12.1 Monthly data in Example 12.1 (the first differences of monthly yields to maturity for
3-month T-bills and corporate bonds AAA in the USA in % p.a.)
denoted as DTB3 in % p.a.) and for corporate bonds of the highest rating AAA by
S&P (denoted as DAAA in % p.a.) during 10-year period 1985–1994 in the USA.
Graphs of both time series of lengths 120 in Fig. 12.1 can be regarded as stationary
(it is just the reason why the first differences are analyzed; see non-differenced time
series in Table 12.16 and Fig. 12.10).
Evidently, there is a relatively strong positive correlation between these time
series, which is confirmed by the estimated correlation coefficient of size 0.563 and
scatterplot in Fig. 12.2. Due to the estimated (matrix) autocorrelation function in
Table 12.2 and the partial correlograms (not shown here) one could identify for
individual time series DTB3 and DAAA models AR(1) (or AR(3)) and AR(2),
.6
.4
.2
DAAA (%)
.0
-.2
-.4
-.6
-.8
-.6 -.4 -.2 .0 .2 .4 .6
DTB3 (%)
Fig. 12.2 Scatterplot for data from Example 12.1

12.1 Generalization of Methods for Univariate Time Series 309
Table 12.2 Estimated k ρ11(k) ρ22(k) ρ12(k) ρ12(k)

(matrix) autocorrelation func-
0 1.000 1.000 0.563 0.563
tion for time series DTB3 and
DAAA from Example 12.1 1 0.443 0.409 0.301 0.234
2 0.151 0.033 0.007 0.001
3 0.214 0.058 0.020 0.025
4 0.150 0.085 0.096 0.032
5 0.181 0.071 0.047 0.036
6 0.129 0.018 0.046 0.034
7 0.096 0.080 0.011 0.078
8 0.148 0.122 0.063 0.072
9 0.140 0.022 0.051 0.021
10 0.100 0.024 0.106 0.077
respectively. The relationship between these time series has again the form of
feedback (in both directions till the lag one; see Remark 12.1).
⋄
Example 12.2 Table 12.3 and Fig. 12.3 present the first differences of logarithms of
annual gross domestic products (i.e., log returns; see (8.1)) during the period
1951–1992 in seven countries (France, Germany, Italy, the UK, Japan, the USA,
and Canada) denoted as RGDP_FRA, RGDP_GER, RGDP_ITA, RGDP_UK,
RGDP_JAP, RGDP_US, RGDP_CAN, respectively. The graphs of these seven
time series of lengths 42 in Fig. 12.3 can be again regarded as stationary.
There are again strong positive correlations among these time series, which is
confirmed by the estimated correlation matrix in Table 12.4 and scatterplots in
Fig. 12.4. Due to the estimated correlograms and partial correlograms (not shown
here), one could identify for each individual time series models AR(1) (or white
noise). There exist unidirectional dependency relationships from Remark 12.1
between some pairs of these time series, e.g., between France and Germany (see
Table 12.5).
⋄
Besides the mutual correlation function ρij(k), one applies also partial mutual
correlation function denoted as ρij(k,k) and defined as the partial correlation coeffi-
cient between yit and yj,tk under fixed values
ytk+1, . . ., yt1. Its estimate rij(k,k) can
be obtained as the estimated parameter Φ b kk in the model
ij
yt ¼ Φ11 yt1 þ þ Φkk ytk þ εt , ð12:9Þ
where the multivariate white noise {εt} is quite analogical to the univariate white
noise, i.e., particular components of vectors εt have zero means and are mutually
uncorrelated in different times, but are simultaneously correlated with a constant
positive definite variance matrix Σ
Table 12.3 Annual data in Example 12.2 (log returns of annual gross domestic products for France, Germany, Italy, the UK, Japan, the USA, and Canada)
310
Year RGDP_FRA RGDP_GER RGDP_ITA RGDP_UK RGDP_JPN RGDP_US RGDP_CAN

1951 0.041 0.073 0.069 0.030 0.112 0.051 0.028
1952 0.041 0.089 0.035 0.006 0.080 0.009 0.056
1953 0.019 0.064 0.068 0.048 0.053 0.025 0.013
1954 0.026 0.068 0.033 0.031 0.046 0.029 0.051
1955 0.041 0.115 0.062 0.031 0.067 0.061 0.062
1956 0.066 0.048 0.034 0.013 0.065 0.000 0.062
1957 0.037 0.040 0.037 0.013 0.062 0.003 0.019
1958 0.014 0.031 0.039 0.003 0.045 0.026 0.021
1959 0.027 0.066 0.055 0.035 0.080 0.045 0.017
1960 0.059 0.071 0.065 0.036 0.117 0.006 0.007
1961 0.044 0.041 0.076 0.023 0.123 0.007 0.001
1962 0.048 0.034 0.058 0.003 0.060 0.041 0.046
1963 0.040 0.014 0.056 0.032 0.089 0.026 0.033
1964 0.062 0.059 0.015 0.047 0.113 0.038 0.046
12
1965 0.036 0.045 0.019 0.015 0.039 0.051 0.052

1966 0.045 0.012 0.053 0.016 0.093 0.045 0.049
1967 0.040 0.010 0.067 0.025 0.100 0.014 0.014
1968 0.038 0.059 0.055 0.031 0.117 0.033 0.034
1969 0.067 0.070 0.060 0.007 0.095 0.020 0.038
1970 0.048 0.056 0.056 0.028 0.093 0.000 0.016
1971 0.038 0.022 0.005 0.015 0.036 0.019 0.042
1972 0.035 0.034 0.018 0.038 0.069 0.037 0.050
1973 0.052 0.039 0.054 0.053 0.060 0.046 0.074
1974 0.007 0.022 0.030 0.044 0.047 0.032 0.046
1975 0.018 0.008 0.041 0.001 0.003 0.023 0.001
1976 0.041 0.056 0.059 0.022 0.033 0.042 0.054
1977 0.023 0.030 0.025 0.026 0.037 0.035 0.011
12.1
1978 0.031 0.038 0.031 0.045 0.049 0.038 0.025

1979 0.027 0.037 0.053 0.033 0.026 0.010 0.048
1980 0.007 0.009 0.042 0.019 0.006 0.030 0.015
1981 0.009 0.021 0.009 0.013 0.031 0.014 0.032
1982 0.019 0.010 0.004 0.020 0.020 0.033 0.063
1983 0.000 0.027 0.009 0.039 0.020 0.029 0.028
1984 0.007 0.025 0.025 0.023 0.039 0.060 0.056
1985 0.018 0.019 0.024 0.033 0.046 0.021 0.038
1986 0.042 0.050 0.046 0.034 0.046 0.016 0.022
1987 0.020 0.023 0.032 0.047 0.036 0.017 0.043
1988 0.040 0.034 0.039 0.057 0.058 0.031 0.045
1989 0.026 0.024 0.025 0.024 0.038 0.021 0.021
1990 0.020 0.039 0.025 0.000 0.037 0.003 0.026
1991 0.000 0.028 0.022 0.027 0.044 0.020 0.051
1992 0.007 0.006 0.004 0.004 0.015 0.020 0.002
Generalization of Methods for Univariate Time Series
311
RGDP_FRA RGDP_GER RGDP_ITA

.07 .12 .08
.06 .10 .06
.05 .08 .04
.04
.06
.03 .02
.04
.02 .00
.02
.01
-.02
.00 .00
-.01 -.02 -.04
-.02 -.04 -.06
50 55 60 65 70 75 80 85 90 50 55 60 65 70 75 80 85 90 50 55 60 65 70 75 80 85 90
RGDP_UK RGDP_JPN RGDP_US

.06 .125 .08
.04 .100 .06

.075
.02 .04
.050
.00 .02
.025
-.02 .00
.000
-.04 -.025 -.02
-.06 -.050 -.04

50 55 60 65 70 75 80 85 90 50 55 60 65 70 75 80 85 90 50 55 60 65 70 75 80 85 90
RGDP_CAN
.08
.06
.04
.02
.00
-.02
-.04
-.06
-.08
50 55 60 65 70 75 80 85 90
Fig. 12.3 Annual data in Example 12.2 (log returns of annual gross domestic products for France,
Germany, Italy, the UK, Japan, the USA, and Canada). Source: OECD (https://data.oecd.org/gdp/
gross-domestic-product-gdp.htm)

Eðεt Þ ¼ 0, E εs ε0t ¼ δst Σ: ð12:10Þ
The model (12.9) is an example of so-called multivariate linear process that is

analogical to the univariate case (6.17). Its general form is

yt ¼ εt þ Ψ1 εt1 þ Ψ2 εt2 þ ¼ 1 þ Ψ1 B þ Ψ2 B2 þ εt ¼ ΨðBÞεt
ð12:11Þ
under certain conditions for matrix parameters Ψi to achieve stationarity and

invertibility. In particular, similarly as in the univariate case one defines vector
mixed process VARMA( p, q)
yt ¼ Φ1 yt1 þ þ Φp ytp þ εt þ Θ1 εt1 þ þ Θq εtq , i:e:, ΦðBÞyt ¼ ΘðBÞεt :

ð12:12Þ
12.1
Table 12.4 Estimated correlation matrix for seven time series from Example 12.2
RGDP_FRA RGDP_GER RGDP_ITA RGDP_UK RGDP_JPN RGDP_US RGDP_CAN
RGDP_FRA 1.000 0.610 0.591 0.489 0.748 0.409 0.345
RGDP_GER 0.610 1.000 0.510 0.445 0.553 0.400 0.177
RGDP_ITA 0.591 0.510 1.000 0.303 0.591 0.284 0.189
RGDP_UK 0.489 0.445 0.303 1.000 0.468 0.543 0.250
RGDP_JPN 0.748 0.553 0.591 0.468 1.000 0.388 0.104
RGDP_US 0.409 0.400 0.284 0.543 0.388 1.000 0.667
Generalization of Methods for Univariate Time Series
RGDP_CAN 0.345 0.177 0.189 0.250 0.104 0.667 1.000

313
.08
RGDP_FRA
.06
.04
.02
.00
-.02
.12
RGDP_GER
.08
.04
.00
-.04
.08
RGDP_ITA
.06
.04
.02
.00
-.02
-.04
-.06
.06
RGDP_UK
.04
.02
.00
-.02
-.04
-.06
.15
RGDP_JPN
.10
.05
.00
-.05
.08
RGDP_US
.06
.04
.02
.00
-.02
-.04
.08
RGDP_CAN
.04
.00
-.04
-.08
-.02 .00 .02 .04 .06 .08 -.04 .00 .04 .08 .12 -.06 -.04 -.02 .00 .02 .04 .06 .08 -.06 -.04 -.02 .00 .02 .04 .06 -.05 .00 .05 .10 .15 -.04 -.02 .00 .02 .04 .06 .08 -.08 -.04 .00 .04 .08
Fig. 12.4 Scatterplots for data from Example 12.2
Table 12.5 Estimated mutual k ρ12(k) ρ12(k) ρ15(k) ρ15(k)

correlation function ρ12()
0 0.6097 0.6097 0.7475 0.7475
(France versus Germany) and
ρ15() (France versus Japan) 1 0.4218 0.1649 0.6257 0.3150
from Example 12.2 2 0.1619 0.0103 0.2335 0.2033
3 0.1829 0.0855 0.3920 0.2320
4 0.1749 0.1371 0.3416 0.2791
5 0.1879 0.0208 0.3356 0.1555
6 0.0713 0.0252 0.2571 0.1132
7 0.2083 0.0617 0.1292 0.0691
8 0.3618 0.0936 0.1251 0.1216
9 0.3873 0.0249 0.3186 0.0157
10 0.1296 0.1749 0.0846 0.1678
12.2 Vector Autoregression VAR 315
In Sect. 12.2, we will present in more detail a special case of VARMA, namely
the vector autoregressive process VAR( p)
yt ¼ Φ1 yt1 þ þ Φp ytp þ εt , i:e:, ΦðBÞyt ¼ εt , ð12:13Þ
since nowadays this model is broadly applied just for dynamic economic data.
12.2 Vector Autoregression VAR
The vector autoregression (12.13) (see, e.g., Lütkepohl (2005)) is a natural exten-
sion of the univariate autoregressive process. In econometrics, it represents a useful
instrument in the context of simultaneous equation models SEM (see, e.g., Greene
(2012) or Heij et al. (2004)).
The VAR have several pros and cons in the framework of practical analysis of
economic and financial time series:
+ It is not necessary to distinguish between exogenous variables (they originate
outside the model) and endogenous variables (they originate as outputs of the
given model).
+ Models VAR has a richer structure than univariate processes AR since each
variable can depend on further variables (and not only on its lagged values with
added white noise).
+ The classical OLS estimate has usually acceptable properties in VAR models.
+ Empirical experiences show that predictions by means of VAR are sufficient for
routine situations in practice.
The application of VAR is sometimes “too technical” without deeper arguments
justifying the given model (in practice, this approach is popular in the context of
data mining).
The number of parameters which must be estimated can be large (particularly for
higher dimensions m and orders p of VAR). Moreover, one must solve the
problem of an adequate choice of p in practice).
One must stationarize the modeled data before the VAR is constructed. However,
the necessary adjustments and transformations to achieve stationarity (mainly
differencing) may imply a substantial loss of information contained originally in
the data.
At first let us consider the following model VAR(1) (the description is simpler
than for the general VAR( p) and the results derived for VAR(1) can be extended
easily to the general order; see Remark 12.3)
yt ¼ φ0 þ Φyt1 þ εt , ð12:14Þ
where εt is m-variate white noise (see (12.10)). In comparison with (12.13), the
relation (12.14) contains in addition an m-variate intercept φ0. For example, if m ¼ 2,
then VAR(1) is formed by two equations which can be written explicitly as
y1t ¼ φ10 þ φ11 y1,t1 þ φ12 y2,t1 þ ε1t ,

ð12:15Þ
y2t ¼ φ20 þ φ21 y1,t1 þ φ22 y2,t1 þ ε2t :
Remark 12.2 The explicit form (12.15) demonstrates how the model parameters
influence relations between series {y1t} and {y2t} in time (see Remark 12.1). If in
addition the covariance matrix Σ of white noise {εt} is diagonal (i.e., its components
are mutually uncorrelated), then it holds:
• If φ12 ¼ φ21 ¼ 0, then {y1t} and {y2t} are uncoupled.
• If φ12 ¼ 0 and φ21 6¼ 0, then there exists a unidirectional dependency relationship
of {y2t} on {y1t}.
• If φ12 6¼ 0 and φ21 6¼ 0, then there exists a feedback between {y1t} and {y2t}.
⋄
The formula (12.14) is called reduced form of the model VAR. If we consider the
ith equation in (12.14) (or more generally in (12.13)), then only the variable yi is
present in the current form (i.e., without lag). Moreover, in the reduced form, the
simultaneous correlation between {yit} and {yjt} is represented only by means of the
element σ ij of the covariance matrix of white noise Σ. However, sometimes one
needs to express the simultaneous relation between {yit} and {yjt} more explicitly. In
such a case, one can use so-called structural form of the model VAR, namely by
means of Cholesky decomposition from the matrix theory: as the matrix Σ is positive
definite, then there is a lower triangular matrix L with units on the main diagonal and
a diagonal matrix D such that
1
Σ ¼ LDL0 , i:e: L1 ΣðL0 Þ ¼ D: ð12:16Þ
The original model (12.14) is then transferred to the structural form
L1 yt ¼ φ0 þ Φ yt1 þ ut , ð12:17Þ
where
1
φ0 ¼ L1 φ0 , Φ ¼ L1 Φ, ut ¼ L1 εt , Eðut Þ ¼ 0, varðut Þ ¼ L1 ΣðL0 Þ ¼ D:
ð12:18Þ
Particularly, {ut} is an m-variate white noise with diagonal covariance matrix (i.e.,
components of {ut} are simultaneously uncorrelated). If we denote the last row of
the inverted matrix L1 as (λm1, . . ., λm,m-1, 1), then the mth equation in (12.17) is
X
m1 X
m
ymt þ λmi yit ¼ φm0 þ φmi yi, t1 þ umt : ð12:19Þ
i¼1 i¼1
This structural form presents explicitly the simultaneous (i.e., at time t) linear
dependence of ymt on yit for i ¼ 1, . . ., m 1 (since umt is uncorrelated with yit
which follows from the facts that umt is uncorrelated with uit and L1 in (12.17) is a
lower triangular matrix with units on the main diagonal similarly as L). As the
components of vector yt can be rearranged in an arbitrary way, one obtains the same
conclusion as for ymt also for other components yjt ( j ¼ 1, . . ., m 1) of vector yt.
In practice, one prefers the reduced form of the model VAR since
• The estimation of the reduced form is relatively easy.
• The predictions in the structural form are complicated due to the links of predicted
variable with the simultaneous values of further variables (see above).
The conditions of (weak) stationarity (see Sect. 12.1) and the first and second
moments of VAR(1) can be found analogically as for the scalar (univariate)
autoregressive process (see Sect. 6.2.3):
A sufficient condition of stationarity of the model VAR(1) written in the form
(12.14) (this condition also allows to express VAR in the form (12.22)) usually
demands that all m eigenvalues of the matrix Φ lie inside the unit circle in complex
plane (i.e., their absolute values are lower than one). The eigenvalues of matrix Φ
are the roots of polynomial equation det (λ I Φ) ¼ 0 (or equivalently the inverted
roots of polynomial equation det (I Φz) ¼ 0). Therefore, the condition of
stationarity can be formulated also in such a way that all m roots of autoregressive
(matrix) polynomial Φ(z) ¼ I Φz lie outside the unit circle in complex plane
(or equivalently all m inverted roots of this polynomial lie inside the unit circle in
complex plane).
Under the condition of stationarity (see above), the matrix I Φ is regular so that
(12.14) can be rewritten in the form
yt μ ¼ Φðyt1 μÞ þ εt , ð12:20Þ
where
μ ¼ ðI ΦÞ1 φ0 ¼ ðΦð1ÞÞ1 φ0 ð12:21Þ
is the mean vector of the stationary process {yt} (obviously, Φ(1) ¼ I Φ).
Moreover, this process can be then also written in the form of m-variate linear
process
yt ¼ μ þ εt þ Φ εt1 þ Φ2 εt2 þ Φ3 εt3 þ . . . ð12:22Þ
(compare with (6.17)) so that its covariance matrix is

0
Γ0 ¼ varðyt Þ ¼ Σ þ ΦΣΦ0 þ Φ2 Σ Φ2 þ . . . ð12:23Þ
and its autocovariance function (12.4) fulfills
Γk ¼ Φ k Γ 0 : ð12:24Þ
Remark 12.3 All previous formulas can be extended to the model VAR( p)
yt ¼ φ0 þ Φ1 yt1 þ þ Φp ytp þ εt : ð12:25Þ
The sufficient condition of stationarity and linearity of this process is that
all roots of ΦðzÞ ¼ I Φ1 z Φp zp lie outside the unit circle ð12:26Þ
(in complex plane). Then one can rewrite (12.25) in the form

yt μ ¼ Φ1 ðyt1 μÞ þ þ Φp ytp μ þ εt , ð12:27Þ
where
1
μ ¼ I Φ1 Φp φ0 ¼ ðΦð1ÞÞ1 φ0 ð12:28Þ
is the mean vector of the stationary process {yt}. Its autocovariance function (12.4)
fulfills the multivariate version of system of Yule–Walker equations (6.35)
Γk ¼ Φ1 Γk1 þ þ Φp Γkp for k > 0 ð12:29Þ
and
Γ0 ¼ Φ1 Γ1 þ þ Φp Γp þ Σ: ð12:30Þ
⋄
The construction of VAR( p) based on observations y1, . . ., yn is entirely analog-
ical to the univariate AR( p):
1. Identification of VAR Order
The order p could be identified by generalizing partial correlograms for multivariate
case (see Sect. 6.3.1.1), but such an approach is rather elaborate. Therefore, in
practice one makes use of identification procedures based either on statistical tests
or on information criteria.
The likelihood ratio test (LR test) modified in a sequential way is used typically
for the VAR order determination (see Lütkepohl (2005) or Tables 12.7 and 12.11 by
EViews). This test makes use of critical regions with confidence level α of the form
Table 12.6 Wald test for the VAR lag exclusion Wald tests
identification of model VAR
in Example 12.3 (DTB3t and
DAAAt) calculated by means Chi-squared test statistics for lag exclusion
of EViews Numbers in [ ] are p-values
DTB3 DAAA Joint
Lag 1 31.18664 31.30155 54.13323
[1.69e-07] [1.60e-07] [4.94e-11]
Lag 2 4.323035 7.686472 8.446665
[0.115150] [0.021424] [0.076520]
Lag 3 6.233328 0.484196 2.614466
[0.044305] [0.784979] [0.624263]
df 2 2 4

LR ¼ n b R j ln jΣ
ln jΣ b U j > χ 2 qm2 : ð12:31Þ
1α
In more detail, one tests the null hypothesis that the last q lags of an original VAR
model with a high number of lags (regarded as an upper bound for the VAR order)
have zero parameters (i.e., zero matrices Φi for the last q lags). The symbols Σb R and
b
ΣU in (12.31) denote the estimated covariance matrix of estimated residuals in the
restricted model VAR (i.e., under the restrictions of null hypothesis) and the
unrestricted model VAR (i.e., without such restrictions), respectively.
Another test recommended in this context is Wald test (see Lütkepohl (2005) or
Table 12.6 by EViews) that is similar to the classical F-test in linear regression
models, but it is based on χ 2-distribution.
As the information criteria are concerned, one applies them in the same way as for
the order determination of univariate time series models (see Sect. 6.3.1.2). For
example, the m-variate version of AIC criterion is
bk j þ 2k
AIC ðkÞ ¼ ln j Σ , ð12:32Þ
n
b k is the estimated covariance matrix of the estimated residuals in the model
where Σ
VAR(k) and k ¼ m(km + 1) is the number of parameters, which must be estimated in
the m-variate model VAR(k) with nonzero mean vector (see Tables 12.7 and 12.11
by EViews).
2. Estimation of Model VAR
The model VAR is usually estimated by means of the ML method (i.e., by maxi-
mizing the (log) likelihood function under the assumption of normal distribution of
white noise) even though the reduced form of model VAR (see (12.25)) may be also
estimated by means of the classical OLS method. Under routine conditions, both
approaches are asymptotically equivalent, and the estimates have asymptotically the
normal distribution.
Table 12.7 LR test and information criteria AIC, BIC, and HQ for the identification of model VAR
in Example 12.3 (DTB3t and DAAAt) calculated by means of EViews
VAR lag order selection criteria
Lag LogL LR AIC BIC HQ
0 42.94176 NA 0.731103 0.682558 0.711407
1 72.27577 57.09657 1.183496 1.037862a 1.124408a
2 76.80968 8.663012 1.193030 0.950307 1.094550
3 82.06239 9.848829a 1.215400a 0.875588 1.077527
4 82.58925 0.969044 1.153380 0.716478 0.976115
5 83.27286 1.232937 1.094158 0.560167 0.877501
a
Lag order selected by the criterion
LR sequential modified LR test statistic (each test at 5% level), AIC Akaike information criterion,
BIC Schwarz information criterion, HQ Hannan–Quinn information criterion
3. Diagnostic of Model VAR

The diagnostic of VAR checks various properties of the constructed model. Primar-
ily, the condition of stationarity should be confirmed, i.e., the inverted roots of the
estimated autoregressive polynomial should lie inside the unit circle in complex
plane.
Further diagnostic procedures check the serial uncorrelatedness of the estimated
white noise), namely
• By means of Bartlett’s approximation (see also (6.65)), where the critical bound
of size 2√(1/n) (applying the significance level of 5%) is used for particular
estimated autocorrelations and mutual correlations of estimated residuals with
nonzero delays (e.g., EViews software offers graphical outputs for all elements of
the estimated matrix autocorrelation function with plotted critical bounds; see
Fig. 12.6).
• By means of m-variate version of Q-test (or portmanteau test) with the critical
region (see, e.g., Lütkepohl (2005) or (6.67) or Tables 12.9 and 12.13 by EViews)
X
K 0 1
Qm ¼ n2
1 b Γ
tr Γ b Γ b 1 χ 2 m2 ðK pÞ
bk Γ ð12:33Þ
k¼1
nk k 0 0 1α
for various choices of K (one recommends to try primarily K √n).

• By means of Lagrange multiplier test (LM test; see Lütkepohl (2005) or
Table 12.10 by EViews) which is applied in such a way that we run an auxiliary
regression of the estimated residual at time t on its lagged value at time t h and
on the original right-hand-side lagged regressors yt-1, . . ., yt-p: under the null
hypothesis of no serial correlation of order h, the corresponding LM statistics is
asymptotically distributed as χ 2(m2).
• By means of tests of normality of the estimated residuals (see, e.g., the test
Jarque–Bera in Example 8.1 or Table 12.14 by EViews).
4. Predictions in Model VAR
The construction of predictions in the model VAR is quite analogical to the univar-
iate model AR. Thus, again one makes recursively use of the relation
b
ytþk ðt Þ ¼ φ0 þ Φ1b
ytþk1 ðt Þ þ þ Φpb
ytþkp ðt Þ, ð12:34Þ
where
b
ytþj ðt Þ ¼ ytþj for j 0: ð12:35Þ
Remark 12.4 As the models VMA and VARMA are concerned, the application of
the OLS method is not so straightforward, and one prefers the ML method in such
models. For example in the model VMA(1)
yt ¼ ϑ0 þ εt þ Θ1 εt1 , εt N ð0, ΣÞ ð12:36Þ
the (conditional) likelihood function has the form
Y
n
1 1
Lðϑ0 , Θ1 , ΣÞ ¼ exp ε0t Σ1 εt , ð12:37Þ
t¼1 ð2π Þm=2 jΣj1=2 2
where the vectors of white noise are calculated recursively in time
ε 1 ¼ y1 ϑ 0 , ε 2 ¼ y2 ϑ 0 Θ 1 ε 1 , ε 3 ¼ y3 ϑ 0 Θ 1 ε 2 , : . . . ð12:38Þ
⋄
Remark 12.5 Particular components of models VARMA have the form of univar-
iate models ARMA: in the case of m-variate model VARMA( p, q) are these
marginal models of the type ARMA(mp, (m1)p + q). For example, the bivariate
model VAR(1) in (12.15) with zero mean vector can be written as

1 φ11 B φ12 B y1t ε1t
¼ : ð12:39Þ
φ21 B 1 φ22 B y2t ε2t
If we multiply the relation (12.39) from the left by the matrix


1 φ22 B φ12 B
,
φ21 B 1 φ11 B
then we can write due to diagonality of the matrix product on the left-hand side of
(12.20)

y1t
ð1 φ11 BÞð1 φ22 BÞ φ12 φ21 B2
y2t

1 φ22 B φ12 B ε1t
¼ , ð12:40Þ
φ21 B 1 φ11 B ε2t
so that particular components are modeled as ARMA(2,1) (the order q ¼ 1 of

moving averages follows from the fact that the autocorrelation functions of both
univariate processes of moving averages on the right-hand side of (12.40) have
truncation points k0 ¼ 1).
⋄
Example 12.3 We shall construct the model VAR for the data from Example 12.1
(the bivariate time series of length 120 with the first differences of monthly yields to
maturity (YTM) for three-month T-bills denoted as DTB3t in the first component and
for corporate bonds denoted as DAAAt in the second component):
1. Identification:
• Wald test in Table 12.6 identifies the given bivariate time series overall as
VAR(1) (with significance level of 5%); however, the second component
DAAA individually demands the order of 2 on this significance level so that
a reasonable model seems to be VAR(2).
• The application of LR test and information criteria AIC, BIC, and HQ in
Table 12.7 results in the model VAR(1) or VAR(3).
2. Estimation:
The model VAR(2) was finally chosen for the given data (also taking into account
the diagnostic results; see 3). The estimation of this model is realized in Table 12.8.
Obviously, it would be possible to omit some lagged regressors in the estimated
model by applying estimated standard deviations or t-ratio presented in this table.
3. Diagnostics:
• Figure 12.5 verifies that the estimated model VAR(2) is stationary (all four
inverted roots of autoregressive polynomial lie inside the unit circle in com-
plex plane).
• The estimated matrix autocorrelation function of the estimated residuals with
plotted critical bounds in Fig. 12.6 confirms the uncorrelatedness of estimated
residuals.
Table 12.8 Estimation of Vector autoregression estimates

model VAR(2) in Example
12.3 (DTB3t and DAAAt) by
means of EViews Standard errors in ( ) and t-statistics in [ ]
DTB3 DAAA
DTB3(1) 0.454168 0.002001
(0.11239) (0.10863)
[4.04089] [0.01842]
DTB3(2) 0.017927 0.015799
(0.10254) (0.09911)
[0.17483] [0.15940]
DAAA(1) 0.121705 0.514193
(0.11407) (0.11026)
[1.06690] [4.66359]
DAAA(2) 0.149127 0.258736
(0.11277) (0.10900)
[1.32238] [2.37376]
C 0.009735 0.023150
(0.01951) (0.01885)
[0.49906] [1.22789]
S.E. equation 0.211679 0.204597

1.5
polynomial in Example 12.3
(DTB3t and DAAAt)
1.0
EViews
0.5
0.0
-0.5
-1.0
-1.5
-1 0 1
• Q-test applied to the estimated residuals in Table 12.9 confirms the

uncorrelatedness of estimated residuals.
• LM test applied to the estimated residuals in Table 12.10 confirms the
uncorrelatedness of estimated residuals as well.
⋄
Autocorrelations (bounds constructed as double standard deviations)

Corr(DTB3,DTB3(-i)) Corr(DTB3,DAAA(-i))
.20 .20
.15 .15
.10 .10
.05 .05
.00 .00
-.05 -.05
-.10 -.10
-.15 -.15
-.20 -.20
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Corr(DAAA,DTB3(-i)) Corr(DAAA,DAAA(-i))
.20 .20
.15 .15
.10 .10
.05 .05
.00 .00
-.05 -.05
-.10 -.10
-.15 -.15
-.20 -.20
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
Fig. 12.6 Estimated matrix autocorrelation function of estimated residuals with plotted critical
bounds for the diagnostics of model VAR(2) in Example 12.3 (DTB3t and DAAAt) calculated by
means of EViews
Example 12.4 Analogically we shall construct the model VAR also for the data
from Example 12.2 (the seven-variate time series of length 42 with the log returns
(i.e., the first differences of logarithms) of the annual gross domestic products (GDP)
in seven countries (denoted as RGDP_FRAt, RGDP_GERt, RGDP_ITAt,
RGDP_UKt, RGDP_JAPt, RGDP_USAt, RGDP_CANt):
1. Identification:
• The application of LR test and information criteria in Table 12.11 identify the
given time series as VAR(1).
2. Estimation:
The estimation of the model VAR(1) is realized by means of EViews in
Table 12.12. Analogically as in Example 12.3 one could omit some lagged regres-
sors in the estimated model by applying estimated standard deviations or t-ratio
Table 12.9 Q-test applied to VAR residual Portmanteau tests for autocorrelations
estimated residuals for the
H0: no residual autocorrelations up to lag h
diagnostics of model
VAR(2) in Example 12.3 Included observations: 118
(DTB3t and DAAAt) calcu- Lags Q-Stat Prob. Adj Q-Stat Prob. df
lated by means of EViews 1 0.759100 NA 0.765588 NA NA
2 2.509240 NA 2.545903 NA NA
3 7.221494 0.1246 7.381086 0.1171 4
4 9.651343 0.2904 9.896192 0.2724 8
5 12.22388 0.4279 12.58255 0.4001 12
6 15.53721 0.4857 16.07339 0.4479 16
7 16.56259 0.6812 17.16344 0.6423 20
8 20.01529 0.6959 20.86724 0.6465 24
9 24.81018 0.6381 26.05804 0.5699 28
10 27.85802 0.6764 29.38809 0.5994 32
The test is valid only for lags larger than the VAR lag order
df is degrees of freedom for (approximate) chi-square distribution
Table 12.10 LM test applied VAR residual serial correlation LM tests

to estimated residuals for the
H0: no serial correlation at lag order h
diagnostics of model
VAR(2) in Example 12.3 Included observations: 118
(DTB3t and DAAAt) calcu- Lags LM-Stat Prob
lated by means of EViews 1 7.522854 0.1107
2 7.499448 0.1117
3 5.172111 0.2701
4 2.696262 0.6099
5 2.692773 0.6105
6 3.632837 0.4580
7 1.420712 0.8406
8 3.703578 0.4476
9 5.453784 0.2438
10 3.642071 0.4566
Probs from chi-square with 4 df
(even the most of lagged regressors since the given time series has large dimension
of 7 but small length of 42, which causes relatively broad confidence intervals).
3. Diagnostics:
• The estimated model VAR(1) is stationary according to Fig. 12.7.
• Q-test applied to the estimated residuals in Table 12.13 confirms the
uncorrelatedness of estimated residuals.
• The test Jarque–Bera applied to the estimated residuals in Table 12.14 con-
firms the normality of these residuals.
⋄
Table 12.11 LR test and information criteria AIC, BIC, and HQ for the identification of model
VAR in Example 12.4 (RGDP_FRAt, . . .) calculated by means of EViews
VAR lag order selection criteria
Lag LogL LR AIC BIC HQ
0 677.8285 NA 34.40146 34.10287 34.29433
1 744.7419 106.3751 35.32010 32.93139 34.46305
2 770.1225 31.23772 34.10885 29.63003 32.50188
3 817.3099 41.13769 34.01589 27.44695 31.65901
Lag order selected by the criterion
LR sequential modified LR test statistic (each test at 5% level), FPE Final prediction error, AIC
Akaike information criterion, BIC Schwarz information criterion, HQ Hannan–Quinn information
criterion
12.3 Tests of Causality
An important aspect of the multivariate time series analysis is the investigation of

causality among various blocks of modeled variables. A general approach to the
causality relates to predicting: if a time series influences causally another time series,
then it should improve prediction results of the affected time series (see, e.g.,
Granger (1969), Sims (1972)).
In econometric practice, the conclusions on causality can be easily realized just in
the models VAR, where they can be obtained in a very objective and computation-
ally easy way. The so-called Granger causality means that there exists a correlation
between the current value of a variable and past values of other variables (see also
Remark 12.1). The structure of models VAR enables in this context to transfer such a
causality to testing whether blocks of certain parameters in the estimated model
VAR are zero (particularly, it can be done simply by applying F- or Wald tests under
the assumption of stationarity).
More specifically, the following terminology is used in this context (similarly to
the one in Remark 12.1):
• If lagged values of a variable yi in the VAR equation explaining a variable yj are
statistically significant (globally, e.g., in the sense of F-test), then the variable yi
causes (or G-causes) the variable yj.
• If yi causes yj, but yj does not cause yi, then there exists a unidirectional
relationship from yi to yj. In such a case, one denotes the variable yi in the
VAR equation explaining the variable yj as strongly exogenous.
• If yi causes yj and also yj causes yi, then there exists a feedback between yi and yj.
• If yi does not cause yj and also yj does not cause yi, then yi and yj are
G-independent.
Let us consider, e.g., the following bivariate model VAR(1):
Table 12.12 Estimation of model VAR(1) in Example 12.4 (RGDP_FRAt, . . .) by means of EViews
12.3
Vector autoregression estimates

Standard errors in ( ) and t-statistics in [ ]
RGDP_FRA(1) 0.153093 0.432343 0.042512 0.028843 0.041562 0.001648 0.055642
(0.22493) (0.32673) (0.28877) (0.31717) (0.42006) (0.35775) (0.46020)
Tests of Causality
[0.68062] [1.32326] [0.14721] [0.09094] [0.09894] [0.00461] [0.12091]

RGDP_GER(1) 0.120615 0.516444 0.186485 0.034213 0.085963 0.019305 0.024199
(0.11702) (0.16998) (0.15024) (0.16501) (0.21854) (0.18612) (0.23942)
[1.03070] [3.03824] [1.24127] [0.20734] [0.39335] [0.10372] [0.10107]
RGDP_ITA(1) 0.207291 0.206014 0.167585 0.239290 0.113849 0.443415 0.232438
(0.13888) (0.20172) (0.17829) (0.19582) (0.25935) (0.22088) (0.28413)
[1.49264] [1.02127] [0.93995] [1.22197] [0.43898] [2.00749] [0.81807]
RGDP_UK(1) 0.001346 0.344634 0.189435 0.218979 0.429568 0.210135 0.527973
(0.15675) (0.22769) (0.20124) (0.22103) (0.29274) (0.24931) (0.32071)
[0.00859] [1.51360] [0.94132] [0.99072] [1.46742] [0.84285] [1.64629]
RGDP_JPN(1) 0.429235 0.363092 0.316353 0.070350 0.558444 0.228788 0.131389
(0.12418) (0.18038) (0.15943) (0.17510) (0.23191) (0.19751) (0.25407)
[3.45650] [2.01292] [1.98429] [0.40176] [2.40802] [1.15835] [0.51714]
RGDP_US(1) 0.138609 0.247262 0.017365 0.080385 0.400317 0.010285 0.130052
(0.17285) (0.25108) (0.22192) (0.24374) (0.32281) (0.27492) (0.35365)
[0.80188] [0.98479] [0.07825] [0.32980] [1.24011] [0.03741] [0.36774]
RGDP_CAN(1) 0.031245 0.149500 0.053745 0.041571 0.240112 0.118187 0.203071
(0.12512) (0.18175) (0.16064) (0.17643) (0.23367) (0.19901) (0.25599)
[0.24971] [0.82257] [0.33458] [0.23562] [1.02759] [0.59388] [0.79327]
(continued)
327
328
Vector autoregression estimates

Standard errors in ( ) and t-statistics in [ ]
C 0.009702 0.016474 0.015393 0.020895 0.028488 0.018920 0.011073
(0.00574) (0.00834) (0.00738) (0.00810) (0.01073) (0.00914) (0.01175)
[1.68886] [1.97409] [2.08701] [2.57942] [2.65534] [2.07066] [0.94212]
S.E. equation 0.016176 0.023496 0.020767 0.022809 0.030208 0.025727 0.033095
12
12.3 Tests of Causality 329

1.5
polynomial in Example 12.4
(RGDP_FRAt, . . .)
1.0
EViews
0.5
0.0
-0.5
-1.0
-1.5
-1 0 1
Table 12.13 Q-test applied VAR residual Portmanteau tests for autocorrelations
to estimated residuals for the
H0: no residual autocorrelations up to lag h
diagnostics of model VAR(1)
in Example 12.4 Included observations: 41
(RGDP_FRAt, . . .) calculated Lags Q-Stat Prob. Adj Q-Stat Prob. df
by means of EViews 1 11.39627 NA 11.68118 NA NA
2 37.89006 0.8753 39.53363 0.8308 49
3 83.33112 0.8547 88.56214 0.7418 98
4 121.3687 0.9397 130.7119 0.8285 147
5 164.3181 0.9517 179.6265 0.7931 196
6 197.2777 0.9888 218.2363 0.8898 245
7 231.0652 0.9973 258.9800 0.9303 294
The test is valid only for lags larger than the VAR lag order
df is degrees of freedom for (approximate) chi-square distribution
Table 12.14 Test Jarque– VAR residual normality tests

Bera applied to estimated
Orthogonalization: Cholesky (Lutkepohl)
residuals for the diagnostics of
model VAR(1) in Example H0: residuals are multivariate normal
12.4 (RGDP_FRAt, . . .) cal- Included observations: 41
culated by means of EViews Component Jarque-Bera df Prob.
1 3.418095 2 0.1810
2 2.690175 2 0.2605
3 2.919029 2 0.2323
4 1.465934 2 0.4805
5 2.644147 2 0.2666
6 1.757411 2 0.4153
7 2.742968 2 0.2537
Joint 17.63776 14 0.2238
y1t ¼ φ11 y1,t1 þ φ12 y2,t1 þ ε1t ,

ð12:41Þ
y2t ¼ φ21 y1,t1 þ φ22 y2,t1 þ ε2t :
Then it holds:
• If φ12 6¼ 0, then y2 causes y1.
• If φ21 6¼ 0, then y1 causes y2.
• If φ12 6¼ 0 and φ21 ¼ 0, then there exists a unidirectional relationship from y2
to y1.
• If φ12 ¼ 0 and φ21 6¼ 0, then there exists a unidirectional relationship from y1
to y2.
• If φ12 6¼ 0 and φ21 6¼ 0, then there exists a feedback between y1 and y2.
• If φ12 ¼ 0 and φ21 ¼ 0, then y1 and y2 are G-independent.
Remark 12.6 The presented approach to the problem of causality concerns not only
the causality relations between two scalar variables but also between blocks of more
variables of given model VAR. However, it has sense only in the case that the given
model VAR is unambiguously identified. If it is not the case, then various trans-
formations of such a model may exist delivering different causality results.
⋄
Example 12.5 Let us consider the model VAR(1) estimated in Example 12.4 (the
seven-variate time series of length 42 with the log returns of annual gross domestic
products in seven countries). Table 12.15 presents the causality analysis based on the
corresponding p-values delivered by EViews. One can see (applying the significance
level of 5%) that:
• There exists a unidirectional relationship from RGDP_JPN to RGDP_FRA.
• There exists a unidirectional relationship from RGDP_JPN to RGDP_GER.
• There exists a unidirectional relationship from RGDP_JPN to RGDP_ITA.
• There exists a unidirectional relationship from RGDP_ITA to RGDP_US.
• RGDP_FRA is influenced casually by all remaining six variables.
• RGDP_GER is influenced casually by all remaining six variables.
• RGDP_ITA is influenced casually by all remaining six variables.
12.4 Impulse Response and Variance Decomposition
The causality analysis in Sect. 12.3 based on F- or analogical tests does not answer
questions, which is the sign of a causality relation or how long the effect of various
one-shot changes will survive. Such information can be obtained by means of
procedures denoted as impulse response and variance decomposition.
12.4 Impulse Response and Variance Decomposition 331
Table 12.15 Causality analysis of model VAR(1) in Example 12.5 (RGDP_FRAt, . . .) calculated
by means of EViews
VAR Granger Causality/Block Exogeneity Wald tests
Excluded Chi-sq df Prob.
Dependent variable: RGDP_FRA
RGDP_GER 1.062342 1 0.3027
RGDP_ITA 2.227988 1 0.1355
RGDP_UK 7.37E-05 1 0.9932
RGDP_JPN 11.94738 1 0.0005
RGDP_US 0.643009 1 0.4226
RGDP_CAN 0.062357 1 0.8028
All 20.79029 6 0.0020
Dependent variable: RGDP_GER
RGDP_FRA 1.751010 1 0.1857
RGDP_ITA 1.042991 1 0.3071
RGDP_UK 2.290986 1 0.1301
RGDP_JPN 4.051830 1 0.0441
RGDP_US 0.969813 1 0.3247
RGDP_CAN 0.676629 1 0.4107
All 14.54094 6 0.0241
Dependent variable: RGDP_ITA
RGDP_FRA 0.021672 1 0.8830
RGDP_GER 1.540756 1 0.2145
RGDP_UK 0.886088 1 0.3465
RGDP_JPN 3.937413 1 0.0472
RGDP_US 0.006123 1 0.9376
RGDP_CAN 0.111941 1 0.7379
All 17.32903 6 0.0081
Dependent variable: RGDP_UK
RGDP_FRA 0.008270 1 0.9275
RGDP_GER 0.042991 1 0.8357
RGDP_ITA 1.493219 1 0.2217
RGDP_JPN 0.161412 1 0.6879
RGDP_US 0.108771 1 0.7415
RGDP_CAN 0.055517 1 0.8137
All 1.936258 6 0.9255
Dependent variable: RGDP_JPN
RGDP_FRA 0.009790 1 0.9212
RGDP_GER 0.154727 1 0.6941
RGDP_ITA 0.192703 1 0.6607
RGDP_UK 2.153336 1 0.1423
RGDP_US 1.537884 1 0.2149
RGDP_CAN 1.055934 1 0.3041
All 3.346580 6 0.7643
(continued)

VAR Granger Causality/Block Exogeneity Wald tests
Excluded Chi-sq df Prob.
Dependent variable: RGDP_US
RGDP_FRA 2.12E-05 1 0.9963
RGDP_GER 0.010758 1 0.9174
RGDP_ITA 4.030007 1 0.0447
RGDP_UK 0.710396 1 0.3993
RGDP_JPN 1.341782 1 0.2467
RGDP_CAN 0.352699 1 0.5526
All 7.351121 6 0.2896
Dependent variable: RGDP_CAN
RGDP_FRA 0.014619 1 0.9038
RGDP_GER 0.010215 1 0.9195
RGDP_ITA 0.669240 1 0.4133
RGDP_UK 2.710257 1 0.0997
RGDP_JPN 0.267433 1 0.6051
RGDP_US 0.135235 1 0.7131
All 4.685880 6 0.5847
1. Impulse Response
Impulse response (see also Sect. 6.3.3.1) investigates the reaction of a chosen
dependent variable of the given model VAR to an impulse (innovation shock)
generated in a chosen row of this model. Obviously, a shock generated in the ith
row of the model affects not only the variable yi, but it is also transmitted to other
variables through the dynamic lag structure of VAR. Thus, in the estimated m-
variate model VAR one can investigate in time (starting at the moment of impulse)
altogether m2 responses (namely m responses for each of m dependent variables y1t,
. . ., ymt on the left-hand sides of particular rows in time t). Under the assumption of
stationarity of this model, the impacts of impulses in all (i.e., m2) response situations
gradually dampen (even though with different intensities, which is often useful to
investigate).
For example in the bivariate model VAR(1) with zero mean vector of the form

y1t 0:6 0:2 y1,t1 ε1t
¼ þ
y2t 0 0:3 y2,t1 ε2t
one obtains gradually in time the following responses to a (deterministic) unit

impulse generated at time t ¼ 0 in the first row (i.e., the innovation shock is
ε1,0 ¼ 1, while the second one remains at the zero level):

y1,0 ε1,0 1 y1,1 0:6 0:2 1 0:6
¼ ¼ , ¼ ¼ ,
y2,0 ε2,0 0 y2,1 0 0:3 0 0

y1,2 0:6 0:2 0:6 0:36
¼ ¼ , ...
y2,2 0 0:3 0 0
and the following responses to a unit impulse generated at time t ¼ 0 in the second
row (i.e., the innovation shock is ε2,0 ¼ 1, while the first one remains at the zero
level):

y1,0 0 ε1,0
y1,1 0:6 0:2 0 0:2
¼ ¼ , ¼ ¼ ,
y2,0 ε2,0 1 y2,1 0 0:3 1 0:3

y1,2 0:6 0:2 0:2 0:18
¼ ¼ , ...
y2,2 0 0:3 0:3 0:09
(one can see that the both responses dampen in time and that the response of y2 to the
impulse generated in the first row is zero in all times since φ21 ¼ 0).
Remark 12.7 If the model VAR can be written as the linear process of the form
yt ¼ εt þ Ψ1 εt1 þ Ψ2 εt2 þ ð12:42Þ
(see also (12.22)), then obviously the elements of the ith column of matrix Ψk
represent the responses of particular dependent variables to the unit innovation
shock generated in the ith row at time t k (while the other values of the multivariate
white noise remain at the zero level).
⋄
Remark 12.8 Several technical problems must be solved when the impulse
response analysis is applied (see, e.g., EViews):
• Sometimes it is reasonable to investigate the response to such an impulse that
does not occur in one shot, but repeatedly starting at a given time moment; in this
case, the response in the given stationary model VAR does not dampen, but after
some time stabilizes to a (nonzero) level.
• The impulses are usually generated randomly being set to one standard deviation
of the estimated residuals (or to multiples of these standard deviations). In
particular, such a standardization is reasonable when different variables are
measured in different scales.
• Standard deviations for particular responses can be constructed (analytically or by
means of simulations; see Fig. 12.8).
• The impulses can be also orthogonalized in advance (e.g., using Cholesky
decomposition similarly as in the transformation (12.17)) to guarantee the mutual
Response of DTB3 to DTB3 Response of DTB3 to DAAA

.24 .24
.20 .20
.16 .16
.12 .12
.08 .08
.04 .04
.00 .00
-.04 -.04
-.08 -.08
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Response of DAAA to DTB3 Response of DAAA to DAAA

.20 .20
.15 .15
.10 .10
.05 .05
.00 .00
-.05 -.05
-.10 -.10
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Fig. 12.8 Impulse response analysis of model VAR(2) in Example 12.6 (DTB3t and DAAAt)
calculated by means of EViews
uncorrelatedness of (random) impulses generated in different rows of the

model VAR.
• The ordering of particular variables included in the given model VAR is very
important and should be consistent with the economic theory: by applying
Cholesky decomposition (12.16), one can rewrite the linear process (12.42) in
the form
yt ¼ LL1 εt þ Ψ1 LL1 εt1 þ ¼ Lut þ Ψ1 Lut1 þ , ð12:43Þ
where {ut} ¼ {L1εt} is the orthogonalized white noise with simultaneously

uncorrelated components and the matrix L applied to the current value of white
noise in ut in (12.43) is lower triangular. Hence one should choose (i) the first
variable y1 such that the shock u1t in the first row may affect immediately at time
t each of variables y1t, . . ., ymt, (ii) similarly the second variable y2 such that
the shock u2t in the second row may affect immediately at time t each of variables
y2t, . . ., ymt but not the variable y1t, etc.
⋄
Variance decomposition
Percentage of variance of DTB3 by DTB3 Percentage of variance of DTB3 by DAAA
100 100
80 80
60 60
40 40
20 20
0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Percentage of variance of DAAA by DTB3 Percentage of variance of DAAA by DAAA

80 80
70 70
60 60
50 50
40 40
30 30
20 20
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Fig. 12.9 Variance decomposition analysis of model VAR(2) in Example 12.6 (DTB3t and DAAAt)
calculated by means of EViews
2. Variance Decomposition
While the response to impulse captures the impact of an impulse generated in a
chosen row of VAR for a chosen dependent variable, the variance decomposition
provides information on such a portion of the variance of prediction error (when
predicting a chosen dependent variable) that is explained by innovations from
particular rows of VAR. In practice, the main portion of this variance for yit is
usually explained by the innovation from the ith row of the model.
Example 12.6 Figures 12.8 and 12.9 present the results of the analysis based on
impulse response and variance decomposition for the model VAR(2) estimated in
Example 12.3 (the bivariate time series of length 120 with the first differences of
monthly yields to maturity for 3-month T-bills DTB3t in the first component and for
corporate bonds DAAAt in the second component):
• The response to (random) impulses set to one standard deviation of the estimated
residuals in Fig. 12.8 is performed after the orthogonalization by Cholesky
decomposition (the standard deviations for estimated responses are also plotted;
see Remark 12.8). Obviously, each of four responses gradually dampens so that
the given model VAR is stable (the model stationarity is also confirmed in this
way). Note the nonnegligible response of DAAA to the impulse generated in the
first equation for DTB3 (see the lower graph on the left).
• The variance decomposition in Fig. 12.9 confirms the well-known fact (see
above) that the most prediction variance of a given variable is explained by the
innovation from the equation explaining this variable (approximately 99% for
DTB3 and 72% for DAAA). However, the portion of 28% corresponding to the
percentage of variance of DAAA explained by DTB3 is not nonnegligible and is
consistent with the result obtained by means of the impulse response analysis.
12.5 Cointegration and EC Models
In majority cases, when one combines linearly (univariate) nonstationary time

series, then their combination is again nonstationary. More specifically, if we have
yit Iðdi Þ, i ¼ 1, . . . , m ð12:44Þ
(i.e., m univariate time series that can be stationarized by appropriate differencing;

see (6.89)), then it holds
X
m
αi yit I max di ð12:45Þ
i¼1, ..., m
i¼1
for arbitrary nontrivial linear combination of considered time series. Hence, partic-
ularly, any linear combination of time series with (stochastic) linear trend yit ~ I(1)
usually includes a (stochastic) linear trend as well.
On the other hand, economic and financial time series can be sometimes com-
bined in such a way that the resulting linear combination of nonstationary time series
becomes stationary. Such a phenomenon is denoted as cointegration and can be
interpreted as relationship of a long-run equilibrium among economic variables:
particular time series are nonstationary, but their (“cointegrated”) movement in time
tends (as a consequence of various market forces) to a balanced state of equilibrium
(even though in short-run segments deviations from such a long-run balance persist
in time). Particularly, in finance there exist various examples of cointegration, e.g.:
• Among spot and futures prices of various assets (commodities, securities, and
the like).
• Among price ratios in different countries (i.e., the ratios of prices of the same
goods) and corresponding currency rates.
• Among market prices of stocks and volumes of dividends.
12.5 Cointegration and EC Models 337
In given (and other) examples, the absence of long-run equilibrium would give
birth to arbitrage opportunities. Therefore, the principle of cointegration including
so-called EC models (see below) becomes one of the main econometric topics
nowadays (see, e.g., the seminal work by Engle and Granger (1987)).
Both the theoretical and the practical analyses of cointegration become simpler in
the framework of the models VAR, which is demonstrated by the following Example
12.7.
Example 12.7 Let us consider two time series {y1t} and {y2t} that can be modeled
simultaneously as the bivariate model VAR(1) of the form
! ! ! ! ! !
y1t y1,t1 ε1t 0:5 0:25 y1,t1 ε1t
¼Φ þ ¼ þ ,
y2t y2,t1 ε2t 1 0:5 y2,t1 ε2t
ð12:46Þ
where the bivariate white noise {εt} has a general covariance matrix Σ > 0. How-
ever, this model is not stationary, since the eigenvalues of matrix Φ in (12.46), i.e.,
the roots of polynomial equation det (λ I Φ) ¼ 0), are 0 and 1, so that one of them
does not lie inside the unit circle in complex plane. Each of marginal time series {y1t}
and {y2t} is nonstationary as well: if the model (12.46) is multiplied from the left-
hand side as
! ! ! !
1 0:5B 0:25B 1 0:5B 0:25B y1t ε1t

¼ , ð12:47Þ
B 1 0:5B B 1 0:5B y2t ε2t
then one obtains an equivalent model VAR(1)

1B 0 y1t 1 0:5B 0:25B ε1t
¼ : ð12:48Þ
0 1B y2t B 1 0:5B ε2t
It explicitly demonstrates that both marginal time series {y1t} and {y2t} can be
modeled as models ARIMA(0,1,1); therefore, these univariate time series include
stochastic linear trends and are nonstationary.
Now let us transform the time series {y1t} and {y2t} to time series {z1t} and {z2t}
and the white noise {εt} to another white noise {ut} by means of transformation
! ! ! !
z1t y1t 1 0:5 y1t
¼P ¼ ,
z2t y2t 2 1 y2t
! ! ! ! ð12:49Þ
u1t ε1t 1 0:5 ε1t
¼P ¼ :
u2t ε2t 2 1 ε2t
Since (12.46) can be rewritten as


y1t 1
y1,t1 ε1t
P ¼ PΦP P þP , ð12:50Þ
y2t y2,t1 ε2t
one obtains finally

z1t 0 0 z1,t1 u1t
¼ þ , ð12:51Þ
z2t 0 1 z2,t1 u2t
i.e., for individual equations
z1t ¼ u1t ,
ð12:52Þ
z2t ¼ z2,t1 þ u2t
(apparently, the time series {z1t} is directly a white noise and the time series {z2t} is a
random walk I(1)).
We arrived at an evident paradox: both original time series {y1t} and {y2t} are
nonstationary with nonstationarity caused according to (12.48) for each of time
series always by one unit root (i.e., as if two unit roots figured seemingly in the
system), while after the transformation only one time series {z2t} is nonstationary in
the system (it confirms the previous conclusion that the system considered as a
whole possesses only one unit root). This paradox may be explained just by the
existence of the cointegration relation
y1t þ 0:5y2t ¼ z1t ¼ u1t ð12:53Þ
(see (12.49) and (12.52)): both univariate time series {y1t} and {y2t} are
nonstationary, while their linear combination (12.53) is stationary.
⋄
The cointegration can be defined exactly by two equivalent ways (in both cases,
we confine ourselves to a special case that appears in practice most frequently). Let
{y1t}, . . ., {ymt} be nonstationary time series with the nonstationarity caused always
just by one unit root of the corresponding autoregressive polynomial (particularly it
may be y1t ~ I(1), . . ., ymt ~ I(1)). Then the time series {y1t}, . . ., {ymt} are
cointegrated if
(1) there exists their nontrivial (i.e., nonzero) linear combination that is stationary;
(2) equivalently: the corresponding model VAR of multivariate time series (y1t, . . .,
ymt)0 has m r unit roots, where r(0 < r < m) presents the number of
cointegration relations of the type (12.53).
For instance in Example 12.7: (1) both time series {y1t} and {y2t} are of the type
I(1) and their stationary linear combination exists (see (12.53)), or equivalently
(2) there exists their bivariate model VAR with one unit root (see (12.46)), i.e.,
m ¼ 2, r ¼ 1. Therefore, {y1t} and {y2t} are cointegrated with the single cointegration
relation (12.53) (if we ignore its scalar multiples).
Remark 12.9 More generally, one can define cointegration of order (d, b) (b > 0,
d > 0), where y1t ~ I(d), . . ., ymt ~ I(d ), and there exists a nontrivial linear
combination of given time series that is of the type I(db). Then one writes
{yt} ~ CI(d, b) (i.e., CoIntegrated).
⋄
1. EC Model (ECM)
When analyzing a univariate nonstationary time series, then the usual recommenda-
tion consists in differencing this time series at first (e.g., if {yt} ~ I(1), then one
transfers it to the first differences {Δyt}; see Sect. 6.4.2). However, when we deal
with more nonstationary variables observed in time and are interested in their mutual
link in time, then transferring to differences may be correct statistically, but the
model constructed for differenced variables may not recover relations of long-run
equilibrium among original (non-differenced) variables, which is an important
feature just in the case of cointegration.
For example, let us consider two time series {xt} and {yt}, which are both
nonstationary of the type I(1). There is a conjecture that the time series {xt}
influences {yt}. Since both time series are nonstationary, this conjecture could be
possibly investigated by means of the model
Δyt ¼ γ Δxt þ εt : ð12:54Þ
However, we are interested in the relation between variables x and y after its
balancing to a long-run equilibrium, when the accruals of variables within time
units are (nearly) zero. Therefore, the relation (12.54) has no informative value from
this point of view. The situation is different if the time series {xt} and {yt} seem to be
cointegrated in long-term horizon: then the model (12.54) can be corrected to the
form
Δyt ¼ γ Δxt þ α ðyt1 β xt1 Þ þ εt ð12:55Þ
by including a correction term that is based on level (and not on differenced) values
of given variables at previous time t 1. The model (12.55) describes not only the
short-run relation between accruals Δxt and Δyt, but simultaneously it guarantees
corrections in the case when short-run changes of both variables deviate the levels of
these variables from their long-run equilibrium state. Let us stress that the correction
of changes of variables x and y from time t 1 to time t is based on the correction
term constructed at time t 1, since its value at time t is not known yet when
corrections for this time are constructed. If the time series {xt} and {yt} are really
cointegrated and the correction term in (12.55) is chosen as the cointegration relation
providing a stationary time series, then all terms in the model (12.55) are stationary.
Thus, the situation when one uses both stationary and nonstationary terms
simultaneously in one model is avoided (it could cause problems when constructing
such a model).
As the terminology is concerned, the model of the type (12.55) is mostly called
EC model (error correction or equilibrium correction). Sometimes one calls it also
VEC model (vector error correction) to stress the VAR context. The terms of type
yt1 β xt1 are called error correction terms. The parameters of type β describe
long-run cointegration relations among variables and they are usually ordered to
so-called cointegration vectors of type (1, β)0 . The parameters of type γ describe
short-run cointegration relations among variables. Finally the parameters of type α
control the rate of adjustment to the equilibrium state. Moreover, there may be
intercepts or linear trends in the model (including the error correction terms), e.g.,
Δyt ¼ γ 1 þ γ 2 Δxt þ α ðyt1 β1 β2 xt1 Þ þ εt ð12:56Þ
(obviously, the parameter γ 1 means the intercept from the point of view of differen-
tial variables, but it means the deterministic linear trend from the point of view of
level variables).
2. EC Model Formulated as VAR
The theory of EC models (but also their practical testing and constructing) is the
most elaborate in the context of vector autoregressive models VAR. For instance, let
us consider the bivariate model VAR(1)

y1t y1,t1 ε1t
¼Φ þ ð12:57Þ
y2t y2,t1 ε2t
which can be rewritten in the form

Δy1t y1,t1 ε1t
¼Π þ , where Π ¼ Φ I: ð12:58Þ
Δy2t y2,t1 ε2t
The key role for classification of (12.58) as an EC model plays the rank r ¼ r(Π)
of matrix Π (in our case of the type 2 2), which is closely related to the form of
eigenvalues of matrix Φ or equivalently to the form of roots of autoregressive
polynomial Φ(z) ¼ I Φz:
1. r(Π) ¼ 0: In this case Π ¼ 0 so that according to (12.58) both time series {y1t}
and {y2t} are nonstationary of the type I(1), and no cointegration relation exists
between them.
2. r(Π) ¼ 2: In this case, Π has the full rank so that both eigenvalues of this matrix
are nonzero, and hence no root of polynomial Φ(z) is unit. Moreover, if we
assume that both roots of Φ(z) lie outside the unit circle in complex plane (i.e.,
both inverted roots of Φ(z), which are simultaneously the eigenvalues of matrix
Φ, lie inside the unit circle), then the model VAR model (12.57) is stationary, and
it makes no sense to transfer it by differencing of the type (12.58) to the EC
model.
3. r(Π) ¼ 1: This case is from the point of view of cointegration and EC method-
ology the most interesting. Just one of both eigenvalues of Π is nonzero, i.e., just
one of both roots of polynomial Φ(z) is unit. If we again assume that the
remaining root of Φ(z) lies outside the unit circle, then one can show that both
univariate time series {Δy1t} and {Δy2t} are stationary (more specifically, each of
non-differenced time series {y1t} and {y2t} is of the type ARIMA(1,1,1)). More-
over, (12.58) can be rewritten as
! ! ! ! !
Δy1t 0
y1,t1 ε1t α1 ε1t
¼ αβ þ ¼ β1 y1,t1 þ β2 y2,t1 þ
Δy2t y2,t1 ε2t α2 ε2t
ð12:59Þ
(the existence of bivariate column vectors α and β follows from the unit rank
r (Π) ¼ 1). The series {Δy1t} and {Δy2t} are stationary; hence also the time series
{β1y1,t1 + β2y2,t1} must be stationary (otherwise, (12.59) would equate station-
ary and nonstationary terms) representing the cointegration relation between {y1t}
and {y2t}. The construction of EC model (12.59) serves as an example of
application of Granger’s representation theorem (see, e.g., Engle and Granger
(1987)). Generally, it holds that the rank r of matrix Π equals the number of
cointegration relations (if we ignore scalar multiples of these relations) in the
corresponding EC model, while m r is the number of unit roots in the consid-
ered m-variate model VAR.
Remark 12.10 There is a direct analogy to DF test (Dickey and Fuller (1979); see
(6.81)), where the validity of null hypothesis
H0 : Δyt ¼ ψyt1 þ εt for ψ ¼ 0 ð12:60Þ
means the existence of unit root in the tested time series {yt} (obviously, ψ ¼
φ1– 1 ¼ π).
⋄
Example 12.8 Let us consider two time series {y1t} and {y2t} from Example 12.7
modeled simultaneously as the bivariate model VAR(1) of the form (12.46). This
model can be easily transferred to the form (12.59), namely
! ! ! !
Δy1t 0:5 0:25 y1,t1 ε1t
¼ þ
Δy2t 1 0:5 y2,t1 ε2t
! ! ! ð12:61Þ
0:5 y1,t1 ε1t
¼ ð1 0:5Þ þ
1 y2,t1 ε2t
(here the matrix Π has the rank r ¼ 1). The corresponding cointegration vector (just
one if we ignore its scalar multiples) is
β ¼ ð1, 0:5Þ0 ð12:62Þ
(see also (12.53)). The bivariate model has really just one unit root (see Example
12.7). The first and second components of vector α ¼ (0.5, 1)0 control the rate of
adjustment to the equilibrium state in the first and second equations, respectively.
The motivation for the model (12.61) can be presented also in another way: The
time series {y1t} and {y2t} considered individually ignoring their cointegration
relations possess altogether two unit roots (one for each of them; see (12.48)), i.e.,
more than one unit root of the bivariate VAR model (12.46). It implies that by
differencing each component to achieve stationarity we would overdifference the
given system (so-called overdifferencing), which has some negatitive consequences:
the invertibility may be damaged due to lagged terms of the type ε1,t1, estimation
and prediction problems may appear, and the like. On the other hand, if we correct
the model by subtracting the vector yt1 from both sides of (12.46) (as it is the case
of (12.61)), then the MA structure of this model does not change (of course, there
must be a compensation allowed for maintenance of invertibility, namely the
presence of the level variable yt1 in the model).
⋄
In general, the m-variate model VAR( p) of the form
yt ¼ Φ1 yt1 þ þ Φp ytp þ εt ð12:63Þ
constructed for components y1t ~ I(1), . . ., ymt ~ I(1) has the following EC re-
presentation:
Δyt ¼ Πyt1 þ Γ1 Δyt1 þ þ Γp1 Δytpþ1 þ εt , ð12:64Þ
where
Π ¼ Φ1 þ þ Φp I ¼ Φð1Þ,
ð12:65Þ
Γ1 ¼ Φ2 . . . Φp , . . . , Γp2 ¼ Φp1 Φp , Γp1 ¼ Φp :
If the rank r of matrix Π fulfills 0 < r < m (the boundary cases r ¼ 0 and r ¼ m are
discussed in the commentary below (12.58)), then (again according to Granger’s
theorem) there exist matrices α and β (both are rectangular m r and have the full
column rank r) so that Π ¼ αβ0 and, moreover, each component of the vector β0 yt
can be modeled as I(0). In other words, there exists an EC representation of the form
Δyt ¼ αβ0 yt1 þ Γ1 Δyt1 þ þ Γp1 Δytpþ1 þ εt ð12:66Þ
with r cointegration relations (each column of the matrix β corresponds to one of

these cointegration relations).
⋄
Remark 12.11 The matrices α and β in (12.66) are not constructed unambiguously:
if γ is an arbitrary regular matrix of the type r r, then the matrices αγ1 and βγ0
present further possible decomposition of the matrix Π. Therefore, various adjust-
ments are recommended in practice, e.g., normalizations of parametric matrices α
and β or application of prescribed a priori constraints.
It is also usual in practice that various exogenous variables figure on the right-
hand side of the EC model (12.66): intercepts, polynomial trends, dummy variables
(e.g., in seasonal models, the dummy variables should be centered in a suitable way;
see Johansen (1995)) or exogenous variables from other systems. As it has signif-
icant consequences for tests and construction of cointegrated models, the
corresponding software systems usually enable to classify various types of EC
models:
• The cointegration relations contain intercepts, e.g., instead of (12.59) it holds

Δy1t α1 ε1t
¼ δ0 þ β1 y1,t1 þ β2 y2,t1 þ : ð12:67Þ
Δy2t α2 ε2t
• The EC model contains intercepts outside of the cointegration relations, e.g.,

instead of (12.59) it holds

Δy1t φ10 α1 ε1t
¼ þ β1 y1,t1 þ β2 y2,t1 þ ð12:68Þ
Δy2t φ20 α2 ε2t
(as a matter of fact, the parameters φ10 and φ20 mean the intercepts from the point
of view of differential variables, but they mean the deterministic linear trends
from the point of view of level variables y1 and y2).
• The intercepts are included both in the cointegration relations and outside of
them: in such a case, one must distinguish strictly among “inner” and “outer”
intercepts (otherwise, there may be problems with the identification of these
models).
• The cointegration relations contain linear trends, e.g., instead of (12.59) it holds

Δy1t α1 ε1t
¼ δ0 þ δ1 t þ β1 y1,t1 þ β2 y2,t1 þ : ð12:69Þ
Δy2t α2 ε2t
⋄
3. Testing of Cointegration
Testing of cointegration should confirm that the tested VAR model contains just the
given number r of cointegration relations, which is important information when
constructing a suitable EC model. The cointegration is declared in the case when
r > 0 (in particular, the case of stationary model VAR with r ¼ m can be also
regarded as a special cointegrated state, in which each equation represents directly
one of m cointegration relations).
Engle and Granger (1987) suggested a simple testing of cointegration among
variables y, x2, . . ., xk. Their EG test is based on a simple idea: if the given variables
are cointegrated, then the OLS residuals bεt calculated by the least squares method in
the model
yt ¼ β1 þ β2 xt2 þ β3 xt3 þ þ βk xtk þ εt ð12:70Þ
should be of the type I(0). Therefore, it is sufficient to modify DF test (Dickey and
Fuller (1979); see (6.81)) and to test the null hypothesis
H0 : Δbεt ¼ ψbεt1 þ ut for ψ ¼ 0: ð12:71Þ
The only difference from the classical DF test is due to the fact that in (12.71) one
applies the residuals estimated from a specific model so that the critical values of the
classical DF test cannot be used. The relevant critical values which are more negative
than the ones for DF test are tabulated by means of simulations in Engle and Granger
(1987) and Engle and Yoo (1987). On the other hand, this approach has some
drawbacks, namely:
• In the case of nonstationary variables, the OLS estimate of model (12.70) may not
be reliable.
• In the case of more cointegration relations, one cannot decide which of them is in
fact by means of (12.70) just estimated (what about receiving a “more intensive”
cointegration relation after reordering given variables?).
Nowadays in practice the cointegration is mostly tested by means of Johansen
tests (see Johansen (1991)). The method is based on ML estimate of so-called
canonical correlations which measure the partial correlations among m-variate
vectors Δyt and yt1 under fixed values of vectors Δyt1, . . ., Δytp+1 in the EC
model (12.64). These canonical correlations are the square roots of eigenvalues λ1,
. . ., λm of a positive definite matrix that is closely related to the matrix Π. The ML
estimates b λ1 , . . . , b
λm of these eigenvalues based on y1, . . ., yn fulfill
1b
λ1 bλ2 b
λm 0: ð12:72Þ
In particular, the number of nonzero (positive) values λ1, . . ., λr (λr+1 ¼ λr+2 ¼ . . . ¼ λm ¼ 0)

is equal to the rank r of matrix Π, i.e., to the number of cointegration relations in the EC
model (12.64).
Johansen tests are constructed to test that the eigenvalues λ are zero. As a matter
of fact, these are LR tests (i.e., tests based on the concept of likelihood ratio), whose
critical values are not generated by means of χ 2-distribution, but by means of
simulations (see MacKinnon et al. (1999)). Moreover, two types of these tests are
used in practice:
• Johansen test with statistics
X
m
λtrace ðr Þ ¼ n ln 1 b
λi ð12:73Þ
i¼rþ1
is a compound test of the null hypothesis that the number of cointegration

relations is at most r against the alternative that this number is higher than r.
This test rejects the null hypothesis, if λtrace(r) is higher than the corresponding
critical value (by intuition,
the closer
to the upper unit bound is the value b λi , the
higher is the value ln 1 b λi , while for the lower bound b λi ¼ 0 one has

ln 1 b λi ¼ 0Þ: It is tested gradually for r ¼ 0, 1, . . ., m 1.
• Johansen test with statistics

λmax ðr Þ ¼ n ln 1 b
λrþ1 ð12:74Þ
is a test of the null hypothesis that the number of cointegration relations is

r against the alternative that it is r + 1. This test rejects the null hypothesis, if
λmax(r) is higher than the corresponding critical value. It is tested gradually for
r ¼ 0, 1, . . ., m 1.
Example 12.9 Table 12.16 and Fig. 12.10 present the monthly yields to maturity
for 3-month T-bills (TB3 in % p.a.) and for corporate bonds of the highest rating
AAA by S&P (AAA in % p.a.) during 10-year period 1985–1994 in the USA. The
first differences of these time series have been analyzed as the stationary model
VAR(2) in Example 12.3. Here we shall test an eventual cointegration of these time
series in their non-differenced (i.e., nonstationary) form, since cointegration is a
typical relationship among important (log) returns on financial markets (Fig. 12.10
reflects similar courses of time series TB3 and AAA as well).
Table 12.17 presents the results of Johansen tests (12.73) and (12.74) by means of
EViews if one models the time series TB3 and AAA as VAR(3) (i.e., the model
(12.64) for the first differences DTB3 and DAAA has the order p 1 ¼ 2). Both the
test with statistics λtrace(r) and the test with statistics λmax(r) indicate with signifi-
cance level of 5% just one cointegration relation (r ¼ 1).
⋄
Table 12.16 Monthly data in Example 12.9 (the monthly yields to maturity for 3-month T-bills
and corporate bonds AAA in the USA in % p.a.)
Obs TB3 AAA Obs TB3 AAA Obs TB3 AAA
1985M01 7.76 12.08 1988M05 6.27 9.90 1991M09 5.25 8.61
1985M02 8.22 12.13 1988M06 6.50 9.86 1991M10 5.03 8.55
1985M03 8.57 12.56 1988M07 6.73 9.96 1991M11 4.60 8.48
1985M04 8.00 12.23 1988M08 7.02 10.11 1991M12 4.12 8.31
1985M05 7.56 11.72 1988M09 7.23 9.82 1992M01 3.84 8.20
1985M06 7.01 10.94 1988M10 7.34 9.51 1992M02 3.84 8.29
1985M07 7.05 10.97 1988M11 7.68 9.45 1992M03 4.05 8.35
1985M08 7.18 11.05 1988M12 8.09 9.57 1992M04 3.81 8.33
1985M09 7.08 11.07 1989M01 8.29 9.62 1992M05 3.66 8.28
1985M10 7.17 11.02 1989M02 8.48 9.63 1992M06 3.70 8.22
1985M11 7.20 10.55 1989M03 8.83 9.80 1992M07 3.28 8.07
1985M12 7.07 10.16 1989M04 8.70 9.79 1992M08 3.14 7.95
1986M01 7.04 10.05 1989M05 8.40 9.57 1992M09 2.97 7.92
1986M02 7.03 9.67 1989M06 8.22 9.10 1992M10 2.84 7.99
1986M03 6.59 9.00 1989M07 7.92 8.93 1992M11 3.14 8.10
1986M04 6.06 8.79 1989M08 7.91 8.96 1992M12 3.25 7.98
1986M05 6.12 9.09 1989M09 7.72 9.01 1993M01 3.06 7.91
1986M06 6.21 9.13 1989M10 7.59 8.92 1993M02 2.95 7.71
1986M07 5.84 8.88 1989M11 7.67 8.89 1993M03 2.97 7.58
1986M08 5.57 8.72 1989M12 7.64 8.86 1993M04 2.89 7.46
1986M09 5.19 8.89 1990M01 7.64 8.99 1993M05 2.96 7.43
1986M10 5.18 8.86 1990M02 7.76 9.22 1993M06 3.10 7.33
1986M11 5.35 8.68 1990M03 7.87 9.37 1993M07 3.05 7.17
1986M12 5.49 8.49 1990M04 7.78 9.46 1993M08 3.05 6.85
1987M01 5.45 8.36 1990M05 7.78 9.47 1993M09 2.96 6.66
1987M02 5.59 8.38 1990M06 7.74 9.26 1993M10 3.04 6.67
1987M03 5.56 8.36 1990M07 7.66 9.24 1993M11 3.12 6.93
1987M04 5.76 8.85 1990M08 7.44 9.41 1993M12 3.08 6.93
1987M05 5.75 9.33 1990M09 7.38 9.56 1994M01 3.02 6.92
1987M06 5.69 9.32 1990M10 7.19 9.53 1994M02 3.21 7.08
1987M07 5.78 9.42 1990M11 7.07 9.30 1994M03 3.52 7.48
1987M08 6.00 9.67 1990M12 6.81 9.05 1994M04 3.74 7.88
1987M09 6.32 10.18 1991M01 6.30 9.04 1994M05 4.19 7.99
1987M10 6.40 10.52 1991M02 5.95 8.83 1994M06 4.18 7.97
1987M11 5.81 10.01 1991M03 5.91 8.93 1994M07 4.39 8.11
1987M12 5.80 10.11 1991M04 5.67 8.86 1994M08 4.50 8.07
1988M01 5.90 9.88 1991M05 5.51 8.86 1994M09 4.64 8.34
1988M02 5.69 9.40 1991M06 5.60 9.01 1994M10 4.96 8.57
1988M03 5.69 9.39 1991M07 5.58 9.00 1994M11 5.25 8.68
1988M04 5.92 9.67 1991M08 5.39 8.75 1994M12 5.64 8.46
Source: FRED (Federal Reserve Bank of St. Louis)
Fig. 12.10 Monthly data in 14

Example 12.9 (the monthly
yields to maturity for 12
3-month T-bills and
corporate bonds AAA in the 10
USA in % p.a.)
8
2
85 86 87 88 89 90 91 92 93 94
TB3 (%) AAA (%)
Table 12.17 Johansen tests of cointegration from Example 12.9 by means of EViews (TB3 and
AAA)
Sample (adjusted): 1985M04 1994M12
Trend assumption: Linear deterministic trend
Series: TB3 AAA
Lags interval (in first differences): 1 to 2
Unrestricted cointegration rank test (Trace)
Hypothesized Trace 0.05
No. of CE(s) Eigenvalue Statistic Critical value Prob.**
None* 0.132054 18.61642 15.49471 0.0164
At most 1 0.017337 2.046241 3.841466 0.1526
Trace test indicates 1 cointegrating eqn(s) at the 0.05 level
Unrestricted cointegration rank test (Maximum eigenvalue)

Hypothesized Max-Eigen 0.05
No. of CE(s) Eigenvalue Statistic Critical value Prob.**
None* 0.132054 16.57018 14.26460 0.0212
At most 1 0.017337 2.046241 3.841466 0.1526
Max-eigenvalue test indicates 1 cointegrating eqn(s) at the 0.05 level
*Rejection of the hypothesis at the 0.05 level
**MacKinnon–Haug–Michelis (1999) p-values
4. Construction of EC Model
The construction of EC models can be described in the following steps (for simplic-
ity, we assume that the m-variate time series y1, . . ., yn is either stationary or
nonstationary of the type I(1), i.e., it can be stationarized by transferring it to the
time series of first differences):
1. One applies the tests of unit root (e.g., DF and ADF tests from Sect. 6.4.1) for
particular univariate time series {y1t}, . . ., {ymt}. If the null hypotheses of unit
roots are rejected, then these time series are stationary (except for possible
deterministic trends), and one constructs for y1, . . ., yn a model VAR (possibly
with deterministic trends as exogenous variables; see Sect. 12.2). Otherwise due
to unit roots, the given time series contain stochastic trends, and one proceeds to
the step 2.
2. One applies Johansen (or other) tests of cointegration (possibly including inter-
cepts, linear trends, and the like). If the cointegration is rejected (r ¼ 0), one
proceeds to the step 3. If it is not the case and the existence of r cointegration
relations is confirmed (0 < r < m), then one proceeds to the step 4 (the case of
r ¼ m is excluded due to the step 1).
3. Since the cointegration was rejected in the previous step of the algorithm, one
constructs the corresponding model VAR for the stationary time series Δy1, . . .,
Δyn.
4. Since there exist r cointegration relations (0 < r < m), the step 3 is ignored, and
one constructs the corresponding EC model (12.66) for the original time series y1,
. . ., yn. The estimation procedure can combine LM method and OLS method (see
Johansen (1995)). In particular, the maximal value of logarithmic likelihood is

np X
r
c ln 1 b
λi , ð12:75Þ
2 i¼1
λ1 , . . . , b
where c is a constant (independent of r) and b λr are positive numbers as in
(12.72). This estimation procedure can be supplemented by a priori constraints
for the parameters in matrices α and β in (12.66).
Example 12.10 In Table 12.18, the time series {TB3t} and {AAAt} from Examples
12.1 and 12.3 are estimated by means of EViews (the first differences of these time
series are denoted as {DTB3t} and {DAAAt}). Here the order of the VAR model
(12.66) is p 1 ¼ 2 (the order of the original model before differencing is p ¼ 3),
and the intercepts are explicitly estimated both in the model (12.66) and in its
cointegration relation. The estimated EC model from Table 12.18 has the explicit
form
! ! !
DTB3t 0:01 0:01
¼ þ ð23:06 þ TB3t1 3:21AAAt1 Þþ
DAAAt 0:03 0:03
! ! ! ! !
0:44 0:11 DTB3t1 0:04 0:16 DTB3t2 ε1t
þ þ þ :
0:10 0:53 DAAAt1 0:10 0:27 DAAAt2 ε2t
The cointegration vector (1, 3.21)0 is normed by means of an a priori constraint in

order of its first component being unit.
⋄
12.6 Exercises 349
Table 12.18 Construction of Vector error correction estimates

EC model from Example
Sample (adjusted): 1985M04 1994M12
12.10 by means of EViews
(TB3 and AAA) Included observations: 117 after adjustments
Cointegrating Eq CointEq1
TB3(1) 1.000000
AAA(1) 3.212049
(0.53460)
C 23.06448
Error correction: D(TB3) D(AAA)
CointEq1 0.014189 0.029656
(0.00774) (0.00725)
D(TB3(1)) 0.438843 0.097404
(0.11419) (0.10687)
D(TB3(2)) 0.043494 0.103530
(0.11166) (0.10451)
D(AAA(1)) 0.110342 0.534480
(0.11089) (0.10378)
D(AAA(2)) 0.161987 0.267855
(0.11037) (0.10330)
C 0.014468 0.027313
(0.01882) (0.01761)
S.E. equation 0.200969 0.188087
12.6 Exercises
Exercise 12.1
Repeat Johansen tests of cointegration for time series {TB3t} and {AAAt} from
Example 12.9, but only for the 5-year period 19901994 (hint: both the test with
λtrace(r) and λmax(r) indicate with significance level of 5% no cointegration relation
(r ¼ 1)).
Chapter 13
Multivariate Volatility Modeling
The models of volatility in Chap. 8 are univariate, i.e., they model the volatility quite
independently on other time series. It may be a drawback (particularly in finance)
since
• The effect of volatility spillover among various financial markets or among
various assets within the same financial market is a typical phenomenon in
finance.
• Correlations among particular components play a key role when constructing and
managing (diversified) investment portfolios.
For instance, let us consider so-called dynamic hedging applied frequently when
reducing investment risk. The dynamic hedging is mostly realized in such a way that
the investor simultaneously enters opposite positions on markets with a mutually
inverse behavior, e.g., on spot and future markets (in this context, the position
characterized as a purchase operation is denoted as long, and similarly, the position
characterized as a sale operation is denoted as short). The pragmatic investors
suppose that potential losses in one market may be balanced by profits from another
market that behaves just inversely. Therefore, they follow so-called hedging ratio
h that presents the number of units of future contracts per one unit of (spot) assets
and should be optimal in terms of the risk reduction. For example, in the case of
so-called short hedge of an investor, which means a long position in assets and
simultaneously a short position in futures, the value of total investor’s position
(during hedging till the maturity of futures) changes by ΔS – hΔF, where ΔS and
ΔF are the corresponding change in spot and future price, respectively. One can
easily show that the optimal hedging ratio minimizing var(ΔS – hΔF) is
σ st
ht ¼ ρt ,
σ ft
where σ st is the risk of spot market (i.e., the standard deviation of spot prices), σ ft is
the risk of future market (i.e., the standard deviation of future prices), and finally ρt is

https://doi.org/10.1007/978-3-030-46347-2_13
352 13 Multivariate Volatility Modeling
the correlation coefficient between both time series of prices (the time index
t emphasizes the fact that the described hedging is dynamic with corrections of
hedging ratio realized in particular times). Obviously, in order to calculate {ht} one
makes use of the time series {ρt} that records the dynamics of correlation between
both time series.
In general, the multivariate volatility modeling plays an important role for the risk
control (e.g., for portfolio investment, but also for internal models in the framework
of regulatory methodologies Basel III of capital adequacy in banks or Solvency II in
insurance companies including commercial products of the type RiskMetrics, and
the like).
13.1 Multivariate Models EWMA
The modeling of multivariate volatility may be approached in an analogous way as

in the univariate case. However, in addition to the univariate modeling used in
Chap. 8, now one must model also the correlatedness among time series denoted
as mutual volatility (or also covolatility).
Commercially (e.g., in the commercial system RiskMetrics; see Sect. 8.3), one
frequently exploits the multivariate models EWMA (i.e., the exponentially weighted
moving averages from Sect. 8.3.1), where the covariances or “covolatilities” among
particular time series {yit} and {yjt} are calculated recursively as
X
1
σ ij,t ¼ ð1 λÞ
b λk yi,t1k yi y j,t1k yj
k¼0

¼ ð1 λÞ yi,t1 yi y j,t1 y j þ λb
σ ij,t1 : ð13:1Þ
Here the estimated covariance b σ ij,t presents the mutual (co)volatility prediction from
time t 1, yi and y j are mean levels of given time series {yit} and {yjt}, and λ
(0 < λ < 1) is a discount constant chosen in advance. In the case of time series with
levels close to zero (which is mostly the case of log returns) one usually applies zero
mean returns in (13.1) so that the problem of their suitable choice is removed. Some
authors (e.g., Fleming et al. (2003)) summarize multivariate EWMA relations to the
matrix form
b t ¼ α exp ðαÞðyt1 yÞðyt1 yÞ0 þ exp ðαÞΣ

Σ b t1 ð13:2Þ
using a decay rate α.

Remark 13.1 For the sake of completeness, the historical volatility approach (8.17)
can be also generalized to the following multivariate form:
13.3 Multivariate GARCH Models 353
Pt1
bt ¼ τ¼ tk ðyτ yÞðyτ yÞ0
Σ ð13:3Þ
k1
for a suitable length of sample period k. The values (13.3) are also denoted as
multivariate SMA of length k (simple moving average; see, e.g., Chiriac and Voev
(2011)).
⋄
13.2 Implied Mutual Volatility
It is a direct analogy of the univariate implied volatility from Sect. 8.3.2. The mutual
(i.e., multivariate) volatility may be implied, e.g., by means of currency option. For
instance, if we deal with implied mutual volatility between currency rates USD/EUR
and USD/CNY, then it can be calculated (in time) by means of relation
b
σ 2USD=EUR þ b
σ 2USD=CNY b
σ 2EUR=CNY
b
σ USD=EUR,USD=CNY ¼ , ð13:4Þ
2
where bσ 2 USD/EUR is the implied volatility of currency rate return USD/EUR and
similarly for bσ 2 USD/CNY and b
σ 2 EUR/CNY (these implied volatilities are constructed
by means of quoted option premiums for returns of particular currency rates; see
Sect. 8.3.2).
13.3 Multivariate GARCH Models
It is not surprising that one tries to construct multivariate generalization of univariate

models GARCH extending the principle of univariate conditional heteroscedasticity
from Sect. 8.3 to mutual volatility. For instance, in a very simple bivariate GARCH
(1,1) model, one can exploit the following (vector) volatility equation:
!
σ 11,t α10 α11 α12 e21,t1 β11 β12 σ 11,t1
¼ þ þ , ð13:5Þ
σ 22,t α20 α21 α22 e22,t1 β21 β22 σ 22,t1
where, e.g., σ 11,t denotes the volatility in the first component {y1t}. Apparently, it is
a direct generalization of the volatility equation from the univariate model (8.59)
σ 2t ¼ α0 þ α1 e2t1 þ β1 σ 2t1 : ð13:6Þ

Moreover, in practice one often combines the volatility equation with the mean
equation (e.g., the mean equation in Example 13.1 is based on the VAR
methodology).
Example 13.1 Tsay (2002) constructed for monthly log returns of stocks IBM (time
series {rt1}) and index S&P 500 (time series {rt2}) during period 1926–1999 the
bivariate model of the form
r 1t ¼ 1:351 þ 0:072r 1, t1 þ 0:055r 1, t2 0:119r 2, t2 þ e1t ,

r 2t ¼ 0:703 þ e2t ,
2 !
σ 11,t 2:98 0:079 e1,t1 0:873 0:031
¼ þ þ
σ 22,t 2:09 0:042 0:045 2
e2,t1 0:066 0:913

σ 11,t1
:
σ 22,t1
This model obviously combines the bivariate model VAR(2) from Sect. 12.2
representing the mean equation and the bivariate GARCH(1,1) model (13.5)
representing the volatility equation (insignificant parameters are omitted). One
assumes the constant correlation between {et1} and {et2} estimated as 0.614.
⋄
In general, one can extend the univariate principle of conditional
heteroscedasticity in (8.16) to the m-variate case as
1=2
yt ¼ Σt εt , ð13:7Þ
where εt are iid m-variate random vectors with zero mean vector and unit covariance
matrix, i.e., {εt} is an iid multivariate white noise with

Eðεt Þ ¼ 0, varðεt Þ ¼ E εt ε0t ¼ I ð13:8Þ
and Σt1/2 is the square root matrix of the conditional covariance matrix Σt expressed
in time t as a suitable function of the information known till time t 1. The matrix Σt
(or also Ht by some authors) may be looked upon as (co)volatility matrix since its
diagonal elements are univariate volatilities of particular univariate components of
{yt} and the elements outside the main diagonal are mutual volatilities
(or covolatilities) of these components (here the logical denotation is σ 11,t ¼ σ 1t2,
and the like). Moreover, it is necessary to respect some practical aspects which are
important for construction of such models using real data:
• One must guarantee that the matrix Σt produced by the model at time t is positive
definite (and therefore also symmetric).
• The model parametrization must be parsimonious (otherwise for higher dimen-

sions m 3, there can appear serious problems to identify and estimate such
models).
The models, which are based on the principle (13.7) (and in addition are accept-
able from the practical point of view), can be roughly classified to (1) models of
conditional covariance matrix, (2) models of conditional variances and correla-
tions, and (3) factor models. We shall show important representatives of these
classes now, even though the development of new approaches in this context is
dramatic (see also Bauwens et al. (2006), Clements et al. (2012), Kroner and Ng
(1998), McNeil et al. (2005), Silvennoinen and Teräsvirta (2009), Tsay (2002), and
others). Moreover, the models can be completed by leverage effect
arrangements, etc.
13.3.1 Models of Conditional Covariance Matrix
These models attempt to model the conditional covariance matrix (i.e., volatility
matrix) directly using similar model instruments as the univariate GARCH models:
1. VEC Model
Vector model GARCH denoted simply as VEC (suggested by Bollerslev et al.
(1988)) models the volatility matrix Σt in (13.7) as
X
r Xs
vechðΣt Þ ¼ a0 þ Ai vech yti y0ti þ Bj vech Σtj , ð13:9Þ
i¼1 j¼1
where a0 is a vector and Ai and Bj are square matrices of parameters. The symbol
vech() is so-called vector half operator which stacks the lower triangular part of a
m m matrix as a m(m + 1)/2 1 vector. For instance, the bivariate model GARCH
(1,1) of this type has
0 1 0 1 0 1
σ 11,t σ 21t y21,t1
B C B C B C
vechðΣt Þ ¼ @ σ 21,t A ¼ @ σ 21,t A, vech yt1 y0t1 ¼ @ y1,t1 y2,t1 A ð13:10Þ
σ 22,t σ 22t y22,t1
(the matrices A1 and B1 must be of the type 3 3 so that the corresponding
volatility equation contains 21 unknown parameters). In the general VEC model,
each element of Σt is a linear function of the lagged squared and cross-product
observations and lagged values of the elements of Σt. The total number of unknown
parameters [1 + (r + s)m(m + 1)/2]m(m + 1)/2 increases with growing dimensions to
an intolerable level so that for higher m the model VEC cannot be recommended in
routine practice.
2. DVEC Model
Diagonal vector model DVEC (see Bollerslev et al. (1988)) is a special case of VEC
model (13.9) with diagonal parametric matrices Ai and Bj. It reduces not only the
number of parameters to be estimated but also enables to rewrite the model (13.9) to
a more transparent form
X
r Xs
Σt ¼ Α0 þ Αi ∘ yti y0ti þ B j ∘Σtj , ð13:11Þ
i¼1 j¼1
where the parametric matrix A0 is m m symmetric with positive elements on the

main diagonal, the parametric matrices Ai and Bj are m m symmetric with
nonnegative elements on the main diagonal, and the symbol “∘” denotes Hadamar
(i.e., elementwise) product of two matching matrices. Then, e.g., the volatility
matrix (13.11) for the bivariate model GARCH(1,1) of type DVEC can be written
by means of three scalar (co)volatility equations:
σ 11,t ¼ α11,0 þ α11,1 y21,t1 þ β11,1 σ 11,t1 ,

σ 12,t ¼ α12,0 þ α12,1 y1,t1 y2,t1 þ β12,1 σ 12,t1 , ð13:12Þ
σ 22,t ¼ α22,0 þ α22,1 y22,t1 þ β22,1 σ 22,t1 :
The volatilities of both components of {yt} are modeled in the same way as in the
univariate GARCH(1,1) models of Sect. 8.3.5. The (scalar) equation of mutual
volatility has a similar structure, but now with the product of lagged values
y1,t1 ∙ y2,t1. Unfortunately, the natural property that the volatility of a component
is impacted by higher absolute values of another component in past time is not
guaranteed here.
The sufficient condition of positive definiteness of matrix Σt requires that the
matrix A0 is positive definite and the matrices Ai and Bj positive semidefinite. It can
be achieved by several alternative parametrizations:
(i) A0 ¼ A01/2(A01/2)0 , Ai ¼ Ai1/2(Ai1/2)0 , Bj ¼ Bj1/2(Bj1/2)0 , where matrices A01/2,
Ai , and Bj1/2 are lower triangular matrices of Cholesky decomposition (see
1/2
(12.16)).
(ii) A0 ¼ A01/2(A01/2)0 , Ai ¼ ai ai0 , Bj ¼ bj bj0 , where ai and bj are m-variate
parametric vectors.
(iii) A0 ¼ A01/2(A01/2)0 , Ai ¼ αi Im, Bj ¼ βj Im, where αi and βj are positive scalar
parameters and Im denotes the unit matrix of dimension m (the multivariate model
EWMA (13.1) can be looked upon as a special case of it). In particular, this case can
reduce the number of parameters substantially when the dimension m is higher.
3. BEKK Model
Model BEKK denoted by initials of its authors (Baba, Engle, Kraft, Kroner; see
Engle and Kroner (1995)) guarantees (automatically in comparison with the previous
models) the positive definiteness of volatility matrix Σt. Namely, this matrix is
modeled as
X
r X
s
Σt ¼ A0 þ A0i yti y0ti Ai þ B0j Σtj B j , ð13:13Þ
i¼1 j¼1
where all parametric matrices are quite general of dimension m m and only A0 is
required to be symmetric and positive definite (it can be achieved by a suitable
parametrization of A0 as in the case of model DVEC; see above). For instance, the
volatility matrix (13.13) for the bivariate model GARCH(1,1) of type BEKK can be
written by means of three scalar (co)volatility equations:
σ 11,t ¼ α11,0 þ α211,1 y21,t1 þ 2α11,1 α21,1 y1,t1 y2,t1 þ α221,1 y22,t1 þ β211,1 σ 11,t1 þ
þ2β11,1 β21,1 σ 12,t1 þ β221,1 σ 22,t1 ,
ð13:14Þ
σ 12,t ¼ α12,0 þ α11,1 α12,1 y21,t1 þ ðα11,1 α22,1 þ α12,1 α21,1 Þy1,t1 y2,t1 þ α21,1 α22,1 y22,t1 þ

þβ11,1 β12,1 σ 11,t1 þ β11,1 β22,1 þ β12,1 β21,1 σ 12,t1 þ β21,1 β22,1 σ 22,t1 ,
ð13:15Þ
σ 22,t ¼ α22,0 þ α212,1 y21,t1 þ 2α12,1 α22,1 y1,t1 y2,t1 þ α222,1 y22,t1 þ β212,1 σ 11,t1 þ
þ2β12,1 β22,1 σ 12,t1 þ β222,1 σ 22,t1 :
ð13:16Þ
The model BEKK has finally the desirable property, namely that the volatility of any
component may be impacted by higher absolute values of other component in past
time (e.g., the higher absolute value y2, t1 in (13.14) raises volatility σ 11, t). If no
interactions among volatilities occur, then it should be α21,1 ¼ β21,1 ¼ 0 in
Eq. (13.14) and α12,1 ¼ β12,1 ¼ 0 in Eq. (13.16) so that only the parameters of the
type aii,k impact the mutual volatilities, namely the parameters α11,1, α22,1, β11,1, β22,1
in Eq. (13.15) (however, it does not mean that the model BEKK is transferred to the
model DVEC in such a case).
Particular parametrizations reducing the total number of parameters are, e.g.:
(i) Diagonal BEKK models with diagonal matrices Ai and Bj.
(ii) Scalar BEKK models with matrices Ai ¼ αi Im and Bj ¼ βj Im, where αi and βj
are positive scalar parameters.
13.3.2 Models of Conditional Variances and Correlations
The models of this type primarily model the conditional correlation matrix, while the
volatilities of particular scalar components are modeled by means of the univariate
GARCH instruments as in Sects. 8.3.5 and 8.3.6.
1. CCC Model
Model CCC (constant conditional correlations; see Bollerslev (1990)) applies for
(13.7) the conditional covariance matrix Σt of the form
Σt ¼ Δt R Δt , ð13:17Þ
where R is the constant correlation matrix (particularly it must be positive definite)

and Δt ¼ diag{√σ 1t, . . ., √σ mt} is the diagonal matrix with diagonal elements, which
are the square roots of univariate volatilities and therefore can be modeled by means
of univariate GARCH models as in (8.55) including the parameter constraints (8.56),
i.e.:
X
rk X
sk
σ kk,t ¼ α0k þ αki y2k,ti þ βkj σ kk,tj , k ¼ 1, . . . , m ð13:18Þ
i¼1 j¼1
(the modifications from Sect. 8.3.6 are also possible, e.g., EGARCH and others).
According to (13.7) and (13.17) the transformed process
zt ¼ Δ1
t yt ð13:19Þ
is obviously a multivariate white noise with the constant correlation matrix R.

Therefore, the transformation (13.19) is sometimes denoted as devolatilization
(or standardization) and R can be looked upon as the (unconditional) correlation
matrix of the devolatilized values zkt ¼ ykt/√σ kk,t. To perform the devolatilization in
practice, one can utilize the volatilities estimated by means of the univariate GARCH
models (13.18) (see Sect. 8.3.5).
A simple special case of the model CCC is a model with the unit correlation
matrix R ¼ I. Then the process {yt} is the multivariate white noise with mutually
uncorrelated components modeled simply by means of univariate models GARCH.
On the other hand, the model CCC is not usually suitable for financial applica-
tions that require not only the dynamical modeling of conditional variances but also
the dynamical modeling of conditional correlations. Moreover, the model CCC does
not enable to model the situation common in the context of multivariate risk when
high absolute returns in one component give rise to higher future volatility of other
components. Therefore, the models DCC are preferred (see below).
2. DCC Model
Model DCC (dynamic conditional correlations; see, e.g., Engle (2002), Tse and Tsui
(2002)) has the conditional covariance matrix Σt figuring in (13.7) of the form
Σt ¼ Δt Rt Δt , ð13:20Þ
where Δt ¼ diag{√σ 1t, . . ., √σ mt} is similarly to (13.17) the diagonal matrix with
diagonal elements, which are the square roots of univariate volatilities so that one
can again utilize univariate GARCH models to estimate them. In DCC models, the
matrix Rt can vary in time and one obtains it as
Rt ¼ diagfQt g1=2 Qt diagfQt g1=2 , ð13:21Þ
namely by rescaling a dynamic matrix process {Qt}. Engle (2002) suggested

a possible alternative for this process as
!
X
r X
s X
r X
s
Qt ¼ 1 αi βj R þ αi zti z0ti þ β j Qtj , ð13:22Þ
i¼1 j¼1 i¼1 j¼1
where R is the unconditional (positive definite) correlation matrix of devolatilized

values zt ¼ Δ1
t yt and the parameters αi and βj in (13.22) fulfill the following
constraints:
αi 0, β j 0, α1 þ . . . þ αr þ β1 þ . . . þ βs < 1: ð13:23Þ
The model DCC is inspired by the univariate GARCH model (8.55) since such a
model under the assumption of stationarity with a constant (unconditional) variance
σ 2 can be rewritten in the form
!
X
r X
s X
r X
s
σ 2t ¼ 1 αi β j σ2 þ αi y2ti þ β j σ 2tj : ð13:24Þ
i¼1 j¼1 i¼1 j¼1
According to this analogy, the correlation matrix R in (13.22) can be looked upon as
such a part of the volatility equation that models the systematic correlatedness.
Apparently if all parameters αi and βj are zero, then we return to the CCC model
(13.17). To estimate DCC models, one can use special methods which are again
based on the devolatilized values estimated by means of univariate GARCH models.
In particular, the matrix R for (13.22) can be set to its empirical counterpart (e.g.,
applying the rolling window sample estimation to devolatilized values) or it can be
estimated as a parametric (positive definite) matrix in addition to remaining param-
eters of the type α and β. An alternative form of the dynamic matrix process {Qt} in
(13.22) was suggested by Tse and Tsui (2002).
13.3.3 Factor Models
Engle et al. (1990) suggested a parametrization of the conditional covariance matrix

Σt using the idea that the comovements of asset returns (e.g., stock returns) are
driven by a small number of common underlying variables called factors (see also
Lin (1992), Vrontos et al. (2003), and others). Moreover, these factors are frequently
conditionally heteroscedastic and possess the GARCH structure.
The factor approach has an advantage that it reduces the dimensionality when the
number of factors K relative to the dimension m of given multivariate time series {yt}
is small. Engle et al. (1990) defined their factor models as follows. They assumed
that the conditional covariance matrix Σt is generated by K (K < m) factor volatilities
fkk,t corresponding to K underlying (not necessarily uncorrelated) factors, i.e.:
X
K
Σt ¼ Ω þ WFt W0 ¼ Ω þ wk w0k f kk,t , ð13:25Þ
k¼1
where Ω is an m m positive semidefinite matrix, W is m K weight (or

transformation) matrix with linearly independent m 1 columns wk, and Ft ¼ diag
{f11,t, . . ., fKK,t} is a K K diagonal matrix. It is assumed that particular factors have
GARCH(1,1) structure with volatility equations of the form
2
f kk,t ¼ ωk þ αk v0k yt1 þ βk f kk,t1 , ð13:26Þ
where ωk, αk, and βk are scalar parameters, vk are m 1 vectors of weights, and fkk,t
are factor volatilities (k ¼ 1, . . ., K ). In any case, the number of factors K is intended
to be much smaller than the number of assets m which makes the model feasible even
for a large number of assets.
In the previous model the factors are generally correlated. This may be undesir-
able when it turns out that several of the factors capture very similar characteristics of
the data. On the other hand, if the factors were uncorrelated, they would represent
really different components that drive the data. The uncorrelatedness of factors can
be achieved by means of various orthogonal transformations in O-GARCH (orthog-
onal GARCH) models (see Alexander and Chibumba (1997)) and GO-GARCH
(generalized orthogonal GARCH) models (see van der Weide (2002), Lanne and
Saikkonen (2007) and others).
Another possibility how to reduce the number of factors in a parsimonious
orthogonal way consists in application of the principal component analysis (PCA)
which is based (similarly to orthogonal GARCH models) on the eigenvalues and
eigenvectors of the conditional covariance matrix Σt of given multivariate time series
(in more details, first few principal components that explain a high percentage of
variability of the process are identified as common factors; see Example 13.2).
Example 13.2 For the same bivariate time series as in Example 13.1 (the compo-
nent {rt1} of monthly log returns of stocks IBM and {rt2} of monthly log returns of
index S&P 500 during period 1926–1999), Tsay (2002) constructed the bivariate
GARCH model applying the factor approach. At first the single common factor {xt}
as the first principal component was constructed explaining 82.5% of variability of
{rt1} and {rt2}:
xt ¼ 0:796r 1t þ 0:605r 2t :
Then this factor {xt} was modeled by means of the following univariate GARCH
model:
xt ¼ 1:317 þ 0:096xt1 þ et , et ¼ σ t εt ,
σ 2t ¼ 3:834 þ 0:110e2t1 þ 0:825σ 2t1 :
Finally, the bivariate model exploiting {σ t2} as the common volatility factor was
constructed:
r 1t ¼ 1:140 þ 0:079r 1, t1 þ 0:067r 1, t2 0:122r 2, t2 þ e1t ,

r 2t ¼ 0:537 þ e2t ,
!
σ 11,t 19:08 0:098 e21,t1 0:333
¼ þ þ σ 2t
σ 22,t 5:62 e22,t1 0:596
(insignificant parameters are omitted). Obviously, the constructed volatility equation

makes use of the lagged squared residuals only in a very limited scope. It is not
surprising as the single common factor explains 82.5% of the total variability of data.
⋄
13.3.4 Estimation of Multivariate GARCH Models
In the literature and various software systems, one can find various approaches how
to estimate particular types of multivariate GARCH models. The most recommended
methods consist in maximum likelihood approach similarly to that in the univariate
case (see, e.g., (8.48)). This approach usually maximizes the log likelihood function
of the form (up to a constant)

1X X
n n
1=2
ln jΣt j þ ln g Σt yt , ð13:27Þ
2 t¼1 t¼1
where g() is an unspecified probability density function of the standardized residuals

εt ¼ Σt1/2yt (see (13.7)). In addition, the conditional mean value can be included in
(13.27).
For normally distributed innovations εt one obtains the log likelihood of the form
1X 1X
n n
ln jΣt j ln y0t Σ1
t yt : ð13:28Þ
2 t¼1 2 t¼1
One should again remind here that the normality of innovations is often rejected in
financial applications (mainly with daily or weekly data): the kurtosis of most
financial asset returns is larger than three and the tails are often fatter than what is
implied by a conditional normal distribution. Fortunately, Bollerslev and
Wooldridge (1992) have shown that a consistent estimator of unknown parameters
(and under some assumptions even a strong consistent one; see Gourieroux (1997) or
Jeantheau (1998)) can be obtained when maximizing (13.28) even if the distribution
of generating process is not normal. Then one denotes it as (Gaussian) quasi-
maximum likelihood QML or pseudo-maximum likelihood PML estimator. Example
13.3 demonstrates the application of this estimation in financial practice.
Example 13.3 Hendrych and Cipra (2016) analyzed the mutual currency risk. By
means of the software system EViews for multivariate GARCH models, one esti-
mated the mutual volatilities (covolatilities) of six European currencies in the period
from January 5, 2007, to April 27, 2012 (i.e., 1362 observations for each currency),
namely for the Czech crown (CZK), the British pound sterling (GBP), the Hungarian
forint (HUF), the Polish zloty (PLN), the Romanian leu (RON), and the Swedish
krona (SEK). In the EU27, 17 member countries used the Euro currency; other three
states (Denmark, Latvia, and Lithuania) were members of the ERM II regime (the
European Exchange Rate Mechanism II), i.e., the national currencies were allowed
to fluctuate around their assigned value with respect to limiting bounds; the
Bulgarian lev was pegged to the euro. Therefore, only six remaining currencies
(see above) were not linked to a currency mechanism and were used in the case
study.
More precisely, one modeled the daily log returns on the bilateral exchange rates
of six currencies with euro as the denominator. Table 13.1 delivers the sample
characteristics of the data collected from the European Central Bank in 2013, e.g.,
the maximum log return of the Czech crown versus euro in the given period was
3.17%.
The analysis of corresponding six-variate process must start by modeling its
conditional mean value. For this purpose vector autoregression (VAR) appears
suitable similarly to that in Examples 13.1 and 13.2, namely VAR(3) (see
Table 13.1 Sample characteristics of daily log returns on exchange rates for selected currencies
versus euro in the period from January 5, 2007, to April 27, 2012 (1362 observations for each
currency) from Example 13.3
CZK GBP HUF PLN RON SEK
Mean 0.00007 0.00014 0.00010 0.00006 0.00019 0.00001
Median 0.00007 0.00012 0.00024 0.00014 0.00000 0.00004
Maximum 0.03165 0.03461 0.05069 0.04164 0.02740 0.02784
Minimum 0.03274 0.02657 0.03389 0.03680 0.01992 0.02260
Std. deviation 0.00478 0.00601 0.00763 0.00721 0.00462 0.00497
Skewness 0.20218 0.30655 0.42056 0.30802 0.54616 0.31526
Kurtosis 8.49754 6.49258 7.80556 8.05110 7.37830 6.05079
Source: Hendrych and Cipra (2016)
(12.25)). The analysis of correlation structure is then performed using only the
deviations from the conditional mean (prediction errors) eit that originate in partic-
ular components of the process applying alternatively the six-variate models
GARCH(1,1) of the type CCC, DCC, or scalar BEKK (denoted as sBEKK):
1. In the case of model CCC one must construct at first particular models for
univariate volatilities. Here the EGARCH(1,1) models (see (8.74)) seem to be
acceptable; e.g., for the deviations from the conditional mean {e1t} in the case of
log returns of the Czech crown versus euro one obtains

e1,t1 e1,t1
ffi þ 0:985 ln σ 11,t1 þ 0:004 pffiffiffiffiffiffiffiffiffiffiffiffi
ln σ 11,t ¼ 0:280 þ 0:154 pffiffiffiffiffiffiffiffiffiffiffiffi ffi:
σ 11,t1 σ 11,t1
Then it suffices to estimate the constant correlation matrix R (see Table 13.2) by
means of the devolatilization (13.19). Figure 13.1 plots the constant conditional
correlation 0.664 for the pair HUF/EUR and PLN/EUR only.
Table 13.2 CCC and DCC estimation of (constant) correlation matrix from Example 13.3 (daily
log returns on exchange rates for six selected currencies versus euro in period from January 5, 2007,
to April 27, 2012)
CZK GBP HUF PLN RON SEK
CZK 1.00000 0.05632 0.39278 0.41581 0.18822 0.16915
GBP 0.05632 1.00000 0.04448 0.02724 0.01421 0.07426
HUF 0.39278 0.04448 1.00000 0.66421 0.44590 0.32100
PLN 0.41581 0.02724 0.66421 1.00000 0.41214 0.36136
RON 0.18822 0.01421 0.44590 0.41214 1.00000 0.21656
SEK 0.16915 0.07426 0.32100 0.36136 0.21656 1.00000
Source: Hendrych and Cipra (2016)
1,00
correlation HUF/EUR vs. PLN/EUR
0,90
0,80
0,70 CCC
0,60 DCC
0,50 sBEKK
0,40
0,30
0,20
2007 2008 2009 2010 2011 2012
Fig. 13.1 Conditional correlations among daily log returns of exchange rates HUF/EUR and
PLN/EUR estimated by means of models GARCH(1,1) of type CCC, DCC, and scalar BEKK
from Example 13.3 (daily log returns on exchange rates for six selected currencies versus euro in the
period from January 5, 2007, to April 27, 2012). Source: Hendrych and Cipra (2016)
2. In the case of model DCC one starts similarly to that in the previous case of CCC
model with the univariate volatilities EGARCH(1,1) for the deviations from the
conditional mean {eti}. The estimation of the dynamic volatility matrix Σt is
obtained according to the formulas (13.20)–(13.22), where
Qt ¼ ð1 0:011 0:978ÞR þ 0:011zt1 z0t1 þ 0:978Qt1
with the same estimated correlation matrix R as in the previous case of the model
CCC (see Table 13.2) and devolatilized process {zt} according to (13.19).
Figure 13.1 plots the dynamic conditional correlation again for the pair
HUF/EUR and PLN/EUR only.
3. Finally in the case of scalar BEKK model (denoted as sBEKK), the estimated
model (13.13) with A1 ¼ α1 I6, B1 ¼ β1 I6 for the dynamic volatility matrix Σt has
the form
Σt ¼ 0:0432 et1 e0t1 þ 0:9512 Σt1 ,
where {et} is the corresponding multivariate process of the deviations from the
conditional mean. Figure 13.1 again plots the dynamic conditional correlation for
the pair HUF/EUR and PLN/EUR only.
From the pragmatic point of view, the estimated models help to do conclusions,
e.g., on the average level of particular conditional correlations (see the estimated
correlation matrix R in Table 13.2): as the currency risk is concerned, the British
pound influences the behavior of remaining five currencies in a negligible scope;
the Hungarian forint and the Polish zloty are rather strongly correlated both
mutually (see also Fig. 13.1) and to the Czech crown, etc. The models DCC
and sBEKK inject an important dynamic aspect to the analysis.
⋄
13.4 Conditional Value at Risk
In the multivariate case, new aspects of risk measures from Sect. 11.1 appear which
are related to the multivariate volatility modeling (and even to the multivariate
GARCH models if the data have the form of financial asset returns). The risk
measures of the type value at risk (VaR) are typical in this context.
1. CoVaR
Adrian and Brunnermeier (2008) showed empirically that the stress state of some
financial institutions (mainly big banks, but also insurance companies, mortgage
agencies, and others) can raise significantly the value at risk of the global financial
system (even by 50%). Therefore, specific risk measures were introduced:
13.4 Conditional Value at Risk 365
CoVaR jji(conditional VaR or contagion VaR) is the value at risk (11.1) of the jth
subject (e.g., a bank) with a possible loss Xj under the condition that the ith subject
(e.g., another bank) with a possible loss Xi finds oneself in a crisis situation or
emergency (i, j ¼ 1, . . ., N ), i.e.

P X j CoVaRαjji j X i ¼VaR iα ¼ α, ð13:29Þ
where VaRiα is the value at risk of the ith subject on the confidence level α (e.g.,
α ¼ 0.99).
CoVaR | i is another conditional value at risk which measures the “contagion”
spread caused by the subject with loss Xi to the global system with loss X (e.g., the
impact of a defaulting bank on the whole bank system), i.e.:

P X CoVaRαj i j X i ¼ VaR iα ¼ α: ð13:30Þ
ΔCoVaR | i (delta conditional VaR) is defined as the following difference:
ΔCoVaRαj i ¼ CoVaRαj i VaRα , ð13:31Þ
where CoVaR | i is given by (13.30) and VaRα is the value at risk corresponding to the
loss X of the global system.
This methodology can be modified for the log returns {rit} of financial assets from
a global system with the log return {rmt} (a global financial market or a security
index; see Brownlees and Engle (2012)). As a special case, one could even consider
a simple portfolio situation
X
N
r mt ¼ wit r it , ð13:32Þ
i¼1
where wit is the relative market capitalization (i.e., the weight) of ith asset at time
t (the scheme (13.32) corresponds, e.g., to the construction of stock indices). In any
case, one can deal with suitable values at risk also in this modified situation:
mjr ¼VaRit ðαÞ
CoVaRt it is the conditional value at risk corresponding to the value at
risk of the market return under the condition that the ith asset finds oneself in a crisis
situation:

mjr ¼VaRit ðαÞ
P r mt CoVaRt it jr it ¼ VaRit ðαÞ ¼ α, ð13:33Þ
so that (13.33) is the modification of (13.30) in this (portfolio) context. Note the
inequality sign in (13.33) since the loss consists in drops of returns so that typical
values at risk are negative return (losses have the negative sign in this context).
CoVaRit(α) is defined as the difference between the value at risk of the global
market conditionally on the ith asset being in financial distress and the value at risk
of the global market conditionally on the asset i being in its median state:
mjrit ¼VaRit ðαÞ mjrit ¼medianðrit Þ

CoVaRit ðαÞ ¼ CoVaRt CoVaRt : ð13:34Þ
2. MES
Marginal expected shortfall (MES) is based on the concept of the expected shortfall
ES (the ES at level α is the expected return in the worst α% of the cases; see (11.14)).
The expected shortfall is usually preferred among risk measures in today’s financial
practice (due to its coherence and other properties giving to it preferences, e.g., in
comparison with the classical value at risk approach; see Artzner et al. (1999) or
Yamai and Yoshiba (2005)):
MESit(C) is the conditional version of ES, in which the global returns exceed
a given market drop C which is chosen as a suitable threshold value (C < 0, i.e.,
measures of the type MES similarly to CoVaR are again in the context of (log)
returns typically negative):
MESit ðC Þ ¼ Et1 ðr it jr mt < C Þ: ð13:35Þ
The symbol Et1 means that one understands the symbols MESit(C) conditionally at
time as MESi,t|t1(C), i.e., computed at time t given the information available at time
t – 1; see also the commentary to (8.12) concerning the symbols of the type σ 2t|t1.
Such a concept seems to be productive in various applications, e.g., when dealing
with so-called systemic risk and systemically important financial institutions (SIFI)
whose distress or disorderly failure, because of their size, complexity, and systemic
interconnectedness, would cause significant disruption to the wider financial system
and economic activity. If the conditional ES of the system is formally defined as
X
N
ESmt ðCÞ ¼ Et1 ðr mt jr mt < CÞ ¼ wit Et1 ðr it jr mt < C Þ ð13:36Þ
i¼1
then it holds
∂ESmt ðCÞ
MESit ðC Þ ¼ : ð13:37Þ
∂wit
Hence MES measures the increase in the risk of the system (measured by the ES)
induced by a marginal increase in the weight of ith subject of the system (the higher
the subject’s MES, the higher the individual contribution of this subject to the risk of
the financial system).
Table 13.3 Constituents of Prague Stock Exchange index (PX index) from Example 13.4
Stock name (abbrev.) Stock name Obs. from Obs. to
PX Prague Stock Exchange Index Jan 6, 2000 May 9, 2016
AAA AAA Auto Sep 25, 2007 Jul 3, 2013
VIG Vienna Insurance Group Feb 6, 2008 May 9, 2016
CEZ CEZ Jan 6, 2000 May 9, 2016
CETV Central European Media Enterprises Jun 28, 2005 May 9, 2016
ECM ECM Real Estate Investments Dec 8, 2006 Jul 20, 2011
ERSTE Erste Group Bank Oct 2, 2002 May 9, 2016
KB Komercni banka Jan 6, 2000 May 9, 2016
NWR New World Resources PLC May 7, 2008 May 9, 2016
O2 O2 CR Jan 6, 2000 May 9, 2016
ORCO Orco Property Group SA Feb 2, 2005 Sep 19, 2014
PEGAS Pegas Nonwovens SA Dec 19, 2006 May 9, 2016
PHILMOR Philip Morris CR Oct 9, 2000 May 9, 2016
UNIPETROL Unipetrol Jan 6, 2000 May 9, 2016
ZENTIVA Zentiva Jun 29, 2004 Apr 27, 2009
Source: Cipra and Hendrych (2017)
Example 13.4 The case study by Cipra and Hendrych (2017) examines the sys-
temic risk for the Prague Stock Exchange index (PX index) constituents (see
Table 13.3). In order to calculate MES for each involved firm i (i ¼ 1, . . ., N ), one
implemented GARCH modeling schemes for each bivariate process of the daily firm
and market log returns rit and rmt (see also Brownlees and Engle (2012)):
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
r it ¼ σ it εit ¼ σ it ρim,t εmt þ σ it 1 ρ2im,t ζ it , ð13:38Þ
r mt ¼ σ mt εmt ,
where the shocks (ζ it, εmt) are independent and identically distributed in time with
zero mean, unit variance, and zero covariance. A mutual independence of these
shocks is not assumed: on the contrary, there are reasons to believe that extreme
values of εmt and ζ it interact (when the market is in its tail, the firm disturbances may
be even further in the tail if there is serious risk of default). Obviously, the modeling
scheme (13.38) guarantees that the conditional variances and correlation of rit and
rmt are σ 2it ð¼ σ ii,t Þ, σ 2mt ð¼ σ mm,t Þ and ρim,t, respectively.
The specification is completed by description of conditional (co)moments. The
volatilities σ 2it and σ 2mt were modeled as the univariate GJR GARCH(1,1) models (see
(8.72))
σ 2it ¼ ωi þ αi r 2i,t1 þ βi σ 2i,t1 þ γ i r 2i,t1 I

i,t1 ,
ð13:39Þ
σ 2mt ¼ ωm þ αm r 2m,t1 þ βm σ 2m,t1 þ γ m r 2m,t1 I
m,t1
with I
i,t ¼ 1 for rit < 0 and 0 otherwise, I m,t ¼ 1 for rmt < 0 and 0 otherwise (this
threshold GARCH modification covers the leverage effect, i.e., the tendency of
volatility to increase more with bad news (negative log returns) rather than with
good ones (positive log returns). The time-varying correlations are captured by using
GARCH(1,1) of the type DCC (also with an asymmetric modification; see Engle
(2009)). For example, the conditional covariance matrix (13.20) has the form

σ it 0 1 ρim,t σ it 0
Σt ¼ Δt Rt Δt ¼ : ð13:40Þ
0 σ mt ρim,t 1 0 σ mt
The previous modeling scheme was applied to 14 firms, which have been
included into the PX index basis according to their market capitalization as of the
end of June 2008. One extracted the daily log returns from January 6, 2000, to May
9, 2016 (these data are unbalanced in that sense that not all companies have been
continuously traded during the sample period; see Table 13.3). Selected sample
characteristics of the studied log returns are presented in Table 13.4. One made use
of the estimation methodology for multivariate GARCH models from Sect. 13.3.4.
Figure 13.2 displays conditional volatilities of all investigated firms jointly with
the PX index (market) conditional volatility. Apparently, all graphs are significantly
influenced by the explosion in variability during the financial crisis 2008. Further-
more, one identifies the similar trend over many charts that is in line with the market
volatility trend. On the contrary, several log return time series are dominated by
other effects, which are not common for the whole market or for other returns. For
instance, one can mention the volatility of O2 log returns, which was increased due
to the split of the company in 2015.
Figure 13.3 shows the estimated time-varying correlations ρim,t between returns
of the ith company and the PX index (market). It is evident that the financial returns
of involved firms are significantly positively correlated with the market financial
returns. However, one can identify different behavior of correlations displayed in
particular plots. Some correlations are relatively stable when comparing with others;
see, e.g., CEZ, KB, PHILMOR, or VIG; others demonstrate trends varying in time,
e.g., O2, PEGAS, or UNIPETROL.
Table 13.5 reports the examined stocks listed in ascending order regarding the
one-step ahead MES predicted for October 20, 2008, which was a very critical date
from the point of view of the financial crisis 2008. The threshold C (see (13.35)) was
set as the unconditional VaR of the PX index log returns with the confidence level
99%. Under the distress condition of the global market, the short-run prediction
produced by the model, e.g., for ERSTE indicates a deep drop over 25%.
Finally, Table 13.6 contains the estimated multi-period ahead MES predictions
(h ¼ 125, i.e., the half-year ahead) starting from May 9, 2016 (the end of the
13.4
Table 13.4 Sample characteristics of the log returns from Example 13.4 (systemic risk analysis for constituents of PX index)
Stock # obs Mean Std. dev Median Min Max Skew Kurt
PX 4101 0.00014 0.01429 0.00051 0.16185 0.12364 0.45054 12.21528
AAA 1450 0.00057 0.02885 0.00000 0.23107 0.34179 1.14937 23.07040
Conditional Value at Risk
VIG 2071 0.00049 0.02215 0.00000 0.17920 0.13539 0.51478 8.42960

CEZ 4099 0.00039 0.01932 0.00090 0.19834 0.19517 0.38745 9.95323
CETV 2705 0.00114 0.03921 0.00000 0.70628 0.47994 1.98131 56.46028
ECM 1155 0.00356 0.03748 0.00253 0.41313 0.24481 1.02567 20.24787
ERSTE 3127 0.00005 0.02668 0.00000 0.26834 0.19382 0.37664 11.05523
KB 4099 0.00048 0.02138 0.00000 0.19392 0.10410 0.43406 6.76452
NWR 2008 0.00436 0.05186 0.00000 0.49590 0.51083 0.86226 16.88669
O2 4101 0.00022 0.02509 0.00000 0.94253 0.13056 13.02806 487.11037
ORCO 2385 0.00181 0.03614 0.00168 0.26397 0.29480 0.14712 11.66799
PEGAS 2353 0.00000 0.01784 0.00000 0.23370 0.13509 0.93750 21.07422
PHILMOR 3882 0.00020 0.01756 0.00000 0.13709 0.14842 0.63086 8.43172
UNIPETROL 4097 0.00028 0.02233 0.00000 0.21770 0.26472 0.12378 14.04297
ZENTIVA 1214 0.00061 0.01929 0.00000 0.17839 0.09733 1.31104 13.24937
369
AAA CETV CEZ

.25 .25 .25
.20 .20 .20
.15 .15 .15
.10 .10 .10
.05 .05 .05
.00 .00 .00

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
ECM ERSTE KB
.25 .25 .25
.20 .20 .20
.15 .15 .15
.10 .10 .10
.05 .05 .05
.00 .00 .00

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
NWR O2 ORCO
.25 .25 .25
.20 .20 .20
.15 .15 .15
.10 .10 .10
.05 .05 .05
.00 .00 .00

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
PEGAS PHILMOR UNIPETROL

.25 .25 .25
.20 .20 .20
.15 .15 .15
.10 .10 .10
.05 .05 .05
.00 .00 .00

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
VIG ZENTIVA PX
.25 .25 .25
.20 .20 .20
.15 .15 .15
.10 .10 .10
.05 .05 .05
.00 .00 .00

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
Fig. 13.2 Conditional volatilities of PX index constituents and PX index itself from Example 13.4
(systemic risk analysis for constituents of PX index). Source: Cipra and Hendrych (2017)
examined data set). Here the threshold C was set as minus 5% and minus 20%,
respectively. To be more precise, an investor can identify and anticipate potential
capital shortfall under the condition that a systemic event occurs half a year after the
investment (i.e., when the market global return is less than the threshold C at that
time moment). Consequently, the stocks NWR were identified as the most risky
AAA CETV CEZ

1.0 1.0 1.0
0.5 0.5 0.5
0.0 0.0 0.0
-0.5 -0.5 -0.5
-1.0 -1.0 -1.0

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
ECM ERSTE KB
1.0 1.0 1.0
0.5 0.5 0.5
0.0 0.0 0.0
-0.5 -0.5 -0.5
-1.0 -1.0 -1.0

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
NWR O2 ORCO
1.0 1.0 1.0
0.5 0.5 0.5
0.0 0.0 0.0
-0.5 -0.5 -0.5
-1.0 -1.0 -1.0

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
PEGAS PHILMOR UNIPETROL

1.0 1.0 1.0
0.5 0.5 0.5
0.0 0.0 0.0
-0.5 -0.5 -0.5
-1.0 -1.0 -1.0

00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
VIG ZENTIVA
1.0 1.0
0.5 0.5
0.0 0.0
-0.5 -0.5
-1.0 -1.0
00 02 04 06 08 10 12 14 16 00 02 04 06 08 10 12 14 16
Fig. 13.3 Conditional correlations among PX index constituents and PX index from Example 13.4
(systemic risk analysis for constituents of PX index). Source: Cipra and Hendrych (2017)
assets assuming C ¼ 0.05 (the multi-period ahead MES of 14.7%) and the stocks
ERSTE as the most risky assets assuming C ¼ 0.20 (the multi-period ahead MES
of 25.5%).
⋄
Table 13.5 One-step-ahead Stock MES

MES predicted for October
ERSTE 0.25211
20, 2008, from Example 13.4;
the threshold C for the global ORCO 0.23861
market in (13.35) was set as NWR 0.21770
the unconditional VaR of PX CETV 0.20791
index log returns with confi- KB 0.20197
dence level 99% (systemic
ECM 0.18318
risk analysis for constituents
of PX index) UNIPETROL 0.17690
PEGAS 0.15227
CEZ 0.14778
AAA 0.12997
VIG 0.11844
O2 0.11019
PHILMOR 0.05774
Table 13.6 Multi-period-ahead MES (h ¼ 125) starting from May 9, 2016, from Example 13.4;
the threshold C was set as minus 5% and minus 20% of PX index log returns (systemic risk analysis
for constituents of PX index)
Stock MES9/5/2016125(C ¼ 5%) MES9/5/2016125(C ¼ 20%)
AAA NA NA
VIG 0.12630 0.21913
CEZ 0.09328 0.19380
CETV 0.09321 0.16910
ECM NA NA
ERSTE 0.13347 0.25510
KB 0.09235 0.19925
NWR 0.14654 0.19410
O2 0.09833 0.16540
ORCO NA NA
PEGAS 0.02161 0.08087
PHILMOR 0.03092 0.07247
UNIPETROL 0.02253 0.09114
ZENTIVA NA NA
13.5 Exercises
Exercise 13.1
Apply the multivariate EWMA methodology for time series {DTB3t} and {DAAAt}
from Table 12.1 (the first differences of monthly yields to maturity for three-
month T-bills and corporate bonds AAA in the USA in % p.a.).
Chapter 14
State Space Models of Time Series
14.1 Kalman Filter
Kalman filter presents a theoretical background for various recursive methods in

(linear) systems, particularly in (multivariate) time series models. In general, one
speaks on so-called Kalman (or Kalman–Bucy) recursions for filtering, predicting,
and smoothing in the framework of so-called state space model; see, e.g., Brockwell
and Davis (1993, 1996), Durbin and Koopman (2012), Hamilton (1994), Harvey
(1989), and others.
Originally, state space modeling was suggested for technical disciplines (e.g., for
the fire control of missiles and in the telecommunications); later it has shown to be
useful also for (Bayesian) statistics and econometrics. This methodological approach
is based on the principle that the state of given dynamic system is determined in time
by state vectors (state variables), which are unobservable in time, but one can draw
conclusions on their behavior by means of their observations in the form of obser-
vation vectors (in practice, it is usually a multivariate time series).
The formal base of state space modeling is the dynamic linear model DLM, which
can be formulated under various levels of complexity (this model can be even
nonlinear). We confine ourselves to the simplest form
xtþ1 ¼ Ft xt þ vt , t ¼ 1, 2, . . . , ð14:1Þ
yt ¼ G t xt þ w t , t ¼ 1, 2, . . . , ð14:2Þ
where (14.1) is the (vector) state equation describing the development of state vector
in time and (14.2) is the (vector) observation equation describing the relationship
between observation vectors and state vectors. The meaning of particular symbols is
the following:
xt state vector of dimension d 1 (at time t);
yt observation vector of dimension m 1 (at time t);

https://doi.org/10.1007/978-3-030-46347-2_14
374 14 State Space Models of Time Series
Ft parameter matrix of state equation of dimension d d (at time t);

Gt parameter matrix of observation equation of dimension m d (at time t);
vt (vector) random residual of state equation of dimension d 1 (at time t);
wt (vector) random residual of observation equation of dimension m 1 (at time t);
Vt covariance matrix of random residual vt of dimension d d (at time t);
Wt covariance matrix of random residual wt of dimension m m (at time t).
Moreover, one usually assumes that

vt iid N ð0, Vt Þ, wt iid N ð0, Wt Þ, covðvs , wt Þ ¼ E vs w0t
¼ 0, s, t ¼ 1, 2, . . . ð14:3Þ
(sometimes one admits more generally the contemporaneous correlatedness between

the random residuals vt and wt, their non-normality, and other modifications).
Example 14.1 Let us consider the univariate process AR( p) (see also (6.31))

yt ¼ φ1 yt1 þ þ φp ytp þ εt , εt N 0, σ 2 : ð14:4Þ
1. On one hand, by means of the state space modeling one can solve the problem of
recursive estimation of parameters of this process, even by an adaptive way in real
time (i.e., the parameters can change in time). In such a case, the state vector is the
parameter vector
0
xt ¼ φ1t , . . . , φpt ð14:5Þ
and the dynamic linear model (14.1) and (14.2) will be formulated as
xtþ1 ¼ xt , ð14:6Þ

yt ¼ yt1 , . . . , ytp xt þ εt ð14:7Þ
(i.e., particularly, yt ¼ yt, Ft ¼ I, Gt ¼ (yt-1, . . ., yt-p), vt ¼ 0, wt ¼ εt,

Vt ¼ 0, Wt ¼ σ 2).
2. On the other hand, by means of the state space modeling one can also predict or
filter the given process AR( p). In such a case, the state vector is the following
vector of values:
0
xt ¼ ytpþ1 , ytpþ2 , . . . , yt ð14:8Þ
and the dynamic linear model (14.1) and (14.2) will have the form
14.1 Kalman Filter 375
0 1 0 1
0 1 0 0 0
B0 0 1 0 C B 0 C
B C B C
B C B C
xtþ1 ¼B⋮ ⋮ ⋮ ⋮ C xt þ B ⋮ C εtþ1 , ð14:9Þ
B C B C
@ 0 0 0 ... 1 A @ 0 A
φp φp1 φp2 φ1 1
yt ¼ ð0, 0, . . . , 1Þ xt ð14:10Þ
(i.e., particularly, yt ¼ yt, Gt ¼ (0, 0, . . ., 1), vt ¼ (0, 0, . . ., εt + 1)0 , wt ¼ 0, Wt ¼ 0).
⋄
According to the previous commentaries, the state space representation enables us
to solve recursively (namely in an effective way by means of Kalman recursive
formulas) the problem of filtering, smoothing, and predicting in a given DLM. The
key role in this context plays the conditional distribution of the state vector xt
conditioned by information contained in the observations ys, ys-1, ys-2, . . . till time
s. Due to practical purposes, we confine ourselves only to the first two moments of
this distribution and denote
0
b
xtjs ¼ Es ðxt Þ, Ptjs ¼ Es xt b
xtjs xt b
xtjs , ð14:11Þ
where the index s in the symbol Es() means that the mean value is conditioned by
information till time s. Important values in this context are the following ones:
• The prediction of state vector xt from time t 1 by one-step-ahead and the
corresponding error matrix:
0
b
xtjt1 ¼ Et1 ðxt Þ, Ptjt1 ¼ Et1 xt b
xtjt1 xt b
xtjt1 : ð14:12Þ
• The estimated (filtered) value of state vector xt at time t and the corresponding
error matrix:
0
b
xtjt ¼ Et ðxt Þ, Ptjt ¼ Et xt bxtjt xt bxtjt : ð14:13Þ
These predictions and estimations are the best ones according to the criterion
MSE (i.e., in the sense of mean squared error; see (2.11)). Moreover, under the given
assumptions (i.e., in the described DML under the assumption of normality), they
even have the form of linear functions, the argument of which is always the
corresponding conditioning information (i.e., corresponding observation vectors).
Simultaneously one can also obtain the prediction of vector yt from time t 1 (i.e.,
by one-step-ahead) and the corresponding error matrix as
0
b
ytjt1 ¼ Et1 ðyt Þ ¼ Gt b
xtjt1, Et1 yt b
ytjt1 yt b
ytjt1
¼ Gt Ptjt1 G0t þ Wt : ð14:14Þ
Remark 14.1 Sometimes the matrices Ft, Gt, Vt, Wt (eventually others) in DML
contain unknown parameters which must be estimated. In practice, one applies
usually so-called EM algorithm (expectation-maximization; see, e.g., Brockwell
and Davis (1996), Dempster et al. (1977), Wu (1983)) which combines the maxi-
mum likelihood method with optimization algorithms and can be used in the
situations with incomplete information where some data are missing.
⋄
1. Filtering in State Space Model
Filtering in a given state space model consists in the (recursive) estimation of state
vector xt exploiting information contained in yt, yt-1, yt-2, . . . . The corresponding
Kalman recursive formulas, which are called Kalman (or Kalman-Bucy) filter in
such a case, have the form
1
xtjt ¼ b
b xtjt1 þ Ptjt1 G0t Gt Ptjt1 G0t þ Wt yt G t b
xtjt1 ,
1 ð14:15Þ
Ptjt ¼ Ptjt1 Ptjt1 G0t Gt Ptjt1 G0t þ Wt Gt Ptjt1 ,
where
b
xtjt1 ¼ Ft1b
xt1jt1 ,
ð14:16Þ
Ptjt1 ¼ Ft1 Pt1jt1 F0t1 þ Vt1 :
Example 14.2 Kalman filter can be used to construct the recursive OLS estimate in
the classical model of linear regression which is rewritten in the form of dynamic
linear model (14.1) and (14.2) with the state vector βt, i.e.,
βtþ1 ¼ βt , ð14:17Þ
yt ¼ ð1, xt2 , . . ., xtk Þβt þ εt ¼ xt βt þ εt , ð14:18Þ
where xt is the tth row of regression matrix X and εt ~ iid N(0, σ t2). After putting into
(14.15) and (14.16), one obtains the following recursive formulas for OLS estimate
(using a simpler denotation, namely bt instead of bt|t and Pt instead of Pt|t /σ t2 since,
e.g., bt|t 1 ¼ bt|t):
Pt1 x0t
bt ¼ bt1 þ ðy xt bt1 Þ,
xt Pt1 x0t þ 1 t
ð14:19Þ
Pt1 x0t xt Pt1
Pt ¼ Pt1 :
xt Pt1 x0t þ 1
In contrast to the classical non-recursive OLS estimate, the algorithm (14.19) is

applicable online and it does not demand to invert any matrix during calculations.
However, the formulas (14.19) must be completed by a simple recursive estimate of
the white noise variance σ t2.
⋄
Example 14.3 In this example, we will show the application of Kalman filter for
recursive estimation of linear time series models, namely autoregressive models. For
this purpose, the model AR( p) can be rewritten in the form of dynamic linear model
(14.1) and (14.2) with the state vector φt as
φtþ1 ¼ φt , ð14:20Þ

yt ¼ yt1 , yt2 , . . . , ytp φt þ εt ¼ yt φt þ εt , ð14:21Þ
where yt ¼ (yt-1, yt-2, . . ., yt-p) and εt ~ iid N(0, σ t2). Again after putting into (14.15)
and (14.16), we receive the following recursive formulas for estimating parameters
of this heteroscedastic model AR( p) (using a simpler denotation again, namely φ bt
b
instead of φtjt and Pt instead of Pt|t /σ t ):2
Pt1 y0t
bt ¼ φ
φ b t1 þ b t1 Þ ,
ðy yt φ
yt Pt1 y0t þ 1 t
Pt1 y0t yt Pt1
Pt ¼ Pt1 , ð14:22Þ
yt Pt1 y0t þ 1
!
1 ðy yt φ b t1 Þ2
σ 2t ¼
b σ 2t1 þ t
ðt p 1Þb :
tp yt Pt1 y0t þ 1
Particularly for the process AR(1), i.e., yt ¼ φ yt-1 + εt, these recursive formulas
are simplified to the form
bt ¼ φ
φ b t1 þ Pt yt1 ðyt φ
b t1 yt1 Þ ,
Pt1
Pt ¼ ,
Pt1 y2t1 þ 1 ð14:23Þ

1 ðy φbt1 yt1 Þ2
σ 2t ¼
b σ 2t1 þ t
ðt 2Þb :
t1 Pt1 y2t1 þ 1
Table 14.1 Recursive esti- bt

t ϕ Pt σ 2t
b
mation based on Kalman filter
in simulated process yt ¼ 10 0.552 0.0024 26.54
¼ 0.6 yt-1 + εt, εt ~ iid N 20 0.518 0.0034 110.69
(0, 100), t ¼ 1, . . ., 100 in 30 0.508 0.0003 94.24
Example 14.4 40 0.448 0.0002 81.74
50 0.613 0.0001 104.39
60 0.640 0.0001 114.00
70 0.608 0.0001 113.42
80 0.630 0.0001 123.06
90 0.641 0.0001 114.14
100 0.616 0.0001 115.25
The initial values in the recursive formulas (14.22) can be chosen as φ b 0 ¼ 0,

P0 ¼ cI for a high positive constant c; the initial value of variance of white noise can
be taken as the sample variance of {yt} from an initial segment of this time series
σ 20 ¼ 0Þ.
(or b
⋄
Example 14.4 Table 14.1 presents selected values obtained by means of recursive
estimation (14.23) based on a simulated trajectory of the process AR(1) modeled as
yt ¼ 0:6yt1 þ εt , εt iid N ð0, 100Þ, t ¼ 1, 2, . . . , 100
b 0 ¼ 0,
(see Kalman filter in Example 14.3). The initial values were chosen as φ
P0 ¼ 1 and b
σ 20 ¼ 0:
⋄
2. Predicting in State Space Model
Predicting in state space model (also predictor) consists in (recursive) estimation of
the state vector xt+h for particular t using information contained in yt, yt-1, yt-2, . . .
(h is fixed). One constructs recursively the predictions of the type
0
b
xtþhjt ¼ Et ðxtþh Þ, Ptþhjt ¼ Et xtþh b
xtþhjt xtþh b
xtþhjt : ð14:24Þ
The corresponding recursive formulas for one-step-ahead prediction are

1
xtþ1jt ¼ Ft b
b xtjt1 þ Ft Ptjt1 G0t Gt Ptjt1 G0t þ Wt yt G t b
xtjt1 ,
1 ð14:25Þ
Ptþ1jt ¼ Ft Ptjt1 F0t þ Vt Ft Ptjt1 G0t Gt Ptjt1 G0t þ Wt Gt Ptjt1 F0t
and for more steps h ahead (h 2)

b
xtþhjt ¼ Ftþh1 Ftþh2 . . . Ftþ1b
xtþ1jt ,
0 ð14:26Þ
Ptþhjt ¼ Ftþh1 Ptþh1jt Ftþh1 þ Vtþh1 :
Hence one can also calculate the prediction of yt+h and the corresponding error
matrix as
0
b xtþhjt, Et ytþh b
ytþhjt ¼ Et ytþh ¼ Gtþhb ytþhjt ytþh b
ytþhjt
¼ Gtþh Ptþhjt G0tþh þ Wtþh : ð14:27Þ
3. Smoothing in State Space Model

Smoothing in state space model (also smoother or Kalman fixed point smoothing)
consists in the estimation of state vector xt for a fixed time t. The procedure is
recursive for increasing n using gradually information contained in samples yn, yn-1,
yn-2, . . . . In other words, one constructs recursively smoothed values of state vector
xt and corresponding error matrix:
0
b
xtjn ¼ En ðxt Þ, Ptjn ¼ En xt b
xtjn xt b
xtjn : ð14:28Þ
The corresponding recursive formulas are

1
xtjn ¼ b
b xtjn1 þ Ωt,n G0n Gn Pnjn1 G0n þ Wn yn Gnbxnjn1 ,
1
Ptjn ¼ Ptjn1 Ωt,n G0n Gn Pnjn1 G0n þ Wn Gn Ω0t,n , ð14:29Þ
h 1 i0
Ωt,nþ1 ¼ Ωt,n Fn Fn Pnjn1 G0n Gn Pnjn1 G0n þ Wn Gn :
Hence one can also smooth the observed time series {yt} as
0
b
ytjn ¼ Gt b
xtjn, En yt b
ytjn yt b
ytjn ¼ Gt Ptjn G0t þ Wt : ð14:30Þ
It is necessary to stress once more that the state space methodology is the
theoretical concept for construction of various recursive procedures in time series
analysis. Section 14.1.1 shows a possible application for recursive estimation of
(multivariate) GARCH models of financial time series.
14.1.1 Recursive Estimation of Multivariate GARCH Models
The usual estimation of GARCH models is based on the maximum likelihood

principle (see, e.g., Fan and Yao (2005)). On the other hand, GARCH models for
high-frequency data (HFD) in finance necessitate an application of recursive (i.e.,

online) approaches which mostly consist in the state space modeling.
Hendrych and Cipra (2018) modified recursive algorithms suggested originally
for system identification in engineering (see Ljung (1999), Ljung and Söderström
(1983), Söderström and Stoica (1989)) to be applicable also for online estimation of
(multivariate) GARCH models in finance. The method combines so-called recursive
pseudo-linear regression with the ML estimation:
One uses the usual model framework for multivariate GARCH processes {rt},
which are formed mostly by m-variate vectors of log returns of financial assets from
a given portfolio (see Sect. 13.3):
1=2
rt ¼ Η t εt , ð14:31Þ
where {εt} is an iid multivariate white noise with normal distribution
εt N ð0, IÞ ð14:32Þ
and Ηt1/2 is the square root matrix of conditional covariance matrix Ηt expressed in
time t as a suitable function of the information Ωt1 known till time t 1.
In particular, Ηt is a positive definite Ωt1-measurable matrix and Ht ¼
0
1=2 1=2
Ηt Ηt :
As the corresponding conditional moments are
Eðrt jℑt1 Þ ¼ 0, varðrt jℑt1 Þ ¼ Ηt , ð14:33Þ
the conditional probability density is obviously

n o
1
f ðrt jℑt1 Þ ¼ j2π Ηt j1=2 exp r0t Η1
t r t : ð14:34Þ
2
Hence the conditional ML estimator of the true parameter vector θ for modeling
Ηt(θ) can be found by minimizing
T h
X i
min ln jΗt ðθÞj þ r0t Ηt ðθÞ1 rt : ð14:35Þ
θ
t¼1
In this phase, we apply the general scheme of recursive pseudo-linear regression

(see, e.g., Hendrych and Cipra (2018), Ljung (1999)):

θt ¼ b
b θt1 ηt R1 0 b
t Ft θt1 , ð14:36Þ
h i
Rt ¼ Rt1 þ ηt e00t b
F θt1 Rt1 , ð14:37Þ
1
ηt ¼ for a forgetting factor ξt , ð14:38Þ
1 þ ξt =ηt1
where
F t ðθÞ ¼ ln jΗt ðθÞj þ rTt Ηt ðθÞ1 rt : ð14:39Þ
e00t ðθÞ is an approximation of the

Here F0t ðθÞ denotes the gradient of Ft(θ) and F
Hessian matrix F00t ðθÞ such that
00
et b
E F θt1 F00t bθt1 jℑt1 ¼ 0 ð14:40Þ
(the approximation based on the conditional mean value in (14.40) makes simpler
the calculation of Hessian matrix). Finally, the forgetting factor {ξt} in (14.38)
substantially improves convergence and statistical properties of the given recursive
estimation. The usual choice in practice is either a constant forgetting factor ξ (e.g.,
ξ ¼ 0.95) or an increasing forgetting factor, e.g.,

ξt ¼ e
ξ ξt1 þ 1 e
ξ , ξ0 , e
ξ 2 ð0, 1Þ: ð14:41Þ
Besides the choice of forgetting factor, further technicalities must by solved before
applying the estimation in practice, e.g., the initialization of the estimation algo-
rithm. The special case of recursive estimation of univariate GARCH models is
shown in Hendrych and Cipra (2018).
Example 14.5 Let us consider the recursive estimation of the (single) parameter λ in
the multivariate EWMA (or MEWMA or scalar VEC-IGARCH(1,1)) model (13.1)
which can be rewritten as
Ht ðλÞ ¼ ð1 λÞ rt r0t þ λHt1 ðλÞ, λ 2 ð0, 1Þ, ð14:42Þ
where the discount constant λ (0 < λ < 1) is the only parameter in this very simple
multivariate GARCH model. Then after troublesome (matrix) arrangements one can
rewrite the recursive pseudo-linear regression (14.36)–(14.38) to the form
2 0 1 3
∂Ht bλt1 ∂Ht bλt1
bλt ¼ bλt1 ηt R1 4tr@H1 bλt1 A r0 H1 bλt1 H1 bλt1 rt 5,
t t
∂λ t t
∂λ t
ð14:43Þ
2 0 1 3
∂Ht bλt1 ∂Ht bλt1
Rt ¼ Rt1 þ ηt 4tr@Ht b
1
λt1 Ht b
1
λt1 A Rt1 5,
∂λ ∂λ
ð14:44Þ

Htþ1 b λt rt r0t þ b
λt ¼ 1 b λt H t b
λt1 , ð14:45Þ

∂Htþ1 b
λt ∂Ht bλt1
¼ rt r0t þ Ht b
λt1 þ b
λt , ð14:46Þ
∂λ ∂λ
1
ηt ¼ for a forgetting factor fξt g ð14:47Þ
1 þ ξt =ηt1
(the symbol tr(A) denotes the trace of matrix A). Note that all calculations (including
the calculations of matrix derivatives in (14.46)) are recursive.
Figure 14.1 shows the simulation results in the bivariate case, where the process
{rt} in (14.42) was generated as a iid normal white noise with zero mean values, unit
variances, and correlation coefficient 0.8. Four alternatives with true values of the
parameter λ (namely λ ¼ 0.91, 0.94, 0.97, 0.99) were considered (one realized 1000
simulations for each of them).
This recursive estimate was applied for 647 couples of daily log returns of
40 currency rates versus EUR (i.e., (EUR, CURR1) and (EUR, CURR2)) from
January 1999 to December 2017 according to the European Central Bank. For
estimating the parameter λ, three approaches were used (see Cipra and Hendrych
(2019)):
• Fixed b
λt ¼ 0:94:
• Recursive MEWMA method with fixed forgetting factor ξt ¼ 0.995.
• Recursive MEWMA method with increasing forgetting factor ξt ¼ e
ξ ξt1 þ

1 eξ , where ξ0 ¼ 0:95, e
ξ ¼ 0:99:
For example, Fig. 14.2 presents the parameter estimators for the couple EUR/USD
and EUR/JPY. Moreover, the corresponding estimated conditional correlation and
volatilities are shown using the results of recursive MEWMA method with increas-
ing forgetting factor.
⋄
14.2 State Space Model Approach to Exponential Smoothing 383
Boxplots of recursive estimates of bivariate EWMA model parameter

LAMBDA = 0.91 LAMBDA = 0.94
1.00 1.00
0.95 0.95
0.90 0.90
0.85 0.85
0.80 0.80
T_250 T_500 T_750 T_1000 T_250 T_500 T_750 T_1000
LAMBDA = 0.97 LAMBDA = 0.99

1.00 1.00
0.95 0.95
0.90 0.90
0.85 0.85
0.80 0.80
T_250 T_500 T_750 T_1000 T_250 T_500 T_750 T_1000
Fig. 14.1 Recursive estimation of parameter λ in bivariate EWMA model (14.42) (boxplots are
based on 1000 simulations for four alternatives with true values λ ¼ 0.91, 0.94, 0.97, 0.99). Source:
Cipra and Hendrych (2019)
14.2 State Space Model Approach to Exponential

Smoothing
Exponential smoothing from Sects. 3.3 and 4.1.3 including Holt’s and Holt–Win-
ters’ method can be formulated as filtering and predicting based on state space
modeling (see the monograph by Hyndman et al. (2008)). One can even systemat-
ically classify particular models according to the type of trend, seasonal, and residual
(or error) components (see Sect. 2.2.2) and the type of decomposition of time series
(additive or multiplicative).
For instance, let us consider the following DLM (14.1)–(14.2) for a (univariate)
time series {yt}:

1 1 α
xt ¼ xt1 þ εt , ð14:48Þ
0 1 γ
yt ¼ ð1 1Þ xt1 þ εt ð14:49Þ
Various MEWMA parameter estimators: Various MEWMA conditional correlations:

EUR/USD and EUR/JPY EUR/USD and EUR/JPY
1.00 1.0
0.95
0.5
0.90
0.0
0.85
–0.5
0.80
0.75 –1.0
2000 2005 2010 2015 2000 2005 2010 2015
Various MEWMA conditional variances: Various MEWMA conditional variances:

EUR/USD EUR/JPY
0.00030
6e–04
4e–04
0.00015
2e–04
0.00000 0e–04
2000 2005 2010 2015 2000 2005 2010 2015
Fig. 14.2 MEWMA method for log returns of currency rates for the couple EUR/USD and
EUR/JPY from January 1999 to December 2017 in Example 14.5: the parameter estimators (the
smooth non-constant line plots the recursive MEWMA estimate with increasing forgetting factor)
and the corresponding model estimates of conditional correlation coefficient and volatilities (for
recursively estimated λ with increasing forgetting factor). Source: Cipra and Hendrych (2019)
with the state vector xt ¼ (Lt, Tt)0 , where the symbols Lt and Tt denote the level and
slope of the given time series (see Sect. 3.1.1), respectively, and {εt} is a white noise
(note that the residuals in state and observation equations in time t are mutually
correlated). Then one obtains gradually
Lt ¼ Lt1 þ T t1 þ αεt1 ¼ Lt1 þ T t1 þ αðyt Lt1 T t1 Þ

¼ αyt þ ð1 αÞðLt1 þ T t1 Þ,
T t ¼ T t1 þ γ εt1 ¼ T t1 þ αγεt1 ¼ T t1 þ γ ðLt Lt1 T t1 Þ

¼ γ ðLt Lt1 Þ þ ð1 γ ÞT t1 ,

bytþτ ðt Þ ¼ E ytþτ j xt ¼ Lt þ T t τ ðτ 0Þ,
which is equivalent to Holt’s method (3.106)–(3.109) (we put γ ¼ αγ).

One can proceed analogously in the case of additive Holt–Winters’ method. The
corresponding DML can be chosen as
0 1
0 1 α
1 1 0 0 ... 0 0 B γ C
B C B C
B0 1 0 0 ... 0 0C B C
B C B C
B C B δ C
B0 0 0 0 ... 0 1C B C
B C B C
B C B C
xt ¼ B
B0 0 1 0 ... 0 0C
C
B
xt1 þ B 0 C εt ,
C ð14:50Þ
B C B C
B0 0 0 1 ... 0 0C B 0 C
B C B C
B C B C
B⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮C B C
@ A B ⋮C
@ A
0 0 0 0 N 1 0
0
yt ¼ ð 1 1 0 0 . . . 0 1 Þ xt1 þ εt ð14:51Þ
with the state vector xt ¼ (Lt, Tt, It, It-1, . . ., It-s+1)’, where the symbols Lt, Tt, and It
denote the level, slope, and seasonal index of the given time series in time t,
respectively, and {εt} is again a white noise. Hence it follows gradually
Lt ¼ αðyt I ts Þ þ ð1 αÞðLt1 þ T t1 Þ, ð14:52Þ
I t ¼ δðyt Lt Þ þ ð1 δÞI ts , ð14:54Þ
byt ¼ Lt þ I t , ð14:55Þ
bytþ τ ðt Þ ¼ Lt þ T t τ þ I tþ τs for τ ¼ 1, . . . , s,

¼ Lt þ T t τ þ I tþ τ2s for τ ¼ s þ 1, . . . , 2s, ð14:56Þ
⋮
which is equivalent to the additive Holt–Winters’ method (4.1.16)–(4.1.20) with

seasonality s (we put γ ¼ αγ and δ ¼ (1 α) δ). Let us derive, e.g., the recursive
formula (14.54). It holds
I t ¼ I ts þ δ εt ¼ I ts þ ð1 αÞδεt ¼ I ts þ δðyt Lt I ts Þ

¼ δðyt Lt Þ þ ð1 δÞI ts :
In this context, a broad class of state space models can be considered providing
various types of exponential smoothing alternatives.
Table 14.2 Trend types used Trend

in state space approach to
None (N) L
exponential smoothing (φ
denotes a damping parameter Additive (A) L+Tτ
(0 < φ < 1)) Additive damped (Ad) L + (ϕ + ϕ2 + + ϕτ) τ
Multiplicative (M) L Tτ
þþϕτ
L T ϕþϕ
2
Multiplicative damped (Md)
1. Classification of Exponential Smoothing Models

State space approach to exponential smoothing by Hyndman et al. (2008) starts with
the classification of trend components. The five trend types (or growth patterns) are
presented in Table 14.2.
The classification of state space models in the context of exponential smoothing is
based on the additive character (A) or multiplicative character (M) of particular
decomposition components. It is a triplet ETS( , , ) for Error component (A or M),
Trend component (N, A, Ad, M, or Md; see Table 14.2), and Seasonal component
(N, A, or M). According to this classification, e.g., the Holt’s method is ETS(A, A,
N) (see (14.48)–(14.49)), the additive Holt–Winters’ method is ETS(A, A, A)
(see (14.50)–(14.51)), and similarly the multiplicative Holt–Winters’ method is
ETS(A, A, M).
Tables 14.3, 14.4, and 14.5 present the recursive relations of the type (14.52)–
(14.56) for models ETS(A, , N), ETS(A, , A), ETS(A, , M), respectively, i.e., for
additive errors only. In particular, Table 14.3 contains the Holt’s method, Table 14.4
contains the additive Holt–Winters’ method, and Table 14.5 contains the
Table 14.3 Recursive relations of exponential smoothing for state space models ETS(A, , N)
(φτ ¼ φ + φ2 + + φτ)
Trend Recursive relations for ETS(A, , N)
None (N) Lt ¼ αyt + (1 α)Lt1
bytþτ ðt Þ ¼ Lt
Additive (A) Lt ¼ αyt + (1 α)(Lt1 + Tt1)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1
bytþτ ðt Þ ¼ Lt þ T t τ
Additive damped (Ad) Lt ¼ αyt + (1 α)(Lt1 + ϕTt1)
Tt ¼ γ(Lt Lt1) + (1 γ)Tt1 ϕ
bytþτ ðt Þ ¼ Lt þ T t ϕτ
Multiplicative (M) Lt ¼ αyt + (1 α)Lt1Tt1
Tt ¼ γ(Lt/Lt1) + (1 γ)Tt1
bytþτ ðt Þ ¼ Lt T τt
Multiplicative damped (Md) Lt ¼ αyt þ ð1 αÞT ϕt1
T t ¼ γ ðLt =Lt1 Þ þ ð1 γ ÞT ϕt1
ϕ
bytþτ ðt Þ ¼ Lt T t τ
Table 14.4 Recursive relations of exponential smoothing for state space models ETS(A, , A)
(φτ ¼ φ + φ2 + + φτ, τs+ ¼ [(τ 1) mod s] + 1)
Trend Recursive relations for ETS(A, , A)
None (N) Lt ¼ α(yt 1Its) + (1 α)Lt1
It ¼ δ(yt Lt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt þ I tsþτþs
Additive (A) Lt ¼ α(yt Its) + (1 α)(Lt1 + Tt1)
It ¼ δ(yt Lt1 Tt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt þ T t τ þ I tsþτþs
Additive damped (Ad) Lt ¼ α(yt Its) + (1 α)(Lt1 + Tt1 ϕ)
It ¼ δ(yt Lt1 Tt1 ϕ) + (1 δ)Its
bytþτ ðt Þ ¼ Lt þ T t ϕτ þ I tsþτþs
Multiplicative (M) Lt ¼ α(yt Its) + (1 α)Lt1Tt1
It ¼ δ(yt Lt1Tt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt T τt þ I tsþτþs
Multiplicative damped (Md) Lt ¼ αðyt I ts Þ þ ð1 αÞLt1 T ϕt1
T t ¼ γðLt =Lt1 Þ þ ð1 γ ÞT ϕt1
I t ¼ δ yt Lt1 T ϕt1 þ ð1 δÞI ts
ϕ
bytþτ ðt Þ ¼ Lt T t τ þ I tsþτþs
Table 14.5 Recursive relations of exponential smoothing for state space models ETS(A, , M)
(φτ ¼ φ + φ2 + . . . + φτ, τs+ ¼ [(τ 1) mod s] + 1)
Trend Recursive relations for ETS(A, , M)
None (N) Lt ¼ α(yt/Its) + (1 α)Lt1
It ¼ δ(yt/Lt1) + (1 δ)Its
bytþτ ðt Þ ¼ Lt I tsþτþs
Additive (A) Lt ¼ α(yt/Its) + (1 α)(Lt1 + Tt1)
It ¼ δ(yt/(Lt1 + Tt1)) + (1 δ)Its
bytþτ ðt Þ ¼ ðLt þ T t τÞI tsþτþs
Additive damped (Ad) Lt ¼ α(yt/Its) + (1 α)(Lt1 + Tt1 ϕ)
It ¼ δ(yt/(Lt1 + Tt1 ϕ) + (1 δ)Its
bytþτ ðt Þ ¼ ðLt þ T t ϕτ ÞI tsþτþs
Multiplicative (M) Lt ¼ α(yt/Its) + (1 α)Lt1Tt1
It ¼ δ(yt/(Lt1Tt1)) + (1 δ)Its
bytþτ ðt Þ ¼ Lt T τt I tsþτþs
Multiplicative damped (Md) Lt ¼ αðyt =I ts Þ þ ð1 αÞLt1 T ϕt1
T t ¼ γðLt=Lt1 Þ þ ð1 γ ÞT ϕt1
I t ¼ δ yt Lt1 T ϕt1 þ ð1 δÞI ts
ϕ
bytþτ ðt Þ ¼ Lt T t τ I tsþτþs
multiplicative Holt–Winters’ method. One uses a simplifying denotation in these

Tables, namely φτ ¼ φ + φ2 + + φτ for a damping parameter φ (0 < φ < 1) and
τs+ ¼ [(τ 1) mod s] + 1 to simplify prediction relations of the type (14.56) (the
modulo operation finds the remainder after division by s). Moreover, the smoothing
relations of the type (14.55) are ignored since they can be obtained from the
corresponding prediction relations if we put τ ¼ 0.
Remark 14.2 An example of state space models with multiplicative error is the
model ETS(M, A, N) with DML of the form

1 1 α
xt ¼ xt1 þ ð 1 1 Þxt1 εt , ð14:57Þ
0 1 γ
yt ¼ ð 1 1 Þ xt1 ð1 þ εt Þ ð14:58Þ
which can be rewritten as
Lt ¼ ðLt1 þ T t1 Þð1 þ αεt Þ, ð14:59Þ
T t ¼ T t1 þ γ ðLt1 þ T t1 Þ εt , ð14:60Þ
byt ¼ ðLt1 þ T t1 Þð1 þ εt Þ: ð14:61Þ
Here the recursive relations are not presented due to their complexity. To derive
them one should express at first the relative error as
yt Eðyt j xt1 Þ
εt ¼ ð14:62Þ
Eðyt j xt1 Þ
(from the observation relation yt ¼ E(yt | xt-1)(1 + εt) of the corresponding DML).
⋄
2. Construction of Exponential Smoothing Models
There is a lot of technicalities to be solved when constructing the exponential
smoothing models described in the previous text (selection, estimation, initialization,
assessing forecast accuracy; see Hyndman et al. (2008) and also Sects. 3.3 and
4.1.3). Here we deal briefly with the problem of model estimation only.
For this purpose, we apply the following general form of the
corresponding DML:
xt ¼ f ðxt1 Þ þ gðxt1 Þ εt , ð14:63Þ
yt ¼ wðxt1 Þ þ r ðxt1 Þ εt ð14:64Þ
with the state vector xt ¼ (Lt, Tt, It, It-1, . . ., It-s+1)0 serving as an argument of scalar
and vector (linear) functions. Further one assumes that {εt} is a normal white
noise with variance σ 2. The models with additive errors have r(xt1) ¼ 1
(so that yt ¼ E(yt | xt1) + εt), while the models with multiplicative errors have
r(xt1) ¼ E(yt | xt1) (so that yt ¼ E(yt | xt1)(1 + εt)).
If θ ¼ (α, γ, δ, φ)0 is the vector of unknown model parameters and x0 contains
given initial state values, then the corresponding (normal) log likelihood function
can be written as
n X n
1X 2 2
n
L θ, σ 2 jy, x0 ¼ ln 2πσ 2 ln jr ðxt1 Þj ε =σ ð14:65Þ
2 t¼1
2 t¼1 t
for observations yt from the vector y ¼ (y1, . . ., yn)0 . If taking the partial derivative
with respect to σ 2 and setting it to zero one obtains the maximum likelihood estimate
of the innovation variance σ 2 as
1X 2
n
σ2 ¼
b ε : ð14:66Þ
n t¼1 t
After putting (14.66) to (14.65) one obtains the concentrated log likelihood.
Hence, maximum likelihood estimates of parameters θ ¼ (α, γ, δ, φ)0 can be obtained
by minimizing (twice) the negative log likelihood function, i.e.,
( ! )
X
n X
n
min n ln ε2t þ2 ln jr ðxt1 Þj , ð14:67Þ
θ
t¼1 t¼1
where {xt-1} and {εt} are calculated recursively using initial state values x0 and
observations yt from the vector y ¼ (y1, . . ., yn)0 . This estimation method can be
completed by information criteria (e.g., AIC; see Sect. 6.3.1) to identify (select)
correct state space models.
Remark 14.3 One can generalize the given approach also for nonlinear state space
models, e.g., for time series with conditional heteroscedasticity using the model
Lt ¼ μ þ Lt1 þ αεt , ð14:68Þ
ln htþ1 ¼ ν0 þ ν1 ln ht þ ν2 ln jεt j, ð14:69Þ

ln yt ¼ Lt1 þ εt , ð14:70Þ
where εt ~ N(0, ht) (sometimes the last term in (14.69) is supplemented by further
positive parameter v3 to the form v2ln(|εt| + v3) to reduce the problem of small
residuals as arguments of logarithmic function).
⋄
Example 14.6 Hyndman et al. (2008) estimated the model (14.68)–(14.70) for
monthly closing prices of the Dow Jones Index (DJI) over the period January
1990–March 2007 as
Lt ¼ 0:0074 þ Lt1 þ 0:960εt ,
ln htþ1 ¼ 0:043 þ 0:932 ln ht þ 0:125 ln jεt j,
ln yt ¼ Lt1 þ εt :
⋄
14.3 Exercises
Exercise 14.1 Derive the recursive relations of exponential smoothing for particular
state space models in Tables 14.3, 14.4, and 14.5 (hint: e.g., for ETS(A,N,N) in the
first row of Table 14.3 using model Lt ¼ Lt1 + α εt and yt ¼ Lt1 + εt one gets
Lt ¼ Lt1 + α(yt Lt1) ¼ αyt + (1 α)Lt1).
Exercise 14.2 Derive in detail the minimized expression in (14.67) when constructing
the maximum likelihood parameter estimates of state space models of exponential
smoothing.
References
Abraham, B., Ledolter, J.: Statistical Methods for Forecasting. Wiley, New York (1983)
Acerbi, C.: Spectral measures of risk: a coherent representation of subjective risk aversion. J. Bank.
Financ. 26, 1505–1518 (2002)
Adrian, T., Brunnermeier, M.K.: CoVaR. Federal Reserve Bank of New York, Staff Report
no. 348, September 2008
Ait-Sahalia, Y.: Testing continuous-time models for the spot interest rate. Rev. Financ. Stud. 9,
385–426 (1996)
Ait-Sahalia, Y., Jacod, J.: High-Frequency Financial Econometrics. Princeton University Press,
Princeton (2014)
Alexander, C.O., Chibumba, A.M.: Multivariate Orthogonal Factor GARCH. University of Sussex
Discussion Papers in Mathematics (1997)
Almon, S.: The distributed lag between capital appropriations and expenditures. Econometrica. 33,
178–196 (1965)
Al-Osh, M.A., Alzaid, A.A.: First-order integer-valued autoregressive (INAR(1)) process. J. Time
Ser. Anal. 8, 261–275 (1987)
Artzner, P., Delbaen, F., Eber, J.-M., Heath, D.: Coherent measures of risk. Math. Financ. 9,
203–228 (1999)
Bauwens, L., Laurent, S., Rombouts, J.: Multivariate GARCH models: a survey. J. Appl. Econ. 21,
79–109 (2006)
Baxter, M., Rennie, A.: Financial Calculus. An Introduction to Derivative Pricing. Cambridge
University Press, Cambridge (1996)
Bollerslev, T.: Generalized autoregressive conditional heteroscedasticity. J. Econ. 31, 307–327
(1986)
Bollerslev, T.: Modeling the coherence in short-run nominal exchange rates: a multivariate gener-
alized ARCH model. Rev. Econ. Stat. 72, 498–505 (1990)
Bollerslev, T., Wooldridge, J.M.: Quasi-maximum likelihood estimation and inference in dynamic
models with time varying covariances. Econ. Rev. 11, 143–172 (1992)
Bollerslev, T., Engle, R.F., Wooldridge, J.M.: A capital-asset pricing model with time-varying
covariances. J. Polit. Econ. 96, 116–131 (1988)
Bølviken, E.: New tests of significance in periodogram analysis. Scand. J. Statist. 10, 1–10 (1983)
Bowerman, B.L., O’Connell, R.T.: Time Series Forecasting. Duxbury Press, Boston (1987)
Box, G.E.P., Jenkins, G.M.: Time Series Analysis, Forecasting and Control. Holden-Day, San
Francisco (1970)
Box, G.E.P., Tiao, G.C.: Intervention analysis with applications to economic and environ-mental
problems. J. Am. Stat. Assoc. 70, 70–79 (1975)

https://doi.org/10.1007/978-3-030-46347-2
392 References
Brock, W.A., Dechert, D., Scheinkman, H., LeBaron, B.: A test for independence based on the
correlation dimension. Econ. Rev. 15, 197–235 (1996)
Brockwell, P.J.: Lévy-driven continuous-time ARMA processes. In: Andersen, T.G., et al. (eds.)
Handbook of Financial Time Series, Part III: Topics in Continuous Time Processes. Springer,
Berlin (2009)
Brockwell, P.J., Davis, R.A.: Time Series: Theory and Methods. Springer, New York (1993)
Brockwell, P.J., Davis, R.A.: Introduction to Time Series and Forecasting. Springer, New York
(1996)
Brownlees, C.T., Engle, R.: Volatility, correlation and tails for systemic risk measurement. Stern
Center for Research Computing, New York University, New York (2012)
Campbell, J.Y., Lo, A.W., MacKinlay, A.C.: The Econometrics of Financial Markets. Princeton
University Press, Princeton (1997)
Chan, K.S., Tong, H.: On estimating thresholds in autoregressive models. J. Time Ser. Anal. 7,
179–190 (1986)
Chappel, D., Padmore, J., Mistry, P., Ellis, C.: A threshold model for French franc/Deutsch mark
exchange rate. J. Forecast. 15, 155–164 (1996)
Chen, R., Tsay, R.S.: Functional-coefficients autoregressive models. J. Am. Stat. Assoc. 88,
298–308 (1993)
Chiriac, R., Voev, V.: Modeling and forecasting multivariate realized volatility. J. Appl. Econ. 26,
922–947 (2011)
Cipra, T.: Financial and Insurance Formulas. Springer, New York (2010)
Cipra, T., Hendrych, R.: Systemic risk in financial risk regulation. Czech J. Econ. Financ. 67, 15–38
(2017)
Cipra, T., Hendrych, R.: Modeling of currency covolatilities. Statistika. 99(3), 259–271 (2019)
Clements, M., Harvey, D.: Forecast combination and encompassing. In: Mills, T., Patterson,
K. (eds.) The Palgrave Handbook of Econometrics, Applied Econometrics, vol. 2. Palgrave,
Oxford (2011)
Clements, A., Scott, A., Silvennoinen, A.: Forecasting Multivariate Volatility in Larger Dimen-
sions: Some Practical Issues. Working Paper #80, NCER Working Paper Series (2012)
Conley, T.G., Hansen, L.P., Luttmer, E.G.J., Scheinkman, J.A.: Short-term interest rates as
subordinate diffusions. Rev. Financ. Stud. 10, 525–577 (1997)
Dagum, E.B., Bianconcini, S.: Seasonal Adjustment Methods and Real Time Trend-Cycle Estima-
tion. Springer, New York (2016)
Davidson, J.: Econometric Theory. Blackwell, Oxford (2000)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM
algorithm. J. R. Stat. Soc. 39, 1–38 (1977)
Dickey, D.A., Fuller, W.A.: Distribution of estimators for time series regressions with a unit
root. J. Am. Stat. Assoc. 74, 427–431 (1979)
Dickey, D.A., Fuller, W.A.: Likelihood ratio statistics for autoregressive time series with a unit root.
Econometrica. 49, 1057–1072 (1981)
Ding, Z., Granger, C.W.J., Engle, R.F.: A long memory property of stock market returns and a new
model. J. Empir. Financ. 1, 83–106 (1993)
Duffie, D.: Security Markets: Stochastic Models. Academic, New York (1988)
Dupačová, J., Hurt, J., Štěpán, J.: Stochastic Modeling in Economics and Finance. Kluwer, Boston
(2002)
Durbin, J., Koopman, S.J.: Time Series Analysis by State Space Methods. Oxford University Press,
Oxford (2012)
Eilers, P.H.C., Marx, B.D.: Splines, knots and penalties. Wiley Interdiscip. Rev.: Comput. Stat. 2
(6), 637–653 (2010)
Elerian, O., Chib, S., Shephard, N.: Likelihood inference for discretely observed non-linear
diffusions. Econometrica. 69, 959–993 (2001)
Elliot, R.J., Kopp, P.E.: Mathematics of Financial Markets. Springer, New York (2004)
Embrechts, P., Kuppelberg, C., Mikosch, T.: Modelling Extremal Events. Springer, Berlin (1997)
References 393
Enders, W.: Applied Econometric Time Series. Wiley, New York (1995)
Engle, R.F.: Autoregressive conditional heteroscedasticity with the estimates of the variance of
United Kingdom inflations. Econometrica. 50, 987–1007 (1982)
Engle, R.F.: Dynamic conditional correlation—a simple class of multivariate GARCH
models. J. Bus. Econ. Stat. 20, 339–350 (2002)
Engle, R.F.: Anticipating Correlation. A New Paradigm for Risk Management. Theory and Practice.
Princeton University Press, Princeton (2009)
Engle, R.F., Granger, C.W.J.: Co-integration, and error correction: representation, estimation and
testing. Econometrica. 55, 251–276 (1987)
Engle, R.F., Kroner, K.F.: Multivariate simultaneous generalized GARCH. Econ. Theory. 11,
122–150 (1995)
Engle, R.F., Ng, V.K.: Measuring and testing the impact of news on volatility. J. Financ. 48,
1749–1778 (1993)
Engle, R.F., Russell, R.J.: Autoregressive conditional duration: a new model for irregularly spaced
transaction data. Econometrica. 66, 1127–1162 (1998)
Engle, R.F., Yoo, B.S.: Forecasting and testing in cointegrated systems. J. Econ. 35, 143–159
(1987)
Engle, R.F., Lilien, D.M., Robins, R.P.: Estimating time varying risk premia in the term structure:
the ARCH-M model. Econometrica. 55, 391–407 (1987)
Engle, R.F., Ng, V.K., Rothschild, M.: Asset pricing with a factor ARCH covariance structure:
empirical estimates for treasury bills. J. Econ. 45, 213–238 (1990)
Eraker, B.: MCMC analysis of diffusion models with applications to finance. J. Bus. Econ. Stat. 19,
177–191 (2001)
EViews 10. IHS Global Inc., Englewood (2018)
Fan, J., Yao, Q.: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer,
New York (2005)
Fleming, J., Kirby, C., Ostdiek, B.: The economic value of volatility timing using “realized”
volatility. J. Financ. Econ. 67, 473–509 (2003)
Francq, C., Zakoian, J.-M.: GARCH Models. Wiley, Chichester (2010)
Franke, J., Härdle, W., Hafner, C.M.: Statistics of Financial Markets. Springer, New York (2004)
Franses, P.H., van Dijk, D.: Non-Linear Time Series Models in Empirical Finance. Cambridge
Fuller, W.A.: Introduction to Statistical Time Series. Wiley, New York (1976)
Gallant, A.R., Long, J.R.: Estimating stochastic diffusion equations efficiently by minimum
chi-squared. Biometrika. 84, 125–141 (1997)
Glosten, L.R., Jagannathan, R., Runkle, D.E.: On the relation between the expected value and the
volatility of the nominal excess return on stocks. J. Financ. 48, 1779–1801 (1993)
Gómez, V.: Multivariate Time Series with Linear State Space Structure. Springer, New York (2016)
Gourieroux, C.: ARCH Models and Financial Applications. Springer, New York (1997)
Gourieroux, C., Jasiak, J.: Financial Econometrics: Problems, Models, and Methods. Princeton
University Press, Princeton (2001)
Granger, C.W.J.: Investigating causal relations by econometric models and cross-spectral methods.
Econometrica. 37, 424–438 (1969)
Granger, C.W.J.: Long memory relationships and the aggregation of dynamic models. J. Econ. 14,
227–238 (1980)
Granger, C.W.J., Andersen, A.P.: An Introduction to Bilinear Time Series Models. Vandenhoek
and Ruprecht, Gottingen (1978)
Granger, C.W.J., Newbold, P.: Forecasting Economic Time Series. Academic, San Diego (1986)
Greene, W.H.: Econometric Analysis. Prentice Hall, New York (2012)
Haggan, V., Ozaki, T.: Modelling nonlinear vibrations using an amplitude-dependent auto-regres-
sive time series models. Biometrika. 68, 189–196 (1981)
Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series and business
cycle. Econometrica. 57, 357–384 (1989)
394 References
Hamilton, J.D.: Time Series Analysis. Princeton University Press, Princeton (1994)
Härdle, W.: Applied Nonparametric Regression. Cambridge University Press, New York (1990)
Harvey, A.C.: Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge
Hatanaka, M.: Time-Series-Based Econometrics. Oxford University Press, Oxford (1996)
Hautsch, N.: Econometrics of Financial High-Frequency Data. Springer, Heidelberg (2012)
Heij, C., de Boer, P., Franses, P.H., Kloek, T., van Dijk, H.K.: Econometric Methods with
Applications in Business and Economics. Oxford University Press, Oxford (2004)
Hendry, D.F.: Dynamic Econometrics. Oxford University Press, Oxford (1995)
Hendrych, R., Cipra, T.: On conditional covariance modelling: an approach using state space
models. Comput. Stat. Data Anal. 100, 304–317 (2016)
Hendrych, R., Cipra, T.: Systemic risk in financial risk regulation. Czech J. Econ. Financ. 67, 15–38
(2017)
Hendrych, R., Cipra, T.: Self-weighted recursive estimation of GARCH models. Commun. Stat.
Simul. Comput. 47, 315–328 (2018)
Hull, J.: Options, Futures, and Other Derivative Securities. Prentice Hall, Englewood Cliffs (1993)
Hurst, H.: Long term storage capacity of reservoirs. Trans. Am. Soc. Civil Eng. 116, 770–799
(1951)
Hyndman, R.J., Koehler, A.B., Ord, J.K., Snyder, R.D.: Forecasting with Exponential Smoothing.
Springer, Berlin (2008)
Jacobs, P., Lewis, P.: Stationary discrete autoregressive-moving average time series generated by
mixtures. J. Time Ser. Anal. 4, 19–36 (1983)
Jeantheau, T.: Strong consistency of estimators for multivariate ARCH models. Econ. Theory. 14,
70–86 (1998)
Johansen, S.: Estimation and hypothesis testing of cointegration vectors in Gaussian vector
autoregressive models. Econometrica. 59, 1551–1580 (1991)
Johansen, S.: Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford
University Press, Oxford (1995)
Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Springer, New York (1988)
Kedem, B.: Binary Time Series. Marcel Dekker, New York (1980)
Kedem, B., Fokianos, K.: Regression Models for Time Series Analysis. Wiley, Hoboken (2002)
Kendall, M.: Time-Series. Griffin, London (1976)
Kessler, M.: Estimation of an ergodic diffusion from discrete observations. Scand. J. Stat. 24, 1–19
(1997)
Koopmans, L.H.: The Spectral Analysis of Time Series. Academic, San Diego (1995)
Kroner, K.F., Ng, V.K.: Modelling asymmetric co-movements of asset returns. Rev. Financ. Stud.
11, 817–844 (1998)
Kwaitkovski, D., Phillips, P.C.B., Schmidt, P., Shin, Y.: Testing the null hypothesis of stationarity
against the alternative of a unit root. J. Econ. 54, 159–178 (1992)
Kwok, Y.-K.: Mathematical Models of Financial Derivatives. Springer, New York (1998)
Lanne, M., Saikkonen, P.: A multivariate generalized orthogonal factor GARCH model. J. Bus.
Econ. Stat. 25, 61–75 (2007)
Lim, K.G.: Financial Valuation and Econometrics. World Scientific, Singapore (2011)
Lin, W.L.: Alternative estimators for factor GARCH models – a Monte Carlo comparison. J. Appl.
Econ. 7, 259–279 (1992)
Ljung, L.: System Identification: Theory for the User. Prentice Hall PTR, Upper Saddle River
(1999)
Ljung, L., Söderström, T.: Theory and Practice of Recursive Identification. MIT Press, Cambridge
(1983)
Lo, A.W.: Maximum likelihood estimation of generalized Ito’s processes with discretely sampled
data. Econ. Theory. 4, 231–247 (1988)
Lütkepohl, H.: New Introduction to Multiple Time Series Analysis. Springer, Berlin (2005)
References 395
MacDonald, I., Zucchini, W.: Hidden Markov and Other Models for Discrete-Valued Time Series.
Chapman and Hall, London (1997)
MacKinnon, J.G.: Numerical distribution functions for unit root and cointegration tests. J. Appl.
Econ. 11, 601–618 (1996)
MacKinnon, J.G., Haugh, A.A., Michelis, L.: Numerical distribution functions of likelihood ratio
tests for cointegration. J. Appl. Econ. 14, 563–577 (1999)
Makridakis, S.: Accuracy measures: theoretical and practical concerns. Int. J. Forecast. 9, 527–529
(1993)
Malliaris, A.G., Brock, W.A.: Stochastic Methods in Economics and Finance. North-Holland,
Amsterdam (1982)
McKenzie, E.: Some ARMA models for dependent sequences of Poisson counts. Adv. Appl.
Probab. 20, 822–835 (1988)
McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management. Princeton University Press,
Princeton (2005)
Merigó, J.M., Yager, R.R.: Generalized moving averages, distance measures and OWA operators.
Int. J. Uncertainty Fuzziness Knowledge-Based Syst. 21, 533–559 (2013)
Mills, T.C.: The Econometric Modelling of Financial Time Series. Cambridge University Press,
Cambridge (1993)
Montgomery, D.C., Johnson, L.A.: Forecasting and Time Series Analysis. McGraw-Hill,
New York (1976)
Musiela, M., Rutkowski, M.: Martingale Methods in Financial Modelling. Springer, New York
(2004)
Neftci, S.N.: Mathematics od Financial Derivatives. Academic, New York (2000)
Nelson, D.B.: Conditional heteroskedasticity in asset returns: a new approach. Econometrica. 59,
347–370 (1991)
Newey, W.K., West, K.D.: A simple, positive semi-definite, heteroskedasticity and auto-correlation
consistent covariance matrix. Econometrica. 55, 703–708 (1987)
Nicholls, D.F., Quinn, B.G.: Random Coefficient Autoregressive Models: An Introduction, Lecture
Notes in Statistics 11. Springer, New York (1982)
Parkinson, M.: The extreme value method for estimating the variance of the rate of return. J. Bus.
53, 61–65 (1980)
Phillips, P.C.B., Perron, P.: Testing for a unit root in time series regression. Biometrika. 75,
335–346 (1988)
Poon, S.-H.: A Practical Guide to Forecasting Financial Market Volatility. Wiley, Chichester (2005)
Priestley, M.B.: Non-Linear and Non-Stationary Time Series Analysis. Academic, London (1988)
Priestley, M.B.: Spectral Analysis and Time Series (Volume 1 and 2). Academic, London (2001)
Rachev, S.T., Mittnik, S., Fabozzi, F.J., Focardi, S.M., Jašić, T.: Financial Econometrics: from
Basics to Advanced Modeling Techniques. Wiley, Chichester (2007)
Ramsey, J.B.: Tests for specification errors in classical linear least-squares regression analysis. J. R.
Stat. Soc. B. 31, 350–371 (1969)
Ripley, B.D.: Statistical aspects of neural network. In: Barndorff-Nielsen, O.B., et al. (eds.)
Networks and Chaos-Statistical and Probability Aspects. Chapman and Hall, London (1993)
Risk Metrics-Technical Document. J. P. Morgan/Reuters, New York. www.riskmet-rics.com
(1996)
Ruppert, D.: Statistics and Finance. Springer, New York (2004)
Schlicht, E.: A seasonal adjustment principle and a seasonal adjustment method derived from this
principle. J. Am. Statist. Assoc. 76, 374–378 (1982)
Sentana, E.: Quadratic ARCH models. Rev. Econ. Stud. 62, 639–661 (1995)
Siegel, A.F.: Testing for periodicity in a time series. J. Am. Stat. Assoc. 75, 345–348 (1980)
Silvennoinen, A., Teräsvirta, T.: Multivariate GARCH models. In: Andersen, T.G., Davis, R.A.,
Kreiss, J.-P., Mikosch, T. (eds.) Handbook of Financial Time Series. Springer, New York
(2009)
Sims, C.A.: Money, income, and causality. Am. Econ. Rev. 62, 540–552 (1972)
396 References
Söderström, T., Stoica, P.: System Identification. Prentice Hall, New York (1989)
Steele, J.M.: Stochastic Calculus and Financial Applications. Springer, New York (2001)
Taylor, S.: Modelling Financial Time Series. Wiley, New York (1986)
Taylor, S.J.: Modelling stochastic volatility. Math. Financ. 4, 183–204 (1994)
Theil, H.: Applied Economic Forecasting. North-Holland, Amsterdam (1966)
Timmermann, A.: Forecast combinations. In: Handbook of Economic Forecasting, vol. 1, pp.
135–196. Elsevier (2006)
Tjøstheim, D.: Some doubly stochastic time series models. J. Time Ser. Anal. 7, 51–72 (1986)
Tong, H.: Threshold Models in Non-Linear Time Series Analysis. Springer, New York (1983)
Tong, H.: Non-Linear Time Series: A Dynamical Systems Approach. Oxford University Press,
Oxford (1990)
Tsay, R.S.: Analysis of Financial Time Series. Wiley, New York (2002)
Tse, Y.K., Tsui, A.K.C.: A multivariate GARCH model with time-varying correlations. J. Bus.
Econ. Stat. 20, 351–362 (2002)
van der Weide, R.: GO-GARCH: a multivariate generalized orthogonal GARCH model. J. Appl.
Econ. 17, 549–564 (2002)
Vasicek, O.: An equilibrium characterization of the term structure. J. Financ. Econ. 5, 177–188
(1977)
Vrontos, I.D., Dellaportas, P., Politis, D.N.: A full-factor multivariate GARCH model. Econ. J. 6,
311–333 (2003)
Wang, S.S.: A class of distortion operators for pricing financial and insurance risks. J. Risk Insur.
67, 15–36 (2000)
Wang, P.: Financial Econometrics: Methods and Models. Routledge, London (2003)
Wecker, W.E.: Asymmetric time series. J. Am. Stat. Assoc. 76, 16–21 (1981)
Wei, W.W.S.: Time Series Analysis: Univariate and Multivariate Methods. Addison-Wesley,
Boston (1994)
Weiss, C.H.: An Introduction to Discrete-Valued Time Series. Wiley, Chichester (2018)
Wilmott, P.: Quantitative Finance. Wiley, Chichester (2000)
Wu, C.F.J.: On the convergence of the EM algorithm. Ann. Stat. 11, 95–103 (1983)
Wu, W.B.: Nonlinear system theory: another look at dependence. Proc. Natl. Acad. Sci. 102(40),
14150–14156 (2005)
Yamai, Y., Yoshiba, T.: Value-at-risk versus expected shortfall: a practical perspective. J. Bank.
Financ. 29, 997–1015 (2005)
Zakoian, J.M.: Threshold heteroskedastic models. J. Econ. Dyn. Control. 18, 931–934 (1994)
Index
A Augmented DF-test (ADF-test), 154

Accumulation, 7 Autocorrelation for lag k, 125
ACD(r,s), 249 Autocorrelation function (ACF), 125, 306
Adaptive approaches, 61 estimated, 126, 306
Adaptive control process, 80 partial, 127
Adjusted mean absolute percentage error Autocovariance for lag k, 125
(AMAPE), 25 Autocovariance function, 125, 305
Airline model, 163 estimated, 126, 127, 306
Akaike information criterion (AIC), 140, 319 Autoregressive (AR) process, 131
Amplitude-dependent model, 239 Autoregressive conditional heteroscedasticity
Analysis (ARCH), 210
correlation, 13 generalized (GARCH), 217
Fourier, 14 Autoregressive process
intervention, 15, 193 conditional duration, 249
robust, 15 exponential, 239
spectral, 14, 102 functional-coefficient, 238
Antipersistent process, 172 Markov-switching, 239
Approximation random coefficient, 238
Bartlett’s, 127, 320 threshold, 236
Quenouille’s, 128 self exciting, 236
AR(1), 132 vector, 315
AR(2), 134
AR( p), 131
Arbitrage, 258 B
ARCH(1), 212 Bandwith, 241
ARCH(r), 211 Bartlett’s approximation, 127, 320
ARFIMA( p,d,q), 172 Basel III, 206, 352
ARIMA( p,d,q), 157 BDS test, 244
ARMA(1,1), 135 BEKK, 356
ARMA( p,q), 135 diagonal, 357
Asset scalar, 357
financial, 251 Bias, 242
underlying, 258 proportional, 23
Asymmetric model, 14 Bid–ask spread, 248
Asymmetric power ARCH (APARCH), 229 Bilinear model, 232

https://doi.org/10.1007/978-3-030-46347-2
398 Index
Bilinear model (cont.) vector, 340

completely, 232 Component
diagonal, 232, 233 cyclical, 10
subdiagonal, 232, 234 irregular, 11
superdiagonal, 232, 233 periodic, 11
Binary process, 28 random, 11
by clipping, 37 residual, 11
Binomial tree model, 264 seasonal, 10, 87
Black–Derman–Toy model, 264 systematic, 11
Black–Karasinski model, 264 Conditional heteroscedasticity, 14, 206, 389
Black–Scholes formula, 258, 262 autoregressive, 210
Block Conditional value at risk (CoVaR), 272, 364
maxima, 288, 292, 293 Constant
minima, 292 discount, 76
Boom, 10 smoothing, 77
Bootstrap, 287 Constant conditional correlations (CCC), 358
Box Cox transformation, 107 Contagion, 365
Box–Jenkins Methodology, 12, 13, 123 Continuous time, 6, 28, 33, 35, 251
Box–Pierce statistics, 146 Continuous-time ARMA (CARMA) process,
Branching process, 29 37
Brennan–Schwartz model, 264 Correlatedness
Breusch–Godfrey (BG) test, 179 negative, 133, 177
Brownian motion positive, 133, 177
geometric, 255 Correlation
standard, 252 analysis, 13
Business cycle, 10 canonical, 344
conditional, 355
constant, 358
C dynamic, 358
Calendar, 7 integral, 245
conventions, 8 Correlogram, 125
effect, 202 Counting process, 6, 37
Call, 260 Covariance
Canonical correlation, 344 heteroscedasticity consistent, 216
Capital adequacy, 267 proportional, 24
Causality, 326 Covolatility, 352
Granger causality, 326 matrix, 354
Centered moving averages, 74 conditional, 355
Central Limit Theorem (CLT), 289 Cox–Ingersoll–Ross (CIR) model, 264, 265
Cholesky decomposition, 316, 333 Credit risk, 267
Clustering, 202 Credit spread, 263
Cochrane–Orcutt method, 180 risk, 267
Coefficient Currency rate, 382
correlation Curve
Kendall, 116 discount, 263
Spearman, 116 mathematical, 43
diffusion, 36, 253 S-curve, 17
drift, 253 asymmetric around inflection point, 57
functional, 238 symmetric around inflection point, 55
growth, 48 yield, 263
random, 238 forward, 263
variance, 271 spot, 263
Cointegration, 13, 336, 339 Cyclical component, 10
Index 399
D excesses, 296, 298

DARMA, 37 exponential, 33, 270
Data fat-tailed, 201
high frequency, 246 Fréchet, 290
irregularly spaced, 5, 7, 248 generalized extreme value, 289
mining, 315 Gumbel, 290
tick, 203 heavy-tailed, 201
time, 5 leptokurtic, 201
transactions, 246 lognormal, 256, 270, 273
ultra-high-frequency, 7 normal, 270, 273
unequally spaced, 248 Pareto, 296
Decomposition generalized, 296
additive, 11, 87, 89 type II, 296
Cholesky, 316, 334 probability, 34
multiplicative, 11, 87, 90 stationary, 31
variance, 335 vector, 31
Delphi method, 17 Weibull, 291
Derivative security, 258 Diurnal pattern, 248
Deviation Domain
conditional mean value, 204 spectral, 14, 102
mean absolute, 271 time, 14, 102
risk measure, 271 Double exponential smoothing, 81
semi-deviation Double stochastic model, 238
mean absolute, 272 Dow Jones Index (DJI), 390
standard, 271 Drift, 36
one-sided, 271 coefficient, 253
Devolatilization, 358 Dummies, 93, 194
Diagnostic of model, 136, 320 Duration, 246
Diagonal vector model (DVEC), 356 autoregressive conditional process, 249
Dickey–Fuller (DF) test, 153, 344 Durbin–Watson (DW) test, 178
augmented, 154 Dynamic conditional correlations (DCC), 358
Difference Dynamic hedging, 351
dth, 70 Dynamic linear model (DLM), 373
first, 70 Dynamic regression model, 175
fractional, 170
second, 70
Difference operator, 70 E
seasonal, 70 EACD, 249
Differential EC model (ECM), 339, 340
random, 254 Effect
stochastic, 254 calendar, 202
Diffusion coefficient, 36, 253 cumulative, 185
Diffusion process, 253 leverage, 202
Discount constant, 76 long-run, 185
Discount curve, 263 microstructure, 200
Discrete time, 6, 28 short-run, 185
Distorted risk measure, 273 EGARCH, 223
Distributed lag (DL) model, 176, 184 Eigenvalue, 340
autoregressive, 176, 192 Elimination
geometric, 185 seasonal, 10
polynomial, 186 trend, 42
Distribution EM algorithm, 376
conditional of state vector, 375 Endogenous variable, 315
400 Index
Equation FI(d ), 170

Black–Scholes differential, 261 FIEGARCH, 229
observation, 373 FIGARCH, 229
simultaneous, 315 Filter
state, 373 Hodrick–Prescott, 75
volatility, 205 Kalman, 15, 373, 376
Yule–Walker, 132, 318 Financial asset, 251
Error Financial derivative, 258
prediction, 22, 204 Financial risk, 267
ES, 274 Financial time series, 14, 199, 231
conditional, 366 Fisher’s test of periodicity, 104
Estimation of model, 136, 141, 215, 319, 361 Forgetting factor, 381
maximum likelihood (ML), 143, 215 constant, 381
quasi (QML), 216, 265 increasing, 381
Nadaraya–Watson, 241 Formula
Newey–West, 181 Black–Scholes, 258, 262
nonlinear least squares (NLS), 142 Rhodes, 56
OLS, 141, 175, 181 Fractional differencing, 170
recursive, 376 Fractionally integrated process, 170
ETS models, 386 Fréchet distribution, 290
additive, 386 Free of risk, 261
multiplicative, 386 Frequency, 102
Excess Nyquist, 10, 103
distribution function, 296 records, 5
mean function, 296 Function
sample mean, 299 activation, 242
threshold, 288, 295 autocorrelation, 125, 306
Exogenous variable, 175, 315 estimated, 126, 306
strongly, 326 partial, 127
Expectation-maximization (EM) algorithm, autocovariance, 125, 305
376 estimated, 126, 306
Expected shortfall, 272 distorted, 274
conditional, 366 Wang, 275
marginal (MES), 366, 368 distribution of excesses, 296
Explosive process, 151 goniometric, 35, 95
Exponential smoothing, 75 growth, 55
double, 81 mean excess, 296
simple, 76 mutual
state space model, 383, 386 correlation, 306
triple, 84 covariance, 306
Exponential trend, 47 partial, 309
modified, 49 piecewise linear, 60
Exponentially weighted moving average slowly varying, 290
(EWMA), 206, 285 spline, 59
multivariate, 352 transfer, 192
Extrapolation, 16 transition, 237
Extreme index, 292 Functional coefficient, 238
Extreme value theory (EVT), 288
G
F GACD, 249
Factor model, 355, 359 Galton–Watson process, 29
FAR, 238 GARCH(1,1), 218
Fat-tailed distribution, 201 GARCH-M, 228
Feedback, 306, 326 GARCH(r,s), 217
Index 401
Generalized ARCH (GARCH) model, 217 Implied volatility, 209

exponential, 223 mutual, 353
GJR, 222 Impulse response, 144, 149, 194, 332
in-mean, 229 Index
integrated, 221 DJI, 390
multivariate, 353 Dow Jones, 390
estimation, 363 extreme, 292
recursive, 379 PX, 207, 258, 367
orthogonal, 360 seasonal, 87, 93, 98
generalized, 360 tail, 290
quadratic, 229 Information criterion, 140
vector, 355 AIC, 140, 319
diagonal, 356 Akaike, 140, 319
Generalized extreme value (GEV) Bayes, 140
distribution, 289 BIC, 140
Generalized Pareto distribution (GPD), 296 Schwarz, 140
GJR GARCH, 222 Innovation, 11, 129, 166, 204
GO-GARCH, 360 Integer autoregressive process (INAR), 37
Gompertz trend, 56 Integral
Goniometric function, 35, 95 correlation, 245
Granger causality, 326 random, 254
Gross domestic product (GDP), 187, 309, stochastic, 254
324, 330 Integrated process, 158
Growth function, 55 Intensity, 33
Gumbel distribution, 290 transition, 34
Interest rate
instantaneous, 264
H short term, 306
Hadamar product, 356 term structure, 263
Heavy-tailed distribution, 201 Intermediate-memory process, 172
Hedging ratio, 351 Internal model, 267
Heteroscedasticity and autoregression Interval prediction, 16
consistent covariances (HAC), 181 Intervention, 8, 193
Heteroscedasticity consistent covariances, 216 analysis, 15, 193
High-frequency data, 246 Irregular observation, 15
High-low measure, 203 Irregularly spaced data, 5, 7, 248
Historical simulation method, 279 Ito’s lemma, 254
modifications, 282 Ito’s process, 253
Historical volatility, 206
H-L measure, 203
Hodrick–Prescott filter, 75 J
Holding period, 268 Jarque–Bera test, 207, 220, 321
Hold-out sample, 19, 243 Johansen test, 344
Ho–Lee model, 264 Johnson trend, 59
Holt’s method, 84, 384 Jump, 194
Holt Winter’s method, 97
additive, 97, 384
multiplicative, 99
Hull–White model, 264 K
Kalman filter, 15, 373, 376
Kendall rank correlation coefficient τ, 116
I Kernel, 241
I(d ), 158 Kolmogorov differential equations, 35
Identification of model, 136, 137, 214, 318 Koyck transformation, 180
IGARCH(r,s), 221 KPSS-test, 155
402 Index
L Matrix
Lag covolatility, 352
mean, 185 conditional, 355
median, 185 eigenvalue, 340
operator, 69 rank, 340
Lagrange multiplier (LM) test, 320 transition, 31
Layer volatility, 354
hidden, 242 Maturity, 263
input, 242 Maximum domain of attraction, 289
output, 242 Maximum likelihood (ML), 215
Least squares method (OLS), 44 quasi (QML), 216, 362
weighted, 48 Mean absolute error (MAE), 24
Length percentage, 25
moving averages, 66 adjusted, 25
time series, 8 Mean absolute percentage error (MAPE), 25
Leptokurtic distribution, 201 Mean age, 78
Level, 97 Mean-corrected returns, 205
confidence, 269 Mean reverting, 151, 169, 264
return, 294 Mean squared error (MSE), 22
Leverage effect, 202 root, 24
Likelihood ratio (LR) test, 318, 345 Mean value of process
Linear process, 128 conditional, 204
multivariate, 312 estimated, 126
Linear trend, 43 nonlinear, 231
Liquidity risk, 268 Measure
Ljung–Box statistics, 146 high-low (H-L) measure, 203
Log return, 161, 199, 207 prediction accuracy, 22
Logarithmic price, 199, 255 risk, 268
Logarithmic trend, 59 coherent, 268
Logistic trend, 54 deviation, 271
Lognormal distribution, 256, 270, 273 —mean absolute, 271
Long memory process, 170, 171 —semi-deviation, 272
Long-run equilibrium, 336 —standard, 271
Longstaff–Schwartz model, 264 ——one-sided, 271
distorted, 273, 274
spectral, 275
M value at risk, 267, 268
MA(1), 130 —conditional, 264, 272
MA(2), 131 ——tail, 272
MA(q), 130 variance, 271
Marginal expected shortfall (MES), 366, 368 —coefficient, 271
Marginal model, 321 Wang, 275
Market price of risk, 265 Median test, 117
Market risk, 267 Method
Markov chain, 30 Cochrane–Orcutt, 180
homogenous, 30 Delphi, 17
Monte Carlo (MCMC), 240, 266 historical simulation, 279
Markov process, 33 modifications, 282
continuous, 35 Holt’s, 84, 384
switching, 239 Holt Winter’s, 97
Markov property, 30, 34 additive, 97, 384
Markov-switching autoregressive (MSA), 239 multiplicative, 99
Markov-switching (MSW), 239 least squares (OLS), 44
Index 403
weighted, 48 diagnostic, 320

MCMC, 240 distributed lag (DL), 176, 184
moving averages, 61 autoregressive, 176, 192
recursive, 14 DLM, 373
Schlicht’s, 101 double stochastic, 238
simulation Monte Carlo, 287 duration, 246
variance-covariance, 276 DVEC, 356
Methodology dynamic linear, 373
Box–Jenkins, 12, 123 EACD, 249
prediction, 164 EC, 339
value at risk (VaR), 268, 270 EGARCH, 223
Microstructure effect, 200 estimation, 136, 141, 215, 317, 322, 363
Microstructure noise, 248 ETS, 386
Missing observation, 15 additive, 386
Mixed process ARMA, 134 multiplicative, 386
Model EWMA, 206, 285
ACD, 249 multivariate, 352
ADL, 192 factor, 355
airline, 163 FAR, 238
amplitude dependent, 239 FIEGARCH, 229
APARCH, 229 FIGARCH, 229
AR, 133 Fong–Vasicek, 264
ARCH, 210 GACD, 249
asymmetric power, 229 GARCH, 217
generalized, 217 exponential, 223
asymmetric, 14 GJR, 222
autoregressive, 131 in-mean, 229
conditional duration, 249 integrated, 221
distributed lag, 176, 192 multivariate, 353
exponential, 239 orthogonal, 360
functional coefficient, 238 —generalized, 360
Markov switching, 239 quadratic, 229
random coefficient, 238 vector, 356
threshold, 236 —diagonal, 357
—self-exciting, 236 GARCH-M, 228
volatility, 210 GDL, 187
BEKK, 356 GO-GARCH, 360
diagonal, 357 Ho–Lee, 264
scalar, 357 Hull–White, 264
bilinear, 232 identification, 136, 137, 214, 318
completely, 232 IGARCH, 221
diagonal, 232, 233 fractionally, 229
subdiagonal, 232, 234 internal, 267
superdiagonal, 232, 233 Jarrow–Rudd, 264
binomial tree, 264 marginal, 321
Black–Derman–Toy, 264 mean-reverting, 151, 169, 264
Black–Karasinski, 264 moving average, 130
CCC, 358 asymmetric, 237
CIR, 264, 265 MSA, 239
conditional covariance matrix, 355 MSW, 239
conditional variances and correlations, 357 multi-factor interest rate, 264
Cox–Ingersoll–Ross, 264, 265 nonlinear, 14, 204, 231
DCC, 358 nonparametric, 240
404 Index
Model (cont.) short-run, 185

O-GARCH, 360 Multivariate EWMA (MEWMA), 352, 381
PDL, 186 Multivariate GARCH, 353
purely stochastic, 204 estimation, 361
RCA, 238 recursive, 379
regression, 93 Multivariate process, 6
autocorrelated residuals, 176, 178 Multivariate volatility, 351
dynamic, 175 Mutual correlation function, 306
recursive pseudo-linear, 380 partial, 309
spurious, 149
Rendleman–Bartter, 264
seasonal, 161 N
SETAR, 235 Nadaraya–Watson estimate, 241
simultaneous equation, 315 Neural network, 242
single-factor interest rate, 264 feed-back, 242
smooth transition autoregressive, 237 feed-forward, 242
state space, 373, 383 Neuron, 242
stochastic volatility (SV), 229 Newey–West estimate, 181
switching regime, 239 NLS-estimate, 142
threshold, 14, 235, 236 No arbitrage, 258
transfer function, 192 Node, 242
VAR, 315 threshold, 242
reduced form, 316 Noise
restricted model, 319 microstructure, 248
structural form, 316 white, 11
unrestricted model, 319 estimated, 145
VARMA, 312 multivariate, 309
Vasicek, 264 Nonlinear model, 14, 206, 233
VEC, 340, 355 Nonparametric model, 240
diagonal, 356 Nonparametric regression, 241
verification, 136, 144, 216, 322 Nonstationarity, 200, 338
WACD, 249 deterministic, 149
Modified exponential trend, 49 stochastic, 149
Monotony, 268 Normalization of seasonal indices, 88
Moving averages, 61, 130 Nyquist frequency, 10, 103
arithmetic, 73
asymmetric, 237
beginning, 66 O
centered, 74 Observation
end, 66 equation, 373
exponentially weighted EWMA, 206, 285 irregular, 15
multivariate, 352 missing, 15
length, 64, 66 time, 7
order, 64, 67 vector, 373
prediction, 66 O-GARCH, 360
robust, 74 Operational risk, 268
weight, 64 Operator
Multi-factor interest rate model, 264 autoregressive, 131
Multi prediction, 20 autoregressive generalized, 158
Multiple transactions, 249 difference, 70
Multiplier seasonal, 70, 162
equilibrium, 185 lag, 69
long-run, 185 moving average, 130
Index 405
OWA, 75 in-sample, 19
seasonal, 161 interval, 16
thinning, 37 Kalman–Bucy, 373, 378
vector half, 355 linear, 165
Option, 258 out-of-sample, 19
American, 260 percentage of correct direction change, 26
call, 260 percentage of correct sign, 25
European, 260 point, 16
premium, 258 qualitative, 17
call, 260 quantitative, 16
put, 260 static, 21
put, 260 in structural model, 19
Ornstein–Uhlenbeck process, 264 in time series model, 19
Outlier, 8, 15 VAR, 321
additive, 194 volatility, 217
innovation, 195 Predictor, 378
Overdifferencing, 160, 342 Premium
call, 260
option, 258
P put, 260
Pareto distribution, 296 Previsibility, 253
generalized, 296 Price, 199
type II, 296 exercise, 260
Partial autocorrelation function (PACF), 127 logarithmic, 199, 257
Partial mutual correlation function, 309 market price of risk, 265
Periodic component, 11 relative, 199
Periodic time series, 104 strike, 260
Periodogram, 14, 102 variation, 199
Phillips–Perron (PP) test, 154 relative, 199
Point Principal component analysis (PCA), 360
growth, 114 Probability
prediction, 16 distribution, 31
truncation, 127 transition, 30, 34
turning, 42, 117 Process
Poisson process, 33 ACD, 249
Portfolio, 276 adaptive control, 80
Portmanteau test, 145, 320 amplitude-dependent, 239
Position antipersistent, 172
long, 260, 351 AR, 131
short, 260, 351 ARFIMA, 172
Positive homogeneity, 268 ARIMA, 157
Prediction ARMA, 134
accuracy, 22 autoregressive, 131
Box–Jenkins, 164 conditional duration, 249
combination, 26 exponential, 239
equal-weighted, 27 functional coefficient, 238
median, 27 Markov switching, 239
ranking, 28 random coefficient, 238
trimmed mean, 27 threshold, 236
weighted in inverse proportional way to —self exciting, 236
MSE, 27 bilinear, 232
dynamic, 21 completely, 232
error, 22, 204 diagonal, 232, 233
406 Index
Process (cont.) geometric, 36

subdiagonal, 232, 234 with drift and volatility, 36, 253
superdiagonal, 232, 233 Proxy, 203
binary, 28, 37 Pseudo-maximum likelihood (PML), 362
by clipping, 37 Pulse, 194
branching, 29 Put, 260
CARMA, 37 Put-call parity, 262
counting, 6, 37 PX index, 207, 258, 369
DARMA, 37
diffusion, 253
double stochastic, 238 Q
EACD, 249 QGARCH, 229
explosive, 151 QML-estimation, 216, 265, 362
FAR, 238 Q-test, 145, 244, 320
FI(d ), 170 Quadratic trend, 47
GACD, 249 Quantile, 269
Galton Watson, 29 Quasi-maximum likelihood (QML), 216,
I(d ), 158 265, 362
INAR, 37 Quenouille’s approximation, 128
integrated, 158
invertible, 129
Ito’s, 253 R
linear, 128 Random coefficient, 238
multivariate, 312 Random process, 5
long memory, 170, 171 continuous states, 6
Markov, 33 continuous time, 6
continuous, 35 discrete states, 6
moving average (MA), 130 discrete time, 6
asymmetric, 237 Random walk (RW), 13, 28
MSA, 239 symmetric, 29
MSW, 239 with drift, 150
multivariate, 6 without drift, 151
nonnegative, 6 Rare event, 33
normal, 124 Real process, 6
persistent, 170, 171 Realization, 6
Poisson, 33 Realized volatility, 203
random, 5 Recession, 10
RCA, 238 Recursion Kalman–Bucy
real, 6 filtering, 373, 376
SARIMA, 162 predicting, 373, 378
SETAR, 235 smoothing, 373, 379
stochastic, 5 Recursive method, 14
strictly stationary, 292 Reduced form of model, 316
strong-memory, 172 Regression Equation Specification Error Test
VAR( p), 315 (RESET), 244
VARMA( p,q), 312 Regression model, 93
vector autoregressive process, 315 autocorrelated residuals, 176, 180
vector mixed process, 312 dynamic, 175
WACD, 249 nonparametric, 241
Wiener, 36, 252 spurious, 149
arithmetic, 253 Residual
exponential, 36, 255 autocorrelated, 176, 178
generalized, 253 component, 11
Index 407
Return, 199 out-of-sample, 19

level, 294 SARIMA(p, d, q) (P, D, Q)12, 162
log, 161, 199, 207 Scatterplot, 308
mean-corrected, 205 Schlicht’s method, 101
period, 294 S-curve, 17
Rhodes formula, 56 asymmetric around inflection point, 57
Risk symmetric around inflection point, 55
commodity, 267 Seasonal adjustment, 41
correlation, 267 Seasonal component, 10, 87
credit, 267 Seasonal elimination, 10
credit spread, 267 Seasonal index, 87, 93, 98
currency, 267 normalization, 88
financial, 267 Seasonal model, 161
interest, 267 Seasonal stability, 101
liquidity, 268 Seasonality, 87
management, 267 Self-financing portfolio, 261
market, 267 SETAR, 235
measure, 268 Shock, 204
operational, 268 Simulation
stock, 267 historical, 279
systemic, 366 Monte Carlo, 287
Risk-free-yield, 263 Simultaneous equation models (SEM), 315
Risk measure, 268 Simultaneous uncorrelatedness, 175
coherent, 268 Single-factor interest rate model, 264
deviation, 271 Single-prediction, 20
mean absolute, 271 Slope, 97
semi-deviation, 272 Smooth transition autoregressive model, 237
standard, 271 Smoothed time series, 12
—one-sided, 271 Smoother, 379
distorted, 273, 274 Smoothing, 42
expected shortfall, 272 constant, 77
monotony, 268 exponential, 75
positive homogeneity, 268 state space model, 383
spectral, 275 exponential double, 81
subadditivity, 268 exponential simple, 76
translation invariancy, 268 exponential triple, 84
value at risk, 267, 268 Kalman–Bucy, 373, 379
conditional, 272 Solvency II, 206, 352
tail, 272 Spearman rank correlation coefficient ρ, 116
variance, 271 Spectral analysis, 14, 102
coefficient, 271 Spectral density, 14
Wang, 275 Spectral risk measure, 275
RiskMetrics, 206, 268, 352 Spectrum, 14
Robust analysis, 15 Spillover, 351
Robustification, 8 Splines, 59
Root mean squared error (RMSE), 24 cubic, 60
Run, 117 penalized, 60
Spread
bid-ask, 248
S credit, 263
Sample risk, 267
hold-out, 19, 243 Spurious regression, 149
in-sample, 19 Standard deviation, 271
408 Index
Standard month, 8 KPSS-test, 155

State, 6 Lagrange multiplier (LM), 320
continuous, 6, 35 likelihood ratio (LR), 318, 345
discrete, 6, 28 linearity, 243
equation, 373 LM, 320
variable, 373 LR, 318, 345
vector, 373 periodicity, 102
State space model, 373 periodicity Bølviken’s, 105
exponential smoothing, 383 periodicity Fisher’s, 104
Stationarity, 124, 125, 305 periodicity Siegel’s, 105
strict, 124, 292 Phillips–Perron (PP), 154
weak, 124, 305 portmanteau, 145, 320
Statistics Q-test, 145, 244, 320
Box–Pierce, 146 randomness, 113
double smoothing, 82 Kendall rank correlation coefficient τ,
DW, 178 116
Ljung–Box, 146 median, 117
simple smoothing, 82 number of runs above and below
Stochastic calculus, 251 median, 117
Stochastic control, 80 signs of differences, 114
Stochastic differential, 254 Spearman rank correlation coefficient τ,
Stochastic integral, 254 116
Stochastic process, 5 turning points, 115
Stochastic volatility (SV), 229 RESET, 244
Strong-memory process, 172 τ-tests, 153
Structural form of model, 316 unit root, 152
Subadditivity, 268 Wald, 319
Sum of squared errors (SSE), 22 Theil’s U-statistics, 25
Switching regime model, 235 Theorem
Markov, 239 Balkema–de Haan, 298
Systematic component, 11 central limit (CLT), 289
Systemic risk, 366 Fisher–Tippett, 289, 292
Systemically important financial institutions Granger’s cointegration, 341
(SIFI), 366 Thinning operator, 37
Threshold excess, 288, 295
Threshold model, 14, 235, 236
T Tick, 248
Tail Tick data, 203
heavy, 290 Time, 5
index, 290 continuous, 6, 33, 35, 251
Tail conditional expectation TVaR, 272 discrete, 6, 28
Term structure of interest rates, 263 length, 8
Test Time data, 5
augmented DF-test, 154 Time series, 5
autocorrelated residuals, 178 decomposition, 9
BDS, 244 financial, 14, 199, 231
Breusch–Godfrey (BG), 179 multivariate, 305
causality, 326 nonstationary, 336
Dickey–Fuller (DF test), 153, 344 periodic, 103
augmented, 154 smoothed, 12
Durbin–Watson (DW), 178 Trajectory, 6
Jarque–Bera, 207, 220, 321 Transactions
Johansen, 344 data, 246
Index 409
multiple, 249 relative, 270

Transfer function model, 192 resampled, 287
Transformation Variable
based on differencing, 111 dummy, 93, 194
Box Cox, 107 endogenous, 315
Koyck, 180 exogenous, 175, 315
time series, 107 strongly, 326
Wang, 275 explained, 175
Transition function, 237 explanatory, 175
Translation invariancy, 268 predetermined, 175
Trend, 10, 42 qualitative, 16
deterministic, 149 quantitative, 16
elimination, 42 state, 373
exponential, 47 Variance, 271
modified, 49 coefficient, 271
Gompertz, 56 decomposition, 335
Johnson, 59 proportional, 24
linear, 43 Variance-covariance method, 276
logarithmic, 59 VARMA, 312
logistic, 54 Vasicek model, 264
quadratic, 47 Vector
smoothness, 101 autoregressive process, 315
Truncation point, 127 cointegration, 340
Turning points, 42, 115 distribution, 31
lower, 42 GARCH, 355
upper, 42 diagonal, 356
half operator, 355
mixed process, 312
U observation, 373
Ultra-high-frequency (UHF) data, 7 state, 373
Uncorrelated Vector autoregression (VAR), 315
mutually, 306 reduced form, 316
simultaneously, 175, 306 restricted model, 319
Uncoupled, 306 structural form, 316
Unidirectional dependency relationship, unrestricted model, 319
306, 326 Vector error correction (VEC), 340
Unit root, 152 Verification of model, 136, 144, 216
U.S. Census Bureau, 10, 90 Volatility, 14, 36, 202, 205, 267
autoregressive model, 210
clustering, 202
V conditional, 203
Value at risk (VaR), 267, 268 equation, 205
calculation, 276 historical, 206
historical simulation, 279 implied, 209
modifications, 282 mutual, 352, 353
simulation Monte Carlo, 287 integrated, 204
variance-covariance, 276 matrix, 354
conditional, 272, 364 multivariate, 351
delta, 365 mutual, 352
tail, 272 persistence, 221
confidence level, 269 prediction, 217
contagion, 365 realized, 203
holding period, 268 stochastic, 229
410 Index
W X
WACD, 249 X-ARIMA, 90
Wald test, 319 X-12-ARIMA, 10, 90
Wang risk measure, 275 X-13-ARIMA-SEATS, 10, 90
Weibull distribution, 291
Weight, 48, 64, 242
White noise, 11
Y
estimated, 145
Yield
multivariate, 309
curve, 263
Wiener process, 36, 252
forward, 263
arithmetic, 253
spot, 263
exponential, 36, 255
risk-free, 263
generalized, 253
to maturity, 263, 306
geometric, 36
Yields to maturity (YTM), 263
with drift and volatility, 36, 253
Yule–Walker equations, 132, 318
Wiener–Khinchin theorem, 14
Window
prediction, 20
recursive, 20 Z
rolling, 20 Zero-coupon bond, 263
economies
Review
The ARDL Method in the Energy-Growth Nexus
Field; Best Implementation Strategies
Angeliki N. Menegaki
Department of Economics & Management of Tourist and Culture Units, Agricultural University of Athens,
33100 Amfissa, Greece; amenegaki@aua.gr

Received: 7 August 2019; Accepted: 14 October 2019; Published: 18 October 2019
Abstract: A vast number of the energy-growth nexus researchers, as well as other “X-variable-growth
nexus” studies, such as for example the tourism-growth nexus, the environment-growth nexus or
the food-growth nexus have used the autoregressive distributed lag model (ARDL) bounds test
approach for cointegration testing. Their research papers rarely include all the ARDL procedure
steps in a detailed way and thus they leave other researchers confused with the series of steps that
must be followed and the best implementation paradigms so that they not allow any obscure aspects.
This paper is a comprehensive review that suggests the steps that need to be taken before the ARDL
procedure takes place as well as the steps that should be taken afterward with respect to causality
investigation and robust analysis.
Keywords: ARDL bounds test; energy-growth nexus; “X-variable-growth nexus” review
1. Introduction
Since the seminal work by Kraft and Kraft (1978) on the energy-growth nexus, various cointegration
and causality methods have been used in this field and the “X-variable growth nexus” framework
in general. The most common of them have been the Engle and Granger (1987) method based
on residuals, the Phillips and Hansen (1990) with a modified ordinary least square procedure,
Johansen (1988) and Johansen and Juselius (1990) maximum likelihood method.
However, some years later, it was realized that these methods may not be appropriate for small
samples (Narayan and Smyth 2005). Foremost, studies before the ARDL establishment, and this was
much the case for the energy-growth nexus, used cross sectional analysis through their panel data
configuration. This entailed that the countries included in those samples were not homogeneous
enough with respect to their economic development level (Odhiambo 2009). Unless results became
country specific, results from these studies were of little use for policy-making. This generated the need
for more sophisticated cointegration and causality methods. These econometric methods employed in
the older energy-growth nexus, have thrown light to other fields such as the tourism-growth nexus or
others, which this paper, for reasons of simplicity, terms as the “X-variable- growth nexus.”
The initiation of the autoregressive distributed lag (ARDL) method or Bounds test is due to
Pesaran and Shin (1999), while its further development is due to Pesaran et al. (2001). It is acknowledged
as one of the most flexible methods in the econometric analysis of the energy-growth nexus, particularly
when the research framework is shaped by regime shifts and shocks. The latter change the pattern of
energy consumption or the evolution of covariates in the energy-growth models. Moreover, the fact
that the ARDL method may tolerate different lags in different variables, this makes the method very
attractive, versatile, and flexible.
The ability to host sufficient lags enables best capturing of the data generating process mechanism.
This translates into that the method can be applied irrespective of whether the time series is I(0),
namely stationary at levels, I(1) namely stationary at first differences or fractionally integrated
Economies 2019, 7, 105; doi:10.3390/economies7040105 www.mdpi.com/journal/economies

Economies 2019, 7, 105 2 of 16
(Pesaran et al. 2001). Nevertheless, within the ARDL framework, the series should not be I(2),
because this integration order invalidates the F-statistics and all critical values established by Pesaran.
Those have been calculated for series which are I(0) and/or I(1).
Furthermore, the ARDL method provides unbiased estimates and valid t-statistics, irrespective
of the endogeneity of some regressors (Harris and Sollis 2003; Jalil and Ma 2008). Actually,
because of the appropriate lag selection, residual correlation is eliminated and thus the endogeneity
problem is also mitigated (Ali et al. 2016). As far as the short-run adjustments are concerned, they
can be integrated with the long-run equilibrium through the error correction mechanism (ECM).
This occurs through a linear transformation without sacrificing information about the long-run horizon
(Ali et al. 2017). One other aspect is that the method allows the correction of outliers with impulse
dummies (Marques et al. 2017, 2019) and the approach distinguishes between dependent and
Last but not the least, the interpretation of the ARDL approach and its implementation is
quite straightforward (Rahman and Kashem 2017) and the ARDL framework requires a single form
equation (Bayer and Hanck 2013), while other procedures require a system of equations. The ARDL
approach is more reliable for small samples as compared to Johansen and Juselius’s cointegration
methodology (Haug 2002). Halicioglu (2007) also mentions two more advantages of the method, which
are: The simultaneous estimation of short- and long-run effects and the ability to test hypotheses on
the estimated coefficients in the long-run. This is not done in the Engle–Granger method.
This paper is organized as follows: After the introduction, follows the methodology as Section 2,
together with best practice guidelines. Section 3 contains other versions of the ARDL approach and
ARDL implementation strategies to follow in one’s energy-growth nexus paper, and Section 4 concludes
the paper.
2. The Methodology
For reasons of educative demonstration, we assume two series, the Yt and the Xt in this paper but
the reader can easily generalize into more variables. Nevertheless, the production function equation
in the energy-growth nexus, has more variables. In a bivariate energy-growth nexus model, the Yt
stands for economic growth and the Xt stands for energy consumption. It is also typical in the energy
growth nexus to use logarithms of the variables in order to translate variable coefficients as elasticities.
The series of steps in the ARDL procedure is the investigation of: (i) stationarity, (ii) cointegration, and
last but not least (iii) causality. There are other ways to proceed to causality analysis without the first
two steps, but this occurs within other methodological frameworks.
2.1. Stationarity
After a presentation of the descriptive statistics of the series (mean, median, minimum and maximum
values, skewness, kurtosis, as well as the standard deviation, Bera–Jacque normality test and pairwise
correlation), the first step in the ARDL analysis, is the unit root analysis. It informs about the degree of
integration of each variable. To satisfy the bounds test assumption of the ARDL models, each variable
must be I(0) or I(1). Under no circumstances, should it be I(2). De Vita et al. (2006) also noted that the
dependent variable should be I(1). However, this is not widely claimed in the current literature. Unit root
analysis is performed with a long array of tests such as for example the augmented Dickey Fuller (ADF)
and the Kwiatkowski–Phillips–Schmidt–Shin (KPSS), the Phillips–Perron (PP), the Ng–Perron test, the
cross-sectional augmented IPS-CIPS (Pesaran 2007), the LS (Lee and Strazicich 2003), and many others.
Each one is more compatible with different data characteristics, but this paper will not discuss them for
brevity reasons. However, it should be stressed that researchers should apply both the traditional and
structural break unit root tests to make sure that the variables are not I(2).
Economies 2019, 7, 105 3 of 16
2.2. Cointegration
The essence models in the ARDL bounds test framework are the following unrestricted error
correction models:
m
X n
X
∆LYt = a0 + a1 t + α2i ∆LYt−i + a3i ∆LXt−i + a4 LYt−1 + a5 LXt−1 + µ1t (1)
i=1 i=0
m
X n
X
∆LXt = β0 + β1 t + β2i ∆LXt−i + β3i ∆LΥt−i + β4 LXt−1 + β5 LΥt−1 + µ2t (2)
i=1 i=0
∆ is the first difference operator, µ is the error term that must be a white noise or put in other
words it represents the residual term which is supposed to be well behaved (serially independent,
homoskedastic and normally distributed). All α and β coefficients are non-zero with a4 and β4 also being
negative (this represents the speed of adjustment). The parameters α2i and a3i represent the short-run
dynamic coefficients, while a4 and a5 are long-run coefficients in the energy-growth nexus relationship.
The a0 and β0 are drift components, µ1t and µ2t are white noise. What type of explanatory variables
must be incorporated in the energy-growth relationship is provided in detail by Inglesi-Lotz (2018) in a
chapter written specifically on this topic. The interested reader is advised to read that. Generally, one
can decide first on the framework one is going to work, namely whether that is a production function
approach or a demand function approach or others such as the Kuznets curve hypothesis and then
decide on the variables and other components. Other deterministic components are included on a trial
and error basis and to corroborate further the stability of an estimated relationship.
Overall, we observe in Equations (1) and (2) that each variable is represented as dependent on the
past values of itself, the past values of the other variable(s), and the past values of differenced values
of itself and the past values of differenced values of the other variable(s). Models (1) and (2) can be
formulated either as intercept or trend ARDL models, or both. Equations (1) and (2) contain both.
Halicioglu (2007) claims that it is possible to end up with two models, one with trend and one without
a trend. There is a method described in Bahmani-Oskooee and Goswami (2003), according to which
one ends up with a single long-run relationship through consecutive eliminations of the rest of the
relationships. The first stage of the ARDL estimation produces a (p + 1)k number of regressions so that
the optimal lag length for each variable is obtained, with p being the maximum number of lags and k
is the number of variables in the equation. In our simplistic example, there is only one Xt variable.
In the framework described in Equations (1) and (2), the ARDL bounds cointegration test is carried out.
These equations are estimated with ordinary least squares (OLS).
2.3. More on the ARDL Analysis

The ARDL analysis occurs as follows: If the existence of cointegration is confirmed in
Equations (1) and (2), then the long-run and the short-run models are estimated and both long
and short-run elasticities are derived, namely the ARDL equivalent of the UECM (Unrestricted error
correction model). Cointegration, in the ARDL bounds test approach, is examined under the following
hypothesis set up:
H0 : a1 = a2 = an = 0
H1 : a 1 , a 2 , a n , 0
The setup of the hypotheses reads as follows: there is cointegration if the null hypothesis is
rejected. The F-statistics for testing are compared with the critical values developed by Pesaran et
al. (2001). Narayan critical values are more appropriate for small samples. Pesaran et al. (2001)
provide a table enumerated as CI and entitled: “Asymptotic critical value bounds for the F-statistic.
Testing for the existence of a levels relationship” in five versions. These are (i) no intercept and no
trend, (ii) restricted intercept and no trend, (iii) unrestricted intercept and no trend, (iv) unrestricted
Economies 2019, 7, 105 4 of 16
intercept and restricted trend, (v) unrestricted intercept and unrestricted trend. They also provide a
table CII entitled “Asymptotic critical value bounds for the t-statistic. Testing for the existence of a levels
relationship” in three versions: (i) No intercept and no trend, (ii) unrestricted intercept and no trend, (iii)
unrestricted intercept and unrestricted trend. Next we reproduce a part of these tables (CI-iii and CI-v)
in order to explain how the decision for cointegration was made in Bölük and Mert (2015) based on
Pesaran tables. Note that Pesaran tables are not valid for I(2) variables (Ali et al. 2016). The interested
reader can find these tables in Pesaran et al. (2001).
Narayan and Smyth (2005) on the other hand, has estimated critical values for the bounds test for
four cases at three significance levels and up to seven independent variables up to eighty observations.
The critical values of the four cases are entitled as: (i) Case II: restricted intercept and no trend,
(ii) case III: unrestricted intercept and no trend, (iii) case IV: unrestricted intercept and restricted
trend, (iv) case V: unrestricted intercept and unrestricted trend. In Narayan tables, k stands for the
number of regressors, n is the sample size, I(0): stationary at levels, I(1): stationary at first differences.
The interested reader can find these tables in Narayan and Smyth (2005).
When no cointegration is confirmed, we can proceed with simple Granger causality (unrestricted
VAR). The VAR equation should be specified on stationary data. There are various reasons why
cointegration is not confirmed (e.g., no relationship between the examined variables or due to omitted
variables). The Toda and Yamamoto (1995) test is a solution for Granger causality testing in this
case. After all, even when a long-run relationship does not exist in the data, this does not mean that
no short-run relationship exists either. Moreover, it needs to be remembered that the cointegration
equation provides the long-run elasticities. Short-run elasticities are presented by the coefficients
of the first differenced variables. In cases where more than one coefficient for a particular variable
has been estimated for the short-run case, these are added and their joint significance is tested with
a Wald test (Fuinhas and Marques 2012). However, if cointegration is the case (which occurs very
commonly, when there is a known and established theoretical connection between some variables),
then we can proceed with the establishment of the error correction mechanism (ECM). Evidence of
cointegration implies that there is a long-run relationship between the variables and their connection is
not a short-lived situation, but a more permanent one, which can be recovered every time there is a
disturbance. Alternatively to the above described F-test, a Wald test can be applied which is used to
test the null hypothesis of no cointegration when there is more than one short-run coefficient of the
same variable (Tursoy and Faisal 2018).
2.4. Diagnostic Tests after Cointegration

A model to be trusted, it must be robust. To support robustness of an estimated model, one needs
to peruse various diagnostic tests. Typical diagnostic χ2 tests follow to investigate the goodness of fit,
stability, parsimoniality, functional form, and a well-behaved model in general. The Breusch Godfrey
serial correlation LM test, the Breusch–Pagan Godfrey Heteroskedasticity test or the White test, and
the Jarque–Bera test are some of the tests encountered in these applications. In addition to that, the
Ramsey reset test is used for the functional form. Besides the latter, the variance inflation factor (VIF)
for multicollinearity might be useful in cases where there is evidence of multicollinearity.
The Impulse Response Function (IRF), Shifts, and Dummies

The impulse response functions can be of use because they reveal the effect of a standard deviation
shock on the dependant variable. IRF are formed through the moving average (MA) of the vector
autoregressive (VAR) equation1 . Some energy-growth researchers use them as an indispensable tool
1 A VAR model is a generalization of univariate AR models for multiple time series. Within a VAR framework, all variables
are represented by an equation that explains its evolution based on its own lags and the lags of the other variables in the
multivariate framework. The number of variables k are measured over a period of time t as a linear evolution of their
past values.
Economies 2019, 7, 105 5 of 16
for valuable information in their ARDL models. The impulse response function mainly shows what
happens when the model is transferred to the one side of a dummy variable. For example, if the value
of 1 represents war time and the value of 0 represents peace time, then if we take the ones or zeros only
and separately, we have an impulse response function, one for war time and one for peace time. Thus,
they are also a useful tool to test the stability of a model across structural breaks. There are various
hypotheses that underlie the models after cointegration is confirmed. After the identification of the
long-run relationship in Equations (1) and (2), we can continue with the examination of the short-run
and the long-run Granger causality. The Granger causality refers to a situation where the past can
be used to predict the future. Thus, if past values of Xt significantly contribute to forecasting future
values of Yt , the Xt is said to Granger cause the Yt . However, evidence of correlation is not necessarily
an evidence for causality.
2.5. Combined Cointegration Methods for the Robustness of the ARDL Model
In the particular case of a unique order of integration, Bayer and Hanck (2013) have developed a
test which borrows elements from a variety of previously developed cointegration tests. The combined
test borrows elements from Engle and Granger (1987); Johansen (1988); Boswijk (1994) and Banerjee et al.
(1998). The combined cointegration test uses Fisher’s formulae and the p-values of the aforementioned
individual tests.
h i
Engle and Granger − Johansen = −2 ln PEngle & Granger + ln PJohansen
Engle and Granger −Johansen

h − Boswijk − Banerjee
et al.
= −2 ln PEngle & Granger + ln PJohansen + ln PBoswijk
i
+ ln PBanerjee
The null hypothesis of no cointegration is rejected if the aforementioned Fisher statistic exceeds the
critical value as produced by Bayer and Hanck (2013). The above test balances the decisions produced
by the independent tests which suffer from various weaknesses, each one of them.
2.6. Causality after the ARDL Bounds Test and the Importance of the Error Correction Term (ECT)
The investigation of causality is the third step in the energy-growth nexus analysis. The lagged
error correction term is derived from the cointegration equation. Thus the long-run information that is
missed through the differencing of the variables for stationarity purposes, is re-introduced in the system
of causality equations. This is a necessary step when variables are cointegrated. Cointegration implies
that there must be causality of some direction, however, it does not reveal to which direction that
causality goes. Therefore, additional causality analysis is required. Thus, before going to the estimation
of Equations (3) and (4) below, one needs to run another set of regressions in order to get the residuals
which will be inserted to Equations (3) and (4) as the ECT term.
There are many strategies to follow in the examination and direction of causality. One such strategy
is the VECM approach (vector error correction model), which is a restricted form of unrestricted VAR
and is suitable, once the variables are integrated at I(1). According to this model setup, the dependent
variable is dependent on its own lagged values, as well as the lagged values of the independent variables,
the error correction term, and the residual term. This is shown in the following set of equations.
l
X m
X
∆lnYt = a1 + a11 ∆LYt−i + a22 ∆Xt− j + n1 ECTt−1 + µ1i (3)
i=1 j=0
l
X m
X
∆lnXt = a1 + a21 ∆LXt−i + a22 ∆Yt− j + n2 ECTt−1 + µ2i (4)
i=1 j=0
Economies 2019, 7, 105 6 of 16
Residual terms in the above equations, are assumed to distribute normally. The coefficient of the
ECT must be negative to assure system convergence from the short run toward the long run. An ECT
equal to x% is interpreted as such that x% of economic growth is corrected by deviations in the short
run that lead eventually to the long-run equilibrium path. The significant variables on the right hand
side of each equation show short-run causality for the dependent variable.
FMOLS and DOLS Estimators for Robustness

The FMOLS (fully modified OLS) and the DOLS (dynamic OLS) were developed by Phillips and
Hansen (1990) and Stock and Watson (1993). They lead to the generation of asymptotically efficient
coefficients, because they take into account the serial autocorrelation and endogeneity. They are
applied only in the I(1) case for all variables. The latter makes them less flexible and attractive
methods. OLS is biased when variables are cointegrated but nonstationary, while FMOLS is not. DOLS
performs better than the FMOLS approach (Kao and Chiang 2000) for several reasons: DOLS is
computationally simpler and it reduces bias better than FMOLS. The t-statistic produced from DOLS
approximates the standard normal density better than the statistic generated from the OLS or the FMOLS.
DOLS estimators are fully parametric and do not require pre-estimation and non parametric correction.
Ali et al. (2017) reports that the most significant benefit of DOLS is that the test considers the mixed
order of integration of variables in the cointegration framework.
2.7. Additional Ways to Study Causality

Literature reports additional types of causality: (a) The weak causality/short-run causality, (b) the
long-run causality, (c) the strong causality (joint causality), (d) the pairwise causality. Each one serves a
particular purpose.
(a) Weak causality/short-run causality
Each variable is caused by its own past only.
(b) Long-run causality
The error correction term is zero. This is a VAR (vector autoregression) causality leading to
Toda and Yamamoto (1995) method. Granger causality can be checked for existence through a VAR
model (note that data are not in differences, namely they are in level form):
Yt = g0 + α1 Υt−1 + . . . + αρ Υt−p + b1 Xt−1 + . . . + bp Xt−p + ut
Xt = h0 + c1 Υt−1 + . . . + cρ Υt−p + d1 Xt−1 + . . . + dp Xt−p + vt
H0 : b 1 = b 2 = . . . = b p = 0
H1 : X Granger causes Y
A similar hypothesis set up can be constructed for the second equation, but this will not be done
here for space considerations. Please note the following rationale:
If bi , 0 and di=0, then Xt will lead Yt in the long run.

If bi = 0 and di , 0, then Yt will lead Xt in the long run.
If bi , 0 and di , 0, then the feedback relationship is present.
If bi = 0 and di = 0, then no cointegration exists.
After we have calculated the diagnostics of the model and we have verified that the model is well
behaved, then the next step is the bounds test. The existence of a long-run relationship can be further
corroborated with the investigation of significance of the individual terms.
Economies 2019, 7, 105 7 of 16
(c) Strong causality: The joint causality investigation process

This is altogether the case described in (a) and (b). The joint causality test also known as strong
causality test (Lee and Chang 2008) identifies two sources of causation, one the short run and the other
the long run, to which the variables re-adjust after a short-run perturbation. This is tested with the
short-run coefficients of the lagged variables and the significance of the lagged error correction term.
Granger causality can be investigated in two other known ways: It can be investigated with the F-test
to decide about the significance of first difference stationary variables (Asafu-Adjaye 2000; Masih and
Masih 1996) or by including the ECT as a source of variation. This is most commonly checked with a
t-test.
(d) Pairwise Granger causality test
This is another solution toward the investigation of causality when cointegration is not confirmed.
An additional usage is for the corroboration of VECM results. Menegaki and Tugcu (2016) have
employed this method for the investigation of the energy-sustainable growth nexus in Sub-Saharan
African countries for the years 1985–2013. In addition to that, Menegaki and Tugcu (2018) have employed
the same method for the investigation of the energy-sustainable growth nexus in Asian countries.
3. Other Versions of the ARDL Approach
3.1. The Asymmetric Nonlinear or the Nonlinear Autoregressive Distributed Lag (NARDL) Approach
This version of the ARDL approach was introduced by Shin et al. (2011, 2014) and is an extension
of the method introduced by Pesaran et al. (2001). The nonlinear ARDL is used for testing whether
the positive shocks of the independent variables have the same effect as their negative shocks on the
dependent variables. In the typical ARDL, there is a symmetric relationship between the dependent
and the explanatory variables. This is not the case with the NARDL in which the ARDL relationship is
formulated as follows:
yt = a+ xt + + a− xt − + εt
The alphas are the long-run parameters, while xt is the following vector regressor:
xt = x0 + xt + + xt −
With xt + being the positive partial sum and xt − being the negative partial sum as follows:
t
X t
X
xt + = ∆xi + = max(∆xi , 0)
i=1 i=1
t
X t
X
xt − = ∆xi − = max(∆xi , 0)
i=1 i=1
This means that the corresponding error correction model can be written as:
j−1
X p
X
∆yt = ρyt−1 + θ+ xt−1 + + θ− xt−1 − + ϕi ∆yt−i + πi + ∆xt−i + + πi − ∆xt−i − + εt
i=1 i=0
where θ+ = ρyαt−1 and θ− = ρyαt−1 .

+ −
Using the F-statistic developed by Pesaran et al. (2001), one can test the hypothesis that θ+ θ− = θ = 0.
The rejection of the null hypothesis indicates the presence of cointegration. The hypothesis of θ = 0
versus the alternative that θ < 0 is examined through a t-test (Banerjee et al. 1998).
Economies 2019, 7, 105 8 of 16
Overall, the procedure steps are exactly as the conventional ARDL approach that has been already
presented in this paper. In addition to that, the method provides the cumulative dynamic multiplier
effects of x+ and x− on yt as follows:
k k
X ∂y t+i
X ∂y t+i
mk + = and mk − =
∂xt + ∂xt −
i=0 i=0
When k increases to infinity, the multipliers converge to the alphas. This method has been
applied by Shahbaz (2018) in a case study for the energy-growth nexus in Al-hajj et al. (2018) for the
investigation of the oil price and stock returns nexus in Malaysia. The NARDL method is applicable if
all variables are integrated at I(1) or they have a flexible order of integration. The approach solves
multicollinearity through the choice of the appropriate lag length of variables (Shin et al. 2014). Thus,
the bounds test proposed by Shin et al. (2014) examines the presence of cointegration while at the
same time hosting asymmetries. As far as causality is concerned, a complete account of asymmetric
causality is presented in Apergis (2018) who provides a detailed account also on linear versus the
nonlinear causality.
3.2. The Pool Mean Group (PMG) Estimator for Panel Data
The PMG allows for heterogeneity only in the short-run compared to the mean group which
allows for heterogeneity both in the short and the long-run. The pool mean group estimates are
superior to the fixed effects estimates, because they are robust to endogeneity and to the presence of unit
roots. Overall the PMG is an estimator that allows pooling and averaging. Besides the short-run and
long-run effects that are captured among the variables of a model, the PMG additionally investigates
the dynamic effects of the independent variables on the dependent variable.
The general form of the PMG can be seen in the following Equation:
p
X q
X
Yit = λij yi, t− j + δij Xi,t− j + µt + εit
j=1 j=0
The following notation applies for the Equation:
I = number of panels with I = 1 . . . N

T = time, t = 1, . . . T
Xit = a vector of K × 1 regressors
λij = is a scalar
µi = is a group specific effect
The error correction equation can be derived from the previous equation as:
p−1
X q−1
X
∆Yit = ϕi yi,t− j − θi Xi,t− j λij ∆yi,t− j + δij ∆i,t− j + µt + εit
j=1 j=0
With ϕi indicating the speed of adjustment which needs to be negative and significant in order
to have convergence in the long-run horizon. If the speed of adjustment is zero, then no long-run
relationship would be present. This equation provides the short-run dynamics that correspond to the
long-run ones described in the cointegration equation. Besides the sign of the adjustment coefficient,
the researcher must pay attention to the rest of the signs both I the cointegration and the error correction
equation and decide whether they are consistent with economic theory and established research in the
energy-growth nexus field. After short-run and long-run causal findings have been corroborated, it is
useful to document them with policy reasons, namely find out why a causal direction is happening,
whether it is due to some energy or environmental policy or whether it is due to the lack of some relevant
Economies 2019, 7, 105 9 of 16
policy. Comparison with the findings of other studies is also essential at this point. The estimates from
the PMG estimator are consistent and asymptotically normal for both stationary and non-stationary
regressors. As with conventional ARDL, the appropriate lag length in the PMG can be determined by
the AIC and SBC criteria. Foremost, the more homogeneous the panels are, the more efficient the PMG
estimator is.
Additional attention is advised for researchers with panel data who are advised to perform
both the Pesaran (2004) CD test and the Pesaran and Yamagata’s slope homogeneity tests.
The Pesaran (2004) CD test was formulated as an answer to the shortcomings faced in the scaled
LM test (Pesaran 2004; Breusch and Pagan 1980). Large panel data sets could not be handled with
the Breusch and Pagan test. Thus, Pesaran (2004) suggested the standardized version of that LM
test. Again, however, this solution had its own restrictions with large panels where cross sections
were large but the time span was not long enough. The CD test was proposed as a final solution that
could accommodate both smaller cross sections and shorter data spans. In many studies we make the
comfortable assumption that the slope coefficients are homogeneous. While, when the time span is
long and the cross section dimension short, this can be tested with seemingly unrelated regressions
(SURE), but these dimensions are not always the case. Pesaran and Yamagata (2005) have proposed a
modified Swamy’s test of slope homogeneity. Swamy (1970) bases his test of the slope homogeneity on
the dispersion of individual slope estimates from a suitable pooled estimator. For more on these tests,
the interested reader should read the suggested bibliography.
3.3. What Are the ARDL Best Implementation Strategies to Follow in One’s Energy-Growth Nexus Paper?
This paper deals with the general outline of the research in the ARDL analysis and not the specific
direction that various studies may end up with, because of specific handlings dictated by data, theory,
and research demands. For example Liu (2009) ends his/her ARDL analysis with a factor decomposition
model (FDM) analysis, which shows the yearly causal contribution of each variable onto the dependent
variable. This is not how most ARDL energy-growth nexus studies end with. The typical outline
of most of these studies is an investigation of the integrational properties of the variables, followed
by an ARDL cointegration analysis that ends with a causality analysis. In the following two tables
(Tables 1 and 2), the ARDL implementation strategies are provided with guidelines for every step and
variant. Table 1 contains guidelines for the time series data, while Table 2 contains guidelines for the
panel data version of the ARDL implementation. For more detailed discussions on time series and
panel data causality tests dependent on cointegration and integration results, the reader is advised to
consult the studies by Tugcu (2018) and Apergis (2018) in the book by Menegaki (2018) entitled as
“The Economics and the Econometrics of the energy-growth nexus” and by Marques et al. (2019) in the
book by Fuinhas and Marques (2019) entitled as “The extended energy-growth nexus.”
Table 1. Autoregressive distributed lag model (ARDL) implementation for time series data in the
energy-growth nexus.
Stages in Time-Series
ARDL Implementation
First: Stationarity, Unit roots, and order
of integration
ADF: Augmented Dickey Fuller,
PP: Philips–Perron
(Note: They have low power properties, but
since literature is still using them, it is good
to use them as reference)
Economies 2019, 7, 105 10 of 16
Table 1. Cont.
KPSS: Kwiatowksi–Phillips–Schmidt–Shin
ADF-WS: Augmented Dickey
Fuller-Weighted Symmetric (Note: Good
size and power properties)
LS: Lee and Strazicish for breaks
and various other tests depending on the
assumptions made about the data or the
knowledge of them . . .
When contradictory results are reached,
observing the correlogram is a good idea.
Are the series I(0) or I(1)? If yes, proceed
with ARDL cointegration
Yes: No:
Stationarity Stationarity
Second stage: Cointegration
Maximum lag value is decided on AIC and
BIC basis and HQC. The F value for the
cointegration test should be applied for all
criteria (BIC, AIC, HQC).
If cointegration evidence is inconclusive,
Yes: Cointegration then the decision about the long-run No: Cointegration
relationship is based on the ECT.
Are long-run coefficients significant?
Do they have the correct sign?
If we find no evidence of
We need to augment the cointegration, then the
Granger-type causality test model specification will be a vector
with one period lagged ECT autoregression (VAR) in 1st
difference form (Liu 2009)
Even if the ECT is incorporated in
all equations of the Granger
causality model, only in the
equations where the null
hypothesis of no cointegration is
rejected, will be estimated with an
ECT (Narayan and Smyth 2006).
Is the cointegration equation robust?
Answer: Use the FMOLS, DOLS to check.
Third stage: Causality
Granger causality is ideal both for small
and large samples (Geweke et al. 1983 )
The ECT model allows the inclusion of the
lagged ECT derived from the cointegration
equation. Thus the long-run information
lost through differencing is reintroduced.
Does the ECM have a negative sign?
Are the estimated coefficients stable?
Work with diagnostics to prove robustness
of your model
Source: Author’s compilation. Note: BIC: Bayesian (Schwarz) information criterion, AIC: Akaike information
criterion, HQC: Hannan–Quinn criterion, ECT: error correction model, FMOLS: fully modified OLS, DOLS:
dynamic OLS.
Economies 2019, 7, 105 11 of 16
Table 2. ARDL implementation for panel data in the energy-growth nexus.
Stages in Panel Data

ARDL Implementation
First stage: Cross
sectional dependence
This is examined with various tests
(Some examples are shown below):
Breusch Pagan LM test (Breusch
and Pagan 1980)
Pesaran CD test (Pesaran 2004)
(Baltagi et al. 2012) bias corrected
scaled LM test
No: Yes:
Cross Sectional dependence Cross Sectional dependence
Second stage: Stationarity and
order of integration
Apply tests assuming cross sectional
Apply tests assuming cross sectional
dependence
independence (first generation)
(second generation)
EXAMPLES:
EXAMPLES:
Im et al. (2003)
LS for 2 structural breaks and Pesaran (2007)
Levin et al. (2002)
large size of data Moon and Perron (2004)
Choi (2001)
Bai and Ng (2004)
Breitung (2000)
Chang (2002)
Maddala et al. (1999)
Harris and Sollis (2003)
Hadri (2000)
CIPS test (Pesaran 2007 )
Yes: Stationarity No: Stationarity
Third stage: Panel cointegration
There are residual based tests,
likelihood based tests and error
correction based tests.
No: Cross sectional dependence
Yes: Cross sectional dependence
EXAMPLES OF TESTS:
EXAMPLES OF TESTS:
Gutierrez (2003)
Groen and Kleibergen (2003)
Larsson et al. (2001)
It allows for multiple cointegration
Pedroni (higher explanatory power, mostly
equations.
preferred with 7 statistics) (Pedroni 2004, 2007)
Westerlund (2007)
McCoskey and Kao (1998)—(ideal for small
4 statistics (good for structural breaks)
samples) Kao (1999) —(ideal for small samples)
Use a resilient estimator such as
Driscoll and Kraay (1998)
Is cointegration confirmed?
Yes: Cointegration No: Cointegration
FMOLS
DOLS
MG
PMG (does not consider cross-sectional
Pooling is a good idea: Opt between
dependence; constrains long-run coefficients be
random effects models or fixed effects
the same across units)
models depending on Hausman test.
CCEP (allows cross sectional dependence,
endogeneity, serial correlation)
CCEMG (as above but better for small cross
sections)
Fourth stage: Panel Causality
Granger causality: It is a Dumitrescu and Hurlin (2012): good
traditional method that assumes sample properties and cross-sectional
panels are homogeneous with no dependence resilient. Able to report
interconnections among individual specific causal linkages.
cross-section units Bai and Kao CUP-FM estimator
Source: Author’s compilation. Note: FMOLS: fully modified OLS, DOLS: dynamic OLS, MG: mean group (estimator),
PMG: panel mean group (estimator), CIPS: CCEP: common correlated effects pooled (estimator), CCEMG: common
correlated effects mean group (estimator), CUP-FM: continuously updated fully modified (estimator).
Experienced researchers will have so far realized that the panel data are many shorter time series
data, pooled together. The data generation process may be, or may not be, the same across panels
Economies 2019, 7, 105 12 of 16
(sub-groups of data). Therefore, several time series tests and procedures have been adapted from time
series into panel data through a kind of averaging across panels (groups of data). Panel data are a
convenient way in energy economics to overcome problems such as collinearity. Furthermore, that
data provide more degrees of freedom and a more informed speed of adjustment. On top of that, with
this approach one can control for heterogeneity and efficiency in the identification and measurement of
economic issues (Tugcu 2018).
Panel data suffer from limitations such as the cross-sectional dependence, which is attributed
to globalization and unification of policies across panel units (e.g., countries). This makes energy
consumption patterns follow similar movements among the various countries in a panel, particularly
if countries are signatories to the same environmental and emissions cutting agreement. The other
limitation comes from the fact that panel data are in essence two entry level data and thus the error
term in modeling contains both unit-specific (e.g., country) information and time-specific information.
This may contribute to the endogeneity problem if the aforementioned error components are correlated
to explanatory variables. However, these drawbacks do not discourage researchers from using panel
data, which are the main type of data to expect in the energy-growth nexus research field.
Before closing this paper, it is useful to recommend the sites for the implementation of ARDL and
NARDL coding in EVIEWS and STATA softwares:
ARDL and NARDL coding and implementation in EVIEWS available from: http://www.eviews.
com/help/helpintro.html#page/content/ardl-Estimating_ARDL_Models_in_EViews.html.
ARDL and NARDL coding and implementation in STATA available from: https:
//www.statalist.org/forums/forum/general-stata-discussion/general/1434232-ardl-updated-stata-
command-for-the-estimation-of-autoregressive-distributed-lag-and-error-correction-models.
Note: As far as NARDL coding and implementation in EVIEWS and STATA are concerned, since
it is an ARDL model, it is just an estimation with lags of variables. One can specify that as a non-linear
estimation with the least squares estimator.
4. Conclusions
The energy-growth nexus economics is a field that attracts major research attention, because of
the significant information it provides to policy-makers who consider energy conservation measures.
The ARDL method has been mostly favored and used in the past decade owing to its merits (flexibility,
interpretability, eloquence, and statistical properties that are explained in the introduction of this
paper). The paper meets the needs of two groups of researchers: one group is the new researchers who
have recently started using the ARDL method. As a result of that, some points of its implementation
are not fully clarified to them yet, because those are fragmented in various research papers and lecture
notes on the internet. This fragmentation causes delays in research and paper writing and always
leaves room for journal reviewers to reject a paper or advise major reviews. The other group is the more
experienced researchers who have used the method a lot of times, but there is always an aspect in the
method that will be benefited from throwing additional light into. Besides, the method is continuously
enriched it its applied dimension and the reading of this paper by experienced researchers will grant
them the opportunity to stay up-to-date with the method’s evolution.
The paper is referencing applied work and knowledge throughout. Sometimes, it happens
that even experienced researchers are using a test of a statistical concept, whose exact meaning
needs brushing-up since the days they learned that during their undergraduate years at university.
Furthermore, the paper guides the ARDL energy-growth researcher about the steps that need to be
taken and the exact way that results should be presented and written in a paper in order to create the
readers a feeling of transparency when they read a research paper. Moreover, this point will offer
comparability among papers and will enable apt meta-analysis which is so valuable for the progress of
science and the evolution of society.
The paper can also serve as a review and reference paper for post-graduate students writing their
MA/MSc (not lest PhD) dissertation and need to employ this method. The quintessence of the paper
Economies 2019, 7, 105 13 of 16
lies in the last two tables of the fifth section, which separate the ARDL steps between the time-series
and panel-data frameworks. Degree of integration, cointegration, and causality steps are explained
and presented in a vertebrate and well-tied nature and relieves students from the stress of selecting the
correct test in every step of the implementation.
Last but not the least, the content of this paper is useful not only for the researchers of the
energy-growth nexus, but also for the researchers of other fields such as the tourism-growth nexus or
the broader environment-growth nexus and the Kuznets curve studies.
Author Contributions: The author is the sole contributor in this paper.

Funding: The author has received no funding for writing this paper.
Conflicts of Interest: The author declares no conflict of interest.
References
Kraft, John, and Arthur Kraft. 1978. On the Relationship between Energy and GNP. Journal of Energy Development
3: 401–3.
Engle, Robert F., and Clive W. J. Granger. 1987. Co-Integration and Error Correction: Representation, Estimation,
and Testing. Econometrica 55: 251–76. [CrossRef]
Ali, Hamisu Sadi, Siong Hook Law, and Talha Ibrahim Zannah. 2016. Dynamic impact of urbanization, economic
growth, energy consumption, and trade openness on CO2 emissions in Nigeria. Environmental Science and
Pollution Research 23: 12435–43. [CrossRef] [PubMed]
Rahman, Mohammad Mafizur, and Mohammad Abul Kashem. 2017. Carbon emissions, energy consumption
and industrial growth in Bangladesh: Empirical evidence from ARDL cointegration and Granger causality
analysis. Energy Policy 110: 600–8. [CrossRef]
Bölük, Gülden, and Mehmet Mert. 2015. The renewable energy, growth and environmental Kuznets curve in
Turkey: An ARDL approach. Renewable and Sustainable Energy Reviews 52: 587–95. [CrossRef]
Boswijk, H. Peter. 1994. Testing for an unstable root in conditional and structural error correction models.
Journal of Econometrics 63: 37–60. [CrossRef]
Stock, James H., and Mark W. Watson. 1993. A Simple Estimator of Cointegrating Vectors in Higher Order
Integrated System. Economometrica 61: 783–820. [CrossRef]
Swamy, Paravastu A. V. B. 1970. Efficient inference in a random coefficient regression model. Econometrica
38: 311–23. [CrossRef]
Fuinhas, Jose Alberto, and António Cardoso Marques, eds. 2019. The Extended Energy–Growth Nexus. Cambridge:
Academic Press.
Dumitrescu, Elena-Ivona, and Christophe Hurlin. 2012. Testing for Granger non-causality in heterogeneous
panels. Economic Modelling 29: 1450–60. [CrossRef]
Al-hajj, Ekhlas, Usama Al-Mulali, and Sakiru Adebola Solarin. 2018. Oil price shocks and stock returns nexus for
Malaysia: Fresh evidence from nonlinear ARDL test. Energy Reports 4: 624–37. [CrossRef]
Ali, Wajahat, Azrai Abdullah, and Muhammad Azam. 2017. Re-visiting the environmental Kuznets curve
hypothesis for Malaysia: Fresh evidence from ARDL bounds testing approach. Renewable and Sustainable
Energy Reviews 77: 990–1000. [CrossRef]
Pedroni, Peter. 2007. Social Capital, Barriers to Production and Capital Shares: Implications for the Importance of
Parameter Heterogeneity from a Nonstationary Panel Approach. Journal of Applied Econometrics 22: 429–51.
[CrossRef]
Apergis, Nicholas. 2018. Testing for Causality: A Survey of the Current Literature. In The Economics and
Econometrics of the Energy-Growth Nexus. Edited by Angeliki N. Menegaki. Cambridge: Academic Press,
pp. 273–305.
Asafu-Adjaye, John. 2000. The relationship between energy consumption, energy prices and economic growth:
Time series evidence from Asian developing countries. Energy Economics 22: 615–25. [CrossRef]
Bahmani-Oskooee, M. Mohsen, and Gour G. Goswami. 2003. A disaggregated approach to test the J-Curve
phenomenon: Japan versus her major trading partners. Journal of Economics and Finance 27: 102–13. [CrossRef]
Bai, Jushan, and Serena Ng. 2004. A panic attack on unit roots and cointegration. Econometrica 72: 1127–77.
[CrossRef]
Economies 2019, 7, 105 14 of 16
Baltagi, Badi H., Qu Feng, and Chihwa Kao. 2012. A Lagrange Multiplier test for cross-sectional dependence in a
fixed effects panel data model. Journal of Econometrics 170: 164–77. [CrossRef]
Banerjee, Anindya, Juan Dolado, and Ricardo Mestre. 1998. Error-correction mechanism tests for cointegration in
a single-equation framework. Journal of Time Series Analysis 19: 615–25. [CrossRef]
Bayer, Christian, and Christoph Hanck. 2013. Combining non-cointegration tests. Journal of Time Series Analysis
34: 83–95. [CrossRef]
Breitung, Jörg. 2000. The local power of some unit root tests for panel data. In Nonstationary Panels,
Panel Cointegration and Dynamic Panels. Edited by Badi H. Baltagi, Thomas B. Fomby and R. Carter Hill.
Bingley: Emerald Group Publishing Limited, vol. 15, pp. 161–78.
Breusch, Trevor S., and Adrian R. Pagan. 1980. The Lagrange multiplier test and its applications to model
specification in econometrics. The Review of Economic Studies 47: 239–53. [CrossRef]
Chang, Yoosoon. 2002. Nonlinear IV unit root tests in panels with cross-sectional dependency. Journal of Econometrics
110: 261–92. [CrossRef]
Choi, In. 2001. Unit root tests for panel data. Journal of International Money and Finance 20: 249–72. [CrossRef]
De Vita, Glauco, Klaus Endresen, and Lester C. Hunt. 2006. An empirical analysis of energy demand in Namibia.
Energy Policy 34: 3447–63. [CrossRef]
Driscoll, John C., and Aart C. Kraay. 1998. Consistent covariance matrix estimation with spatially dependent
panel data. Review of Economics and Statistics 80: 549–59. [CrossRef]
Fuinhas, José Alberto, and António Cardoso Marques. 2012. Energy consumption and economic growth nexus in
Portugal, Italy, Greece, Spain and Turkey: An ARDL bounds test approach (1965–2009). Energy Economics
34: 511–17. [CrossRef]
Geweke, John, Richard Meese, and Warren Dent. 1983. Comparing alternative tests of causality in temporal
systems. Analytic results and experimental evidence. Journal of Econometrics 21: 161–94. [CrossRef]
Groen, Jan J. J., and Frank Kleibergen. 2003. Likelihood-based cointegration analysis in panels of vector
error-correction models. Journal of Business and Economic Statistics 21: 295–318. [CrossRef]
Gutierrez, Luciano. 2003. On the power of panel cointegration tests: A Monte Carlo comparison. Economics Letters
80: 105–11. [CrossRef]
Hadri, Kaddour. 2000. Testing for stationarity in heterogeneous panel data. Econometrics Journal 3: 148–61.
[CrossRef]
Halicioglu, Ferda. 2007. Residential electricity demand dynamics in Turkey. Energy Economics 29: 199–210.
[CrossRef]
Harris, Richard, and Robert Sollis. 2003. Applied Time Series Modelling and Forecasting. West Sussex: Wiley.
Haug, Alfred A. 2002. Temporal aggregation and the power of cointegration tests: A Monte Carlo study.
Oxford Bulletin of Economics and Statistics 64: 399–412. [CrossRef]
Im, Kyung So, M. Hashem Pesaran, and Yongcheol Shin. 2003. Testing for unit roots in heterogeneous panels.
Inglesi-Lotz, Roula. 2018. The role of potential factors/actors and regime switching modelling. In The Economics and
the Econometrics of the Energy-Growth Nexus. Edited by Angeliki N. Menegaki. Cambridge: Academic Press,
p. 387.
Jalil, Abdul, and Ying Ma. 2008. Financial development and economic growth: Time series evidence from Pakistan
and China. Journal of Economic Cooperation among Islamic Countries 29: 29–68.
Johansen, Søren. 1988. Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control
122: 231–54. [CrossRef]
Johansen, Søren, and Katarina Juselius. 1990. Maximum likelihood estimation and inference on cointegration—with
applications to the demand for money. Oxford Bulletin of Economics and Statistics 52: 169–210. [CrossRef]
Kao, Chihwa. 1999. Spurious regression and residual–based tests for cointegration in panel data.
Kao, Chihwa, and Min-Hsien Chiang. 2000. On the estimation and inference of cointegrated regression in panel
data. In Nonstationary Panels, Panel Cointegration, and Dynamic Panels (Advances in Econometrics). Edited by
Badi H. Baltagi, Thomas B. Fomby and R. Carter Hill. Bingley: Emerald Group Publishing Limited, vol. 15,
pp. 179–222.
Economies 2019, 7, 105 15 of 16
Larsson, Rolf, Johan Lyhagen, and Mickael Löthgren. 2001. Likelihood-based cointegration tests in heterogeneous
panels. Econometrics Journal 108: 1–24. [CrossRef]
Lee, Chien-Chiang, and Chun-Ping Chang. 2008. Energy consumption and economic growth in Asian economies:
A more comprehensive analysis using panel data. Resource and Energy Economics 30: 50–65. [CrossRef]
Lee, Junsoo, and Mark C. Strazicich. 2003. Minimum Lagrange Multiplier Unit Root Test with Two Structural
Breaks. Review of Economics and Statistics 85: 1082–89. [CrossRef]
Levin, Andrew, Chien-Fu Lin, and Chia-Shang James Chu. 2002. Unit root tests in panel data: Asymptotic and
finite-sample properties. Journal of Econometrics 108: 1–24. [CrossRef]
Liu, Yaobin. 2009. Exploring the relationship between urbanization and energy consumption in China using
ARDL autoregressive distributed lag and FDM factor decomposition model. Energy 34: 1846–54. [CrossRef]
Maddala, G. S., W. U. Shaowen, and Peter C. Liu. 1999. Do panel data rescue purchasing power parity (PPP)
theory? In Panel Data Econometrics: Future Directions. Edited by Jaya Krishnakkumar and Elvezio Ronchetti.
New York: Elsevier.
Marques, Luís Miguel, José Alberto Fuinhas, and António Cardoso Marques. 2017. Augmented energy-growth
nexus: Economic, political and social globalization impacts. Energy Procedia 136: 97–101. [CrossRef]
Marques, Luís Miguel, José Alberto Fuinhas, and António Cardoso Marques. 2019. Chapter Four—The impacts
of China’s effect and globalization on the augmented energy–nexus: Evidence in four aggregated regions.
In The Extended Energy-Growth Nexus. Edited by Jose Alberto Fuinhas and António Cardoso Marques.
Cambridge: Academic Press, pp. 97–139.
Masih, Abul MM, and Rumi Masih. 1996. Energy consumption, real income and temporal causality: Results from
a multi-country study based on cointegration and error-correction modelling techniques. Energy Economics
18: 165–83. [CrossRef]
McCoskey, Suzanne, and Chihwa Kao. 1998. A residual-based test of the null of cointegration in panel data.
Econometric Reviews 17: 57–84. [CrossRef]
Menegaki, Angeliki N., ed. 2018. The Economics and Econometrics of the Energy-Growth Nexus. Cambridge: Academic Press.
Menegaki, Angeliki N., and Can Tansel Tugcu. 2016. Rethinking the energy-growth nexus: Proposing an index of
sustainable economic welfare for Sub-Saharan Africa. Energy Research and Social Science 17: 147–59. [CrossRef]
Menegaki, Angeliki N., and Can Tansel Tugcu. 2018. Two versions of the Index of Sustainable Economic Welfare
(ISEW in the energy-growth nexus for selected Asian countries. Sustainable Production and Consumption
14: 21–35. [CrossRef]
Moon, Hyungsik Roger, and Benoit Perron. 2004. Testing for a unit root in panels with dynamic factors.
Narayan, Paresh Kumar, and Russell Smyth. 2005. Electricity consumption, employment and real income in
Australia evidence from multivariate Granger causality tests. Energy Policy 33: 1109–16. [CrossRef]
Narayan, Paresh Kumar, and Russell Smyth. 2006. Higher education, real income and real investment in China:
Evidence from granger causality tests. Education Economics 14: 107–25. [CrossRef]
Odhiambo, Nicholas M. 2009. Energy consumption and economic growth nexus in Tanzania: An ARDL bounds
testing approach. Energy Policy 37: 617–22. [CrossRef]
Pedroni, Peter. 2004. Panel cointegration: Asymptotic and finite sample properties of pooled time series tests with
an application to the PPP hypothesis. Econometric Theory 20: 597–625. [CrossRef]
Pesaran, M. Hashem, Yongcheol Shin, and Richard J. Smith. 2001. Bounds testing approaches to the analysis of
level relationships. Journal of Applied Econometrics 16: 289–326. [CrossRef]
Pesaran, M. Hashem, and Yongcheol Shin. 1999. An Autoregressive Distributed Lag Modelling Approach to
Cointegration Analysis. In Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial
Symposium. Edited by Steinar Strøm. Cambridge: Cambridge University Press.
Pesaran, M. Hashem. 2004. General Diagnostic Tests for Cross Sectional Dependence in Panels. Cambridge Working
Papers in Econometrics, No: 0435. Cambridge: Faculty of Economics, University of Cambridge.
Pesaran, M. Hashem. 2007. A simple panel unit root test in the presence of cross-section dependence. Journal of
Applied Econometrics 22: 265–312. [CrossRef]
Pesaran, M. Hashem, and Takashi Yamagata. 2005. Testing Slope Homogeneity in Large Panels (March 2005). IEPR
Working Paper No. 05.14; CESifo Working Paper No. 1438. Available online: https://ssrn.com/abstract=671050
(accessed on 6 June 2019).
Economies 2019, 7, 105 16 of 16
Phillips, Peter CB, and Bruce E. Hansen. 1990. Statistical inference in instrumental variables regression with I(1)
processes. Review of Economic Studies 57: 99–125. [CrossRef]
Shahbaz, Muhammad. 2018. Current Issues in Time-Series Analysis for the Energy-Growth Nexus (EGN);
Asymmetries and Nonlinearities, Case Study: Pakistan. In The Economics and Econometrics of the Energy-Growth
Nexus. Edited by Angeliki N. Menegaki. Cambridge: Academic Press, p. 387.
Shin, Yongcheol, Byungchul Yu, and Matthew Greenwood-Nimmo. 2011. Modelling Asymmetric Cointegration
and Dynamic Multiplier in a Nonlinear ARDL Framework. In Festschrift in Honor of Peter Schmidt.
Rochester: SSRN.
Shin, Yongcheol, Byungchul Yu, and Matthew Greenwood-Nimmo. 2014. Modelling Asymmetric Cointegration
and Dynamic Multipliers in a Nonlinear ARDL Framework. In Festschrift in Honor of Peter Schmidt. New York:
Springer, pp. 281–314.
Toda, Hiro Y., and Taku Yamamoto. 1995. Statistical inference in vector autoregressions with possibly integrated
processes. Journal of Econometrics 66: 225–50. [CrossRef]
Tugcu, Can Tansel. 2018. Panel Data Analysis in the Energy-Growth Nexus (EGN). In The Economics and
Econometrics of the Energy-Growth Nexus. Cambridge: Academic Press, pp. 255–71.
Tursoy, Turgut, and Faisal Faisal. 2018. The impact of gold and crude oil prices on stock market in Turkey:
Empirical evidences from ARDL bounds test and combined cointegration. Resources Policy 55: 49–54.
[CrossRef]
Westerlund, Joakim. 2007. Testing for error correction in panel data. Oxford Bulletin of Economics and Statistics
69: 709–48. [CrossRef]
© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Modelling Asymmetric Cointegration and Dynamic
Multipliers in a Nonlinear ARDL Framework
Yongcheol Shin
University of York
Byungchul Yu
Department of International Trade, Dong-A University
Matthew Greenwood-Nimmo
Leeds University Business School
November 9, 2011
Abstract
This paper develops a cointegrating nonlinear ARDL (NARDL) model in which
short- and long-run nonlinearities are introduced via positive and negative partial
sum decompositions of the explanatory variables. We demonstrate that the model
is estimable by OLS and that reliable long-run inference can be achieved by bounds-
testing regardless of the integration orders of the variables. Furthermore, we derive
asymmetric dynamic multipliers that graphically depict the traverse between the
short- and the long-run. The salient features of the model are illustrated using the
examples of nonlinearities in both the unemployment-output relationship and the
adjustment of retail gasoline prices.
JEL Classification: C12, C13, J64.
Key Words: Asymmetric Cointegrating Relationships, Asymmetric Dynamic Multipli-
ers, Nonlinear ARDL (NARDL) ECM-based Estimation and Tests, Nonlinear Unemployment-
Output Relationship, Asymmetric Gasoline Price Adjustment.
Electronic copy available at: http://ssrn.com/abstract=1807745

1 Introduction
The nonlinearity of many macroeconomic variables and processes has long been recog-
nised. In a famous remark, Keynes (1936, p. 314) noted that “the substitution of a down-
ward for an upward tendency often takes place suddenly and violently, whereas there
is, as a rule, no such sharp turning point when an upward is substituted for a downward
tendency”. More recently, the joint fields of behavioural finance and economics associated
most notably with Daniel Kahneman, Amos Tversky and Robert Shiller (e.g. Kahneman
and Tversky, 1979; Shiller, 1993, 2005) have provided a considerable impetus to the mod-
elling of asymmetry, stressing that nonlinearity is endemic within the social sciences and
that asymmetry is fundamental to the human condition.
Since the mid-nineties, a substantial literature has considered the joint issues of nonsta-
tionarity and nonlinearity. This field has been dominated by three regime-switching mod-
els: the threshold ECM associated with Balke and Fomby (1997), the Markov-switching
ECM of Psaradakis et al. (2004), and the smooth transition regression ECM developed
by Kapetanios et al. (2006). The development of this literature reflects the belief that
the information revealed by linear models may be insufficiently rich to permit strong in-
ference or to yield reliable forecasts. More generally, it suggests a general concern that
the assumption of linear adjustment may be excessively restrictive in a wide range of eco-
nomically interesting situations, particularly where transaction costs are non-negligible
and where policy interventions are observed in-sample.
The majority of these studies, however, maintain the assumption that the long-run re-
lationship may be represented as a symmetric linear combination of nonstationary stochas-
tic regressors. With the notable exceptions of Park and Phillips (2001), Saikkonen and
Electronic copy available at: http://ssrn.com/abstract=1807745

Choi (2004), Escribano et al. (2006) and Bae and de Jong (2007), little research effort
has been devoted to the analysis of nonlinear cointegration. Schorderet (2001, 2003) has
proposed the bivariate asymmetric cointegrating regression of unemployment on output,
where output is decomposed into partial sum processes of positive and negative changes.
On the basis of this piecewise linear specification, he finds that the impact of recessions
on unemployment is larger in absolute terms than that of cyclical upturns, indicating
an hysteretic relationship. Granger and Yoon (2002) further develop the notion that the
cointegrating relationship may be defined between the positive and negative components
of the underlying variables, an effect that they term ‘hidden cointegration’.
Partial sum decompositions have been applied with some success to the analysis of dy-
namic asymmetry. Examples include Webber’s (2000) analysis of the relationship between
the exchange rate and import prices, Lee (2000) and Virén’s (2001) work on asymmetries
in Okun’s Law and the research of Borenstein et al. (1997) and Bachmeier and Grif-
fin (2003) focusing on the asymmetric response of gasoline prices to fluctuations in the
oil price. However, most papers modelling short-run asymmetry employ the two step
Engle-Granger technique which is inherently less efficient than single-step ECM estima-
tion. Moreover, papers coherently modelling long- and short-run asymmetries jointly are
scarce.
Our purpose in this paper is to develop a simple and flexible nonlinear dynamic frame-
work capable of simultaneously and coherently modelling asymmetries both in the under-
lying long-run relationship and in the patterns of dynamic adjustment. We make four
principle contributions. Firstly, we derive the dynamic error correction representation as-
sociated with the asymmetric long-run cointegrating regression, resulting in the nonlinear
3
ARDL (NARDL) model. Secondly, following Pesaran and Shin (1998) and Pesaran et
al. (2001), we employ a pragmatic bounds-testing procedure for the existence of a stable
long-run relationship which is valid irrespective of whether the underlying regressors are
I(0), I(1) or mutually cointegrated. Thirdly, we derive asymmetric cumulative dynamic
multipliers that allow us to trace out the asymmetric adjustment patterns following pos-
itive and negative shocks to the explanatory variables. This has substantial theoretical
appeal as it allows us to depict in an intuitive manner the traverse to a new equilib-
rium following a perturbation to the system. Such is the flexibility of our framework
that it can readily accommodate the four general combinations of long- and short-run
asymmetry. Finally, we conduct a range of Monte Carlo experiments which largely val-
idate our estimation and inferential framework, revealing little bias in estimation and
considerable power of the key test statistics. Moreover, we compute empirical p-values
for the cointegration tests and confidence intervals for our dynamic multipliers by means
of a non-parametric bootstrap. These exercises highlight a further enviable attribute of
our proposed methodology: it is easily estimable by OLS and simple inferential methods
provide a straightforward and reliable means of discriminating between the various forms
and combinations of asymmetries.
We demonstrate the usefulness of the NARDL framework through two empirical ap-
plications. First, we investigate the unemployment-output relationship in the US, Canada
and Japan over the period 1982m2–2003m11. We find strong evidence of long-run asym-
metry consistent with the growing consensus that unemployment is more sensitive to busts
than booms. Moreover, particularly in Canada, we find dynamic asymmetries indicating
that firms are quick to fire and slow to hire. Finally, the dynamic multipliers reveal a
4
pattern that is often obscured in discussions of persistence – although the half-life of an
expansionary shock in the US is smaller than that of an equivalent recessionary shock,
the real impact in terms of jobs created/lost is larger in the recessionary case. It fol-
lows, therefore, that focusing on the half-life of a shock is insufficient when the long-run
relationship is asymmetric as this fails to convey relevant information about the relative
magnitude of the economic response to the shock in each regime.
Our second application investigates the asymmetric responses of Korean retail gasoline
prices to fluctuations in the crude oil spot price and the Korean Won/US Dollar exchange
rate over the period 1991q1–2007q2. Our results indicate that the long-run relationship is
linear in both variables, indicating that retailers pass cost changes through to consumers
symmetrically in the long-run. However, the speed of upward adjustment exceeds that of
downward adjustment, indicating a degree of downward price-stickiness consistent with
the ‘rockets and feathers’ hypothesis associated with Bacon (1991). Moreover, our results
support the findings of Asplund et al. (2000) that the short-run response of gasoline prices
to the exchange rate is more pronounced than that associated with fluctuations in the
price of crude oil.
Finally, the flexibility and utility of the NARDL technique is reflected in the growing
literature that has adopted our technique for the analysis of a range of economic issues1 .
Van Treeck (2008) has employed the NARDL model in his analysis of asymmetric wealth
effects on US consumption, and has found that liquidity constraints and loss-aversion can
be reconciled inter-temporally, with the former dominating in the short-run and the latter
in the long-run. More recently, Delatte and López-Villavicencio (2010, 2011) have applied
the NARDL technique in their analysis of long-run asymmetries in the pass-through from
5
exchange rates to consumer prices in developed economies. Nguyen and Shin (2010) have
estimated NARDL models on high frequency exchange rate data, revealing interesting
patterns of asymmetry in the pricing impacts of order flow. Lastly, Greenwood-Nimmo,
Shin and Van Treeck (2011) have estimated NARDL models of the interest rate pass-
through relationship in the USA and Germany, finding strong evidence of time-varying
asymmetry. An important and relatively common finding in this literature is that the
direction of asymmetry may switch between the short-run and the long-run. For example,
a positive shock may have a larger absolute effect in the short-run while a negative shock
has a larger absolute effect in the long-run (or vice-versa). The simplicity and flexibility
of NARDL renders it an ideal framework with which to model such complex phenomena.
The paper proceeds as follows. Section 2 introduces the asymmetric cointegrating
regression model and derives the associated asymptotic theory. On this basis, the NARDL
model is derived including expressions for the asymmetric cumulative dynamic multipliers,
and the associated testing procedures are developed. Section 3 employs a range of Monte
Carlo simulations to investigate the finite sample properties of the proposed estimators
and the test statistics. Section 4 presents the results of our two empirical illustrations.
Lastly, Section 5 offers some concluding remarks, while mathematical proofs are collected
in the Appendix.
6
2 Modelling Asymmetries in a Nonlinear ARDL Frame-
work
The increasing popularity of nonlinear modelling in the context of cointegrating long-
run relationships has led to the proliferation of regime-switching models. Among existing
studies, nonlinearity is typically confined to the error correction mechanism and estimation
proceeds on the basis of either the threshold ECM associated with Balke and Fomby
(1997), the Markov-Switching ECM of Psaradakis et al. (2004) or the smooth transition
regression ECM developed by Kapetanios et al. (2006). However, the common assumption
that the underlying cointegrating relationship may be represented as a linear combination
of the underlying nonstationary variables may be excessively restrictive. In general, the
long-run (cointegrating) relationship may also be subject to asymmetry or nonlinearity.2
The three regime-switching type functional forms mentioned above are equally applicable
to the case of long-run asymmetry (Saikkonen and Choi, 2004; Escribano et al., 2006).
In principle, it is possible to obtain a unified model capable of combining nonlinearities
in the long-run relationship and the error correction mechanism coherently. In practice,
however, selection of the regime-switching variables and the transition functional forms
may be non-trivial.3 Hence, the development of an operational model of this form is
likely to be highly challenging (c.f. Saikkonen, 2008). We contribute to this literature by
developing a nonlinear modelling framework based on the ARDL approach which provides
a simple and flexible vehicle for the analysis of joint long- and short-run asymmetries.
7
2.1 Nonlinear Asymmetric Cointegration
Before developing the full representation of the NARDL model, we introduce the following
asymmetric long-run regression:
− −
y t = β + x+
t + β xt + u t , (2.1)
∆xt = vt , (2.2)
−
where yt and xt are scalar I(1) variables, and xt is decomposed as xt = x0 + x+
t + xt where
−
x+
t and xt are partial sum processes of positive and negative changes in xt :
t
X t
X t
X t
X
x+
t = ∆x+
j = max (∆xj , 0) , x−
t = ∆x−
j = min (∆xj , 0) . (2.3)
j=1 j=1 j=1 j=1
This simple approach to modelling asymmetric cointegration based on partial sum decom-
positions has been applied by Schorderet (2001) in the context of the nonlinear relationship
between unemployment and output.4
Granger and Yoon (2002) advance the concept of ‘hidden cointegration’, where coin-
tegrating relationships may be defined between the positive and negative components of
the underlying variables. They demonstrate the relevance of this conceptual framework in
the context of the linkage between US short- and long-term interest rates and the output-
unemployment relationship, both of which are notable for the lack of robust evidence of
linear cointegration. Schorderet (2003) generalises this concept and defines the following
stationary linear combination of the partial sum components:
zt = β0+ yt+ + β0− yt− + β1+ x+ − −

t + β 1 xt . (2.4)
If zt is stationary, then yt and xt are said to be ‘asymmetrically cointegrated’. It follows
that standard linear (symmetric) cointegration is a special case of (2.4), obtained only
8
if β0+ = β0− and β1+ = β1− . Schorderet modifies (2.4) to analyse hidden cointegration,
where only one component of each series appears in (2.4), developing a model of the
asymmetric cointegrating relationship between bilateral exchange rates as an illustration.
Lardic and Mignon (2008) analyse hidden cointegration between the price of oil and GDP,
although they fail to provide any economically meaningful interpretation of the estimated
asymmetric coefficients.
Given the difficulty in interpreting the results of hidden cointegration analysis, we will
focus on (2.1), imposing the restriction β0+ = β0− = β0 in (2.4) such that β + = −β1+ /β0 and
β − = −β1− /β0 . To achieve the greatest possible clarity of exposition, we initially begin
with the case of a single regressor decomposed into the relevant partial sum processes.
Assumption 1 The disturbances ut and vt in (2.1) and (2.2) follow iid processes with
zero means and finite variances, and they are independently distributed.
Theorem 1 Consider the asymmetric cointegrating regression, (2.1) and (2.2). Under
Assumption 1, the OLS estimators of β + and β − have the following asymptotic distribu-
tions:
1R R R
µ− σ u Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr

T (β̂ + − β + ) ⇒ − 3
2 ,
σs 1
R R
W (r) 2 dr − rW (r)dr
3 s̃ s̃
1R R R
µ+ σ u Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr

− − 3
T (β̂ − β ) ⇒ 2 ,
σs 1
R R
W (r) 2 dr − rW (r)dr
3 s̃ s̃
where µ+ := E [max[0, vt ]], µ− := E [min[0, vt ]], st := µ+ min[0, vt ] − µ− max[0, vt ], σu2 :=
V ar (ut ), σs2 := V ar (st ), and Ws̃ (·) and Wũ (·) are two independent standard Brownian
motions defined on r ∈ [0, 1], and obtained as the weak limit of partial sum processes,
9
PT (·) PT (·)
T −1/2 j=1 s̃t and T −1/2 j=1 ũt , with ũt := ut /σu and s̃t := st /σs . Furthermore,
T {µ+ (β̂ + − β + ) + µ− (β̂ − − β − )} = op (1).
2 2
Remark 1 In the special case when vt follows a symmetric distribution with µ+ = µ−
and V ar (max[0, vt ]) = V ar (min[0, vt ]),5 then we have
1
R R R
− − + + 3
Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr
T (β̂ − β ), T (β̂ − β ) ⇒ 2 ,
1
R R
3
Ws̃ (r)2 dr − rWs̃ (r)dr
n o
− − + +
such that T (β̂ − β ) + (β̂ − β ) = op (1).
0
Remark 2 Let β = (β + , β − ) , then

a
T β̂ − β ∼ M N (0, V ) , (2.5)
−1
where V = plimT →∞ T 2 (X 0 X) σu2 . Even though x+ −
t and xt are dominated by the de-
terministic trends by construction, these leading terms cancel off in the derivation of
−1 −1
(X 0 X) such that plimT →∞ T 2 (X 0 X) is well-defined and standard inference on β re-
mains asymptotically valid.
Remark 3 In a similar manner, when an intercept term is included, we can obtain the
asymptotic distributions of the OLS estimator as follows:
R R
1
W̃s̃ (r)dWũ (r) − (r − 12 )W̃s̃ (r)dr (r − 21 )dWũ (r)
R
− u

µ σ 12
T (β̂ + − β + ) ⇒ − 2 ;
σs
R
1 1
R
2
W̃s̃ (r) dr − (r − 2 W̃s̃ (r)
12
R R
1
W̃s̃ (r)dWũ (r) − (r − 21 )W̃s̃ (r)dr (r − 12 )dWũ (r)
R
+ u

µ σ 12
T (β̂ − − β − ) ⇒ 2 ;
σs
R
1 1
R
W̃ (r)2 dr − (r − W̃ (r)
12 s̃ 2 s̃
10
and T {µ+ (β̂ + − β + ) + µ− (β̂ − − β − )} = oP (1), where W̃s̃ (r) := Ws̃ (r) −
R
Ws̃ (r)dr for
r ∈ [0, 1].
2.2 The Nonlinear ARDL Model
The simple case presented above is useful for exposition and will certainly cover some
empirical applications. However, it is too restrictive since it does not allow for weak en-
dogeneity of the regressors and/or serially correlated errors, factors that will significantly
affect both the asymptotic and the small sample properties of the estimators. In their
presence, the OLS estimator in (2.1) may remain super-consistent but the asymptotic dis-
tribution is non-Gaussian. Hence, hypothesis testing cannot be carried out in the usual
manner without removing both the serial correlation and the endogeneity of the regres-
sors. In particular, the resulting OLS estimator of the cointegrating parameter will be
poorly determined in finite samples.
In the linear cointegration literature, several solutions to these twin problems have
been proposed in the context of the static regression model (Phillips and Hansen, 1990;
Saikkonen, 1991) and the dynamic regression model (Pesaran and Shin, 1998). Given
that our interest is in developing a fully dynamic model, we naturally choose to extend
the ARDL approach popularised by Pesaran and Shin (1998) and Pesaran et al. (2001),
thereby developing a flexible dynamic parametric framework with which to model rela-
tionships that exhibit combined long- and short-run asymmetries.6
To this end we consider the following nonlinear ARDL(p, q) model:
p q
X X
−0 −
θ +0 +

yt = φj yt−j + j xt−j + θ j xt−j + εt , (2.6)
j=1 j=0
11
−
where xt is a k × 1 vector of multiple regressors defined such that xt = x0 + x+
t + xt , φj is
−
the autoregressive parameter, θ +
j and θ j are the asymmetric distributed-lag parameters,
and εt is an iid process with zero mean and constant variance, σε2 . Throughout this
−
paper we will focus on the case in which xt is decomposed into x+
t and xt around
a threshold of zero, thereby distinguishing between positive and negative changes in the
rate of growth of xt . The resulting partial sum processes maintain an intuitively appealing
and economically meaningful interpretation in a wide range of applications.7
Following Pesaran et al. (2001), it is straightforward to rewrite (2.6) in the error
correction form as
p−1 q−1
X X
−0 −0 −
+0
x+ x− ϕ+0 +

∆yt = ρyt−1 + θ t−1 +θ t−1 + γj ∆yt−j + j ∆xt−j + ϕj ∆xt−j + εt
j=1 j=0
p−1 q−1
X X
−0 −
ϕ+0 +

= ρξt−1 + γj ∆yt−j + j ∆xt−j + ϕj ∆xt−j + εt (2.7)
j=1 j=0
Pp Pp Pq −
where ρ = j=1 φj − 1, γj = − i=j+1 φi for j = 1, ..., p − 1, θ + = j=0 θ+
j , θ =
Pq − + + +
Pq + − − −
Pq −
j=0 θ j , ϕ0 = θ 0 , ϕj = − i=j+1 θj for j = 1, ..., q − 1, ϕ0 = θ 0 , ϕj = − i=j+1 θj
−0 −
for j = 1, ..., q − 1, and ξt = yt − β +0 x+
t − β xt is the nonlinear error correction term
where β + = −θ + /ρ and β − = −θ − /ρ are the associated asymmetric long-run parameters.
To further deal with the possibility of non-zero contemporaneous correlation between
the regressors and the residuals in (2.7) we now consider the following reduced form data
generating process for ∆xt :8

q−1
X
∆xt = Λj ∆xt−j + v t , (2.8)
j=1
where v t ∼ iid (0, Σv ), with Σv being a k × k positive definite covariance matrix. Given
our focus on conditional modelling, we may express εt conditionally in terms of v t as:
12
q−1
!
X
εt = ω 0 v t + et = ω 0 ∆xt − Λj ∆xt−j + et (2.9)
j=1
where et is uncorrelated with v t by construction. Substituting (2.9) into (2.7) and rear-
ranging it, we finally obtain the following conditional nonlinear ECM:

p−1 q−1
X X
−0 −
π +0 +

∆yt = ρξt−1 + γj ∆yt−j + j ∆xt−j + π j ∆xt−j + et (2.10)
j=1 j=0
− −
where π + +
0 = θ 0 +ω, π 0 = θ 0 +ω, π+
j =
+
ϕj −ω 0 Λj and π − − 0
j = ϕj −ω Λj for j = 1, ..., q−1.
It is clear that (2.10) corrects perfectly for the weak endogeneity of any nonstationary
explanatory variables and that the choice of an appropriate lag structure will render the
model free from residual serial correlation. Our model combines many of the desirable
attributes of the fully-modified and the ARDL-based dynamic corrections associated re-
spectively with Phillips and Hansen (1991) and Pesaran and Shin (1998) in a dynamic
parametric framework capable of modelling both long- and short-run asymmetries. More-
over, since our model is linear in all the parameters including θ + , θ − , π + −

i and π i , reliable
estimation of (2.10) can be achieved by standard OLS.
Following the conditions used in the derivations above, we now summarise the following
assumption in the context of the NARDL-based ECM, (2.10):
Assumption 2 (i) et ∼ iid(0, σe2 ); (ii) xt is a k × 1 vector of I(1) regressors given by
(2.8); (iii) et is uncorrelated with v t through the conditional modelling, (2.9); (iv) the
condition, ρ < 0 guarantees that the model is dynamically stable.
Following Theorems 3.1 and 3.2 in Pesaran and Shin (1998), it is straightforward to
show under Assumption 2 that: (i) the OLS estimators of all the short-run dynamic
√
parameters in (2.10) are T -consistent and have the asymptotic normal distribution,
13
+ +
and (ii) the OLS estimators of the long-run parameters computed as β̂ = −θ̂ /ρ̂ and
− −
β̂ = −θ̂ /ρ̂, are T -consistent and follow the mixture normal distribution as defined in
Theorem 1. Hence, the null hypotheses of a symmetric long-run relationship β + = β −

or symmetric short-run coefficients can be tested using the Wald statistic following an
asymptotic χ2 distribution. In order to assess the extent to which these theoretical pre-
dictions are validated in both large and small samples, we will conduct a series of Monte
Carlo experiments in Section 3.
2.3 Bounds-Testing the Asymmetric Long-Run Relationship
We develop two operational testing procedures for the existence of an asymmetric (cointe-
grating) long-run relationship based on the NARDL ECM, (2.10). If ρ = 0, (2.10) reduces
to the regression involving only first differences, implying that there is no long-run rela-
−
tionship between the levels of yt , x+
t and xt . We first follow Banerjee et al. (1998) and
propose the t-statistic testing ρ = 0 against ρ < 0 in (2.10). Next, we follow Pesaran,
Shin and Smith (2001) and propose an F-test of the joint null, ρ = θ + = θ − = 0 in (2.10).
We denote these tests, tBDM and FP SS , respectively.
The asymptotic distributions of these test statistics are non-standard under their re-
spective null hypotheses and their exact asymptotic distributions are generally compli-
−
cated to derive due to the complex dependence structure between x+
t and xt , especially
when the means of ∆yt and ∆xt are non-zero.9 In light of these difficulties, we propose the
use of the pragmatic ‘bounds-testing’ approach advanced by Pesaran et al. (2001). Two
−
extreme cases can be identified, one in which the level regressors x+
t and xt in (2.10) are
all I(1), and the other in which they are all I(0). It follows that critical values tabulated
14
for these two scenarios provide critical value bounds for all classifications, irrespective of
whether the regressors are I(0), I(1) or mutually cointegrated. This approach is partic-
ularly useful in the current context due to the various dependence structures (including
−
cointegration) that may exist between x+
t and xt . Following Pesaran et al. (2001), we
differentiate between five cases of (2.10) for the FP SS statistic: (i) without intercept or
linear trend; (ii) with restricted intercept only; (iii) with unrestricted intercept only; (iv)
with intercept and restricted linear trend; and (v) with intercept and unrestricted linear
trend. Similarly, for the tBDM statistic we differentiate between cases (i), (iii) and (v).
Pesaran et al. (2001) tabulate the critical value bounds for both the FP SS and tBDM
statistics under each of these cases for a range of values of k, the number of regressors
entering the long-run relationship.
In the context of the NARDL model, due to the dependence structure that exists
−
between the partial sum decompositions x+
t and xt , the exact value of k is not clear.
−
In the simplest case where the long-run relationship is defined between yt , x+
t and xt , it
follows that the true value of k lies between 1 and 2.10 In general, we expect that the test
will be modestly undersized using k = 1 and similarly oversized with k = 2. Employing
the k = 1 critical values results in a more conservative test (a higher critical value) so, at
a pragmatic level, rejecting the null of no long-run relationship using these critical values
provides strong evidence of the existence of a long-run relationship. The mis-sizing of
the test can be readily resolved by bootstrapping, although in practice we find that the
pragmatic approach typically leads to the same conclusion. This observation is reinforced
below by a series of Monte Carlo simulation experiments designed to evaluate the finite
sample properties of the PSS test and the associated bootstrapping routine.
15
2.4 Asymmetric Dynamic Multipliers
It is straightforward to derive asymmetric dynamic multipliers associated with unit changes
−
in x+
t and xt , respectively, on yt . Consider the ARDL-in-levels representation of (2.10):
− −
φ (L) yt = θ + (L) x+
t + θ (L) xt + et , (2.11)
where φ (L) = 1 − p−1

P i + Pq + i − Pq −1 i 11
i=1 φi L , θ (L) = i=0 θ i L , and θ (L) = i=0 θ i L . Premul-
tiplying (2.11) by the inverse of φ (L), we obtain:
− − −1
yt = λ+ (L) x+t + λ (L) xt−i + [φ (L)] et , (2.12)
P P
where λ+ (L) = ∞ λ
j=0 j
+
= φ (L)−1 +
θ (L) and λ −
(L) = ∞
j=0 j λ −
= φ (L)−1 θ − (L).12
−
The cumulative dynamic multiplier effects of x+
t and xt on yt can be evaluated as follows:
h h h h
X ∂yt+j X X ∂yt+j X
m+ = = λ+
j , m− = = λ−
j , h = 0, 1, 2... (2.13)
h
j=0
∂x+
t j=0
h
j=0
∂x−
t j=0
+ − − + +
Notice that, by construction, as h → ∞, m+
h → β and mh → β , where β = −θ /ρ
and β − = −θ − /ρ are the asymmetric long-run coefficients. There is little reason to be-
−
lieve that the dynamic adjustment patterns summarised by m+
h and mh should generally
be symmetric. Therefore, even though we do not directly model asymmetric error cor-
rection (i.e. we do not allow for regime-dependency of ρ in (2.10)) we may still observe
asymmetric adjustment paths and/or duration of the disequilibrium. This highlights an
important feature of the NARDL model. In the interest of clarity, when discussing asym-
metry we tend to distinguish only between long- and short-run asymmetries. However,
the NARDL model in fact admits three general forms of asymmetry: (i) long-run or re-
action asymmetry, associated with β + 6= β − ; (ii) impact asymmetry, associated with the
−
inequality of the coefficients on the contemporaneous first differences ∆x+
t and ∆xt ; (iii)
16
adjustment asymmetry, captured by the patterns of adjustment from initial equilibrium
to the new equilibrium following an economic perturbation (i.e. the dynamic multipliers).
Adjustment asymmetry derives from the interaction of impact and reaction asymmetries
in conjunction with the error correction coefficient, ρ.
In practice, the patterns of dynamic adjustment will depend on the model specification.
Four distinct cases can be identified: the unrestricted specification, (2.10), accommodating
asymmetries in both the short- and long-run and three restricted specifications obtained by
imposing short- and long-run symmetry restrictions in (2.10), either separately or jointly.
An early study by Borenstein et al. (1997) investigates short-run dynamic asymmetries in
the response of retail gasoline prices to fluctuations in the price of crude oil by implicitly
imposing the long-run symmetry restrictions θ + = θ − = θ such that (2.10) simplifies to13
p−1 q−1
X X
− −
π+ +

∆yt = ρyt−1 + θxt−1 + γi ∆yt−i + i ∆xt−i + π i ∆xt−i + et . (2.14)
i=1 i=0
Models of this form have also been employed by Shirvani and Wilbratte (2000) and Apergis
and Miller (2006) in their analysis of short-run asymmetric wealth effects on consumption
due to liquidity constraints.
−
Short-run symmetry restrictions can take either of two forms: (i.) π +
i = π i for all
Pq−1 Pq−1
i = 0, ..., q − 1 or (ii.) i=0 π+
i = i=0 π−
i . When imposing such restrictions in the
presence of an asymmetric long-run relationship, we obtain:14

p−1 q−1
X X
−
∆yt = ρyt−1 + θ +
x+
t−1 +θ x−
t−1 + γi ∆yt−i + π i ∆xt−i + et . (2.15)
i=1 i=0
Finally, the most restrictive specification is obtained when assuming linearity of the long-
run relationship in conjunction with symmetric short-run adjustment:
17
p−1 q−1
X X
∆yt = ρyt−1 + θxt−1 + γi ∆yt−i + π i ∆xt−i + et . (2.16)
i=1 i=0
It is clear that (2.14), (2.15) and (2.16) are special cases of the unrestricted specifica-
tion described by (2.10) and that the long- and short-run symmetry restrictions can be
easily tested in the usual manner following our proposed methodology. Our early experi-
mentation with the model, as well as the results adduced in Van Treeck (2008), Nguyen
and Shin (2010) and Greenwood-Nimmo, Shin and Van Treeck (2011), suggest that the
dynamic multipliers obtained from the various cases are generally significantly different
from one-another. Moreover, it is generally the case that the results of linear estimation
are profoundly misleading when the underlying relationship is, in fact, asymmetric. This
will become apparent during the discussion of our empirical illustrations in Section 4.
A simple and useful addition to the general typology developed above is the extension
to the case where a subset of regressors enters the long-run relationship symmetrically:15
−0 − 0
yt = β +0 x+
t + β x t + γ w t + ut , (2.17)
−
where xt = x0 + x+

t + xt is a k × 1 vector of regressors entering the model asymmetri-
cally and wt is a g × 1 vector of regressors entering symmetrically. Extending the concept
of partial asymmetry to both the long- and short-run within our NARDL model, we
obtain:
− −
∆yt = ρyt−1 + θ + x+
t−1 + θ xt−1 + θ w w t−1
p−1 q−1
X X
− −
π+ +

+ γi ∆yt−i + i ∆xt−i + π i ∆xt−i + π w,i ∆w t−i + et . (2.18)
i=1 i=0
In light of the bounds-testing approach employed above, it follows that estimation and
inference proceed exactly as before, irrespective of whether xt and wt are I(0), I(1) or
18
mutually cointegrated. Furthermore, it is once again clear that this partially asymmetric
form represents a special case of (2.10).
3 Finite Sample Properties
In order to investigate the finite sample properties of the estimators we conduct a range of
Monte Carlo experiments based on the following simple data generating process (DGP):
− − − −
∆yt = a + ρ yt−1 − β + x+ + +

t−1 − β xt−1 + ϕ ∆xt + ϕ ∆xt + ut , (3.19)
where ∆xt = εt , and (ut , εt ) are serially uncorrelated and are generated according to the
following bivariate normal distribution:
   
 
 ut   1 ω 

 
  ∼ N 0, Ω =   . (3.20)
    
εt 
 ω 1 
Notice that when ω 6= 0, (3.19) can be estimated by:
− − − −
∆yt = a + ρyt−1 + θ+ x+ +
t−1 + θ xt−1 + π ∆xt + π ∆xt + et , (3.21)
where π + = ϕ+ + ω and π − = ϕ− + ω and the long run parameters are defined as
β̂ + = −θ̂+ /ρ̂ and β̂ − = −θ̂− /ρ̂.
We experiment with a wide variety of parameterisations of (3.19) and (3.20). Specif-
ically, under the assumptions that a = 0, β + = 0.5 and ϕ+ = 0.5, and denoting
β − = β + +δβ and ϕ− = ϕ+ +δϕ , we experiment with an array of combinations of the follow-
ing parameters: ρ ∈ (−0.05, −0.1, −0.2), δβ ∈ (0.1, 0.2, 0.25, 0.5), δϕ ∈ (0.1, 0.2, 0.25, 0.5),
ω ∈ (−0.5, 0, 0.5), and T ∈ (100, 200, 400). Due to space constraints, we are unable to
19
report the results of all of these simulations herein16 . Rather, we summarise the key
findings that arise across these parameterisations and report in detail the results from a
baseline case in which we use ρ = −0.2, δβ = 0.5 and δϕ = 0.5, and where ω and T vary
over the ranges defined above.
In Table 1 we report a range of summary statistics for the parameter estimates based
on our simulations using 3,000 replications of our baseline case. We note that the bias
and error in the estimation of each of the parameters is largely negligible (this also holds
under the other parameterisations of the DGP that we consider). The only exception to
this generalisation is the error correction parameter, which shows a modest downward
bias especially when T ≤ 100. However, this observation is not unexpected given the
well-documented downward bias associated with the estimation of AR(1) coefficients in
time series models.
– TABLE 1 ABOUT HERE –
We also investigate the finite sample size and power of the Wald statistics for the null
: β + = β − and the null of symmetric short-run

S

hypothesis of long-run symmetry HLR
: π + = π − . To this end, we consider the model for HLR

S
S
dynamics HSR :
− −
∆yt = a + ρ (yt−1 − βxt−1 ) + ϕ+ ∆x+
t + ϕ ∆xt + ut , (3.22)
where we set β = β + , and the model for HSR

S
:
− −
∆yt = a + ρ yt−1 − β + x+

t−1 − β xt−1 + ϕ∆xt + ut , (3.23)
where ϕ = ϕ+ . In both cases the alternative model is given by (3.19). Finally, we
examine the finite sample size and power of the PSS bounds test of the null hypothesis
of no asymmetric cointegration (HP SS : ρ = β + = β − = 0). In this case, the restricted
model is given by:
20
− −
∆yt = a + ϕ+ ∆x+
t + ϕ ∆xt + ut . (3.24)
and, as before, the alternative model is given by (3.19). As noted in Section 2.3, the
relevant critical value bounds for the PSS test depend on the number of regressors entering
−
the long-run relationship, k. However, given the dependence between x+
t and xt , the
appropriate value of k is unclear. Thus, we propose a pragmatic solution using two sets
of critical values, one for which k is defined by counting the partial sums as separate I(1)
regressors (here, k = 2) and another by counting each set of partial sums collectively
as a single I(1) regressor (here, k = 1). It follows that the latter approach is the more
conservative.
Table 2 summarises the simulation results from our baseline case at a nominal size of
5%. For T = 100, the long-run Wald test has very high power and the short-run Wald and
PSS tests have moderate power, although this rapidly improves as T increases. Indeed,
when T = 400 all of the tests achieve close to 100% power. The short-run Wald test is
well-sized regardless of the value of T while WLR is slightly oversized in small samples,
although this improves rapidly as T increases. Finally, as expected, we observe some
mis-sizing of the PSS F-test dependent on the selection of k. Importantly, however, we
find that the power of the test is satisfactory even under the conservative case (k = 1).
Table 2 also reports the power of the bootstrapped PSS test. For each replication of
the simulation routine, using data generated under the alternative hypothesis, we generate
500 bootstrap samples non-parametrically using the resampled residuals from estimation
of (3.21) in conjunction with the estimated coefficients from (3.24) under the assumption
21
that the initial values and the x’s are known. It is then a simple matter to compute
the empirical p-value of the PSS test by estimating (3.21) on the bootstrap samples
and calculating the probability that the bootstrapped test statistic exceeds its original
value. On this basis, we note that the bootstrapping procedure achieves the desired size
correction while retaining admirable power which increases with T .
One important finding that arises from the other parameterisations of the DGP is
that the power of the long- and short-run Wald tests is positively associated with the
distance between their respective null and alternative hypotheses. Moreover, we find that
the long-run Wald test becomes somewhat over-sized especially when the distance of the
alternative from the null is small, the error correction parameter is close to zero, and
T ≤ 100. These findings reflect the well known limitations of asymptotic inference under
adverse conditions. To overcome these issues, one could adopt the common practice within
the literature and compute empirical p-values for the short- and long-run Wald statistics
by use of a bootstrap. However, we choose to pursue an alternative and more flexible
approach. By computing 95% bootstrap confidence intervals for the difference between
the asymmetric cumulative dynamic multipliers defined for positive and negative shocks,
respectively, we are able to convey relevant information about the statistical significance
of any observed asymmetries at any horizon, h, and over any timeframe h1 ≤ h ≤ h2 .
Furthermore, in light of our simulations, and given the absence of precise asymptotic
critical values for the FP SS and tBDM test statistics, we choose to provide bootstrapped
p-values for these tests in our empirical applications.17
22
4 Empirical Applications
To demonstrate both the simplicity and the flexibility of the NARDL approach, we will
present two empirical applications. Firstly, we will examine nonlinearities in the bivariate
relationship between output and unemployment in the US, Canada and Japan. Secondly,
we will apply our technique to the trivariate case of gasoline pricing in Korea.
4.1 Asymmetric Unemployment-Output Relationship
The negative relationship between changes in the rate of unemployment and the rate
of output growth (Okun’s Law) remains one of the most commonly cited stylized facts
in modern macroeconomics. It is of fundamental importance in monetary policy trans-
mission, representing the link between unemployment and output which underpins the
mechanism by which inflation targeting monetary policy is thought to operate.
However, despite its importance, empirical assessments of Okun’s law over the last
three decades have been rather disappointing. The majority of this voluminous litera-
ture adheres to a linear paradigm, reflecting the assumption that cyclical upturns and
downturns have symmetrical effects on unemployment. In general, there is little reason
to believe that the labour market should behave in this simplistic fashion. If employers
dismiss a given quantity of labour after a negative growth shock, then they may not hire
exactly the same amount after a positive shock of equal magnitude (Lang and de Peretti,
2009). This may be discussed in terms of labour market hysteresis, the idea that cyclical
shocks may permanently affect structural unemployment. In this vein, Blanchard and
Summers (1987) explain the persistently high European unemployment of the 1980s us-
ing an insider-outsider wage setting model. They argue that adverse shocks that reduce
23
the proportion of insiders (union members) will increase outsider unemployment perma-
nently. There is, therefore, no tendency for the labour market to return to its initial state
even after economic growth has recovered (see also Hammermesh and Pfann, 1998, on the
asymmetric adjustment costs of labour).
In response to these issues, empirical attention is increasingly turning to nonlinear
modelling. There is a natural complementarity between the asymmetric analyses of
Okun’s Law, the Phillips curve and the preferences of the central bank which has helped
to drive research in the field. Neftci (1984) laid the foundations for this literature with
his early study of business cycle effects on the patterns of correlation between major US
time series, which revealed that the output-unemployment relationship displays marked
asymmetry. Altissimo and Violante (2001) find evidence of nonlinearity between output
and unemployment using a nonlinear multivariate VAR model. Their results, which they
note are consistent with the majority of existing univariate threshold models, indicate
that shocks in the recessionary regime are considerably less persistent than those in the
expansionary regime. Similarly, Crespo Cuaresma (2003) develops a regime-dependent
specification of Okun’s law and finds that the contemporaneous effect of output growth
on unemployment is asymmetric and significantly larger in recessions than in expansions,
and that shocks to unemployment tend to be more persistent in the expansionary regime.
Attfield and Silverstone (1998) argue that if output and unemployment are coin-
tegrated and potential output and unemployment are defined by the stochastic trend
components of the variables constructed from the Beveridge-Nelson decomposition, then
Okun’s coefficient can be interpreted as the cointegrating coefficient. However, the cointe-
gration test results are ambiguous: the single equation residual based ADF test is unable
24
to reject the null of no cointegration while it is rejected by the Johansen test. Using a
static asymmetric regression of the form of (2.1), Schorderet (2001) finds that nonlin-
earity hinders efforts to detect the stationary relationship between unemployment and
output.18 The contention that the appropriate modelling of nonlinearity strongly affects
the cointegration test is one to which we will return shortly.
In this section, we apply the NARDL technique to the simultaneous analysis of both
long- and short-run nonlinearities in the relationship between output and unemployment
in the US, Canada and Japan.19 This application demonstrates one of the key strengths
of our model: its flexibility and the ease with which it can be applied to each of the four
cases of nonlinearity defined above.
Firstly, to establish a reference point, we estimate the static linear regression of un-
employment on a constant, a time trend and output (Table 3(a)) and a static asymmetric
model of the form of (2.1), the results of which are reported in Table 3(b).
In keeping with the findings of Attfield and Silverstone (1998), Schorderet (2001) and
Granger and Yoon (2002), the EG test finds no evidence of linear cointegration. Moreover,
the EG test is unable to reject the null of no cointegration in the static asymmetric case,
highlighting the importance of an appropriate dynamic specification. In all cases, we find
a pronounced negative association between output and unemployment, with the results of
asymmetric analysis indicating strong non-linearity (the Wald tests reject the null in all
cases). However, the validity of these results is questionable given the evidence of severe
model mis-specifications.
Table 4 reports estimation results for the restricted symmetric ARDL regression of
the form of (2.16). Table 5 presents the results of the unrestricted NARDL case allowing
25
for both long- and short-run asymmetry. Notice that the cointegration tests are unable to
reject the null hypothesis in the restricted case but that both the tBDM and FP SS statistics
resoundingly reject the null when long-run asymmetry is modelled appropriately. This
result underscores the importance of correctly specifying the long-run relationship under
scrutiny. Moreover, the finding that the ECM-based tests are able to detect the asymmet-
ric long-run relationship while the EG residual-based approach cannot is generally consis-
tent with the works of Kremers, Ericsson and Dolado (1992), Hansen (1995), Banerjee et
al. (1998) and Pesaran et al. (2001). This reflects the well-established power-dominance
of the ECM-based tests resulting from their inclusion of potentially valuable information
relating to the correlation between the regressors and the underlying disturbances.
– TABLES 4 & 5 ABOUT HERE –
In the restricted symmetric models (Table 4), the estimated long-run coefficients for the
US, Canada and Japan are -1.66, -5.68 and 5.57, respectively, although none is statistically
significant due to the failure to accurately model the long-run relationship. Indeed, the
counterintuitive finding of a positive long-run coefficient in the case of Japan reflects the
fact that the model misspecification is so severe in this case that the estimated error
correction coefficient is positive, indicating explosive instability. By contrast, using the
more general unrestricted model of the form (2.10), the FP SS and tBDM tests both reject
their respective null hypotheses in all cases, even using the conservative critical values for
the PSS test (see Table 5). Furthermore, the Wald tests are also able to firmly reject the
null hypothesis of long-run symmetry in all cases. In this case, the estimated long-run
coefficients on y + and y − are -9.76 and -28.88 for the US, -17.26 and -28.48 for Canada and
-7.28 and -11.26 for Japan, respectively. Therefore, we may conclude that an economic
upturn of 10.3% is necessary to reduce unemployment by 1% in the US while an economic
26
downturn of just 3.5% achieves the opposite. The associated values for Canada are 5.8%
and 3.5% while in the case of Japan the figures translate to an economic upturn of 13.7%
and a downturn of 8.9%. The relatively muted response of the labour market to output
fluctuations in Japan reflects its restrictive employment policies and unusually long job
tenure (Tanaka, 2001), and is comparable to the linear estimation results achieved by
Hamanda and Kurosaka (1984).
Turning to the analysis of short-run dynamic asymmetry, we find that the Wald test
cannot reject the null of (weak-form) summative symmetric adjustment in the USA or
Japan but that it is rejected at the 10% level in Canada. Consulting the bootstrap con-
fidence intervals for the difference between the asymmetric dynamic multipliers reported
in Figures 1–3 supports this finding. However, as noted earlier, the pattern of dynamic
adjustment depends on a combination of the long-run parameters, the error correction
coefficient and the model dynamics. Therefore, although we find little evidence of additive
short-run asymmetries, we nevertheless observe apparent asymmetries in the adjustment
patterns traced by the dynamic multipliers.
– FIGURES 1 – 3 ABOUT HERE –
For the benefit of the reader, Figure 1 presents the dynamic multipliers for the US
under each of the four combinations of long- and short-run asymmetry. Notice that
the imposition of long-run symmetry restrictions fundamentally changes the shape of
the dynamic multipliers, resulting in marked overshooting where none was previously
observed. In conjunction with the results of a battery of diagnostic tests, we conclude that
the imposition of invalid long-run restrictions represents a severe mis-specification of the
model. This underscores the importance of correctly accounting for inherent nonlinearities
in the long-run relationship and cautions that failure to do so jeopardises the identification
27
of the long-run relationship and compromises the estimation of the model dynamics. In
light of the overwhelming rejection of the long-run symmetric models, the associated
dynamic multipliers are omitted from Figures 2 and 3 to save space.
For the US, the results of both long-run asymmetric models (Figures 1(a) and (c)) are
remarkably similar, indicating that the labour market responds rapidly and strongly to
cyclical downturns in the very short-run (correcting one quarter of disequilibrium within
one period) but that full adjustment to the new equilibrium is a relatively prolonged
process. By contrast, the labour market responds only mildly to the boom phase but full
adjustment is achieved within six months. This reflects the flexibility of the US labour
market, whereby firms are quick to fire in the short-run in order to cut costs but are also
quick to hire in the knowledge that they can easily and quickly release the additional
labour should the need arise.
Figure 2 reveals that the pattern of dynamic adjustment is considerably richer in the
fully asymmetric case in Canada. We again find very rapid labour market adjustment
in the immediate wake of a recessionary shock, with more than 50% of the traverse to
equilibrium achieved within six months. Again, we find that the remaining disequilibrium
error is corrected relatively slowly. By contrast, the labour market response to the cyclical
upswing is more gradual, taking one year to achieve 50% of the adjustment toward equi-
librium. Furthermore, in panel (b), with the imposition of short-run symmetry, after the
initial rapid adjustment to the recessionary shock the gradient of the cumulative dynamic
multiplier is noticeably steeper than in the case of an economic expansion, as reflected in
the upward slope of the difference curve. In sum, our results suggest that Canadian firms
are quick to fire and slow to hire, reflecting conservatism on the part of their management.
28
Finally, we find little evidence of short-run asymmetry in Japan. Figure 3 reveals
that the Japanese labour market exhibits very muted responses to both booms and busts
when compared to the US and Canada, a finding that reflects the prevalence of restrictive
labour market institutions. Focusing on Figure 3(b), we note that 50% of the equilibrium
correction occurs within 10-12 months of either a positive or a negative shock, and that
after this initial phase, convergence upon long-run equilibrium occurs very slowly.
Despite their superficial differences, a common pattern emerges between Figures 1, 2
and 3. In general, the labour markets in all countries exhibit relatively rapid adjustment
in the first year with the absolute effect of an economic contraction being significantly
larger than that of an expansion. Following this initial period, the speed of adjustment
slows markedly, and subject to the imposition of short-run symmetry restrictions, we find
that the labour market response to output shocks remains somewhat more rapid in the
recessionary case than in the expansionary environment in both Canada and Japan. The
US can be viewed as a special case due to the widely discussed flexibility of its labour
market which permits very rapid adjustment to the expansionary shock as firms are eager
to hire in the knowledge that subsequent dismissals are neither difficult nor unduly costly.
The subtle patterns revealed by the dynamic multipliers suggest that the focus of the
literature on the persistence of shocks (Altissimo and Violante, 2001; Crespo Cuaresma,
2003) fails to convey important information regarding the magnitude of the implied ad-
justments to the labour market. Simply put, the impact of a recession in terms of jobs
lost is greater in both the short- and the long-run than the job creation associated with an
economic expansion of equal magnitude even though the discussion of the half-life of the
shocks in the US may indicate the opposite (i.e. 50% of the long-run effect of a recession-
29
ary shock is greater than 100% of the long-run impact of an expansionary shock of equal
magnitude). Focusing on persistence gives an incomplete picture of the phenomenon un-
der study when the long-run relationship is asymmetric. This serves to highlight one of
the primary attributes of the asymmetric cumulative dynamic multipliers; they help to
shed light on the traverse between the short-run and the long-run, a property whose use-
fulness and theoretical appeal is difficult to overstate. In a traditional ECM, the speed of
adjustment is computed simply as a percentage of the equilibrium error that is corrected
in each period. By contrast, NARDL illuminates the dynamic pattern of adjustment in a
simple and intuitive manner.
4.2 Asymmetric Gasoline Price Adjustment in Korea
A large literature has developed around the observation that retail gasoline prices tend to
react asymmetrically to changes in the price of crude oil (an exhaustive survey is provided
by Grasso and Manera, 2007). This phenomenon has come to be referred to as the ‘rockets
and feathers’ hypothesis following the early contribution of Bacon (1991). Employing an
asymmetric partial adjustment model in which the adjustment process is assumed to be
quadratic, Bacon’s results support the hypothesised asymmetry. Similarly, Borenstein
et al. (1997, BCG) derive strong support for asymmetry from a hybrid error correction
model where changes in gasoline and oil prices are decomposed into positive and negative
changes.20
Various theoretical explanations for asymmetric price adjustment have been adduced
in the literature, the dominant three being oligopolistic pricing behaviour (Radchenko,
2005), inventory capacity and costs (Borenstein and Shepard, 2002) and nonlinear con-
30
sumer search-effort (Johnson, 2002). While the literature on short-run dynamic asymme-
try is expansive, relatively little work has been done on potential long-run asymmetry.21
Reilly and Witt (1998) were among the first authors to investigate asymmetric pass-
through of the exchange rate to the retail price of gasoline, reflecting the convention of
quoting oil prices in US$ per barrel. Their results, derived from a simple ECM specifica-
tion encompassing short-run asymmetries, support the hypothesis of a non-linear relation-
ship between the exchange rate and retail gasoline prices for the UK. The authors report
that a Sterling depreciation is rapidly passed through to higher prices at the pump but
that a strengthening of the Pound is not met by a commensurate reduction in retail prices.
Similarly, Asplund, Eriksson and Friberg (2000) find that the impact of a depreciation is
more marked than that of an appreciation in Sweden, with retail gasoline prices reacting
more swiftly to the exchange rate than to crude oil price movements. More recently, Ga-
leotti, Lanza and Manera (2003) find compelling evidence that the speed of adjustment
to long-run equilibrium is asymmetric both with respect to oil price shocks and exchange
rate shocks. However, these papers consider only short-run dynamic asymmetries and
abstract from long-run nonlinearity.
The majority of papers surveyed by Grasso and Manera (2007) rely on the two-step
Engle-Granger estimation technique in which linear homogeneity of the long-run relation-
ship is imposed in the first step. This methodology is only appropriate in the analysis of
short-run asymmetry where the long-run relationship is believed to be linear.22 Should the
underlying long-run relationship prove nonlinear, the imposition of linearity in the first
step is likely to provide misleading and spurious results as noted in the case of Okun’s
Law above. We contribute to this literature by applying our modelling strategy to the
31
case of asymmetric pass-through of crude oil price changes and exchange rate fluctuations
to the retail price of gasoline in Korea over the period 1991q1-2007q2.23 The choice of
Korean data is motivated by the need to find an industrial country which is entirely reliant
on imported oil, thereby circumventing any issues of endogeneity of regressors that may
arise in countries with significant oil extraction and refining activity. Given the extensive
literature surveyed above, we do not report static estimation results and merely note that
such simple models tend to exhibit profound evidence of misspecification.
Table 6 presents the results of the benchmark symmetric ARDL model, the fully asym-
metric NARDL model and our preferred specification which combines long-run symmetry
with short-run asymmetry. Taken together, the FP SS and tBDM tests indicate cointegra-
tion in both of the asymmetric models.24
– TABLE 6 & FIGURE 4 ABOUT HERE –
The Wald tests fail to reject long-run symmetry with respect to either the oil price or
the exchange rate, indicating that the pass-through from input prices to the retail price
of gasoline is linear in the long-run. This may suggest that the retail gasoline industry
in Korea has been relatively competitive. Alternatively, it may be attributed to state
intervention in the energy industry in the early years of the sample. Turning to the
short-run, the Wald tests decisively reject the null of additive short-run symmetry with
respect to both the oil price and the exchange rate. This pattern of asymmetry determines
the shape of the dynamic multipliers presented in Figure 4. Focusing first on the retail
price response to the crude oil spot price, we observe a strong and rapid reaction to
positive changes but a more gradual response to falling crude prices in both panels (a)
and (b). The principle difference between these two figures derives from the considerable
uncertainty surrounding the long-run coefficient estimates in the fully asymmetric model.
32
This inflates the bootstrap confidence intervals in panel (a) but, interestingly, also seems to
exaggerate the observed short-run asymmetry. In conjunction with the weight of evidence
supporting the rockets and feathers hypothesis, we therefore regard the combination of
long-run symmetry and short-run asymmetry as the most plausible case.
Turning to the case of exchange rate fluctuations, we again note that the long-run
symmetry restrictions cannot be rejected but that the additive short-run restrictions are
firmly rejected. Figures 4(c) and (d) reveal that gasoline prices increase rapidly and
strongly following a weakening of the Korean Won, displaying mild overshooting. By
contrast, the response to an appreciation is rather muted. Moreover, our results suggest
that exchange rate fluctuations have a more pronounced impact on retail gasoline prices
than movements in the price of crude oil quoted in US$. This effect is apparent in both
the long- and the short-run and is consistent with the findings of Asplund et al. (2000).
Overall, our results are largely consistent with the existing literature on dynamic
asymmetries, confirming that Korean gasoline prices respond more rapidly to the price
increases of crude oil than to decreases and that they are more sensitive to exchange
rate depreciations than to appreciations. By contrast, little research concerning long-run
asymmetries exists against which to judge our results. However, at a pragmatic level,
one can argue that the presence of long-run asymmetries in the gasoline-pricing equation
may give rise to a logical inconsistency, and so our finding of long-run symmetry may
be considered theory-consistent. If there was a persistent long-run tendency for gasoline
prices to increase more following a depreciation than they would decrease following an
appreciation of equal magnitude, for example, then there would be a ratchet mechanism
at work whereby prices would gradually increase through time under the assumption that
33
positive and negative shocks are of approximately equal magnitude and probability. This
outcome seems rather implausible and suggests that long-run linearity is the more natural
case.
5 Concluding Remarks
The investigation of nonstationarity in conjunction with nonlinearity has recently assumed
a prominent role in econometric research. This reflects the realisation that asymmetry
is pervasive within the social sciences and may be inherent in modern economies. In-
deed, the behavioural finance literature can be viewed as an attempt at formalising this
observation. In this paper we have proposed a simple method of combining asymmetric
cointegration with a dynamically flexible ARDL model and have derived the associated
error correction framework. The desirable features of the NARDL model are threefold.
Firstly, the estimation of the ECM in one step is likely to improve the performance of
the model in small samples, particularly in terms of the power of the cointegration tests.
Secondly, the ability to simultaneously estimate both long- and short-run asymmetries in
a computationally simple and tractable manner reflects the flexibility of our modelling
approach. Moreover, our technique provides a straightforward means of testing both long-
and short-run symmetry restrictions. Finally, the use of asymmetric dynamic multipliers
provides an intuitive and computationally straightforward means of assessing the traverse
between the short- and long-run, a result with significant theoretical appeal. While the
dynamic adjustment in most ECMs is discussed in terms of the percentage of the disequi-
librium error that is corrected in each period, our approach sheds light on the nature of
this dynamic adjustment, mapping the gradual movement of the process under scrutiny
34
from initial equilibrium through the shock and toward the new equilibrium.
These key strengths of the NARDL framework have been demonstrated in the case
of the long- and short-run asymmetry of the unemployment-output relationship and the
short-run asymmetry characterising retail gasoline price adjustments. The results suggest
that the imposition of long-run symmetry where the underlying relationship is nonlinear
will confound efforts to test for the existence of a stable long-run relationship and will
result in spurious dynamic responses. Similarly, our results stress the importance of
correctly capturing short-run asymmetries in order to illuminate potentially important
differences in the response of economic agents to positive and negative shocks.
In summary, NARDL represents the simplest method of modelling combined short-
and long-run asymmetries yet developed. At this point, it seems appropriate to mention
three obvious extensions which present themselves. Firstly, the model can be related to
the threshold literature by generalising to the case of one or more unknown non-zero
thresholds for use in the construction of the partial sum processes. This is the subject
of ongoing research by Greenwood-Nimmo, Shin and Van Treeck (2011), in which we
employ Hansen’s (2000) approach to estimation and inference in models with unknown
threshold parameters. One could further extend research in this vein by allowing for the
state-contingency of the error correction term, ρ (i.e. distinguishing between ρ+ and ρ− ).
Secondly, although highly challenging, the development of a system equivalent of our
model capable of dealing with multiple long-run relationships would permit the analysis
of a more diverse range of macroeconomic phenomena. Finally, the extension of the
model to the dynamic heterogeneous panel context may broaden its appeal further still.
The obvious starting point for such developments is the pooled mean group framework
35
advanced by Pesaran, Shin and Smith (1999), which is readily estimable by FIML under
the assumption of long-run homogeneity.
Acknowledgments
This is a substantially revised version of the working paper by Shin and Yu (2004). Earlier
versions circulated under the titles “An ARDL Approach to an Analysis of Asymmetric
Long-Run Cointegrating Relationships” and “Modelling Asymmetric Cointegration and
Dynamic Multipliers in an ARDL Framework”. We are grateful to Badi Baltagi, Jinseo
Cho, Ana-Maria Fuertes, Liang Hu, John Hunter, Minjoo Kim, Soyoung Kim, Gary
Koop, Kevin Lee, Camilla Mastromarco, Amy Mise, Viet Ngyuen, Neville Norman, Kevin
Reilly, Hashem Pesaran, Laura Serlenga, Ron Smith, Till van Treeck and participants at
the ESEM conference (Vienna, 2006), the ICAETE conference (Hyderabad, 2009), and
research seminars at the IMK, the Bank of Korea, and the Universities of Bari, Lecce,
Leeds, Leicester, Korea and Yonsei for their helpful comments. This paper has been
widely circulated and the methodology adopted by a number of authors – we are pleased
to acknowledge their valuable feedback, comments and discussion. Shin acknowledges
partial financial support from the ESRC (Grant No. RES-000-22-3161). Yu is grateful for
the hospitality of Leeds University Business School during his visit. The usual disclaimer
applies.
36
Notes
1
The present version of the paper is a substantially revised version of Shin and Yu
(2004), which has benefited greatly from a sequence of incremental improvements and
additions arising from the constructive comments of conference and seminar participants
and from editorial feedback. Earlier versions of the paper circulated under the titles “An
ARDL Approach to an Analysis of Asymmetric Long-run Cointegrating Relationships”
and “Modelling Asymmetric Cointegration and Dynamic Multipliers in an ARDL Frame-
work”. By virtue of its wide circulation and prolonged availability as a working paper, our
research has informed the development of a subsequent literature that we now discuss. In
all cases, however, the development of the NARDL model is properly credited.
2
The presence of long-run asymmetry will induce a ratchet mechanism if the respec-
tive positive and negative regime probabilities are approximately equal and the shocks
under each regime are of comparable magnitude. In the more general case in which these
conditions are not satisfied, no such simple conclusion may be drawn.
3
Consider the threshold ECM as an example, in which case the choice of the transition
variable is of importance both theoretically and empirically. In general, the asymptotic
distribution of the test statistic for the null of linearity or symmetry is not only non-
standard but also depends on these transition variables.
4
The concept of asymmetric cointegration is easily conceptualised by use of a simple
example. Consider the output-unemployment relationship. In a standard cointegrating
regression, one models yt and xt subject to a common stochastic trend. As this relationship
37
is assumed to hold in the long-run, it represents the equilibrium to which the system
returns after a perturbation (i.e. it acts as a global attractor). However, in our framework,
the long-run relationship between yt and xt is modelled as piecewise linear subject to the
decomposition of xt . Suppose that |β + | < |β − | in (2.1). This suggests that the long-run
effect of a unit negative change in output will increase unemployment by a greater amount
than a unit positive change would reduce it. Thus, our model includes a regime-switching
cointegrating relationship in which regime transitions are governed by the sign of ∆xt .
The economic implication of this line of reasoning is that equilibrium need not be unique
in a globally linear sense. The link to the path dependency literature is apparent.
5
In the special case where vt is normally distributed with zero mean and constant
variance σv2 , it is well-established that the censored normal variates, vt+ = max [0, vt ] and
vt− = min [0, vt ], will have E vt+ = √σv , E vt− = − √σ2π , and V ar vt+ = V ar vt− =
v

2π
σv2 π−1
2 π
. We are grateful to Jinseo Cho for pointing this issue out and encouraging us to
provide a more general result in Theorem 1.
6
Notice that the analysis of short-run dynamic asymmetries is not straightforward in
the context of the static regression model employing the semiparametric approach.
7
In some cases, most notably where the growth rates of the series in xt are predomi-
nantly positive (negative), the use of a zero threshold may result in one regime containing
an undesirably low number of effective observations. In such situations, an obvious can-
didate for an alternative threshold is the mean growth rate.
8
For convenience we employ the same lag order, q. One may also allow for feedback
effects from the lagged ∆y’s on ∆xt in (2.8).
38
9
While the associated critical values can be tabulated easily using stochastic simula-
tion, it is impractical to provide a meaningful set of critical values covering all possible
combinations. It is generally straightforward, however, to compute the appropriate p-
values by means of standard bootstrap techniques.
10
It is straightforward to extend similar reasoning to the more general case with multiple
regressors decomposed into partial sum processes.
11
The level parameters are obtained as follows:
φ1 = ρ + 1 + ϕ1 ; φi = ϕi − ϕi−1 , i = 2, ..., p − 1; φp = −ϕp−1 ;
θ `0 = π `0 ; θ `1 = θ ` − π `0 + π `1 ; θ ì = π ì − π ì−1 , i = 2, ..., q − 1; θ `q = −π `q−1 , ` = +, −.
−
12
The dynamic multipliers, λ+
j and λj for j = 0, 1, ..., can be evaluated using the
following recursive relationships in which λ`0 = θ `0 , φj = 0 for j < 1 and λ`j = 0 for j < 0:
λ`j = φ1 λ`j−1 + φ2 λ`j−2 + ... + φj−1 λ`1 + φj λ`0 + θ `j , ` = +, −, j = 1, 2, ...,
13
The final specification in Borenstein et al. (1997) differs slightly from (2.14) as the
lagged ∆yt ’s on the right hand side are also decomposed into positive and negative changes.
However, their derivation is rather ad hoc.
14
Short-run symmetry restrictions (especially the pair-wise restrictions) may be exces-
sively restrictive in many applications although they may be useful in providing more
precise estimation results, particularly when estimating a long-run asymmetric relation-
ship in small samples. The additive symmetry restrictions are somewhat weaker and have
been discussed in the literature in terms of assessing the validity of the liquidity constraint
39
Pq−1 Pq−1
where i=0 π+
i < i=0 π−
i (e.g. Van Treeck, 2008).
15
Webber (2000) utilises a similar approach in his analysis of the asymmetric pass-
through from exchange rates, decomposed as the partial sum processes of appreciations
and depreciations, to import prices.
16
Full results are available on request.
17
We employ a non-parametric bootstrapping routine and use 50,000 replications after
rejecting those for which ρ > −1 × 10−4 . Full details are available on request.
18
Further examples of the use of positive/negative decompositions in the modelling of
asymmetry in the unemployment-output relationship include Lee (2000) and Virén (2001).
19
Seasonally-adjusted monthly data for unemployment and industrial production cov-
ering the range 1982m2-2003m11 were collected from the OECD’s Main Economic Indi-
cators. Although not presented here, ADF testing lends overwhelming support to the
hypothesis that all variates are I(1).
20
Bachmeier and Griffin (2003) criticise BCG for their use of ‘nonstandard estimation
methodology’ and low-frequency data, arguing that the two-step EG method finds no
evidence of asymmetry and, moreover, that the BCG method finds no evidence of asym-
metry when applied to their daily dataset. While there is some debate over the optimal
data frequency for the study of price shocks, the criticism of the one-step BCG estimation
process is unwarranted (c.f. Pesaran and Shin, 1998). Indeed, as noted above, estimating
the ECM in a single step yields superior performance in small samples, particularly in
relation to the power of the cointegration tests.
40
21
An early and notable paper combining both short- and long-run asymmetries in the
analysis of the nonlinearities characterising the relationship between upstream and down-
stream prices in the oil industry is Balke, Brown, and Yücel (1998). The authors extend
BCG’s modelling to incorporate long-run nonlinearities and find evidence of “pervasive
and large” asymmetries in all cases apart from their levels specification (p. 10).
22
As discussed above, in the presence of weakly endogenous regressors and/or serially
correlated errors, the OLS estimator in the first step remains consistent but is inefficient.
Furthermore, if the AR coefficients are significantly different from zero, the OLS estimator
becomes inconsistent and is thus poorly determined in finite samples (see Pesaran and
Shin, 1998, and Pesaran et al., 2001).
23
The Dubai spot price (US$/bbl), pot , was retrieved from the Korean Energy Eco-
nomics Institute (http://www.keei.re.kr) prior to 1998 and from PETRONET (http:
//www.petronet.co.kr) thereafter. The gasoline price index (2000Y=100), pt , and the
KRW/USD exchange, xt , were retrieved from the Economic Statistics System of the Bank
of Korea. All data is in logarithmic form.
24
The Engle-Granger residual-based test associated with the static linear regression
of price on a constant, time trend, oil price and exchange rate (all in logs) returns a
maximum value of -3.91 compared to a 5% critical value of -4.32. Similarly, for the static
asymmetric regression of price on po+ , po− , x+ and x− , as well as a constant and a trend,
EGM AX = −4.57 compared to a 5% critical value of -5.01.
41
A Appendix: Proof of Theorem 1
The OLS estimator, β̂ := (β̂ + , β̂ − )0 , in (2.1) is obtained by
 −1  
+ 2
PT PT + −
PT +
 t=1 xt t=1 xt xt   t=1 xt yt 
β̂ = 
 P
  ,
T + −
PT 
− 2
 P
T

t=1 xt xt t=1 xt t=1 x−
t yt
so that
    
− 2
PT PT + −
PT +
1  t=1 xt − t=1 xt xt t=1 xt ut  AT 
= 1 
 
β̂ − β =   ,
DT  PT + −
PT 2  P
T
 DT  
− t=1 xt xt t=1 x+
t t=1 x−
t ut BT
PT 2 PT 2 P 2 2 PT
T PT
where DT := t=1 x+
t t=1 x−
t − t=1 x+ −
t xt , AT := t=1 x−
t t=1 x+
t ut −
PT −
PT PT PT PT 2 PT
t=1 x+
t xt t=1 x−
t ut , and BT := − t=1 x+ −
t xt t=1 x+
t ut + t=1 x+
t t=1 x−
t ut . We
now let
wt+ := max[0, vt ] − µ+ , wt− := min[0, vt ] − µ− ,
where µ+ := E [max[0, vt ]] and µ− := E [min[0, vt ]], so that
t
X t
X
x+ +
t ≡ tµ + wj+ , x− −
t ≡ tµ + wj−
j=1 j=1
Hence, we obtain:
( T ) T  t
!2 t
!2 t
! t
!
X X X X X X 
2 +2 − −2 + + − − + 
DT = t  µ wj +µ wj − 2µ µ wj wj
t=1
 t=1 
j=1 j=1 j=1 j=1
 !2 !2 ! T !
 T
X X t T
X X t T
X X t X X t 
+2 − −2 + + − − +
− µ t wj +µ t wj − 2µ µ t wj t wj
 
t=1 j=1 t=1 j=1 t=1 j=1 t=1 j=1
+ oP (T 5 ).
42
Here, oP (T 6 ) terms are canceled off, and the remaining next-order terms are stated as
above. We now note that

T
1 X 2 1
t = + o(1),
T 3 t=1 3
t
!2 t
!2 t
! t
! t
!2
X X X X X
µ+2 wj− + µ−2 wj+ − 2µ+ µ− wj− wj+ = sj
j=1 j=1 j=1 j=1 j=1
where sj ≡ µ+ wj− − µ− wj− by the definitions of wj− and wj+ . Hence, by Donsker’s FCLT.
T (·)
X
−1/2
T st /σs ⇒ Ws̃ (·),
j=1
where σs2 := V ar (st ), ⇒ indicates weak convergence, and Ws̃ (r) is the standard Brownian
motions defined on r ∈ [0, 1]. Therefore,
T t
!2
X X Z 1
−2
T sj ⇒ σs2 Ws̃ (r)2 dr
t=1 j=1 0
by the CMT (e.g. Eq. (17.3.22) of Hamilton (1994), p. 486). Also notice that
T t
!2 T t
!2 T t
! T t
!
X X X X X X X X
µ+2 t wj− + µ−2 t wj+ − 2µ+ µ− t wj− t wj+
t=1 j=1 t=1 j=1 t=1 j=1 t=1 j=1
T t
!2 T t
!2
X X X X
µ+ wj− − µ− wj+

= t = t sj ,
t=1 j=1 t=1 j=1
then it follows that

T t Z 1
− 25
X X
T t sj ⇒ σs rWs̃ (r)dr
t=1 j=1 0
43
by the CMT. Collecting all these results we obtain:
" Z 2 #
1 Z 1
1
T −5 DT ⇒ σs2 Ws̃ (r)2 dr − rWs̃ (r)dr . (A.1)
3 0 0
Next, we consider the asymptotic weak limit of the numerator of β̂ + −β + . For this, we
note that OP (T 9/2 ) terms cancel off, so that the remaining next-order terms are Op (T 4 ),
so that
T T T T
X 2 X X X
AT := x−
t x+
t ut − x+ −
t xt x−
t ut
t=1 t=1 t=1 t=1
( T T t T T X
t
)
X X X X X
= µ−2 t2 ut wj+ + 2µ− µ+ tut wj−
t=1 t=1 j=1 t=1 t=1 j=1
( T T t T t T
)
X X X X X X
− µ+ µ− t2 ut wj+ + t (µ+ wj− + µ− wj+ )µ− tut + oP (T 4 )
t=1 t=1 j=1 t=1 j=1 t=1
( T
! T t
! T t
! T
!)
X X X X X X
=µ− − t2 ut sj + t sj tut + oP (T 4 ) (A.2)
t=1 t=1 j=1 t=1 j=1 t=1
where we also employ the definition of sj := µ+ wj− − µ− wj+ . Then, by the CMT (e.g. Eqs.
(f) on p. 548 and (17.3.19) on p. 486 of Hamilton (1994), respectively),.we have:
T
X t
X Z 1
−1
T ut sj ⇒ σs σu Ws̃ (r)dWũ (r) (A.3)
t=1 j=1 0
T Z 1
− 23
X
T tut ⇒ σu Wũ (1) − Wũ (r)dr (A.4)
t=1 0
where Wũ (·) is the standard Brownian motion independent of Ws̃ (·). Collecting all these
44
results and (A.4) and plugging them into AT , we obtain by the CMT:
Z 1 Z 1 Z 1
−4 − 1
T AT ⇒ µ σs σu − Ws̃ (r)dWũ (r) + rWs̃ (r)dr Wũ (1) − Wũ (r)dr
3 0 0 0
(A.5)
We now examine the numerator of (β̂ − − β − ) in a similar manner. That is,
( T
! T t
! T t
! T
!)
X X X X X X
BT := µ+ σs σu t2 ut sj − t sj tut + oP (T 4 ), (A.6)
t=1 t=1 j=1 t=1 j=1 t=1
and
Z 1 Z 1 Z 1
−4 + 1
T BT ⇒ µ σs σu Ws̃ (r)dWũ (r) − rWs̃ (r)dr Wũ (1) − Wũ (r)dr
3 0 0 0
(A.7)
Combining (A.5) and (A.7) respectively with (A.1) we obtain the main results.
Next, from (A.2) and (A.6), it is easily seen that
µ+ AT + µ− BT = oP (T 4 ),
which proves the final result in Theorem 1.
45
References
Apergis, N. and Miller, S. (2006). “Consumption Asymmetry and the Stock Market:
Empirical Evidence.” Economics Letters, 93 (3), 337-342.
Altissimo, F. and Violante, G. (2001). “The Nonlinear Dynamics of Output and Unem-
ployment in The U.S.” Journal of Applied Econometrics, 16 (4), 461-486.
Asplund, M., Eriksson, R. and Friberg, R. (2000). “Price Adjustment by a Gasoline
Retail Chain.” Scandinavian Journal of Economics, 102 (1), 101-121.
Bachmeier, L.J. and Griffin, J.M. (2003). “New Evidence on Asymmetric Gasoline Price
Responses.” The Review of Economics and Statistics, 85 (3), 772-776.
Bacon, R.W. (1991). “Rockets and Feathers: The Asymmetric Speed of Adjustment of
UK Retail Gasoline Prices to Cost Changes.” Energy Economics, 13 (3), 211-218.
Bae, Y. and de Jong, R.M. (2007). “Money Demand Function Estimation by Nonlinear
Cointegration.” Journal of Applied Econometrics, 22 (4), 767-793.
Balke, N.S. and Fomby, T.B. (1997). “Threshold Cointegration.” International Economic
Review, 38 (3), 627-645.
Balke, N.S., Brown, S.P. and Yücel, M.K. (1998). “Crude Oil and Gasoline Prices: An
Asymmetric Relationship?” Federal Reserve Bank of Dallas Economic and Financial
Policy Review, Q1, 2-11.
Banerjee, A., Dolado, J. and Mestre, R. (1998). “Error-correction Mechanism Tests for
Cointegration in a Single-Equation Framework.” Journal of Time Series Analysis, 19
(3), 267-283.
Blanchard, O.J. and Summers, L.H. (1987). “Hysteresis and the European Unemployment
Problem,” Working Paper 1950, National Bureau of Economic Research.
46
Borenstein, S., Cameron, C. and Gilbert, R. (1997). “Do Gasoline Prices Respond Asym-
metrically to Crude Oil Price Changes?” The Quarterly Journal of Economics, 112
(1), 305-339.
Borenstein, S. and Shepard, A. (2002). “Sticky Prices, Inventories, and Market Power in
Wholesale Gasoline Markets.” Rand Journal of Economics, 33 (1), 116-139.
Crespo Cuaresma, J. (2003). “Okun’s Law Revisited.” Oxford Bulletin of Economics and
Statistics, 65 (4), 439-451.
Delatte, A.-L. and López-Villavicencio, A. (2010). “Asymmetric Responses of Prices to
Exchange Rate Variations. Evidence from the G7 Countries,” unpublished manuscript,
Rouen University Business School.
Delatte, A.-L. and López-Villavicencio, A. (2011). “Asymmetric Exchange Rate Pass-
Through. Evidence from Major Economies,” unpublished manuscript, Rouen Univer-
sity Business School.
Engle, R.F. and Granger, C.W.J. (1987). “Co-integration and Error Correction: Repre-
sentation, Estimation and Testing.” Econometrica, 55 (2), 251-276.
Escribano, A., Sipols, A.E. and Aparicio, F.M. (2006). “Nonlinear Cointegration and
Nonlinear Error Correction: Record Counting Cointegration Tests.” Communications
in Statistics–Simulation and Computation 35 (4), 939-956.
Galeotti, M., Lanza, A. and Manera, M. (2003). “Rockets and Feathers Revisited: An
International Comparison on European Gasoline Markets.” Energy Economics, 25 (2),
175-190.
Granger, C.W.J. and Yoon, G. (2002). “Hidden Cointegration,” unpublished manuscript,
University of California San Diego.
47
Grasso, M. and Manera, M. (2007). “Asymmetric Error Correction Models for the Oil-
Gasoline Price Relationship.” Energy Policy, 35 (1), 156-177.
Greenwood-Nimmo, M.J., Shin, Y. and Van Treeck, T. (2011). “The Great Modera-
tion and the Decoupling of Monetary Policy from Long-Term Rates in the U.S. and
Germany,”. Mimeo: Leeds University Business School.
Greenwood-Nimmo, M.J., Shin, Y. and Van Treeck, T. (2011). “The Asymmetric ARDL
Model with Multiple Unknown Threshold Decompositions: An Application to the
Phillips Curve in Canada,”. Mimeo: Leeds University Business School.
Hamanda, K. and Kurosaka, Y. (1984). “The Relationship between Production and
Unemployment in Japan: Okun’s Law in a Comparative Perspective.” European Eco-
nomic Review, 25 (1), 71-94.
Hamilton, J.D. (1994). Time Series Analysis. Princeton (NJ): Princeton University Press.
Hansen, B.E. (1995). “Rethinking the Univariate Approach to Unit Root Tests: How to
use Covariates to Increase Power.” Econometric Theory, 11 (5), 1148-1171.
Hansen, B.E. (2000). “Sample Splitting and Threshold Estimation.” Econometrica, 68
(3), 575-603.
Johnson, R.N. (2002). “Search Costs, Lags and Prices at the Pump.” Review of Industrial
Organization, 20 (1), 33-50.
Kahneman, D. and Tversky, A. (1979). “Prospect Theory: An Analysis of Decisions
under Risk.” Econometrica, 47 (2), 263-291.
Kapetanios, G., Shin, Y. and Snell, A. (2006). “Testing for Cointegration in Nonlinear
Smooth Transition Error Correction Models.” Econometric Theory, 22 (2), 279-303.
Keynes, J.M. (1936). The General Theory of Employment, Interest and Money. London:
48
Macmillan.
Kremers, J.J.M., Ericsson, K.R. and Dolado, J.J. (1992). “The Power of Cointegration
Tests.” Oxford Bulletin of Economics and Statistics, 54 (3), 325-348.
Lang, D. and de Peretti, C. (2009). “A Strong Hysteretic Model for Okun’s Law: Theory
and Preliminary Investigation”. International Review of Applied Economics, 23 (4),
445-462.
Lardic, S. and Mignon, V. (2008). “Oil Prices and Economic Activity: An Asymmetric
Cointegration Approach.” Energy Economics, 30 (3), 847-855.
Lee, J. (2000). “The Robustness of Okun’s Law: Evidence from OECD countries.” Jour-
nal of Macroeconomics, 22 (2), 331-56.
Neftci, S.N. (1984). “Are Economic Time Series Asymmetric over the Business Cycle?”
Journal of Political Economy, 92 (2), 307-328.
Nguyen, V.H. and Shin, Y. (2010). “Asymmetric Price Impacts of Order Flow on Ex-
change Rate Dynamics,”. Mimeo: Leeds University Business School.
Park, J.Y. and Phillips, P.C.B. (2001). “Nonlinear Regressions with Integrated Time
Series.” Econometrica, 69 (1), 117-161.
Pesaran M.H. and Shin, Y. (1998). “An Autoregressive Distributed Lag Modelling Ap-
proach to Cointegration Analysis.” in Econometrics and Economic Theory: The Rag-
nar Frisch Centennial Symposium, ed. S. Strom. Cambridge: Cambridge University
Press, pp. 371-413.
Pesaran, M.H., Shin, Y. and Smith, R.J. (1999). “Pooled Mean Group Estimation of
Dynamic Heterogenous Panels.” Journal of the American Statistical Association, 94
(446) , 621-634.
49
Pesaran M.H., Shin, Y. and Smith, R.J. (2001). “Bounds Testing Approaches to the
Analysis of Level Relationships.” Journal of Applied Econometrics, 16 (3), 289-326.
Phillips, P.C.B. and Hansen, B. (1990). “Statistical Inference in Instrumental Variables
Regression with I(1) Processes.” Review of Economic Studies, 57 (1), 99-125.
Psaradakis, Z., Sola, M. and Spagnolo, F. (2004). “On Markov Error-Correction Models
with an Application to Stock Prices and Dividends.” Journal of Applied Econometrics,
19 (1), 69-88.
Radchenko, S. (2005). “Oil Price Volatility and the Asymmetric Response of Gasoline
Prices to Oil Price Increases and Decreases.” Energy Economics, 27 (5), 708-730.
Reilly, B. and Witt, R. (1998). “Petrol Price Asymmetries Revisited.” Energy Economics,
20 (3), 297-308.
Saikkonen, P. (1991). “Asymptotically Efficient Estimation of Cointegrating Regressions.”
Econometric Theory, 7 (1), 1-21.
Saikkonen, P. (2008). “Stability of Regime Switching Error Correction Models under
Linear Cointegration.” Econometric Theory, 24 (1), 294-318.
Saikkonen, P. and Choi, I. (2004). “Cointegrating Smooth Transition Regressions.”
Econometric Theory, 20 (2), 301-340.
Schorderet, Y. (2001). “Revisiting Okun’s Law: An Hysteretic Perspective,” unpublished
manuscript, University of California San Diego.
Schorderet, Y. (2003). “Asymmetric Cointegration,” unpublished manuscript, University
of Geneva.
Shin, Y. and Yu, B. (2004). “An ARDL Approach to an Analysis of Asymmetric Long-run
Cointegrating Relationships,”. Mimeo: Leeds University Business School.
50
Shiller, R.J. (1993). Macro Markets: Creating Institutions for Managing Society’s Largest
Economic Risks. Oxford: Clarendon Press.
Shiller, R.J. (2005). Irrational Exuberance (2nd ed.). Princeton (NJ): Princeton Univer-
sity Press.
Shirvani, H. and Wilbratte, B. (2000). “Does Consumption Respond More Strongly to
Stock Market Declines than to Increase?” International Economic Journal, 14 (3),
41-49.
Stock, J.H. and Watson, M.W. (1993). “A Simple Estimator of Cointegrating Vectors in
Higher Order Integrated Systems.” Econometrica, 61 (4), 783-820.
Tanaka, Y. (2001). “Employment Tenure, Job Expectancy and Earnings Profile in Japan.”
Applied Economics, 33 (3), 365-374.
Van Treeck, T. (2008). “Asymmetric Income and Wealth Effects in a Non-linear Error
Correction Model of US Consumer Spending,” IMK Working Paper 6/2008, Hans-
Böckler Foundation, Düsseldorf.
Virén, M. (2001). “The Okun Curve is Non-linear.” Economics Letters, 70 (2), 253-57.
Webber, A.G. (2000). “Newton’s Gravity Law and Import Prices in the Asia Pacific.”
Japan and the World Economy, 12 (1), 71-87.
51
Table 1: Monte Carlo Simulation Results: Bias, Standard Error and RMSE of the OLS Estimator
T = 100 T = 200 T = 400
Coef Bias STDE RMSE Coef Bias STDE RMSE Coef Bias STDE RMSE
α 0.001 0.308 6.932 α 0.001 0.194 3.082 α 0.001 0.130 1.451
ρ -0.063 0.070 2.121 ρ -0.029 0.043 0.825 ρ -0.014 0.028 0.354
θ+ 0.019 0.054 1.283 θ+ 0.011 0.028 0.475 θ+ 0.006 0.016 0.195
ω = −0.5 θ− 0.051 0.073 2.001 θ− 0.026 0.044 0.806 θ− 0.013 0.029 0.351
ϕ+ -0.001 0.179 4.019 ϕ+ 0.000 0.122 1.930 ϕ+ -0.001 0.085 0.954
− −
ϕ -0.002 0.178 4.011 ϕ -0.001 0.122 1.937 ϕ− 0.001 0.085 0.954
β + -0.031 0.205 4.664 β + -0.010 0.102 1.628 β+ -0.003 0.051 0.567
− −
β -0.031 0.205 4.661 β -0.010 0.102 1.631 β− -0.003 0.051 0.567
α 0.002 0.366 8.230 α 0.001 0.229 3.633 α 0.000 0.150 1.681
ρ -0.075 0.077 2.427 ρ -0.037 0.049 0.970 ρ -0.018 0.033 0.417
+ +
θ 0.037 0.072 1.811 θ 0.018 0.036 0.647 θ+ 0.009 0.020 0.251
ω=0 θ− 0.075 0.098 2.773 θ− 0.037 0.056 1.062 θ− 0.018 0.035 0.441
ϕ+ 0.002 0.206 4.631 ϕ+ 0.000 0.141 2.228 ϕ+ 0.001 0.098 1.102
52
− −
ϕ -0.001 0.205 4.613 ϕ 0.000 0.140 2.221 ϕ− 0.000 0.099 1.104
β + -0.002 0.227 5.104 β+ 0.000 0.114 1.812 β+ 0.000 0.057 0.637
β − -0.001 0.227 5.097 β− 0.000 0.114 1.813 β− 0.000 0.057 0.638
α 0.005 0.311 7.001 α 0.001 0.195 3.096 α 0.001 0.129 1.441
ρ -0.063 0.070 2.121 ρ -0.029 0.043 0.826 ρ -0.014 0.028 0.354
+ +
θ 0.044 0.074 1.929 θ 0.018 0.036 0.634 θ+ 0.008 0.019 0.234
− −
ω = 0.5 θ 0.075 0.102 2.854 θ 0.032 0.054 1.001 θ− 0.015 0.032 0.396
ϕ+ 0.002 0.178 4.001 ϕ+ 0.001 0.123 1.948 ϕ+ 0.000 0.085 0.952
− −
ϕ 0.002 0.178 4.002 ϕ 0.000 0.122 1.933 ϕ− 0.000 0.085 0.957
β+ 0.031 0.207 4.714 β+ 0.010 0.102 1.625 β+ 0.003 0.050 0.565
β− 0.032 0.207 4.714 β− 0.010 0.102 1.621 β− 0.003 0.050 0.566
Note: Bias = θ̂R − θ0 , where θ0 is the true value of the coefficient θ and θ̂R is the mean of the estimates of θ across
PR
replications, i.e., θ̂R = i=1 θ̂i /R, where R is the number of replications (we set R = 3, 000 in all cases). STDE θ denotes
the
q standard error of the estimator, θ̂i , across replications. RMSE denotes the root mean squared error of θ̂i , defined as
−1 2
PR
R i=1 (θ̂i − θ0 ) .
Table 2: Monte Carlo Simulation Results: Size and Power of Wald and PSS Tests
T = 100 T = 200 T = 400
Test Power Size Test Power Size Test Power Size
WLR 0.981 0.089 WLR 1.000 0.067 WLR 1.000 0.059
ω = −0.5 WSR 0.425 0.075 WSR 0.675 0.050 WSR 0.935 0.055
FPk=1
SS 0.610 0.040 FPk=1 SS 0.995 0.045 FPk=1
SS 1.000 0.030
k=2 k=2
FPk=2
SS 0.765 0.090 F P SS 1.000 0.100 FP SS 1.000 0.070
(b) (b) (b)
FP SS 0.720 0.050 FP SS 1.000 0.050 FP SS 1.000 0.050
WLR 0.974 0.100 WLR 1.000 0.075 WLR 1.000 0.062
ω=0 WSR 0.308 0.051 WSR 0.548 0.053 WSR 0.870 0.035
FPk=1
SS 0.329 0.036 FPk=1 SS 0.947 0.030 FPk=1
SS 1.000 0.025
k=2 k=2
0.988
53
FP SS 0.527 0.080 FPk=2 SS 0.072 FP SS 1.000 0.070
(b) (b) (b)
FP SS 0.422 0.050 FP SS 0.976 0.050 FP SS 1.000 0.050
WLR 0.982 0.098 WLR 1.000 0.075 WLR 1.000 0.061
ω = 0.5 WSR 0.385 0.075 WSR 0.675 0.025 WSR 0.925 0.055
FPk=1
SS 0.540 0.035 FPk=1 SS 0.985 0.025 FPk=1
SS 1.000 0.025
k=2
FP SS 0.735 0.080 FPk=2 SS 1.000 0.080 FPk=2
SS 1.000 0.075
(b) (b) (b)
FP SS 0.655 0.050 FP SS 0.995 0.050 FP SS 1.000 0.050
Note: WLR denotes the Wald test of the null hypothesis of long-run symmetry defined as
θ+ = θ− . WSR is the Wald test of the short-run symmetry restrictions ϕ+ = ϕ− . FPk=n SS
denotes the PSS F-test of the null hypothesis ρ = θ+ = θ− = 0 using the k = n critical
values where n = (1, 2) for the case where all regressors follow nonstationary I(1) processes.
(b)
FP SS refers to the bootstrapped PSS test.
Table 3: Static Estimation of the Unemployment-Output Relationship
(a) Static Linear Regression

US Canada Japan
Var. Coeff. S.E. Coeff. S.E. Coeff. S.E.
Constant 73.16 3.92 74.96 2.94 29.94 1.25
Trend 0.03 0.00 0.03 0.00 0.02 0.00
yt -15.66 0.94 -15.19 0.70 -6.38 0.28
2
R 0.77 0.78 0.89
2
Adj. R 0.77 0.78 0.89
2
χSC 250.84[.000] 233.28[.000] 235.08[.000]
χ2H 69.29[.000] 1.95[.163] 0.29[.593]
2
χF F 109.11[.000] 0.21[.901] 60.72[.000]
χ2N 3.40[.183] 6.52[.011] 21.62[.000]
EGM AX -2.90 -2.42 -2.86
(b) Static Asymmetric Regression

US Canada Japan
Var. Coeff. S.E. Coeff. S.E. Coeff. S.E.
Const. 7.82 0.10 10.56 0.10 2.55 0.62
+
yt -10.73 0.51 -13.05 0.48 -4.61 0.28
−
yt -25.83 1.81 -20.38 0.92 -7.70 0.33
2
R 0.78 0.81 0.87
2
Adj. R 0.77 0.81 0.87
χ2SC 248.82[.000] 231.04[.000] 240.02[.000]
χ2H 66.99[.000] 0.31[.580] 0.16[.690]
χ2F F 110.39[.000] 0.23[.892] 57.18[.000]
χ2N 11.23[.004] 7.97[.005] 22.69[.000]
Wy+ =y− 129.20[.000] 258.10[.000] 1607.50[.000]
EGM AX -2.79 -2.60 -2.55
Note: yt denotes the natural logarithm of industrial production and yt+ and yt− the associ-
ated positive and negative partial sum processes. Note also that in order to accommodate
the strong trending behavior of yt , we include a deterministic time trend in the symmet-
ric case. χ2SC , χ2H , χ2F F and χ2N denote LM tests for serial correlation, heteroscedasticity,
functional form (Ramsey’s RESET test) and normality, respectively. Figures in square
parentheses are the associated p-values. Wy+ =y− denotes the Wald test of the equality
of the coefficients associated with yt+ and yt− . EGM AX denotes the largest value of the
Engle-Granger residual-based ADF test. The 95% critical values of the EG test are -3.42
(panel (a)) and -3.77 (panel (b)).
54
Table 4: Dynamic Linear Estimation of the Unemployment-Output Relationship
US Canada Japan
Var. Coeff. S.E. Var. Coeff. S.E. Var. Coeff. S.E.
ut−1 -0.03 0.01 ut−1 -0.02 0.01 ut−1 0.00 0.01
yt−1 -0.04 0.07 yt−1 -0.09 0.10 yt−1 -0.02 0.06
∆ut−1 -0.17 0.06 ∆ut−2 -0.12 0.06 ∆ut−1 -0.26 0.06
∆ut−11 0.13 0.05 ∆yt -4.40 1.19 ∆ut−2 -0.22 0.06
∆yt -8.17 1.61 ∆yt−2 -2.83 1.21 ∆ut−10 0.16 0.06
∆yt−2 -4.73 1.58 ∆yt−6 -3.01 1.16 ∆ut−12 -0.18 0.06
∆yt−4 -4.04 1.50 Const. 0.57 0.55 ∆yt−1 -1.37 0.42
Const. 0.38 0.35 ∆yt−2 -1.27 0.45
∆yt−3 -1.30 0.43
∆yt−9 -1.16 0.39
Const. 0.09 0.27
Ly -1.66 2.03 Ly -5.68 3.89 Ly 5.57 20.88
R2 0.29 R2 0.13 R 2
0.23
R̄2 0.27 R̄2 0.11 R̄ 2 0.20
χ2SC 10.75[.550] χ2SC 9.35[.673] χ2SC 11.95[.450]
χ2F F 1.94[.163] χ2F F 0.26[.609] χ2F F 0.03[.867]
χ2N OR 3.72[.156] χ2N OR 12.35[.002] 2
χN OR 0.92[.632]
χ2HET 15.19[.000] χ2HET 0.09[.770] χ2HET 0.41[.521]
tBDM -2.34[.136] tBDM -1.27[.820] tBDM 0.57[1.000]
FP SS 4.69[.081] FP SS 0.81[.927] FP SS 0.18[.890]
Note: ut denotes the rate of unemployment, measured in percentage points. Here we fol-
low the general-to-specific approach to select the final ARDL specification. The preferred
specification is chosen by starting with max p = max q = 12 and dropping all insignificant
stationary regressors. tBDM is the BDM t-statistic while FP SS denotes the PSS F-statistic
testing the null hypothesis ρ = θ = 0. The long-run coefficient Ly is defined by β̂ = −θ̂/ρ̂.
Pesaran, Shin and Smith (2001) tabulate the 5% critical values for k = 1 as follows:
tcrit = −3.22; Fcrit = 5.73. Empirical p-values are quoted for the BDM t-statistic and the
PSS F-statistic.
55
Table 5: Dynamic Asymmetric Estimation of the Unemployment-Output Relationship
US Canada Japan
ut−1 -0.06 0.01 ut−1 -0.07 0.02 ut−1 -0.05 0.01
+ + +
yt−1 -0.55 0.17 yt−1 -1.27 0.28 yt−1 -0.34 0.10
− − −
yt−1 -1.62 0.50 yt−1 -2.09 0.46 yt−1 -0.53 0.14
∆ut−1 -0.19 0.06 ∆ut−2 -0.13 0.06 ∆ut−1 -0.23 0.06
∆ut−11 0.11 0.05 ∆ut−12 -0.12 0.06 ∆ut−2 -0.19 0.06
∆yt+ -8.42 2.23 ∆yt+ -5.24 1.86 ∆ut−10 0.13 0.06
+ +
∆yt−2 -4.82 1.99 ∆yt−3 3.69 1.86 ∆ut−12 -0.22 0.06
∆yt− -8.24 4.28 ∆yt− -5.15 2.60 +
∆yt−1 -1.61 0.65
− − +
∆yt−4 -9.74 3.77 ∆yt−3 -5.89 2.64 ∆yt−9 -1.71 0.66
−
Const. 0.38 0.11 Const. 0.72 0.19 ∆yt -1.80 0.71
Const. 0.16 0.04
Ly + -9.76 1.74 Ly+ -17.26 2.15 Ly+ -7.28 1.64
Ly − -28.88 6.33 Ly− -28.48 4.04 Ly − -11.26 1.97
R2 0.32 R2 0.20 R 2
0.24
R̄2 0.30 R̄2 0.17 R̄2 0.21
χ2SC 9.23[.683] χ2SC 8.11[.777] 2
χSC 11.85[.458]
χ2F F 0.53[.466] χ2F F 9.74[.002] 2
χF F 0.11[.744]
χ2N OR 1.79[.409] χ2N OR 12.62[.002] χ2N OR 0.30[.861]
χ2HET 12.81[.000] χ2HET 0.38[.537] 2
χHET 2.77[.096]
tBDM -3.97[.007] tBDM -4.12[.006] tBDM -3.34[.033]
FP SS 6.98[.010] FP SS 7.13[.005] FP SS 5.38[.038]
WLR 16.33[.000] WLR 32.49[.000] WLR 76.69[.000]
WSR 0.46[.498] WSR 3.65[.056] WSR 2.35[.125]
Note: Ly+ and Ly− denote the long-run coefficients associated with positive and negative
changes of output, respectively. WLR refers to the Wald test of long-run symmetry (i.e.
Ly+ = Ly− ) while WSR denotes the Wald test of the additive short-run symmetry con-
dition. Pesaran, Shin and Smith (2001) tabulate the 5% critical values of tBDM as -3.53
and -3.22 for k = 2 and k = 1, respectively, while the equivalent values for FP SS are 4.85
and 5.73. Empirical p-values are reported for both tests.
56
Table 6: Dynamic Asymmetric Estimation of Gasoline Price Adjustments
LR & SR Symmetry LR Sym & SR Asym LR & SR Asymmetry

pt−1 -0.20 0.06 pt−1 -0.18 0.05 pt−1 -0.18 0.07
o o+
pt−1 0.10 0.03 pot−1 0.10 0.02 pt−1 0.07 0.04
xt−1 0.26 0.08 xt−1 0.18 0.07 po−
t−1 0.07 0.04
∆pot 0.11 0.05 ∆po+ t 0.30 0.08 +
xt−1 0.13 0.07
−
o
∆pt−1 0.10 0.04 ∆po−t−1 0.11 0.05 xt−1 0.09 0.14
∆xt 0.56 0.09 ∆x+ t 0.61 0.09 ∆po+ t 0.26 0.09
∆xt−3 0.22 0.09 ∆x+ t−3 0.33 0.10 o+
∆pt−1 0.16 0.08
Const. -1.22 0.44 Const. -0.79 0.37 ∆x+ t 0.68 0.11
+
∆xt−3 0.39 0.10
Const. 0.70 0.21
Lp o 0.49 0.06 Lpo 0.54 0.06 Lpo + 0.40 0.25
Lpo − 0.37 0.30
Lx 1.31 0.13 Lx 1.00 0.18 Lx + 0.73 0.29
Lx− 0.48 0.73
R2 0.56 R2 0.60 R 2
0.60
Adj. R2 0.50 Adj. R2 0.56 Adj. R 2
0.54
χ2SC 3.08[.544] χ2SC 2.56[.638] 2
χSC 2.00[.736]
χ2F F 8.45[.004] χ2F F 1.38[.240] χ2F F 1.50[.221]
χ2N 4.47[.107] χ2N 15.43[.000] χN 2
11.25[.004]
χ2H 1.70[.193] χ2H 0.00[.995] χ2H 0.00[.979]
tBDM -3.57[.076] tBDM -4.02[.107] tBDM -2.72[.210]
FP SS 4.86[.100] FP SS 9.69[.036] FP SS 5.51[.239]
WLR, po 0.03[.866]
WLR, x 0.17[.680]
WSR, po 3.49[.062] WSR, po 17.17[.000]
WSR, x 44.14[.000] WSR, x 56.30[.000]
Note: pt denotes the natural logarithm of the gasoline price index (2000Y=ln(100)), pot
denotes the natural logarithm of the price of crude oil (US$/bbl) while xt denotes the
natural logarithm of the KRW/USD exchange rate. The superscripts ‘+’ and ‘-’ denote
positive and negative partial sums, respectively. Lpo+ , Lpo− , Lx+ and Lx− denote the long-
run coefficients associated with positive and negative changes in the price of crude oil and
positive and negative changes in the KRW/USD exchange rate, respectively. WLR, po
refers to the Wald test of the restriction Lpo+ = Lpo− while WLR, x refers to the Wald
test of Lx+ = Lx− . WSR, po and WSR, x refer to the Wald tests of the short-run additive
symmetry restrictions. The relevant 5% critical values of the tBDM test are -3.99 for k = 4
and -3.53 for k = 2. Similarly, the critical values of the FP SS test are 4.01 with k = 4 and
4.85 with k = 2. Empirical p-values are reported for both tests.
57
30 20
20
10
10
0
0
-10
-10
-20 -20
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
y- Diff y+ y- Diff y+
(a) LR & SR asymmetry (b) LR symmetry & SR asymmetry
30 20
20
10
10
0
0
-10
-10
-20 -20
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
(c) LR asymmetry & SR symmetry (d) LR & SR symmetry
Figure 1: US Unemployment-Output Dynamic Multipliers
58
30 30
20 20
10 10
0 0
-10 -10
-20 -20
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
(a) LR & SR asymmetry (b) LR asymmetry & SR symmetry
Figure 2: Canadian Unemployment-Output Dynamic Multipliers
59
12 10
8
5
4
0
0
-5
-4
-8 -10
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
(a) LR & SR asymmetry (b) LR asymmetry & SR symmetry
Figure 3: Japanese Unemployment-Output Dynamic Multipliers
60
.6 .8
.4
.4
.2
.0
.0
-.4
-.2
-.4 -.8
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
p- Diff p+ p- Diff p+
(a) LR & SR asymmetry (po ) (b) LR symmetry & SR asymmetry (po )
1.5 1.2
0.8
1.0
0.4
0.5 0.0
-0.4
0.0
-0.8
-0.5 -1.2
10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80
xr- Diff xr+ xr- Diff xr+
(c) LR & SR asymmetry (x) (d) LR symmetry & SR asymmetry (x)
Figure 4: Dynamic Multipliers w.r.t. Oil Price and Exchange Rate Shocks
61
Bounds Testing Approaches to the Analysis of Level Relationships
Author(s): M. Hashem Pesaran, Yongcheol Shin and Richard J. Smith
Source: Journal of Applied Econometrics, Vol. 16, No. 3, Special Issue in Memory of John
Denis Sargan, 1924-1996: Studies in Empirical Macroeconometrics (May - Jun., 2001), pp.
289-326
Published by: Wiley
Stable URL: http://www.jstor.org/stable/2678547
Accessed: 08-08-2016 17:39 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.
Wiley is collaborating with JSTOR to digitize, preserve and extend access to Journal of Applied
Econometrics
This content downloaded from 129.49.5.35 on Mon, 08 Aug 2016 17:39:12 UTC
All use subject to http://about.jstor.org/terms
JOURNAL OF APPLIED ECONOMETRICS
J. Appl. Econ. 16: 289-326 (2001)
DOI: 10.1002/jae.616
BOUNDS TESTING APPROACHES TO THE ANALYSIS

OF LEVEL RELATIONSHIPS
M. HASHEM PESARAN,a* YONGCHEOL SHINb AND RICHARD J. SMITHC

a Trinity College, Cambridge CB2 1TQ, UK
b Department of Economics, University of Edinburgh, 50 George Square, Edinburgh EH8 9JY, UK
c Department of Economics, University of Bristol, 8 Woodland Road, Bristol BS8 1TN, UK
SUMMARY
This paper develops a new approach to the problem of testing the existence of a level relatio
a dependent variable and a set of regressors, when it is not known with certainty whether
regressors are trend- or first-difference stationary. The proposed tests are based on standard F-
used to test the significance of the lagged levels of the variables in a univariate equilibri
mechanism. The asymptotic distributions of these statistics are non-standard under the null
there exists no level relationship, irrespective of whether the regressors are I(0) or I(1). Two sets
critical values are provided: one when all regressors are purely I(1) and the other if they
1(0). These two sets of critical values provide a band covering all possible classifications of
into purely I(O), purely I(1) or mutually cointegrated. Accordingly, various bounds testing p
proposed. It is shown that the proposed tests are consistent, and their asymptotic distribution
and suitably defined local alternatives are derived. The empirical relevance of the bounds
demonstrated by a re-examination of the earnings equation included in the UK Treasury ma
model. Copyright © 2001 John Wiley & Sons, Ltd.
1. INTRODUCTION
Over the past decade considerable attention has been paid in empirical econ
the existence of relationships in levels between variables. In the main, thi
based on the use of cointegration techniques. Two principal approaches hav
two-step residual-based procedure for testing the null of no-cointegration (see
1987; Phillips and Ouliaris, 1990) and the system-based reduced rank regres
Johansen (1991, 1995). In addition, other procedures such as the variable additio
(1990), the residual-based procedure for testing the null of cointegration by Sh
stochastic common trends (system) approach of Stock and Watson (1988) ha
All of these methods concentrate on cases in which the underlying variables are
one. This inevitably involves a certain degree of pre-testing, thus introducing
uncertainty into the analysis of levels relationships. (See, for example, Cavanag
1995.)
This paper proposes a new approach to testing for the existence of a relationship between
variables in levels which is applicable irrespective of whether the underlying regressors are purely
* Correspondence to: M. H. Pesaran, Faculty of Economics and Politics, University of Cambridge, Sidgwick Avenue,
Cambridge CB3 9DD. E-mail: hashem.pesaran@econ.cam.ac.uk
Contract/grant sponsor: ESRC; Contract/grant numbers: R000233608; R000237334.
Contract/grant sponsor: Isaac Newton Trust of Trinity College, Cambridge.
Copyright © 2001 John Wiley & Sons, Ltd. Received 16 February 1999
Revised 13 February 2001
290 M. H. PESARAN, Y. SHIN AND R. J. SMITH
I(O), purely I(1) or mutually cointegrated. The statistic underlying our procedure is the
Wald or F-statistic in a generalized Dicky-Fuller type regression used to test the s
of lagged levels of the variables under consideration in a conditional unrestricted
correction model (ECM). It is shown that the asymptotic distributions of both st
non-standard under the null hypothesis that there exists no relationship in levels b
included variables, irrespective of whether the regressors are purely I(0), purely I(1) or
cointegrated. We establish that the proposed test is consistent and derive its asymptotic d
under the null and suitably defined local alternatives, again for a set of regressors
mixture of 1(0)/I(1) variables.
Two sets of asymptotic critical values are provided for the two polar cases which assu
the regressors are, on the one hand, purely I(1) and, on the other, purely I(0). Since the
of critical values provide critical value bounds for all classifications of the regressors in
I(1), purely I(0) or mutually cointegrated, we propose a bounds testing procedure. If the
Wald or F-statistic falls outside the critical value bounds, a conclusive inference can be drawn
without needing to know the integration/cointegration status of the underlying regressors. However,
if the Wald or F-statistic falls inside these bounds, inference is inconclusive and knowledge of the
order of the integration of the underlying variables is required before conclusive inferences can be
made. A bounds procedure is also provided for the related cointegration test proposed by Banerjee
et al. (1998) which is based on earlier contributions by Banerjee et al. (1986) and Kremers et al.
(1992). Their test is based on the t-statistic associated with the coefficient of the lagged dependent
variable in an unrestricted conditional ECM. The asymptotic distribution of this statistic is obtained
for cases in which all regressors are purely I(1), which is the primary context considered by these
authors, as well as when the regressors are purely I(0) or mutually cointegrated. The relevant
critical value bounds for this t-statistic are also detailed.
The empirical relevance of the proposed bounds procedure is demonstrated in a re-examination
of the earnings equation included in the UK Treasury macroeconometric model. This is a
particularly relevant application because there is considerable doubt concerning the order of
integration of variables such as the degree of unionization of the workforce, the replacement
ratio (unemployment benefit-wage ratio) and the wedge between the 'real product wage' and the
'real consumption wage' that typically enter the earnings equation. There is another consideration
in the choice of this application. Under the influence of the seminal contributions of Phillips (1958
and Sargan (1964), econometric analysis of wages and earnings has played an important role in
the development of time series econometrics in the UK. Sargan's work is particularly noteworthy
as it is some of the first to articulate and apply an ECM to wage rate determination. Sargan,
however, did not consider the problem of testing for the existence of a levels relationship between
real wages and its determinants.
The relationship in levels underlying the UK Treasury's earning equation relates real average
earnings of the private sector to labour productivity, the unemployment rate, an index of union
density, a wage variable (comprising a tax wedge and an import price wedge) and the replacement
ratio (defined as the ratio of the unemployment benefit to the wage rate). These are the variables
predicted by the bargaining theory of wage determination reviewed, for example, in Layard
et al. (1991). In order to identify our model as corresponding to the bargaining theory of wag
determination, we require that the level of the unemployment rate enters the wage equation, but not
vice versa; see Manning (1993). This assumption, of course, does not preclude the rate of chang
of earnings from entering the unemployment equation, or there being other level relationships
between the remaining four variables. Our approach accommodates both of these possibilities.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001)
BOUNDS TESTING FOR LEVEL RELATIONSHIPS 291
A number of conditional ECMs in these five variables were estimated and w

sufficiently high order is selected for the lag lengths of the included variables, the
there exists no relationship in levels between these variables is rejected, irrespect
they are purely I(0), purely I(1) or mutually cointegrated. Given a level relationsh
variables, the autoregressive distributed lag (ARDL) modelling approach (Pesar
is used to estimate our preferred ECM of average earnings.
The plan of the paper is as follows. The vector autoregressive (VAR) model
the analysis of this and later sections is set out in Section 2. This section a
issues involved in testing for the existence of relationships in levels between vari
considers the Wald statistic (or the F-statistic) for testing the hypothesis th
level relationship between the variables under consideration and derives the associ
theory together with that for the t-statistic of Banerjee et al. (1998). Section 4 di
properties of these tests. Section 5 describes the empirical application. Section
concluding remarks. The Appendices detail proofs of results given in Sections 3 an
The following notation is used. The symbol == signifies 'weak convergenc
measure', I,,1 'an identity matrix of order m', I(d) 'integrated of order d', O
order as K in probability' and op(K) 'of smaller order than K in probability'.
2. THE UNDERLYING VAR MODEL AND ASSUMPTIONS
Let {zt}tl denote a (k + l)-vector random process. The data-generating proc

VAR model of order p (VAR(p)):
(L)(z,t - - yt) = et, t = 1, 2,... (1)
where L is the lag operator, It and y are unknown (k + 1)-

the (k + 1, k + 1) matrix lag polynomial ¢((L) = Ik
matrices of unknown coefficients; see Harbo et al. (1
henceforth HJNR and PSS respectively. The properties o
are given in Assumption 2 below. All the analysis of
observations Zo = (l_p, ..., zo). We assume:
Assumption 1. The roots of |Ik+l - EP=1 (iiZil = 0 are

satisfy z = 1.
Assumption 2. The vector error process {et}tl\ is IN(0
Assumption 1 permits the elements of zt to be purely I(

the possibility of seasonal unit roots and explosive roots.
to permit {[t}7tl to be a conditionally mean zero and
PSS, Assumption 4.1.
We may re-express the lag polynomial 4(L) in vect
form; i.e.. ¢(L) -HL + 1(L)(I - L) in which the long-ru
1 Assumptions 5a and 5b below further restrict the maximal order of
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-
-(Ik+l - =iP-1 (i), and the short-run response matrix lag polynomial r(L) Ik+l - Ei1
ri = - jil j, i = 1, ..., p- 1. Hence, the VAR(p) model (1) may be rewritten in vec
ECM form as
p-1
Azt = ao + alt + rIt + E riAzt i + st t = 1, 2,... (2)

i=l
where A - 1 - L is the difference operator,
ao = -nrI + (F + n)y, ai = -Hy (3)
and the sum of the short-run coefficient matri

detailed in PSS, Section 2, if y 7& 0, the resultant con
in (2) ensure that the deterministic trending behaviour o
the (cointegrating) rank of II; a similar result holds for t
Consequently, critical regions defined in terms of the W
asymptotically similar.2
The focus of this paper is on the conditional modellin
vector xt and the past values {zt-}i=l and Zo, where we ha
the error term et conformably with t = (Y' x)' as t =
a. _ ( )yy wyX )
(Wxy )xx
we may express Eyt conditionally in terms of ext as
yt = WyxQl-xt + Ut (4)
where ut - IN(O, 0uu), Cw,, = oyy - w wyx l wxy a

into (2) together with a similar partitioning of
r = (y/ r), ' ri = (yi, r,i) i = 1,... p- , provide
zt-_, Axt, Azt-, ...; i.e. the conditional ECM
P-1
Ayt o + clt + 7ry.xt- 1 + Azt-i + w'Axt + ut t = 1, 2,... (5)

i=l
where w = 1wx,y, co a= o - w'aao, cl _ ayl - w'al, a - Yyyi - w'rxi, i = 1,..., p

7y.x = ry - w'Ix. The deterministic relations (3) are modified to
Co =-7ry.xl + (Yy.x + 7ry.x)Y C1 = -7y.xY (6)
where Yy.x Yv - w'r.

We now partition the long-run multiplier matrix rI conformably with t = (yt
r = (7tyy 7yx c
V rxy nrx x
2 See also Nielsen and Rahbek (1998) for an analysis of similarity issues in cointegrated systems.
The next assumption is critical for the analysis of this paper.
Assumption 3. The k-vector 7xy = 0.
In the application of Section 6, Assumption 3 is an identifying assumption for t

theory of wage determination. Under Assumption 3,
p-1
Axt = axo + axlt + nlxxxt + rxi Azti + xt t = 1

i=l
Thus, we may regard the process {xt}°l as long-run forcing for {Yt}li as there
from the level of Yt in (7); see Granger and Lin (1995).3 Assumption 3 restricts con
cases in which there exists at most one conditional level relationship between Yt and
of the level of integration of the process {xt}ll; see (10) below.4
Under Assumption 3, the conditional ECM (5) now becomes
P-1
Ay =co + clt + ryyt + 7yCt-l + 1 Azt-i + w'Axt + Ut (8)

i=l
t = 1, 2..., where
C = -(7ryy, 7ryx.x)± + [Yy.x + (7ry, 7. )]Y, C1 = -( 7Tyy, 7ry.x)y (9)

and 7ryx.x -= tyx - lxx^.5
The next assumption together with Assumptions 5a and 5b below which constrain the maximal
order of integration of the system (8) and (7) to be unity defines the cointegration properties of
the system.
Assumption 4. The matrix Ixx has rank r, 0 < r < k.
Under Assumption 4, from (7), we may express IIxl as lxx = axxPx, where axx and ,xx are both
(k, r) matrices of full column rank; see, for example, Engle and Granger (1987) and Johansen
(1991). If the maximal order of integration of the system (8) and (7) is unity, under Assumptions
1, 3 and 4, the process {xt}tl^ is mutually cointegrated of order r, 0 < r < k. However, in
contradistinction to, for example, Banerjee, Dolado and Mestre (1998), BDM henceforth, who
concentrate on the case r = 0, we do not wish to impose an a priori specification of r.6 When
7y, = 0 and .,x = O, then xt is weakly exogenous for tyy and 7ryx.x = 7Ty in (8); see, for example,
3Note that this restriction does not preclude {Yt }I being

t=Granger-causal
i n thotu for {xt } in the short run.
4 Assumption 3 may be straightforward
asymptotic properties of such a test are the
5 PSS and HJNR consider a similar model
If current and lagged values of a weakly e
in (8), the lagged level vector xt- should b
asymptotic similarity of the statistics dis
6 BDM, pp. 277-278, also briefly discuss
below, the validity of the limiting distribut
untested assumptions.
Copyright © 2001 John J. Appl.

Wiley Econ
& Sons, Ltd
Johansen (1995, Theorem 8.1, p. 122). In the more general case where rIl. is non-zero, as yTYY and
ryx.x = jt,y - w'ilx are variation-free from the parameters in (7), xt is also weakly exogenous for
the parameters of (8).
Note that under Assumption 4 the maximal cointegrating rank of the long-run multiplier
matrix n for the system (8) and (7) is r + 1 and the minimal cointegrating rank of HI is r. The
next assumptions provide the conditions for the maximal order of integration of the system (8)
and (7) to be unity. First, we consider the requisite conditions for the case in which rank(J) = r.
In this case, under Assumptions 1, 3 and 4, 7ryy = 0 and ryx - 'Inx, = 0' for some k-vector 4.
Note that Tyx.x = 0' implies the latter condition. Thus, under Assumptions 1, 3 and 4, rI has rank
r and is given by
11 - n
\Q(x° x
n,Jx
Hence, we may express II = at' where a = (a x, a')' and fB = (0, B'x)' are (k + 1, r) matrices
full column rank; cf. HJNR, p. 390. Let the columns of the (k + 1, k - r + 1) matrices (al,
and (fry , pf), where ay , fy1 and a , fB are respectively (k+l 1)-vectors and (k+ , k - r)
matrices, denote bases for the orthogonal complements of respectively a and fB; in particul
(al, a')'a = 0 and (I, fBl)'f- = 0.
Assumption 5a. If rank(r ) = r, the matrix (al, al)'r(BIyz, f6) isfull rankk - r + 1, 0 < r <
Cf. Johansen (1991, Theorem 4.1, p. 1559).

Second, if the long-run multiplier matrix II has rank r + 1, then under Assumptions 1, 3 and
Yyy : 0 and r may be expressed as rI = ay'y + at', where ay = (Oyy, 0')' and By = (/3yy,
are (k + 1)-vectors, the former of which preserves Assumption 3. For this case, the columns of a
and fBI form respective bases for the orthogonal complements of (ay, a) and (fBy, f); in particu
al'(ay, a) = 0 and Bl'/(Bfy, B) = 0.
Assumption 5b. If rank(r) = r + 1, the matrix al'r"T is full rank k - r, 0 < r < k.
Assumptions 1, 3, 4 and 5a and 5b permit the two polar cases for {x,} l. First, if {xt} 1 i
purely I(O) vector process, then n^,, and, hence, aXx and ,XX, are nonsingular. Second, if {x
is purely I(1), then lxx = 0, and, hence, axx and fxx are also null matrices.
Using (A.1) in Appendix A, it is easily seen that 7ty.x(zt - A - yt) = ry.xC*(L)Et, where
{C*(L)Et} is a mean zero stationary process. Therefore, under Assumptions 1, 3, 4 and 5b, that is,
7Tyy A O0, it immediately follows that there exists a conditional level relationship between Yt a
xt defined by
Yt = O0 + lt + xt +vt, t = 1,2,... (10)
where -0 -= 7ryX,L/ryy, 01 Tty.xy/ryy, 8 = _- x.x/nyy and vt = 7

stationary process. If 7tyx.. = ayyPyx + (ay - w'atx)xx ) ~ 0f', t
and xt is non-degenerate. Hence, from (10), Yt - I(0) if rank(
rank(IOyx, fP,) = r + 1. In the former case, 0 is the vector of cond
in this sense, (10) may be interpreted as a conditional long-run lev
xt, whereas, in the latter, because the processes {yt}} i and {xt} I ar
the conditional long-run level relationship between Yt and xt. T
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326
if yryy 0 0 and ryx.x = 0', clearly, from (10), Yt is (trend) stationary or yt -

value of r. Consequently, the differenced variable Ayt depends only on its own la
in the conditional ECM (8) and not on the lagged levels xt_l of the forcing va
ryy = 0, that is, Assumption 5a holds, and 7tyx.x = (ayx - w'axvx)B'x 7 O', as rank
(0 - w)'ax,Bx which, from the above, yields yx. (xt - ,ux - yxt) = ,ry.xC
where /u = (/y, ' ) and y= (y), yx)' are partitioned conformably with zt = (
(8), Ayt depends only on the lagged level xt_l through the linear combination
lagged mutually cointegrating relations IBxt-I for the process {xt}=1. Conseq
whatever the value of r. Finally, if both rryy = 0 and ryx.x = 0', there are no lev
conditional ECM (8) with no possibility of any level relationship between Yt a
or otherwise, and, again, Yt ' 1(1) whatever the value of r.
Therefore, in order to test for the absence of level effects in the conditional EC
crucially, the absence of a level relationship between Yt and xt, the emphasis
test of the joint hypothesis yTyy = 0 and 7tyx.x = O' in (8).7,8 In contradistinction,
BDM may be described in terms of (8) using Assumption 5b:
At = co + clt + aoyy(fyyYt-1 + P ' xt-1) + (ay)x - w 'ax)xxxt_

p-1
+ E AZt- i+ 'Axt +ut (11)

i=l
BDM test for the exclusion of yt- in (11) when r = 0, that is, fx, = 0 in (11) or I
(7) and, thus, {xt} is purely 1(1); cf. HJNR and PSS.9 Therefore, BDM consider the
oyy = 0 (or ryy = 0).10 More generally, when 0 < r < k, BDM require the imposit
untested subsidiary hypothesis ay - w'axx = O'; that is, the limiting distribution of the
is obtained under the joint hypothesis yTyy = 0 and 7ryx.x = 0 in (8).
In the following sections of the paper, we focus on (8) and differentiate between five
interest delineated according to how the deterministic components are specified:
* Case I (no intercepts; no trends) co = 0 and cl = 0. That is, ,u = 0 and y= 0.

ECM (8) becomes
p-1
Ayt = nTyyyt-l + -7yx.xXt-l + E 'iAzt-i + w'Axt + u

i=l
* Case II (restricted intercepts; no trends) co = -(ryy, 7tyx.x)L and cl = 0. Her

ECM is
p-1
Ayt = T7yy(yYt- - y)+ ± yx.x(Xt - I-x) + E /l Azt-i + w'/Axt + ut

i=l
7 This joint hypothesis may be justified by the application of Roy's union-intersection princ
in (8) given ,ryx.. Let W,,,, (yx)) be the Wald statistic for testing 7r,y = 0 for a given
maxr',,, W7y (lryx.x) is identical to the Wald test of 7r,y = 0 and ryx.x. = 0 in (8).
8 A related approach to that of this paper is Hansen's (1995) test for a unit root in a univariate t
context, would require the imposition of the subsidiary hypothesis 7ry.x = 0'.
9 The BDM test is based on earlier contributions of Kremers et al. (1992), Banerjee et al. (199
10Partitioning rxi = (Yxy,i, rxx,i), i = 1, - 1 , , conformably with zt = (yt x, x BDM
1,..., p - 1, which implies y,y = 0, where rx = (Yxy, rxx); that is, AYt does not Granger cau
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (2001
* Case III (unrestricted intercepts; no trends) co - 0 and cl = 0. Again, y= 0. Now

intercept restriction co = -(7r)y, ry,. ),t is ignored and the ECM is
p-1
Ay C = co + r- Yy Xt-1, + ±E i AZt -i + w Axt + Ut (14

i=l
* Case IV (unrestricted intercepts; restricted trends) co ~= 0 and cl = -(7tryy, 7rx.
ip-1
Ayt = Co + ryy,(Yt- - yyt) + 7,x(.x(Xt - yxt) + E rAzt-i + w'

i=l
* Case V (unrestricted intercepts; unrestricted trends) co - 0 and cl :7 0. Here, the

trend restriction cl = -(7r)y, 7ryx.x)y is ignored and the ECM is
p-1
AYt = CO + Clt + yy-Yt-i + Ty:.Xrt-1 + E /iAzt-i + W'AXt + ut (16)

i-=
It should be emphasized that the DGPs for Cases II and III are treated as identical as are th
for Cases IV and V. However, as in the test for a unit root proposed by Dickey and Fuller (1979)
compared with that of Dickey and Fuller (1981) for univariate models, estimation and hypothesi
testing in Cases III and V proceed ignoring the constraints linking respectively the intercept an
trend coefficient, co and cl, to the parameter vector (7ryy, 7rty.) whereas Cases II and IV fully
incorporate the restrictions in (9).
In the following exposition, we concentrate on Case IV, that is, (15), which may be specialized
to yield the remainder.
3. BOUNDS TESTS FOR A LEVEL RELATIONSHIPS
In this section we develop bounds procedures for testing for the existence of
between Yt and xt using (12)-(16); see (10). The main approach taken
Granger (1987) and BDM, is to test for the absence of any level relations
Xt via the exclusion of the lagged level variables yt-l and xt-_ in (12)-(16
define the constituent null hypotheses Ho' :ryy = 0, H " ': T,.x = 0', and al
Hi!': ryy , 0, HI"' : Jtr, 0'. Hence, the joint null hypothesis of inte
given by:
Ho= HZo qnH,o (17)
and the alternative hypothesis is correspondin
Hi =H H' UH .'. x (18)

However, as indicated in Section 2, not only does the altern
case of interest in which y,,y : 0 and 7ty,. : 0' but also perm
and 7r,,. 70 0'; cf. (8). That is, the possibility of degenerate
is admitted under H1 of (18). We comment further on these
For ease of exposition, we consider Case IV and rewrite (15) in matrix notati
Ay = TcO + Z*_liX + AZ-_ + U (19)
where IT is a T-vector of ones, yAy ..A )', AX = ((Axl, ...,AxT)', AZ_i-

(Azi, AZT-i), i=1,..., p-, 1, (w, 1,... _1, AZ- (AX,AZ,,...
AZ1i_), Z-1 (TT, Z-), rT (1,...T)', Z_1 (o. ,ZT-o), U (U1,, UT) and
, Ik+l_ (- ) ( 7[ly )
The least squares (LS) estimator of rT*. is given by:
. = - ^Z- *)-* P^- Ay (20)

y.x - P _ Z_ IP_ Y _
where Z1-PLZ, AZ_- -PlZ_, Ay=P y, Pl, - IT - lr(tr) - and PZ -IT -
AZ_ (AZ AZ_)-1AZ_. The Wald and the F-statistics for testing the null hypothesis Ho of
(17) against the alternative hypothesis H1 of (18) are respectively:
=^, ~ , - ~ W
W = r PZ*_P_z Z* _i x/ ,9I, F k2 (21)
where ci,,,, (T - m)- T=1 Uz , m - (k + 1)(p + 1) + 1 is the number of estimated co
and ut, t = 1, 2, ..., T, are the least squares (LS) residuals from (19).
The next theorem presents the asymptotic null distribution of the Wald statistic;
behaviour of the F-statistic is a simple corollary and is not presented here or sub
Let Wk-,-+l(a) = (W, (a), Wk-r (a)')' denote a (k - r + 1)-dimensional standard Browni
partitioned into the scalar and (k - r)-dimensional sub-vector independent standard
motions W,,(a) and Wk_, (a), a e [0, 1]. We will also require the corresponding de-m
r+ 1)-vector standard Brownian motion Wk-,+l (a) Wk-,.+ (a) - f0 Wk-r+1 (a)da
meaned and de-trended (k - r + l)-vector standard Brownian motion Wk-,+l (a) = Wk
12 (a - 1) fo (a- 2) W,k_,.+l(a)da, and their respective partitioned counterparts Wk
(W ,,(a), Wk- r(a)')', and Wk-, ++i(a) = (W,l(a), Wk-. (a)')', a E [0, 1].
Theorem 3.1 (Limiting distribution of W) If Assumptions 1-4 and 5a hold, then u

Try) = 0 and rTyv.x = 0 of (17), as T -> oo, the asymptotic distribution of the Wald statis
(21) has the representation
1 \I i-1o
W X Zz.z + dW
where z,. - N(O, I,.) i
Wk-r+l(a) Case I >

(Wk-.+l (a)', 1)' Case II
Fk-r+l(a) = Wk-r+l(a) Case III
(Wk-+l (a)', a - )' Case IV
Wk-_,+l(a) Case V -
r = 0, ..., k, and Cases I-V are defined in (12)-(16), a e [0, 1].
The asymptotic distribution of the Wald statistic W of (21) depends on the dimensio
cointegration rank of the forcing variables {xt}, k and r respectively. In Case IV, refer
(11), the first component in (22), z.zr - X2(r), corresponds to testing for the exclusion of th
dimensional stationary vector 'xx t_l, that is, the hypothesis ayx - w'ax = 0', whereas the se
term in (22), which is a non-standard Dickey-Fuller unit-root distribution, corresponds to te
for the exclusion of the (k - r + 1)-dimensional I(1) vector (gfB, Pl)'zt_l and, in Case
IV, the intercept and time-trend respectively or, equivalently, ayy = 0.
We specialize Theorem 3.1 to the two polar cases in which, first, the process for the f
variables {xt} is purely integrated of order zero, that is, r = k and rlx is of full rank, and, s
the {xt} process is not mutually cointegrated, r = 0, and, hence, the {xt} process is purely inte
of order one.
Corollary 3.1 (Limiting distribution of W if {xt} - I(0)). If Assumptions 1-4 and 5a hold
and r = k, that is, {xt} - I(0), then under Ho : 0tyy = 0 and Yfyx.x = O' of (17), as T -> oo, the
asymptotic distribution of the Wald statistic W of (21) has the representation
W X: z Zk + (f ° F(a)d W (a))2 (23)

(fo F(a)2da)
where Zk ~ N(O, Ik) is distributed independently of the second term in (23) and
W (a) Case I '

(W (a), 1)' Case II
F(a) = W,(a) Case III
(W.(a), a-- Case IV
W.(a) Case V,
r = 0, ..., k, where Cases I-V are defined in (12)-(16), a e [0, 1]
Corollary 3.2 (Limiting distribution of W if {xt} I1(1)). If A

and r = O, that is, {xt} I1(1), then under Ho : tryy = 0 and ryx
asymptotic distribution of the Wald statistic W of (21) has the repr
WX f dW.L(a)Fk+l (a)' Fk+ (a)Fk+1 (a)'da) Fk
where Fk+l(a) is defined in Theorem 3.1 for Cases I-V, a e [0, 1].
In practice, however, it is unlikely that one would possess a pr

of nl; that is, the cointegration rank of the forcing variables {xt} o
{xt} - I(O) or {xtl 1 I(1). Long-run analysis of (12)-(16) predicat
of the cointegration rank r in (7) is prone to the possibility of
see, for example, Cavanagh et al. (1995). However, it may be s
asymptotic critical values obtained from Corollaries 3.1 (r = k
and {xt} - I(1)) provide lower and upper bounds respectively fo
general case considered in Theorem 3.1 when the cointegration
{xt} process is 0 < r < k.11 Hence, these two sets of critical values prov
bounds covering all possible classifications of {xt} into I(0), I(1) and mutual
processes. Asymptotic critical value bounds for the F-statistics covering Case
Tables CI(i)-CI(v) for sizes 0.100, 0.050, 0.025 and 0.010; the lower bound valu
the forcing variables {xt} are purely I(0), and the upper bound values assume that
1(1).12
Hence, we suggest a bounds procedure to test Ho : ryy = 0 and ryx.x = 0' of (17) within the
conditional ECMs (12)-(16). If the computed Wald or F-statistics fall outside the critical value
bounds, a conclusive decision results without needing to know the cointegration rank r of the
{xt} process. If, however, the Wald or F-statistic fall within these bounds, inference would be
inconclusive. In such circumstances, knowledge of the cointegration rank r of the forcing variables
{xt} is required to proceed further.
The conditional ECMs (12)-(16), derived from the underlying VAR(p) model (2), may also be
interpreted as an autoregressive distributed lag model of orders (p, p, ..., p) (ARDL(p, ..., p)).
However, one could also allow for differential lag lengths on the lagged variables Yt-i and
xt-i in (2) to arrive at, for example, an ARDL(p, 1, ..., Pk) model without affecting the
asymptotic results derived in this section. Hence, our approach is quite general in the sense that
one can use a flexible choice for the dynamic lag structure in (12)-(16) as well as allowing
for short-run feedbacks from the lagged dependent variables, Ayt-i, i = 1, ..., p, to Axt in
(7). Moreover, within the single-equation context, the above analysis is more general than the
cointegration analysis of partial systems carried out by Boswijk (1992, 1995), HJNR, Johansen
(1992, 1995), PSS, and Urbain (1992), where it is assumed in addition that _xn = 0 or xt is purely
I(1) in (7).
To conclude this section, we reconsider the approach of BDM. There are three scenarios for
the deterministics given by (12), (14) and (16). Note that the restrictions on the deterministics'
coefficients (9) are ignored in Cases II of (13) and IV of (15) and, thus, Cases II and IV are now
subsumed by Cases III of (14) and V of (16) respectively. As noted below (11), BDM impose
but do not test the implicit hypothesis at - w'ax = 0'; that is, the limiting distributional results
given below are also obtained under the joint hypothesis Ho : tyy = 0 and Vyx.x = 0' of (17). BDM
test y,, = 0 (or H 'Ttyy = 0) via the exclusion of Yt-1 in Cases I, III and V. For example, in
Case V, they consider the t-statistic
-1P - t - A y
A
t 1/2 / - ^ 1/2
Oiiu (Y^-lP I _I (24)
where ctuU, is defined in the line after (21), Ay =P-1 _ P1,y_i y-i
(Yo, .., YT-1), X-1 -Pt X_1, X-1 (Xp, v,XT-)i,) AZ_ APATZ-_, PT,TT - P,,
PTtT( TQTPTTT) tP,tT PI - =PPZ -P z- X1(X P-Z X_ 1) X Pz and PZ-
IT - AZ-(AZ AZ- )-'AZZ
1 The critical values of the Wald and F-statistics in the general case (not reported here) may be computed via stochastic
simulations with different combinations of values for k and 0 < r < k.
12 The critical values for the Wald version of the bounds test are given by k + 1 times the critical values of the F-test
Cases I, III and V, and k + 2 times in Cases II and IV.
Table CI. Asymptotic critical value bounds for the F-statistic. Testing for the existence of a levels
relationshipa
Table CI(i) Case I: No intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance
k 1(0) 1(1) I(o) I(1) I(o) I(1) i(o) ( (1) ( (o) ( (1) ( (o) () )
0 3.00 3.00 4.20 4.20 5.47 5.47 7.17 7.17 1.16 1.16 2.32 2.32
1 2.44 3.28 3.15 4.11 3.88 4.92 4.81 6.02 1.08 1.54 1.08 1.73
2 2.17 3.19 2.72 3.83 3.22 4.50 3.88 5.30 1.05 1.69 0.70 1.27
3 2.01 3.10 2.45 3.63 2.87 4.16 3.42 4.84 1.04 1.77 0.52 0.99
4 1.90 3.01 2.26 3.48 2.62 3.90 3.07 4.44 1.03 1.81 0.41 0.80
5 1.81 2.93 2.14 3.34 2.44 3.71 2.82 4.21 1.02 1.84 0.34 0.67
6 1.75 2.87 2.04 3.24 2.32 3.59 2.66 4.05 1.02 1.86 0.29 0.58
7 1.70 2.83 1.97 3.18 2.22 3.49 2.54 3.91 1.02 1.88 0.26 0.51
8 1.66 2.79 1.91 3.11 2.15 3.40 2.45 3.79 1.02 1.89 0.23 0.46
9 1.63 2.75 1.86 3.05 2.08 3.33 2.34 3.68 1.02 1.90 0.20 0.41
10 1.60 2.72 1.82 2.99 2.02 3.27 2.26 3.60 1.02 1.91 0.19 0.37
Table CI(ii) Case II: Restricted intercept an
0.100 0.050 0.025 0.010 Mean Variance
k I(o) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1)
0 3.80 3.80 4.60 4.60 5.39 5.39 6.44 6.44 2.03 2.03 1.77 1.77
1 3.02 3.51 3.62 4.16 4.18 4.79 4.94 5.58 1.69 2.02 1.01 1.25
2 2.63 3.35 3.10 3.87 3.55 4.38 4.13 5.00 1.52 2.02 0.69 0.96
3 2.37 3.20 2.79 3.67 3.15 4.08 3.65 4.66 1.41 2.02 0.52 0.78
4 2.20 3.09 2.56 3.49 2.88 3.87 3.29 4.37 1.34 2.01 0.42 0.65
5 2.08 3.00 2.39 3.38 2.70 3.73 3.06 4.15 1.29 2.00 0.35 0.56
6 1.99 2.94 2.27 3.28 2.55 3.61 2.88 3.99 1.26 2.00 0.30 0.49
7 1.92 2.89 2.17 3.21 2.43 3.51 2.73 3.90 1.23 2.01 0.26 0.44
8 1.85 2.85 2.11 3.15 2.33 3.42 2.62 3.77 1.21 2.01 0.23 0.40
9 1.80 2.80 2.04 3.08 2.24 3.35 2.50 3.68 1.19 2.01 0.21 0.36
10 1.76 2.77 1.98 3.04 2.18 3.28 2.41 3.61 1.17 2.00 0.19 0.33
Table CI(iii) Case III: Unrestricted intercep
0.100 0.050 0.025 0.010 Mean Variance
k I(O) I(1) i(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1)
0 6.58 6.58 8.21 8.21 9.80 9.80 11.79 11.79 3.05 3.05 7.07 7.07
1 4.04 4.78 4.94 5.73 5.77 6.68 6.84 7.84 2.03 2.52 2.28 2.89
2 3.17 4.14 3.79 4.85 4.41 5.52 5.15 6.36 1.69 2.35 1.23 1.77
3 2.72 3.77 3.23 4.35 3.69 4.89 4.29 5.61 1.51 2.26 0.82 1.27
4 2.45 3.52 2.86 4.01 3.25 4.49 3.74 5.06 1.41 2.21 0.60 0.98
5 2.26 3.35 2.62 3.79 2.96 4.18 3.41 4.68 1.34 2.17 0.48 0.79
6 2.12 3.23 2.45 3.61 2.75 3.99 3.15 4.43 1.29 2.14 0.39 0.66
7 2.03 3.13 2.32 3.50 2.60 3.84 2.96 4.26 1.26 2.13 0.33 0.58
8 1.95 3.06 2.22 3.39 2.48 3.70 2.79 4.10 1.23 2.12 0.29 0.51
9 1.88 2.99 2.14 3.30 2.37 3.60 2.65 3.97 1.21 2.10 0.25 0.45
10 1.83 2.94 2.06 3.24 2.28 3.50 2.54 3.86 1.19 2.09 0.23 0.41
(Continued ov
J. Ltd.
Copyright © 2001 John Wiley & Sons, Appl. Econ. 16:
Table CI. (Continued)
Table CI(iv) Case IV: Unrestricted intercept and restricted tren
0.100 0.050 0.025 0.010 Mean Variance
k I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1) I(O) I(1)
0 5.37 5.37 6.29 6.29 7.14 7.14 8.26 8.26 3.17 3.17 2.68 2.68
1 4.05 4.49 4.68 5.15 5.30 5.83 6.10 6.73 2.45 2.77 1.41 1.65
2 3.38 4.02 3.88 4.61 4.37 5.16 4.99 5.85 2.09 2.57 0.92 1.20
3 2.97 3.74 3.38 4.23 3.80 4.68 4.30 5.23 1.87 2.45 0.67 0.93
4 2.68 3.53 3.05 3.97 3.40 4.36 3.81 4.92 1.72 2.37 0.51 0.76
5 2.49 3.38 2.81 3.76 3.11 4.13 3.50 4.63 1.62 2.31 0.42 0.64
6 2.33 3.25 2.63 3.62 2.90 3.94 3.27 4.39 1.54 2.27 0.35 0.55
7 2.22 3.17 2.50 3.50 2.76 3.81 3.07 4.23 1.48 2.24 0.31 0.49
8 2.13 3.09 2.38 3.41 2.62 3.70 2.93 4.06 1.44 2.22 0.27 0.44
9 2.05 3.02 2.30 3.33 2.52 3.60 2.79 3.93 1.40 2.20 0.24 0.40
10 1.98 2.97 2.21 3.25 2.42 3.52 2.68 3.84 1.36 2.18 0.22 0.36
Table CI(v) Case V: Unrestricted intercept and
0.100 0.050 0.025 0.010 Mean Variance
k I(O) I(1) I(O) I(1) I(O) I(1) I(O) (1 ) i(O ) (1 ) i(O) i(1)
0 9.81 9.81 11.64 11.64 13.36 13.36 15.73 15.73 5.33 5.33 11.35 11.35
1 5.59 6.26 6.56 7.30 7.46 8.27 8.74 9.63 3.17 3.64 3.33 3.91
2 4.19 5.06 4.87 5.85 5.49 6.59 6.34 7.52 2.44 3.09 1.70 2.23
3 3.47 4.45 4.01 5.07 4.52 5.62 5.17 6.36 2.08 2.81 1.08 1.51
4 3.03 4.06 3.47 4.57 3.89 5.07 4.40 5.72 1.86 2.64 0.77 1.14
5 2.75 3.79 3.12 4.25 3.47 4.67 3.93 5.23 1.72 2.53 0.59 0.91
6 2.53 3.59 2.87 4.00 3.19 4.38 3.60 4.90 1.62 2.45 0.48 0.75
7 2.38 3.45 2.69 3.83 2.98 4.16 3.34 4.63 1.54 2.39 0.40 0.64
8 2.26 3.34 2.55 3.68 2.82 4.02 3.15 4.43 1.48 2.35 0.34 0.56
9 2.16 3.24 2.43 3.56 2.67 3.87 2.97 4.24 1.43 2.31 0.30 0.49
10 2.07 3.16 2.33 3.46 2.56 3.76 2.84 4.10 1.40 2.28 0.26 0.44
a The critical values are computed via stochastic simulati

for testing q = 0 in the regression: Ayt = -'z-_l + a
zt-i = (Yt-1, Xt_), wt =0 Case

zt- = (Yt-I, x, i)', wt =0 Case II
zt-i = - (Yt- ), xt1, wt = 1 Case III
Zt-1 = (Yt-1, xt-1, t), wt = 1 Case IV
zt- = (Yt -, t_i), wt = (1, t)' Case V
The variables yt and xt are generated from Yt = Yt-l + Eit and xt = Pxt-I + 82t, t = 1 ..., T, where yo = 0
Et = (lt, e2t)' is drawn as (k + 1) independent standard normal variables. If xt is purely I(1), P = Ik whereas P
is purely I(0). The critical values for k = 0 correspond to the squares of the critical values of Dickey and Ful
unit root t-statistics for Cases I, III and V, while they match those for Dickey and Fuller's (1981) unit root F
for Cases II and IV. The columns headed 'I(0)' refer to the lower critical values bound obtained when xt is
while the columns headed 'I(1)' refer to the upper bound obtained when xt is purely I(1).
Theorem 3.2 (Limiting distribution of t,,,). If Assumptions 1-4 and 5a hold and Yxy = 0, where
Tx = (Yxy, rr), then under Ho: 7ryy = 0 and tyx.x = O' of (17), as T -> oo, the asymptotic
distribution of the t-statistic t,y, of (24) has the representation
dWu(a)Fk_,r(a) Fk_-r(a)2 da) (25)
where
W,(a) - f1 W,(a)Wk_,(a)' da (fo Wk-_(a)Wk-
Fkr(a) = u(a) - fo WI,a)Wk_,(a)'da ( Wk-,r(a)W

< W(a) - f Wu,(a)Wk r(a)'da (f1 Wk _r(a)Wk
r = 0, .., k, and Cases I III and V are defined in (12), (14)
The form of the asymptotic representation (25) is simila

a unit root except that the standard Brownian motion W
an asymptotic regression of W (a) on the independent (k -
Wkr_ (a) (or their de-meaned and de-meaned and de-trended
Similarly to the analysis following Theorem 3.1, we detail t
statistic t,y in the two polar cases in which the forcing v
order zero and one respectively.
Corollary 3.3 (Limiting distribution of t,y if {xt) - I(

and r = k, that is, {xt} - I(0), then under Ho : 7ryy = 0 an
asymptotic distribution of the t-statistic t,,y, of (24) has the
p\ / \ \-1/2
od dWu(a)F(a)
\Jo( ) F(a2
where
where W, (a) Case
F(a) = Wu(a) Case III
I W,(a) Case V )
and Cases I, III and V are defined in (12), (14) and (16), a e [0, 1].
Corollary 3.4 (Limiting distribution of t,y: if {Xt} - 1(1)). If Assumptions 1-4

Yxy = 0, where rI = (Yxy, ,), and r = O, that is, {xt} - I(1), then under Ho
T -> oo, the asymptotic distribution of the t-statistic t,,y of (24) has the representa
r1 / /ll -1/2
j dW,(a)Fk(a) ( Fk(a)2
where Fk(a) is defined in Theorem 3.2 f
As above, it may be shown by simulati

Corollaries 3.3 (r = k and {xt} is purely
Copyright © 2001 John Wiley & J. Appl.

Sons, Econ.
Ltd. 16
lower and upper bounds respectively for those corresponding to the general
Theorem 3.2. Hence, a bounds procedure for testing H': y = 0 based on thes
may be implemented as described above based on the t-statistic t,,, for the exclus
the conditional ECMs (12), (14) and (16) without prior knowledge of the cointe
These asymptotic critical value bounds are given in Tables CII(i), CII(iii) and C
III and V for sizes 0.100, 0.050, 0.025 and 0.010.
As is emphasized in the Proof of Theorem 3.2 given in Appendix A, if the asym
for the t-statistic t,Y of (24) is conducted under HoY : yy = 0 only, the resultant
for t,: depends on the nuisance parameter w - 0 in addition to the cointegrating r
under Assumption 5a, ayx - 0'acx = 0'. Moreover, if Ayt is allowed to Granger-cau
Yxy,i a 0 for somee i = p - , then the limit distribution also is dependent o
parameter yAy/(yyy - 'Yxy); see Appendix A. Consequently, in general, where w Z
Table CII. Asymptotic critical value bounds of the t-statistic. Testing for the existence of a l
Table CII(i): Case I: No intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance
k I(0) I(1) I(0) I(1) I() I(1) I(0) I(1) I(0) I(1) I(0) I(1)
0 -1.62 -1.62 -1.95 -1.95 -2.24 -2.24 -2.58 -2.58 -0.42 -0.42 0.98 0.98
1 -1.62 -2.28 -1.95 -2.60 -2.24 -2.90 -2.58 -3.22 -0.42 -0.98 0.98 1.12
2 -1.62 -2.68 -1.95 -3.02 -2.24 -3.31 -2.58 -3.66 -0.42 -1.39 0.98 1.12
3 -1.62 -3.00 -1.95 -3.33 -2.24 -3.64 -2.58 -3.97 -0.42 -1.71 0.98 1.09
4 -1.62 -3.26 -1.95 -3.60 -2.24 -3.89 -2.58 -4.23 -0.42 -1.98 0.98 1.07
5 -1.62 -3.49 -1.95 -3.83 -2.24 -4.12 -2.58 -4.44 -0.42 -2.22 0.98 1.05
6 -1.62 -3.70 -1.95 -4.04 -2.24 -4.34 -2.58 -4.67 -0.42 -2.43 0.98 1.04
7 -1.62 -3.90 -1.95 -4.23 -2.24 -4.54 -2.58 -4.88 -0.42 -2.63 0.98 1.04
8 -1.62 -4.09 -1.95 -4.43 -2.24 -4.72 -2.58 -5.07 -0.42 -2.81 0.98 1.04
9 -1.62 -4.26 -1.95 -4.61 -2.24 -4.89 -2.58 -5.25 -0.42 -2.98 0.98 1.04
10 -1.62 -4.42 -1.95 -4.76 -2.24 -5.06 -2.58 -5.44 -0.42 -3.15 0.98 1.03
Table CII(iii) Case III: Unrestricted intercept and no
0.100 0.050 0.025 0.010 Mean Variance
k I(0) I(1) I(0) I(1) I(0) I(1) I(0 ) () (0) I(1) I(O) I(1)
0 -2.57 -2.57 -2.86 -2.86 -3.13 -3.13 -3.43 -3.43 -1.53 -1.53 0.72 0.71
1 -2.57 -2.91 -2.86 -3.22 -3.13 -3.50 -3.43 -3.82 -1.53 -1.80 0.72 0.81
2 -2.57 -3.21 -2.86 -3.53 -3.13 -3.80 -3.43 -4.10 -1.53 -2.04 0.72 0.86
3 -2.57 -3.46 -2.86 -3.78 -3.13 -4.05 -3.43 -4.37 -1.53 -2.26 0.72 0.89
4 -2.57 -3.66 -2.86 -3.99 -3.13 -4.26 -3.43 -4.60 -1.53 -2.47 0.72 0.91
5 -2.57 -3.86 -2.86 -4.19 -3.13 -4.46 -3.43 -4.79 -1.53 -2.65 0.72 0.92
6 -2.57 -4.04 -2.86 -4.38 -3.13 -4.66 -3.43 -4.99 -1.53 -2.83 0.72 0.93
7 -2.57 -4.23 -2.86 -4.57 -3.13 -4.85 -3.43 -5.19 -1.53 -3.00 0.72 0.94
8 -2.57 -4.40 -2.86 -4.72 -3.13 -5.02 -3.43 -5.37 -1.53 -3.16 0.72 0.96
9 -2.57 -4.56 -2.86 -4.88 -3.13 -5.18 -3.42 -5.54 -1.53 -3.31 0.72 0.96
10 -2.57 -4.69 -2.86 -5.03 -3.13 -5.34 -3.43 -5.68 -1.53 -3.46 0.72 0.96
(Continued overl
13 Although Corollary 3.3 does not require Yyy = 0 and HO x : TAx

of Corollary 3.4, the simulation critical value bounds result req
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289
Table CII. (Continued)

Table CII(v) Case V: Unrestricted intercept and unrestricted trend
0.100 0.050 0.025 0.010 Mean Variance
k i(o) I(1) I(O) I(1) I(0) I(1) I(0) I(1) I(O) I(1) I(O) I(1)
0 -3.13 -3.13 -3.41 -3.41 -3.65 -3.66 -3.96 -3.97 -2.18 -2.18 0.57 0.57
1 -3.13 -3.40 -3.41 -3.69 -3.65 -3.96 -3.96 -4.26 -2.18 -2.37 0.57 0.67
2 -3.13 -3.63 -3.41 -3.95 -3.65 -4.20 -3.96 -4.53 -2.18 -2.55 0.57 0.74
3 -3.13 -3.84 -3.41 -4.16 -3.65 -4.42 -3.96 -4.73 -2.18 -2.72 0.57 0.79
4 -3.13 -4.04 -3.41 -4.36 -3.65 -4.62 -3.96 -4.96 -2.18 -2.89 0.57 0.82
5 -3.13 -4.21 -3.41 -4.52 -3.65 -4.79 -3.96 -5.13 -2.18 -3.04 0.57 0.85
6 -3.13 -4.37 -3.41 -4.69 -3.65 -4.96 -3.96 -5.31 -2.18 -3.20 0.57 0.87
7 -3.13 -4.53 -3.41 -4.85 -3.65 -5.14 -3.96 -5.49 -2.18 -3.34 0.57 0.88
8 -3.13 -4.68 -3.41 -5.01 -3.65 -5.30 -3.96 -5.65 -2.18 -3.49 0.57 0.90
9 -3.13 -4.82 -3.41 -5.15 -3.65 -5.44 -3.96 -5.79 -2.18 -3.62 0.57 0.91
10 -3.13 -4.96 -3.41 -5.29 -3.65 -5.59 -3.96 -5.94 -2.18 -3.75 0.57 0.92
a The critical values are computed via stochastic simulations usin

testing 0 = 0 in the regression: Ayt = Pyt- + 8'xt- + a'wt + -t
wt = 0 Case I
wt = 1 Case III
wt =(1, t)' Case V
The variables Yt and xt are generated from Yt = Yt-_ + 8lt and xt = Pxt_l + s2t, t = 1 ..., T, where yo = 0,
and st = (Elt, s2t)' is drawn as (k + 1) independent standard normal variables. If xt is purely I(1), P = Ik whereas P
if xt is purely I(0). The critical values for k = 0 correspond to those of Dickey and Fuller's (1979) unit root t-stat
The columns headed 'I(0)' refer to the lower clitical values bound obtained when xt is purely I(0), while the co
headed 'I(1)' refer to the upper bound obtained when xt is purely I(1).
although the t-statistic t, has a well-defined limiting distribution under H ' = 0, the above
IT - ~yy - 0, the above
bounds testing procedure for Hor : r,,,,
Consequently, in the light of the co
Section 4, see Theorems 4.1, 4.2 and 4.
the existence of a level relationship betw
based on the Wald or F-statistic of (21
proceed no further; (b) if Ho is rejecte
the t-statistic t,, of (24) from Coroll
t,!! should result, at least asymptotically
Yt and xt, which, however, may be deg
4. THE ASYMPTOTIC POWER OF THE BOUNDS PROCEDURE
This section first demonstrates that the proposed bounds testing procedu
statistic of (21) described in Section 3 is consistent. Second, it derives the asym
14 In principle, the asymptotic distribution of t,V,, under H"!' : Ty,, = 0 may be simulated from t
given in the Proof of Theorem 3.2 of Appendix A after substitution of consistent estimators for 0
Ho' y = 0, where yY,x Y /y - /Xy. Although such estimators may be obtained straightfo
they necessitate the use of parameter estimators from the marginal ECM (7) for {xt}t°l
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289-326 (200
of the Wald statistic of (21) under a sequence of local alternatives. Finally,

bounds procedure based on the t-statistic of (24) is consistent.
In the discussion of the consistency of the bounds test procedure based on
of (21), because the rank of the long-run multiplier matrix H may be either r or
alternative hypothesis H1 = H ' U H of (18) where H"' : yy 7 0 and H
necessary to deal with these two possibilities. First, under H 'Y' ty 7 0, the rank
Assumption 5b applies; in particular, a,yy O. Second, under H z = 0, the r
Assumption 5a applies; in this case, H l': rtyx O' holds and, in particular,
Theorem 4.1 (Consistency of the Wald statistic bounds testprocedure under H ' ")
1-4 and 5b hold, then under H '!Y: t,, Zy 0 of (18) the Wald statistic W (21) is con
H1: trryy 0 in Cases I-V defined in (12)-(16).
Theorem 4.2 (Consistency of the Wald statistic bounds test procedure und
Assumptions 1-4 and 5a hold, then under H '": 7r x. 0 of(18) and H"
Wald statistic W (21) is consistent against H> ' : nr,..x = 0/ in Cases I-V defi
Hence, combining Theorems 4.1 and 4.2, the bounds procedure of Section 3 base
statistic W (21) defines a consistent test of Ho = Ho"' n, H"H' of (17) against
of (18). This result holds irrespective of whether the forcing variables {xt} are p
I(1) or mutually cointegrated.
We now turn to consider the asymptotic distribution of the Wald statistic (21)
specified sequence of local alternatives. Recall that under Assumption 5b, t7rV,v[
(ayytyy, ayalfiy + (aoty -W - w/a)/5x). Consequently, we define the sequence of
H1T 7: y.xT[= (7ryy, T r.xT)] = (T-l yyaS)y, T-1 , + 1/2(

Hence, under Assumption 3, defining
niT ( ('y,T ZyT-T
and recalling Hn = ap', where (1, -w')a = a, - w'ax = 0', we have
II - T-r= lT S'a,, + T-/2 )1v (27)

In order to detail the limit distribution of the Wald statistic under the sequence of l
tives HIT of (26), it is necessary to define the (k - r + l)-dimensional Ornstein-Uhl
cess Jk_- (a) = ( Ja) (la),(a), ')' which obeys the stochastic integral and differentia
Jk-.+l (a) = Wk-r+1 (a) + ab' f Jk- +l (r) dr and dJk_.+l (a) = dWk-,+l (a) + ab Jk
where Wk-,+l (a) is a (k - r + 1)-dimensional standard Brownian motion, a = [(ay
a1)]-1/2(ol, oL)'a)y, b = [(a- , a')'(a', a1)]l/2[(-L l)Tr(a I, a1)]-1 I(S, 1 )ty,,
k, ), (
with the de-me
and J_,+l (a) =
(1995, Chapter
Copyright © J. App
2001 J
Theorem 4.3 (Limiting distribution of W under H ir). If Assumptions 1 -4 and 5a hold, then unde
H1T : ry.x = T y-lyyfy + T-1/2(8y - w'8xx)' of (26), as T -- oo, the asymptotic distribution
the Wald statistic W of (21) has the representation
W = z;z, + 0 dJ *(a)Fk -r+(a)' (7 Fk-r+l(a)Fk ,+1(a)'da) Fk_,+i(a)dJ(a) (28)

JO \Jo } o
where z, N (Q1/
xx)', is distributed in
Jkr.+i(a) Case I
Case IIII
(Jk-r+l (a)', 1)' Case
Fk-r+l(a)= < Jk+i(a) Case III
(J_.l (a)', a- 1/2)' Case IV
Jkr+l(a) Case V
r = 0, ..., k, and Cases I-V are defined in (12)-(16), a E [0, 1].
The first component of (28) z'.z,. is non-central chi-square distribut

freedom and non-centrality parameter Yt'QYt and corresponds to the
nryxxT = T- /2(8 - W'8xx)'xx
yx.xTunder
= xx 0 HO' : tyy= 0. The second term in
. 71'y
Dickey-Fuller unit-root distribution under the local alternative H'' = T-lay/yy an
yx - w'Sxx = O'. Note that under Ho of (17), that is, ayy = 0 and 8yx - w'S^ = 0', the limiti
representation (28) reduces to (22) as should be expected.
The proof for the consistency of the bounds test procedure based on the t-statistic of (
requires that the rank of the long-run multiplier matrix nI is r + 1 under the alternative hypothe
H 'ry : ryy 0. Hence, Assumption 5b applies; in particular, ayy % 0.
Theorem 4.4 (Consistency of the t-statistic bounds test procedure under H 1' ). If Assumptio
1-4 and 5b hold, then under H>' . yy :7 0 of (18) the t-statistic t,,~, (24) is consistent again
H1 ': tyy 0 in Cases I, III and V defined in (12), (14) and (16).
As noted at the end of Section 3, Theorem 4.4 suggests the possibility of using ty,, to
discriminate between HO!: yy = 0 and H 7y: Ty 1 0, although, if H': = O' is fal
the bounds procedure given via Corollaries 3.3 and 3.4 is not asymptotically similar.
AN APPLICATION: UK EARNINGS EQUATION
Following the modelling approach described earlier, this section provides a re-examination of the
earnings equation included in the UK Treasury macroeconometric model described in Chan, Savage
and Whittaker (1995), CSW hereafter. The theoretical basis of the Treasury's earnings equation
is the bargaining model advanced in Nickell and Andrews (1983) and reviewed, for example, in
Layard et al. (1991, Chapter 2). Its theoretical derivation is based on a Nash bargaining framework
where firms and unions set wages to maximize a weighted average of firms' profits and unions'
utility. Following Darby and Wren-Lewis (1993), the theoretical real wage
the Treasury's earnings equation is given by
Prodt
t =1 + f(URt)(1 - RRt)/Uniont (29)

where Wt is the real wage, Prodt is labour productivity, RRt is the replacement ratio defined a
the ratio of unemployment benefit to the wage rate, Uniont is a measure of 'union power',
f(URt) is the probability of a union member becoming unemployed, which is assumed to b
increasing function of the unemployment rate URt. The econometric specification is based
log-linearized version of (29) after allowing for a wedge effect that takes account of the differe
between the 'real product wage' which is the focus of the firms' decision, and the 'real consump
wage' which concerns the union.15 The theoretical arguments for a possible long-run wedge eff
on real wages is mixed and, as emphasized by CSW, whether such long-run effects are pre
is an empirical matter. The change in the unemployment rate (A URt) is also included in t
Treasury's wage equation. CSW cite two different theoretical rationales for the inclusion of A U
in the wage equation: the differential moderating effects of long- and short-term unemplo
on real wages, and the 'insider-outsider' theories which argue that only rising unemploym
will be effective in significantly moderating wage demands. See Blanchard and Summers (1
and Lindbeck and Snower (1989). The ARDL model and its associated unrestricted equilibriu
correction formulation used here automatically allow for such lagged effects.
We begin our empirical analysis from the maintained assumption that the time series propert
of the key variables in the Treasury's earnings equation can be well approximated by a log-linea
VAR(p) model, augmented with appropriate deterministics such as intercepts and time tren
To ensure comparability of our results with those of the Treasury, the replacement ratio is not
included in the analysis. CSW, p. 50, report that '... it has not proved possible to identif
significant effect from the replacement ratio, and this had to be omitted from our specification'.
Also, as in CSW, we include two dummy variables to account for the effects of incomes policies
on average earnings. These dummy variables are defined by
D7475t = 1, over the period 1974ql - 1975q4, 0 elsewhere
D7579t = 1, over the period 1975ql - 1979q4, 0 elsewhere
off' dummy variables.17 Let zt = (wt, Prodt, URt, Wedget, Uniont)' = (wt, x')'. Then, using
analysis of Section 2, the conditional ECM of interest can be written as
p-l
Aw, = Co + clt + c2D7475t + c3D7579t + 7r,,,,wt- + Tx.Xxt- + E Azt-i + 8'A

i=1
(30)
15 The wedge effect is further decomposed into a tax wedge and an import price wedge in the Treasury model, but this
decomposition is not pursued here.
16 It is important, however, that, at a future date, a fresh investigation of the possible effects of the replacement ratio on
real wages should be undertaken.
17 However, both the asymptotic theory and associated critical values must be modified if the fraction of periods in which
the dummy variables are non-zero does not tend to zero with the sample size T. In the present application, both dummy
variables included in the earning equation are zero after 1979, and the fractions of observations where D7475t and D7579t
are non-zero are only 7.6% and 19.2% respectively.
Under the assumption that lagged real wages, wt1_, do not enter the sub-VAR model for x
the above real wage equation is identified and can be estimated consistently by LS.18 Notic
however, that this assumption does not rule out the inclusion of lagged changes in real wages in
the unemployment or productivity equations, for example. The exclusion of the level of real wag
from these equations is an identification requirement for the bargaining theory of wages which
permits it to be distinguished from other alternatives, such as the efficiency wage theory which
postulates that labour productivity is partly determined by the level of real wages.19 It is cl
that, in our framework, the bargaining theory and the efficiency wage theory cannot be entertain
simultaneously, at least not in the long run.
The above specification is also based on the assumption that the disturbances ut are seriall
uncorrelated. It is therefore important that the lag order p of the underlying VAR is select
appropriately. There is a delicate balance between choosing p sufficiently large to mitigate t
residual serial correlation problem and, at the same time, sufficiently small so that the condition
ECM (30) is not unduly over-parameterized, particularly in view of the limited time series d
which are available.
Finally, a decision must be made concerning the time trend in (30) and whether its coefficien
should be restricted.20 This issue can only be settled in light of the particular sample period un
consideration. The time series data used are quarterly, cover the period 1970ql-1997q4, and
seasonally adjusted (where relevant).21 To ensure comparability of results for different choices
p, all estimations use the same sample period, 1972ql-1997q4 (T = 104), with the first ei
observations reserved for the construction of lagged variables.
The fiveve variables in the earnings equation were constructed from primary sources in the
lowing manner: wt = ln(ERPRt/PYNONGt), Wedget = ln(l + TEt) + ln(l - TDt) - ln(RPIXt
PYNONGt), URt = ln(100 x ILOUt/(ILOUt + WFEMPt)), Prodt = ln((YPROMt + 278.29 x
YMFt)/(EMFt + ENMFt)), and Uniont = ln(UDENt), where ERPRt is average private sector
earnings per employee (£), PYNONGt is the non-oil non-government GDP deflator, YPROM
is output in the private, non-oil, non-manufacturing, and public traded sectors at constant fac-
tor cost (f million, 1990), YMFt is the manufacturing output index adjusted for stock changes
(1990 = 100), EMFt and ENMFt are respectively employment in UK manufacturing and non-
manufacturing sectors (thousands), ILOUt is the International Labour Office (ILO) measure
of unemployment (thousands), WFEMPt is total employment (thousands), TEt is the average
employers' National Insurance contribution rate, TDt is the average direct tax rate on employ-
ment incomes, RPIXt is the Retail Price Index excluding mortgage payments, and UDENt is
union density (used to proxy 'union power') measured by union membership as a percentage of
employment.22 The time series plots of the five variables included in the VAR model are given in
Figures 1-3.
18 See Assumption 3 and the following discussion. By construction, the contemporaneous effects Axt are uncorrelated
with the disturbance term ut and instrumental variable estimation which has been particularly popular in the empirical
wage equation literature is not necessary. Indeed, given the unrestricted nature of the lag distribution of the conditional
ECM (30), it is difficult to find suitable instruments: namely, variables that are not already included in the model, which
are uncorrelated with Ut and also have a reasonable degree of correlation with the included variables in (30).
19 For a discussion of the issues that surround the identification of wage equations, see Manning (1993).
20 See, for example, PSS and the discussion in Section 2.
21 We are grateful to Andrew Gurney and Rod Whittaker for providing us with the data. For further details about the
sources and the descriptions of the variables, see CSW, pp. 46-51 and p. 11 of the Annex.
22 The data series for UDEN assumes a constant rate of unionization from 1980q4 onwards.
_ _A_
(a) 4.0-
3.5-
_ __^-----~~~~~~~~~~~- ~ ~ ~~~/ Real Wages
3.0.~.~~-~
-a)
co 2.5-
2.0-
1.5- ~ A... . ........ / Productivity

o . 0..
1 .0 I I I I I I I I I I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
(b) 0.04-
0.03-
0.02 /, ! Real Wage
0.00
-0.01
-0.02
I I0.03-0.04~ / t, Productivity
-0.03
-0.0 4 I I I I I I I I I I I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 199701
Quarters
Figure 1. (a) Real wages and labour productivity. (b) Rate of change of real wages and labour productivity
It is clear from Figure 1 that real wages (average earnings) and productivity show steadily risin
trends with real wages growing at a faster rate than productivity.23 This suggests, at least initially,
that a linear trend should be included in the real wage equation (30). Also the application of unit
root tests to the five variables, perhaps not surprisingly, yields mixed results with strong eviden
in favour of the unit root hypothesis only in the cases of real wages and productivity. This does
not necessarily preclude the other three variables (UR, Wedge, and Union) having levels impac
on real wages. Following the methodology developed in this paper, it is possible to test for th
existence of a real wage equation involving the levels of these five variables irrespective of wheth
they are purely I(O), purely I(1), or mutually cointegrated.
23 Over the period 1972ql-97q4, real wages grew by 2.14% per annum as compared to labour productivity that increase
by an annual average rate of 1.54% over the same period.
Copyright © 2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16
-0.2
-0.3 -
/ UNION
-0.4 -
-0.5 -
-0.6-
·I·L·L_
WEDGE
-0.7-
.(
_
I I I I I I I I I I
O Q I I I I I I I I I I I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
Figure 2. The wedge and the unionization variables
3.0-
2.5-
2.0-
/ UR
1.5-
Q)
0
o
0
,_1
1.0-
0.5-
n n
*~
I I I~I I
I I lI lI l
I I l I
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
Figure 3. The unemployment rate
To determine the appropriate lag length p and whether a deterministic linear trend is required
in addition to the productivity variable, we estimated the conditional model (30) by LS, with
and without a linear time trend, for p = 1, 2,..., 7. As pointed out earlier, all regressions were
computed over the same period 1972ql-1997q4. We found that lagged changes of the productivity
variable, AProdt-l, AProdt2, ..., were insignificant (either singly or jointly) in all regressions.
Therefore, for the sake of parsimony and to avoid unnecessary over-parameterization, we decided
to re-estimate the regressions without these lagged variables, but including lagged changes of
all other variables. Table I gives Akaike's and Schwarz's Bayesian Information Criteria, denoted
respectively by AIC and SBC, and Lagrange multiplier (LM) statistics for testin
of no residual serial correlation against orders 1 and 4 denoted by XS2c(1) and X 2
As might be expected, the lag order selected by AIC, 7paic = 6, irrespecti
deterministic trend term is included or not, is much larger than that selected by
criterion gives estimates Psbc = 1 if a trend is included and psbc = 4 if not. The X
suggest using a relatively high lag order: 4 or more. In view of the importance of
of serially uncorrelated errors for the validity of the bounds tests, it seems prud
be either 5 or 6.24 Nevertheless, for completeness, in what follows we report test
and 5, as well as for our preferred choice, namely p = 6. The results in Tab
that there is little to choose between the conditional ECM with or without a linear deterministic
trend.
Table II gives the values of the F- and t-statistics for testing the existence of a level earnings
equation under three different scenarios for the deterministics, Cases III, IV and V of (14), (15)
and (16) respectively; see Sections 2 and 3 for detailed discussions.
The various statistics in Table II should be compared with the critical value bounds provided
in Tables CI and CII. First, consider the bounds F-statistic. As argued in PSS, the statistic Fly
which sets the trend coefficient to zero under the null hypothesis of no level relationship, Case
IV of (15), is more appropriate than Fv, Case V of (16), which ignores this constraint. Note that,
if the trend coefficient cl is not subject to this restriction, (30) implies a quadratic trend in the
level of real wages under the null hypothesis of nr,, = 0 and r,,x.x = 0', which is empirically
implausible. The critical value bounds for the statistics Flv and Fv are given in Tables CI(iv) and
CI(v). Since k = 4, the 0.05 critical value bounds are (3.05, 3.97) and (3.47, 4.57) for Fly and
Fv, respectively.25 The test outcome depends on the choice of the lag order p. For p = 4, the
Table I. Statistics for selecting the lag order of the earnings equation
With deterministic trends Without deterministic trends
p C SC ) AIC SBC x2 XC S (1) XSC(4)

1 319.33 302.14 16.86* 35.89* 317.51 301.64 18.38* 34.88*
2 324.25 301.77 2.16 19.71* 323.77 302.62 1.98 21.52*
3 321.51 293.74 0.52 17.07* 320.87 294.43 1.56 19.35*
4 334.37 301.31 3.48*** 7.79*** 335.37 303.63 3.41*** 7.13
5 335.84 297.50 0.03 2.50 336.49 299.47 0.03 2.15
6 337.06 293.42 0.85 3.58 337.03 294.72 0.99 3.99
7 336.96 288.04 0.17 2.20 336.85 289.25 0.09 0.64
Notes: p is the lag order of the underlying

coefficients of lagged changes in the produ
Akaike's and Schwarz's Bayesian Information C
value of the model, sp is the number of freel
statistics for testing no residual serial correlat
at 0.01, 0.05 and 0.10 levels, respectively.
24 In the Treasury model, different lag orders a

to the log of the price deflator and the wedge
model is 1971ql-1994q3.
25 Following a suggestion from one of the ref
T = 104. For k = 4, the 5% critical value boun
(3.61,4.76), respectively, which are only margin
Copyright © 2001 John WileyJ.

& Appl. Econ.
Sons, Ltd.
Table II. F- and t-statistics for testing the existence of a

levels earnings equation
With Without
deterministic trends deterministic trends
p F F t F tv Fi tll
4 2.99a 2.34a -2.26a 3.63b -3.02b

5 4.42C 3.96b -2.83a 5.23C -4.00C
6 4.78c 3.59b -2.44a 5.42C -3.48b
Notes: See the notes to Table I. Fly is the F-stati

2tYvV = 0, ,,,wx,- = 0' and cl = 0 in (30). Fv is th
testing rtww = 0 and 7w,,x. = 0' in (30). FlI is th
testing trvw = 0 and 7r,,.x = 0' in (30) with cl s
and tlm are the t-ratios for testing 7rvv = 0 in (30)
a deterministic linear trend. a indicates that the statistic lies below
the 0.05 lower bound, b that it falls within the 0.05 bounds, and c
that it lies above the 0.05 upper bound.
hypothesis that there exists no level earnings equation is not rejected at the 0.05 level, irrespective
of whether the regressors are purely I(O), purely I(1) or mutually cointegrated. For p = 5, the
bounds test is inconclusive. For p = 6 (selected by AIC), the statistic Fv is still inconclusive, but
Flv = 4.78 lies outside the 0.05 critical value bounds and rejects the null hypothesis that there
exists no level earnings equation, irrespective of whether the regressors are purely I(0), purely
I(1) or mutually cointegrated.26 This finding is even more conclusive when the bounds F-test is
applied to the earnings equations without a linear trend. The relevant test statistic is F111 and the
associated 0.05 critical value bounds are (2.86, 4.01).27 For p = 4, F111 = 3.63, and the test result
is inconclusive. However, for p = 5 and 6, the values of F111 are 5.23 and 5.42 respectively and
the hypothesis of no levels earnings equation is conclusively rejected.
The results from the application of the bounds t-test to the earnings equations are less clear-cut
and do not allow the imposition of the trend restrictions discussed above. The 0.05 critical value
bounds for t/ll and tv, when k = 4, are (-2.86, -3.99) and (-3.41, -4.36).28 Therefore, if a
linear trend is included, the bounds t-test does not reject the null even if p = 5 or 6. However,
when the trend term is excluded, the null is rejected for p = 5. Overall, these test results support
the existence of a levels earnings equation when a sufficiently high lag order is selected and
when the statistically insignificant deterministic trend term is excluded from the conditional ECM
(30). Such a specification is in accord with the evidence on the performance of the alternative
conditional ECMs set out in Table I.
In testing the null hypothesis that there are no level effects in (30), namely (7,,, = 0, 7r,^
it is important that the coefficients of lagged changes remain unrestricted, otherwise these
could be subject to a pre-testing problem. However, for the subsequent estimation of levels e
and short-run dynamics of real wage adjustments, the use of a more parsimonious specificat
seems advisable. To this end we adopt the ARDL approach to the estimation of the level relat
26 The same conclusion is also reached for p = 7.

27 See Table CI(iii).
28 See Tables CII(iii) and CII(v).
discussed in Pesaran and Shin (1999).29 First, the (estimated) orders of an ARDL(p
model in the five variables (wt, Prodt, URt, Wedget, Uniont) were selected b
the 75 = 16, 807 ARDL models, spanned by p = 0, 1,..., 6, and pi = 0, 1
using the AIC criterion.30 This resulted in the choice of an ARDL(6, 0, 5, 4, 5) sp
estimates of the levels relationship given by
wt = 1.063 Prodt -0.105 URt -0.943 Wedget +1.481 Uniont +2.701 + vt (31)
(0.050) (0.034) (0.265) (0.311) (0.242)
where vt is the equilibrium correction term, and the sta

All levels estimates are highly significant and have the e
productivity and the wedge variables are insignificantly dif
earnings equation, the levels coefficient of the productivity
above estimates can be viewed as providing empirical sup
levels estimates of the effects of the unemployment rate
namely -0.105 and 1.481, are also in line with the Treas
The main difference between the two sets of estimates concerns the levels coefficient of the
wedge variable. We obtain a much larger estimate, almost twice that obtained by the T
Setting the levels coefficients of the Prodt and Wedget variables to unity provides the alte
interpretation that the share of wages (net of taxes and computed using RPIX rather
implicit GDP deflator) has varied negatively with the rate of unemployment and positively
union strength.32
The conditional ECM regression associated with the above level relationship is giv
Table III.33 These estimates provide further direct evidence on the complicated dynamics t
to exist between real wage movements and their main determinants.34 All five lagged chan
real wages are statistically significant, further justifying the choice of p = 6. The equ
correction coefficient is estimated as -0.229 (0.0586) which is reasonably large and h
significant.35 The auxiliary equation of the autoregressive part of the estimated conditiona
has real roots 0.9231 and -0.9095 and two pairs of complex roots with moduli 0.7589 an
which suggests an initially cyclical real wage process that slowly converges towards the equ
described by (31).36 The regression fits reasonably well and passes the diagnostic tests agai
normal errors and heteroscedasticity. However, it fails the functional form misspecificatio
29 Note that the ARDL approach advanced in Pesaran and Shin (1999) is applicable irrespective of whether the r
are purely I(0), purely 1(1) or mutually cointegrated.
30 For further details, see Section 18.19 and Lesson 16.5 in Pesaran and Pesaran (1997).
31 CSW do not report standard errors for the levels estimates of the Treasury earnings equation.
32 We are grateful to a referee for drawing our attention to this point.
33 Clearly, it is possible to simplify the model further, but this would go beyond the remit of this section which
test for the existence of a level relationship using an unrestricted ARDL specification and, second, if we are sat
such a levels relationship exists, to select a parsimonious specification.
34 The standard errors of the estimates reported in Table III allow for the uncertainty associated with the estimati
levels coefficients. This is important in the present application where it is not known with certainty whether the
are purely 1(0), purely 1(1) or mutually cointegrated. It is only in the case when it is known for certain that al
are 1(1) that it would be reasonable in large samples to treat these estimates as known because of their super-c
35 The equilibrium correction coefficient in the Treasury's earnings equation is estimated to be -0.1848 (0.052
is smaller than our estimate; see p. 11 in Annex of CSW. This seems to be because of the shorter lag lengths
Treasury's specification rather than the shorter time period 1971ql-1994q3. Note also that the t-ratio reporte
coefficient does not have the standard t-distribution; see Theorem 3.2.
36 The complex roots are 0.34293 + 0.67703i and -0.17307 + 0.61386i, where i = 1-T.
the 0.05 level which may be linked to the presence of some non-linear effects or asymmetries in
the adjustment of the real wage process that our linear specification is incapable of taking into
account.37 Recursive estimation of the conditional ECM and the associated cumulative sum and
cumulative sum of squares plots also suggest that the regression coefficients are generally stable
over the sample period. However, these tests are known to have low power and, thus, may have
missed important breaks. Overall, the conditional ECM earnings equation presented in Table III
has a number of desirable features and provides a sound basis for further research.
Table III. Equilibrium correction form of the ARDL(6, 0, 5, 4, 5)

earnings equation
Regressor Coefficient Standard error p-value
Vt- -0.229 0.0586 N/A

Awt- 1 -0.418 0.0974 0.000
Awt-2 -0.328 0.1089 0.004
Awt-3 -0.523 0.1043 0.000
Awt-4 -0.133 0.0892 0.140
Awt-5 -0.197 0.0807 0.017
AProdt 0.315 0.0954 0.001
A URt 0.003 0.0083 0.683
AURt-1 0.016 0.0119 0.196
AURt-2 0.003 0.0118 0.797
AURt-3 0.028 0.0113 0.014
AURt-4 0.027 0.0122 0.031
AWedget -0.297 0.0534 0.000
AWedget- -0.048 0.0592 0.417
AWedget-2 -0.093 0.0569 0.105
AWedget-3 -0.188 0.0560 0.001
AUniont -0.969 0.8169 0.239
AUniont_l -2.915 0.8395 0.001
AUniont-2 -0.021 0.9023 0.981
AUniont-3 -0.101 0.7805 0.897
A Uniont-4 -1.995 0.7135 0.007
Intercept 0.619 0.1554 0.000
D7475t 0.029 0.0063 0.000
D7579t 0.017 0.0063 0.009
R2 = 0.5589, - = 0.0083, AIC = 339.57, S

X2c(4) = 8.74[0.068], X2(1) = 4.86[0.02
X2(2) = 0.01[0.993], XH(1) = 0.66[0.415]
Notes: The regression is based on the con

using an ARDL(6, 0, 5, 4, 5) specification w
estimated over 1972ql-1997q4, and the eq
vt-1 is given in (31). R2 is the adjusted s
coefficient, & is the standard error of the
Akaike's and Schwarz's Bayesian Informat
X2(2), and X2 (1) denote chi-squared stati
serial correlation, no functional form mis-sp
homoscedasticity respectively with p-valu
these diagnostic tests see Pesaran and Pes
37 The conditional ECM regression in Table III also

was specified to deal with this problem, it should not
J. Appl.
Copyright © 2001 John Wiley & Sons, Ltd. Econ. 16
6. CONCLUSIONS
Empirical analysis of level relationships has been an integral part of time

and pre-dates the recent literature on unit roots and cointegration.38 However, t
earlier literature was on the estimation of level relationships rather than testing
otherwise). Cointegration analysis attempts to fill this vacuum, but, typically,
restrictive assumption that the regressors, xf, entering the determination of the
interest, yt, are all integrated of order 1 or more. This paper demonstrates that t
for the existence of a level relationship between Yt and xf is non-standard even
under consideration are I(0) because, under the null hypothesis of no level relat
and xt, the process describing the Yt process is I(1), irrespective of whether th
purely I(0), purely I(1) or mutually cointegrated. The asymptotic theory dev
provides a simple univariate framework for testing the existence of a sing
between Yt and xt when it is not known with certainty whether the regressor
purely I(1) or mutually cointegrated.39 Moreover, it is unnecessary that the or
of the underlying regressors be ascertained prior to tesg te sting the existence o
between yt and xt. Therefore, unlike typical applications of cointegration analy
not subject to this particular kind of pre-testing problem. The application of t
testing procedure to the UK earnings equation highlights this point, where one
priori position as to whether, for example, the rate of unemployment or tt ohe u
are I(1) or I(0).
The analysis of this paper is based on a single-equation approach. Consequentl
ate in situations where there may be more than one level relationship involving
this paper and those of HJNR and PSS to deal with such cases is part of our cu
the consequent theoretical developments will require the computation of furth
values.
APPENDIX A: PROOFS FOR SECTION 3
We confine the main proof of Theorem 3.1 to that for Case IV and briefly detail t
necessary for the other cases. Under Assumptions 1-4 and 5a, the process {zt}l
moving-average representation,
z, = C/L+ yt + Cst + C*(L)Et (Al)
where the partial sum st = E1, 4(z)C(z) = C(z)4

4iz', C(z) Ik+i + E Cizl = C + (1 - z)C*(z), t = 1, 2...; see
Note that C = (g, 'l)[(al, a'r(B , )]-a o,al)'; see Johansen (1991, (4.5), p. 1559).
Define the (k + 2, r) and (k + 2, k - r + 1) matrices ,* and 8 by
p =f-IB + ) ( P and 8 I+ ) (l, P)
38 For an excellent review of this early literature, see Hendry et al. (1984).
39 Of course, the system approach developed by Johansen (1991, 1995) can also be applied to a set of variables containin
possibly a mixture of 1(0) and 1(1) regressors.
where (gB, fBI) is a (k + 1, k- r + 1) matrix whose columns are a basis for the orthogonal
complement of f. Hence, (f, , B) is a basis for Zk+l. Let 4 be the (k + 2)-unit vector (1, 0')'
Then, (P,,, , 8) is a basis for Rk+2. It therefore follows that
T-1/28/Z* T/2Z/2(L ( )/, = T-1/2(p l, _ _)/'CS[Ta] + (f2, )' T-1/2C*(L)E[Ta]
= (P', PI)/CBk+l (a)
where zt* = (t, z),) Bk+l (a) is a (k + 1)-vector Brownian motion with variance matrix Q and [T
denotes the integer part of Ta, a e [0, 1]; see Phillips and Solo (1992, Theorem 3.15, p. 983). Also,
T-l'zt* = T-lt = a. Similarly, noting that B'C = 0, we have that ft*,zt = Pt'/ + ItC*(L)Et =
Op(l). Hence, from Phillips and Solo (1992, Theorem 3.16, p. 983), defining V Z- P,Z* and
AZ_ - P,AZ_, it follows that
T-l ' = Z1
Op(l),fT-lZ*AZ
, Z_ = Op(l), T-1'Z_AZ = Op(l)
T-lBZ*-1 -,B* = Op(1), T-BTZ*lAZ_ = Op(l) (A2)
where BT (8, T-1/2k). Similarly, defining ui P,u,
T- /2p' Z*' = Op(l), T-1/2'Z i = Op(l) (A3)
Cf. Johansen (1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3
The next result follows from Phillips and Solo (1992, Theorem 3.15, p
(1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146) and P
(1986).
Lemma A.1 Let BT (8, T-1/25) and define G(a) = (G (a)', G2(a))', whe
CBk+ (a), Bk+1(a)[= (B1(a)', Bk(a)')'] = Bk+1(a) - f0 Bk+1(a)da, and G2
Then
T- 2B'* Z* 1Br =T J G(a)G(a)'da, T-'1BTZ* G(a)dB*(a

Jo Jo
where B*(a) - 1
Proof of Theore
(tuW ~-uW
= - -1 up _ -1 1P_ PT_fi = U
= i'P
Z Z1A
A T (AZ'
TZ_P-1
Z*TA)
T A Z*'1P^Z
- _
where AT = T-1/2 (f, T-1/2BT). Consider

and Lemma A. 1 that
ATZP -1 n 1 T 0 -2B T' -,.T.

B_ -_
1_iT()
Next, consider A,,Z*' P_ ii. From (A3) and Lemma A. 1,
A' (T-1/2fl/
P fi u+'*'pl--i (A5)
- \ T lBZ* + op() (A5)
Finally, the estimatorfor the error variance cou, (defined in the line afte
~/ -_ i'P
wuu = (T - m)-1 [u'f -- *Z1AT(AZP^
A-* --Z* -1ZA/,~.*/ - ]
-1iA) -AZP^
= (T - m)-ii + op(1) = ( o+ + op(1) (A6)
From (A4)-(A6) and Lemma A.I,
W = T-1i'P- Z*-1fi (T 1Z 1P z*
+ T P~-_ _T
-2UZi'B BT
[rT
-1 T--I'Br
BZ.Z.
T ~ --z B ' B(A7)
BZ*UiB _i/)oIu +
We consider each of the te
to state
(T-l/z2*p
T - 1-PT z*
-_ fi1)
- 1/*-1/2
T - _T /2 ii/o1/
-PT-N ,_ 2 = : Zr N(O, I,)
Hence, the first term in (A7) converges in distribution to z.z,., a chi-square random variable with
r degrees offreedom; that is,
T- 1 'P Z* (1 i(T- Z 1Z iP Z* ) Z IPAz_P U/C),,, = Z .Z, - X2(r) (A8

From Lemma A.], the second term in (A7) weakly converges to
dB*(a)G(a)' ( G(a)G(a)'dr) Gk+l(a)dB,*(a)/w,,,
which, as C = (g, jBl)[(a, a' )'(P, )]-1 (, a')', may be expressed as
dB*
a-U( )da)(a) 1 1
2O1a a 1 a-
\Jo a 2/
X,/ (( 1ay B al
a I
Now, noting that under Ho of (17) we may express a^ =

ax'a = O, we define the (k - r + 1)-vector of independen
motions,
[(~et e )~-1/2 2_ ]Rk+l(a)

Wk_r++l(a)[= (W a(a), Wk_r(a)')] [(a
-( Ct)o-1/2B1, (a)
Copyrig(aht © 2001 John Wiley & Sons, Ltd. J. (a). Ecn. 16: 289-326 (2001)
where B*(a) = B1(a) - w'Bk(a) is independent of Bk(a) and Bk+1(a) = (B1(a), Bk(a)')' is
titioned according to z, = (Yt, x)', a e [0, 1]. Hence, the second term in (A7) has the follo
asymptotic representation:
dW,(a 1 (a I 1 da
jd lW
aJ-a- (a) (Wk
- a- - ± (
x /l ( + (a-2 ) dl,(a) (A9)
Note that dWu(a) in (A9) may be replaced by dW, (a), a e [0, 1].
the result of Theorem 3.1.
For the remaining cases, we need only make minor modificati
In Case I, 8 = (fyir, 'il) with (P, fyi, ) a basis for Rk+1 and
-1 = (tr, Z 1)', we have
Ik+
and, consequently, we define ~ as in Case IV
8= Ik+l (fyl(, B) and B

Case III is similar to Case I as is Case V. E
Proof of Corollary 3.1 Follows immediately from Theorem 3.1 by setting r = k.
Proof of Corollary 3.2 Follows immediately from Theorem 3.1 by setting r = O.E
Proof of Theorem 3.2 We provide a prooffor Case V which may be simply adapted for
and III. To emphasize the potential dependence of the limit distribution on nuisance paramete
the proof is initially conducted under Assumptions 1-4 together with Assumption 5a which i
Ho'' tyy = 0 but not necessarily HO ' =rV.xX 0'; in particular, note that we may write a
(1, -o')' for some k-vector 0. The t-statistic for Ho'"! ' y = 0 may be expressed as the
root of
Ay P _, Z-_A1AT ATZ/P- Z_iA -A ^Z/'P Ay/8; (A 10)

where AT T-1/2(, T-1/2BT) and BT = (yL, pl). Note that only the diagonal element of the
inverse in (A10) corresponding to yL is relevant, which implies that we only need to consider
the blocks T2B'ZP 1P Z1BT and T-1B'Z' P Ay in (A10). Therefore, using (A2) and
(A3), (A10) is asymptotically equivalent to
Tr-lP_iZiBr
T--UPlB( 2BT T (T ZI ZiBr)1
--B'' T -TZ
21B B T) T-1BT ^P i/r- (All)
' A0111
where Px : '- ITX- 1I - x xx -1 x )X-fiX) iX' Now,

1 I_ a I._I I I
T fi- X X[Ta] (, flX fiL)[(ayi a.Lr r , k._ 1 (a)
= rxg
o- (Ix PAx)[a , (r. -Wi
© [2x0 Jhxx y,xy
- )fia
yS..x] pxx
a -1k B
(2)(a)
where, for convenience, but without loss of generality, we have set y = (p1
(0, y), ) y = Yxy /Yy.x, Yy. Yyy - Yx y y. Yyx - 0 x and (a) B (a)
B,(a) = B1(a) - 0/Bk(a), a e [0, 1]. Hence, (All) weakly converges to
,{ B
- /
x axx u \UIUk xx
x al
Under
012W1 (a) and ax'B(a)[= a Bk (a)] (a), a e [0, 1].
Proof of Corollary 3.3 Follows immediately from Theorem 3.2 by setting r = k.
Proof of Corollary 3.4 Follows immediately from Theorem 3.2 by setting r = O.E
APPENDIX B: PROOFS FOR SECTION 4
Proof of Theorem 4.1 Again, we consider Case IV; the remaining Cases I-III and V may be
dealt with similarly. Under H ' : .7r,y / 0, Assumption Sb holds and, thus, n = ay py + ap' where
ay = (a, 0')' and fy = (Pyy, Px)'; see above Assumption 5b. Under Assumptions 1-4 and 5b,
the process {zt]lz has the infinite moving-average representation, zt = /i + yt + Cst + C*(L)st,
where now C - '[a "'p1]-la '. We redefine P* and 8 as the (k + 2, r + 1) and (k + 2, k - r
matrices,
and ,
8 (Y -f )1
1jk+1/
where is a (k + 1, k - r) matrix whose columns are a basis for the orthogonal complement of
(fly, j). Hence, (y, , (y, , ) is a basis for Z k+1 and, thus, (I*, , 8) a basis for Zk+2, where again
4 is the (k + 2)-unit vector (1, 0')'. It therefore follows that
T- 1/28/ = T- 1/2p/I + T- 112 ICS[Ta] + p'T- l/2C*(L)e8[Ta] = "CB (a)
Also, as above, T-10'z* = T-1t = a and P*z7 = (fy, P)Itt + (fy, 8)'C*(L)st = Op(l).
The Wald statistic (21) multiplied by eui may be written as
(B1)
(B 1)
where , (, a)'(1, -w')', AT - T-1/2(, T-1/2B) and Br - (, T-1/2k). Note that (A6)
continues to hold under H!'' 71 T y k O. A similar argument to that in the Proof of Theorem 3.1
demonstrates that the first term in (Bl) divided by w,,c has the limiting representation
z. l + j dWI(a)Fk -r(a)' ( Fk_,(a)Fk ,(a)'da) Fk,r(a)dW,(a) (B2)
where z,.+ 1 N(O, I,+i), Fk_,(a) = (Wk_,(a)', a - ) and Wk_ (a) (a 'Qa )- 1/
is a (k - r)-vector of de-meaned independent standard Brownian motions independen
standard Brownian motion W,(a), a e [0, 1]; cf (22). Now, fo Fk_,(a)dW,(a) is mixed
with conditional variance matrix fo Fk_,-(a)Fk_,(a)'da. Therefore, the second term i
unconditionally distributed as a X2(k - r) random variable and is independent of the first
(A4). Hence, the first term in (Bl) divided by Iw,i has a limiting X2(k + 1) distribution.
The second term in (Bl) may be written as
2(1, -w')(ay, a)pZ P u = 2T1/2(, -w')(a a) (Tr- / Z Z*' P ) = O(T1

and the third term as
(1, -w')(ay, a)p'* IPz_ Z* _(al,a), a)'(1,-w')'
=T(1, -w')(ay, a) (T-1B Z* P Z*fi) (a,, a)'(1, -w')' = Op(T) (B4)

as T- ' Z*P- Z-* 1 f converges in probability to a positive definite matrix. Moreover, as
(1, -w')(ay, a) k 0' under H1 : tyy 0, the Theorem is proved.
Proof of Theorem 4.2 A similar decomposition to (Bl) for the Wald statistic (21) holds under
HIV n Ho-'- except that f, and 8 are now as defined in the Proof of Theorem 3.1. Although
H 7ryy = 0 holds, we have Hl' ' Jryx, O'. Therefore, as in Theorem 3.2, note that we may
write al = (1, -')' for some k-vector 0 = w. Consequently, the first term divided by w,, may be
written as
T-1i/- 2* ~Ifi T-1 ~ *P * 1 P */ l/

T-pz_ U Z1 ,- ,V -1P^Z zz_1fi* - 1PÂ_fi/O)
+T-2UiZ*
-1 1BT
T [T-2B/rTZ*
T -1 lZ- iB B'Z- i/,,1 + o
cf (A7). As in the Pro
where z,. ~ N(0, I,.); cf
/ ( Bf(a) /' Bf(a) I Bf(a) ' -1

Jo a1i Xk(a) jo\ aX 1 /
[I 7(/% Bk(a) \
x ( cx^Bk(a)
Coyrgh dB(a)/w,,,
©201 on ily Sns td J Api = Op(1)
Eon 6:28-36 201
JO \ a--
J. Appl. Econ. 16: 28

Copyright © 2001 John Wiley & Sons, Ltd.
where B (a) Bl(a) - 'Bk(a), a e [0, 1]; cf. Proof of Theorem 3.2. The seco
becomes
2(1, -w')a Z* lP--_u = 2T12(1, -w')a (Tr-/2' Pz ) = Op(T )

and the third term
(1,-w')iZ*_iP _Z* ,B a' (1, -w') = T(1,-w')a
x (T-ljB,Z*iP - Zi ) a'(1, -uw') = Op(T)
The Theoremfollows as (1, -w')a , 0' under H :r = O and Hr r : O .
Proof of Theorem 4.3 We concentrate on Case IV; the remaining Cases I-III and V are
proved by a similar argument. Let {ztr}lI denote the process under HIT of (26). Hence,
41(L)(ztT - L -yt) = StT, where tT - (fT - n)[z(t-_)T - - y(t - 1)] + Et and nT - n is
given in (27). Therefore, A(ztT - / - yt) = C tT+ C*(L)AtT, C(z) = C + (1 - z)C*(z) and
C = (fi-, B')[(a, a')'F(fl, al')] -(a, a1)', and thus,
[Ik+l - (Ik+1 + T- Ca,fy'y)L](ztT - - t) = CetT + C*(L)A tT (B6)

where
etT =T-1 (2 f i [Z(t)r - - y(t - 1)] + t, t = 1, ..., T, T = 1, 2, ...
Inverting (B6) yields
s-1
ZtT = (Ik+1 + Tr Cai)(ZST - - ys) + -t + Yt + (I+l + T- Ca,

i=0
x[Cs(t-i)T + C*(L)AU(t-i)T]
Note that ArT = (tri - n)A[z(t-_ ) - t - y(t - 1)] + Ast. It thereforefollows tha
( (g,, B )'CJk+l(a), where 8 is defined above Lemma A.1 and zt* = (t, z'T', Jk+
{ayfi,C(a - r)dBk+l (r) is an Ornstein-Uhlenbeck process and Bk+l (a) is a (k + 1)
nian motion with variance matrix Q, a E [0, 1]; cf Johansen (1995, Theorem 14.1
Similarly to (A4),
T ',Z*<-P1T * Ifi o0,

A'Z_' P Z iA = (T T IBA_*B
r -1_ ) +op(1)
-- 1 rT
Therefore, expression (Bl) for the Wald statistic (21) multiplied by ,,,, is
<y,,,,W T-2 !i --,-! )1

c T- oaWyP Z*- A1 * (T- 1 P/Z* Pjz Z- 1 i) *
+T z BT [T BT*1*BT BZ*PZ y+op(1) (B7) I
- r-2a'yP"
AP-IIZ
~ B-
TT,
-1B'Z-1
^,Br]
1PB_
rZ!P _ Ay + op(1) (B
J.Ltd.
Copyright © 2001 John Wiley & Sons, Appl. Econ. 16:
The first term in (B7) may be written as
rfiP--1 ~ T ,- ~ z -1 ,
+ 2T'i'Pz Z2* f(T l-' /Z* P^ Z-1 ) p Z* P Z*<
+wT
yT -1rT, 1z(laP^z
-1 yT_ 2* ( (T
where T T- 1auTyyfi' + T1/2( - w
T 2pl/2 /*1P'
rT-/2* P*'-z2*i<_-T1/2Z/
Z*- 1T, = T 2*'/2
p Z* 22 Z* (, _lPay
lP--
= T-lf*Z*lPAzZ* + op(l) (B9)
where we have made use of T- 112jy*z[Ta]T =^ ' CJk+l

re-expressed as
[(T //*ZpI f) + Qr] Q1 [(T Z Pz u) + Q

(B9)
12,-1 B* -- Z* * '
where Q plimT,, (T-1'Z*iP_ _ Z* I) and z, - N(Q1/2, I,).
As PAZ_ y = PA (Z_-I 7r + u), T-1 Bf1PAZ = T BZ PAZ (ZiX1 ,+ Ui).
Consider the second term in (B7), in particular, T- 1B/Z*' P * T which after substitution
for 7*y becomes
B/ZlT-2B'P ZL-Y*ay
=_Z+ T-32BT*
T7-1 ' , P_cif,P~_Z*_,y
Z- = '* ,T-B + o1 +
) op(l))
1 ( fy, fi )CJk+l (a) JT+fl ( La) ayyda
Therefore,
T- iBTZ* 1PXZ_ Ay I )C (a) ( dWi, (a)+ Jk+1 (a)'C' yl)yyda)

Consider
;k-,-+l(a)[= (J(a), JK,-) [(aO, )'Q(aa, )]-12c k a I ) (a)

=( cC- 1/2 J((a) \
~T- r__P~ y=
U\L~~t/
k *I
f01 /2' -a.x'Jk(a)
-'1 .ab)-1/2
(axxQ
1 J u\U/
where Ju(a) = J1(a) - W'Jk(a) is independentf k(a) and o k+l(a) (Ji(a), Jk(a)')', a e [0, 1].
Now , Jk r+l(a) satisfies the stochastic integral and differential equations, Jk*-,+l (a) = Wk-r+
(a) + ab' Jo Jk- + (r) dr and dJk.+ 1 (a) = dWk-,.+1 (a) + ab'Jl .+1 (a) da, where a = [(a, a)'
Ž(af, al)]-1/2(f, a)'a and b = [(aL, a)'Q)(al, a')]1/2 x [(f, Bly)'r(al, a)]-( Y )
fy; cf. Johansen (1995, Theorem 14.4, p. 207). Note that the first element of Jk*-_+l (a) satisfies
J(a) = Wu(a) + wouu-/2ayb' f i Jk- ( (r) dr and dJ( = dW(a) = d +(a)C + 1/2 ' 1 (a) da.
Copyright © 2001 J. Appl.
John WileE
Therefore,
T-1BT-*'IP-_AY
B lB Z*P j ((IY,
:=o CJk+l
d1*( (a) ) )1/2 d(a)
T -J I \ a I ^
Hence, the second term in (B7) we
W, I dJ) (a) 'Fk_, +

where Fk-r+l (a) = (Jk-r+ (a)',
Combining (B9) and (BO1) gives
H1T of (26) and noting dJ*(a) ma
Proof of Theorem 4.4 We consid

similarly. Under
__ A
H y: tyy
A_Z
# 0
V-1 = (O, V1,
v'lP AZ-.X l-1 1 AZY- 1X
'~1P~'~_, - _ V- 1·
As in Appendix A,
T-1/2PX[Ta] = T-1/2 x1fx + T-1 2x YXt + T- (x

+ (0, ,B ')T- 1/2C*(L)e[Ta]
and noting that = 0, -8tXx = T-l/2f' + T-1/24xxy xt +
A i P/fxX _ 1 -0P " _^ tT 01 x
where AX - T- 1/2 (BX, T- /2 )

Now, because T- 1 Bx ^v/ = Op(l), T- BlXX' AZ = Op(l), T-'AZ_AZ_
T-1 AZ_v-i = Op(l), hence T- BXX' 1P vi
xX-P-_ = Op(l ). Also
v_t-^-I because T-l1 -
= O-M
and T-1 '^/XZ' 1A = Op(l), hence T
T- 1 X' P X - rlxx = Op(l) and T-2ix' P^ X-i = O,(l),
where 1P X_ A , P -P X 1 1 1frx 1 X 1 and

i (P^X_i ' l /X' P^- X i) 1X'1 . Therefore, as T- v' iv_1 =
T- yi 1P^ x_Y- = Op(l) (Bll)
The numerator of tf_ of (24) may be written as Ty' _P^

2_1i, where X (Py, B)(ay, a)'(1, -w')'. Because T -'/2B X
Op(l), T- /2 Xi 1P^ u = Op(l), and, as T- 1A 'X' uI = Op(l), T- 'X' P u = Op (1).

Therefore,
T-1/2vlPZXu = ^T-1/2v/ -lP - T-/2/ -P fli + Op(1)

-l/2v_P^ , u + op(l) = Op(l)
noting T-1/2v u = Op(l). Similarly, as (1 - )(, a) 7 0', T-/Z'' 1AZ_ = Op(1), T-'1Z'2
X- iPxx_ = Op(l) and T-1''J IX_ i-l = Op(l). Therefore,
T-' 1i"' P Z 1=T 1 P, x 1X- T P_ X X_ + op(1)

T'B',P 1PAZ„X„Z-^ I T-1PAZ„X„A,i_lI -A1Z V-1
= T-'vip' 1P_' z-_ i + op(1)= op(l)
_ 1 AZ_,-_ip..r
noting T- ' 1Z _1 = Op(l). Thus,
T-1/2vP' 1 Z_1- = Op(T12). (B12)

Because i),, - w,, = op(1), combining (Bll) and (B12) yields the desired result.
ACKNOWLEDGEMENTS
We are grateful to the Editor (David Hendry) and three anonymous refere
comments on an earlier version of this paper. Our thanks are also owed to Mic
Burridge, Clive Granger, Brian Henry, Joon-Yong Park, Ron Smith, Rod Whit
participants at the University of Birmingham. Partial financial support from
R000233608 and R000237334) and the Isaac Newton Trust of Trinity Coll
gratefully acknowledged. Previous versions of this paper appeared as DAE Wor
Nos. 9622 and 9907, University of Cambridge.
REFERENCES
Banerjee A, Dolado J, Galbraith JW, Hendry DF. 1993. Co-Integration, Error Correction,
metric Analysis of Non-Stationary Data. Oxford University Press: Oxford.
Banerjee A, Dolado J, Mestre R. 1998. Error-correction mechanism tests for cointegration in
framework. Journal of Time Series Analysis 19: 267-283.
Banerjee A, Galbraith JW, Hendry DF, Smith GW. 1986. Exploring equilibrium relationships
rics through static models: some Monte Carlo Evidence. Oxford Bulletin of Economics and
253-277.
Blanchard OJ, Summers L. 1986. Hysteresis and the European Unemployment Problem. In N
conomics Annual 15-78.
Boswijk P. 1992. Cointegration, Identification and Exogeneity: Inference in Structural Error C
Models. Tinbergen Institute Research Series.
Boswijk HP. 1994. Testing for an unstable root in conditional and structural error correction mode
of Econometrics 63: 37-70.
Boswijk HP. 1995. Efficient inference on cointegration parameters in structural error correctio
Journal of Econometrics 69: 133-158.
Cavanagh CL, Elliott G, Stock JH. 1995. Inference in models with nearly integrated regressors. Ec
Theory 11: 1131-1147.
Chan A, Savage D, Whittaker R. 1995. The new treasury model. Government Econo
Paper No. 128, (Treasury Working Paper No. 70).
Darby J, Wren-Lewis S. 1993. Is there a cointegrating vector for UK wages? Journal
20: 87-115.
Dickey DA, Fuller WA. 1979. Distribution of the estimators for autoregressive time series wi
Journal of the American Statistical Association 74: 427-431.
Dickey DA, Fuller WA. 1981. Likelihood ratio statistics for autoregressive time series wit
Econometrica 49: 1057-1072.
Engle RF, Granger CWJ. 1987. Cointegration and error correction representation: estimation a
Econometrica 55: 251-276.
Granger CWJ, Lin J-L. 1995. Causality in the long run. Econometric Theory 11: 530-536.
Hansen BE. 1995. Rethinking the univariate approach to unit root testing: using covariates to incre
Econometric Theory 11: 1148-1171.
Harbo I, Johansen S, Nielsen B, Rahbek A. 1998. Asymptotic inference on cointegrating rank
systems. Journal of Business Economics and Statistics 16: 388-399.
Hendry DF, Pagan AR, Sargan JD. 1984. Dynamic specification. In Handbook of Econometri
Griliches Z, Intriligator MD (des). Elsevier: Amsterdam.
Johansen S. 1991. Estimation and hypothesis testing of cointegrating vectors in Gaussian vector au
sive models. Econometrica 59: 1551-1580.
Johansen S. 1992. Cointegration in partial systems and the efficiency of single-equation analysis.
Econometrics 52: 389-402.
Johansen S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxf
versity Press: Oxford.
Kremers JJM, Ericsson NR, Dolado JJ. 1992. The power of cointegration tests. Oxford Bulletin of E
and Statistics 54: 325-348.
Layard R, Nickell S, Jackman R. 1991. Unemployment. Macroeconomic Performance and the
Market. Oxford University Press: Oxford.
Lindbeck A, Snower D. 1989. The Insider Outsider Theory of Employment and Unemployment, MIT
Cambridge, MA.
Manning A. 1993. Wage bargaining and the Phillips curve: the identification and specification of ag
wage equations. Economic Journal 103: 98-118.
Nickell S, Andrews M. 1983. Real wages and employment in Britain. Oxford Economic Papers 35:
Nielsen B, Rahbek A. 1998. Similarity issues in cointegration analysis. Preprint No. 7, Departm
Theoretical Statistics, University of Copenhagen.
Park JY. 1990. Testing for unit roots by variable addition. In Advances in Econometrics: Cointe
Spurious Regressions and Unit Roots, Fomby TB, Rhodes RF (eds). JAI Press: Greenwich, CT.
Pesaran MH, Pesaran B. 1997. Working with Microfit 4.0: Interactive Econometric Analysis, Oxford
sity Press: Oxford.
Pesaran MH, Shin Y. 1999. An autoregressive distributed lag modelling approach to cointegration an
Chapter 11 in Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Cen
Symposium, Strom S (ed.). Cambridge University Press: Cambridge.
Pesaran MH, Shin Y, Smith RJ. 2000. Structural analysis of vector error correction models with ex
I(1) variables. Journal of Econometrics 97: 293-343.
Phillips AW. 1958. The relationship between unemployment and the rate of change of money wage
the United Kingdom, 1861-1957. Economica 25: 283-299.
Phillips PCB, Durlauf S. 1986. Multiple time series with integrated variables. Review of Economic
53: 473-496.
Phillips PCB, Ouliaris S. 1990. Asymptotic properties of residual based tests for cointegration
58: 165-193.
Phillips PCB, Solo V. 1992. Asymptotics for linear processes. Annals of Statistics 20: 971-1
Rahbek A, Mosconi R. 1999. Cointegration rank inference with stationary regressors in VA
Econometrics Journal 2: 76-91.
Sargan JD. 1964. Real wages and prices in the U.K. Econometric Analysis of National Economic Pla
Hart PE Mills G, Whittaker JK (eds). Macmillan: New York. Reprinted in Hendry DF, Wallis KF
Econometrics and Quantitative Economics. Basil Blackwell: Oxford; 275-314.
Shin Y. 1994. A residual-based test of the null of cointegration against the alternative of no cointegration.
Econometric Theory 10: 91-115.
Stock J, Watson MW. 1988. Testing for common trends. Journal of the American Statistical Association 83:
1097-1107.
Urbain JP. 1992. On weak exogeneity in error correction models. Oxford Bulletin of Economics
52: 187-202.
J. Appl. Econ. 16: 289– 326 (2001)
DOI: 10.1002/jae.616

M. HASHEM PESARAN,a * YONGCHEOL SHINb AND RICHARD J. SMITHc

SUMMARY
This paper develops a new approach to the problem of testing the existence of a level relationship between
a dependent variable and a set of regressors, when it is not known with certainty whether the underlying
regressors are trend- or first-difference stationary. The proposed tests are based on standard F- and t-statistics
used to test the significance of the lagged levels of the variables in a univariate equilibrium correction
mechanism. The asymptotic distributions of these statistics are non-standard under the null hypothesis that
there exists no level relationship, irrespective of whether the regressors are I0 or I1. Two sets of asymptotic
critical values are provided: one when all regressors are purely I1 and the other if they are all purely
I0. These two sets of critical values provide a band covering all possible classifications of the regressors
into purely I0, purely I1 or mutually cointegrated. Accordingly, various bounds testing procedures are
proposed. It is shown that the proposed tests are consistent, and their asymptotic distribution under the null
and suitably defined local alternatives are derived. The empirical relevance of the bounds procedures is
demonstrated by a re-examination of the earnings equation included in the UK Treasury macroeconometric
model. Copyright  2001 John Wiley & Sons, Ltd.
1. INTRODUCTION
Over the past decade considerable attention has been paid in empirical economics to testing for
the existence of relationships in levels between variables. In the main, this analysis has been
based on the use of cointegration techniques. Two principal approaches have been adopted: the
two-step residual-based procedure for testing the null of no-cointegration (see Engle and Granger,
1987; Phillips and Ouliaris, 1990) and the system-based reduced rank regression approach due to
Johansen (1991, 1995). In addition, other procedures such as the variable addition approach of Park
(1990), the residual-based procedure for testing the null of cointegration by Shin (1994), and the
stochastic common trends (system) approach of Stock and Watson (1988) have been considered.
All of these methods concentrate on cases in which the underlying variables are integrated of order
one. This inevitably involves a certain degree of pre-testing, thus introducing a further degree of
uncertainty into the analysis of levels relationships. (See, for example, Cavanagh, Elliott and Stock,
1995.)
Ł Correspondence to: M. H. Pesaran, Faculty of Economics and Politics, University of Cambridge, Sidgwick Avenue,
Copyright  2001 John Wiley & Sons, Ltd. Received 16 February 1999
I(0), purely I(1) or mutually cointegrated. The statistic underlying our procedure is the familiar
Wald or F-statistic in a generalized Dicky–Fuller type regression used to test the significance
of lagged levels of the variables under consideration in a conditional unrestricted equilibrium
correction model (ECM). It is shown that the asymptotic distributions of both statistics are
non-standard under the null hypothesis that there exists no relationship in levels between the
included variables, irrespective of whether the regressors are purely I(0), purely I(1) or mutually
cointegrated. We establish that the proposed test is consistent and derive its asymptotic distribution
under the null and suitably defined local alternatives, again for a set of regressors which are a
mixture of I0/I1 variables.
Two sets of asymptotic critical values are provided for the two polar cases which assume that all
the regressors are, on the one hand, purely I(1) and, on the other, purely I(0). Since these two sets
of critical values provide critical value bounds for all classifications of the regressors into purely
I(1), purely I(0) or mutually cointegrated, we propose a bounds testing procedure. If the computed
ratio (unemployment benefit–wage ratio) and the wedge between the ‘real product wage’ and the
‘real consumption wage’ that typically enter the earnings equation. There is another consideration
in the choice of this application. Under the influence of the seminal contributions of Phillips (1958)
the development of time series econometrics in the UK. Sargan’s work is particularly noteworthy
The relationship in levels underlying the UK Treasury’s earning equation relates real average
et al. (1991). In order to identify our model as corresponding to the bargaining theory of wage
vice versa; see Manning (1993). This assumption, of course, does not preclude the rate of change
Copyright  2001 John Wiley & Sons, Ltd. J. Appl. Econ. 16: 289–326 (2001)
A number of conditional ECMs in these five variables were estimated and we found that, if a
sufficiently high order is selected for the lag lengths of the included variables, the hypothesis that
there exists no relationship in levels between these variables is rejected, irrespective of whether
they are purely I(0), purely I(1) or mutually cointegrated. Given a level relationship between these
variables, the autoregressive distributed lag (ARDL) modelling approach (Pesaran and Shin, 1999)
The plan of the paper is as follows. The vector autoregressive (VAR) model which underpins
the analysis of this and later sections is set out in Section 2. This section also addresses the
issues involved in testing for the existence of relationships in levels between variables. Section 3
considers the Wald statistic (or the F-statistic) for testing the hypothesis that there exists no
level relationship between the variables under consideration and derives the associated asymptotic
theory together with that for the t-statistic of Banerjee et al. (1998). Section 4 discusses the power
properties of these tests. Section 5 describes the empirical application. Section 6 provides some
concluding remarks. The Appendices detail proofs of results given in Sections 3 and 4.
The following notation is used. The symbol ) signifies ‘weak convergence in probability
measure’, Im ‘an identity matrix of order m’, Id ‘integrated of order d’, OP K ‘of the same
order as K in probability’ and oP K ‘of smaller order than K in probability’.

Let fzt g1
denote a k C 1-vector random process. The data-generating process for fzt g1
tD1 tD1 is the
8Lzt m gt D et , t D 1, 2, . . . 1
where L is the lag operator, m and g are unknown k C 1-vectors

p of intercept and ptrend coefficients,
the k C 1, k C 1 matrix lag polynomial 8L D IkC1 iD1 8i L i with fi giD1 k C 1, k C 1
matrices of unknown coefficients; see Harbo et al. (1998) and Pesaran, Shin and Smith (2000),
henceforth HJNR and PSS respectively. The properties of the k C 1-vector error process fet g1
tD1
are given in Assumption 2 below. All the analysis of this paper is conducted given the initial
observations Z0 z1p , . . . , z0 . We assume:
p
Assumption 1. The roots of jIkC1 iD1 8i zi j D 0 are either outside the unit circle jzj D 1 or
satisfy z D 1.
Assumption 2. The vector error process fet g1

tD1 is IN0, Z, Z positive definite.
Assumption 1 permits the elements of zt to be purely I(1), purely I(0) or cointegrated but excludes
the possibility of seasonal unit roots and explosive roots.1 Assumption 2 may be relaxed somewhat
to permit fet g1
tD1 to be a conditionally mean zero and homoscedastic process; see, for example,
We may re-express the lag polynomial 8L in vector equilibrium correction model (ECM)
form; i.e. 8L 5L C 0L1 L in which the long-run multiplier matrix is defined by 5
1 Assumptions 5a and 5b below further restrict the maximal order of integration of fzt g1
tD1 to unity.
p p1 i
p iD1 8i , and the short-run response matrix lag polynomial 0L IkC1 iD1 0i L ,
IkC1
0i D jDiC1 j , i D 1, . . . , p 1. Hence, the VAR(p) model (1) may be rewritten in vector
ECM form as

p1
zt D a0 C a1 t C 5zt1 C 0i zti C et t D 1, 2, . . . 2
iD1
where  1 L is the difference operator,
a0 5m C 0 C 5g, a1 5g 3

p1 p
and the sum of the short-run coefficient matrices 0 Im iD1 0i D 5 C iD1 i8i . As
detailed in PSS, Section 2, if g 6D 0, the resultant constraints (3) on the trend coefficients a1
in (2) ensure that the deterministic trending behaviour of the level process fzt g1 tD1 is invariant to
the (cointegrating) rank of 5; a similar result holds for the intercept of fzt g1
tD1 if m 6D 0 and g D 0.
Consequently, critical regions defined in terms of the Wald and F-statistics suggested below are
The focus of this paper is on the conditional modelling of the scalar variable yt given the k-
vector xt and the past values fzti gt1 0 0
iD1 and Z0 , where we have partitioned zt D yt , xt . Partitioning
0 0 0 0 0
the error term et conformably with zt D yt , xt as et D εyt , ext and its variance matrix as

ωyy wyx
ZD
wxy xx
we may express εyt conditionally in terms of ext as
εyt D wyx Z1

xx ext C ut 4
where ut ¾ IN0, ωuu , ωuu ωyy wyx Z1 xx wxy and ut is independent of ext . Substitution of (4)
into (2) together with a similar partitioning of a0 D ay0 , a0x0 0 , a1 D ay1 , a0x1 0 , 5 D p0y , 50x 0 ,
0 D g0y , 00x 0 , 0i D g0yi , 00xi 0 , i D 1, . . . , p 1, provides a conditional model for yt in terms of
zt1 , xt , zt1 , . . .; i.e. the conditional ECM

p1
yt D c0 C c1 t C py.x zt1 C y0i zti C w0 xt C ut t D 1, 2, . . . 5
iD1
where w 1 0 0 0 0
xx wxy , c0 ay0 w ax0 , c1 ay1 w ax1 , yi gyi w 0xi , i D 1, . . . , p 1, and
0
py.x py w x . The deterministic relations (3) are modified to
c0 D py.x m C gy.x C py.x g c1 D py.x g 6
where gy.x gy w0 0x .
We now partition the long-run multiplier matrix 5 conformably with zt D yt , x0t 0 as

!yy pyx
D
pxy 5xx
Assumption 3. The k-vector pxy D 0.
In the application of Section 6, Assumption 3 is an identifying assumption for the bargaining


p1
xt D ax0 C ax1 t C 5xx xt1 C 0xi zti C ext t D 1, 2, . . . . 7
iD1
Thus, we may regard the process fxt g1 1

tD1 as long-run forcing for fyt gtD1 as there is no feedback
3
from the level of yt in (7); see Granger and Lin (1995). Assumption 3 restricts consideration to
cases in which there exists at most one conditional level relationship between yt and xt , irrespective
4
of the level of integration of the process fxt g1
tD1 ; see (10) below.

p1
yt D c0 C c1 t C !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 8
iD1
t D 1, 2, . . ., where
c0 D !yy , pyx.x m C [gy.x C !yy , pyx.x ]g, c1 D !yy , pyx.x g 9

0
and pyx.x pyx w 5xx .5
the system.
Assumption 4. The matrix 5xx has rank r, 0 r k.
Under Assumption 4, from (7), we may express 5xx as 5xx D axx b0xx , where axx and bxx are both
k, r matrices of full column rank; see, for example, Engle and Granger (1987) and Johansen
1, 3 and 4, the process fxt g1tD1 is mutually cointegrated of order r, 0 r k. However, in
concentrate on the case r D 0, we do not wish to impose an a priori specification of r.6 When
pxy D 0 and 5xx D 0, then xt is weakly exogenous for !yy and pyx.x D pyx in (8); see, for example,
3 Note that this restriction does not preclude fyt g1 1

tD1 being Granger-causal for fxt gtD1 in the short run.
4 Assumption 3 may be straightforwardly assessed via a test for the exclusion of the lagged level yt1 in (7). The
asymptotic properties of such a test are the subject of current research.
5 PSS and HJNR consider a similar model but where x is purely I1; that is, under the additional assumption 5 D 0.
t xx
If current and lagged values of a weakly exogenous purely I0 vector wt are included asadditional explanatory variables
in (8), the lagged level vector xt1 should be augmented to include the cumulated sum t1 sD1 ws in order to preserve the
asymptotic similarity of the statistics discussed below. See PSS, sub-section 4.3, and Rahbek and Mosconi (1999).
6 BDM, pp. 277– 278, also briefly discuss the case when 0 < r k. However, in this circumstance, as will become clear
below, the validity of the limiting distributional results for their procedure requires the imposition of further implicit and
Johansen (1995, Theorem 8.1, p. 122). In the more general case where 5xx is non-zero, as !yy and
pyx.x D pyx w0 5xx are variation-free from the parameters in (7), xt is also weakly exogenous for
matrix 5 for the system (8) and (7) is r C 1 and the minimal cointegrating rank of 5 is r. The
and (7) to be unity. First, we consider the requisite conditions for the case in which rank5 D r.
In this case, under Assumptions 1, 3 and 4, !yy D 0 and pyx f0 5xx D 00 for some k-vector f.
Note that pyx.x D 00 implies the latter condition. Thus, under Assumptions 1, 3 and 4, 5 has rank
r and is given by
0 pyx
D
0 5xx
Hence, we may express 5 D ab0 where a D a0yx , a0xx 0 and b D 0, b0xx 0 are k C 1, r matrices of
full column rank; cf. HJNR, p. 390. Let the columns of the k C 1, k r C 1 matrices a? ?
y ,a
? ? ? ? ? ?
and by , b , where ay , by and a , b are respectively k C 1-vectors and k C 1, k r
matrices, denote bases for the orthogonal complements of respectively a and b; in particular,
a? ? 0 ? ? 0
y , a a D 0 and by , b b D 0.
Assumption 5a. If rank5 D r, the matrix a? ? 0 ? ?

y , a 0by , b is full rank k r C 1, 0 r k.

Second, if the long-run multiplier matrix 5 has rank r C 1, then under Assumptions 1, 3 and 4,
!yy 6D 0 and 5 may be expressed as 5 D ay b0y C ab0 , where ay D ˛yy , 00 0 and by D ˇyy , b0yx 0
are k C 1-vectors, the former of which preserves Assumption 3. For this case, the columns of a?
and b? form respective bases for the orthogonal complements of ay , a and by , b; in particular,
a?0 ay , a D 0 and b?0 by , b D 0.
Assumption 5b. If rank5 D r C 1, the matrix a?0 0b? is full rank k r, 0 r k.
Assumptions 1, 3, 4 and 5a and 5b permit the two polar cases for fxt g1 1
tD1 . First, if fxt gtD1 is a
purely I0 vector process, then 5xx , and, hence, axx and bxx , are nonsingular. Second, if fxt g1 tD1
is purely I1, then 5xx D 0, and, hence, axx and bxx are also null matrices.
Using (A.1) in Appendix A, it is easily seen that py.x zt m gt D py.x CŁ Let , where
fCŁ Let g is a mean zero stationary process. Therefore, under Assumptions 1, 3, 4 and 5b, that is,
!yy 6D 0, it immediately follows that there exists a conditional level relationship between yt and
xt defined by
yt D (0 C (1 t C qxt C vt , t D 1, 2, . . . 10
where (0 py.x m/!yy , (1 py.x g/!yy , q pyx.x /!yy and vt D py.x CŁ Lεt /!yy , also a zero mean
stationary process. If pyx.x D ˛yy b0yx C ayx w axx b0xx 6D 00 , the level relationship between yt
and xt is non-degenerate. Hence, from (10), yt ¾ I0 if rankbyx , bxx D r and yt ¾ I1 if
rankbyx , bxx D r C 1. In the former case, q is the vector of conditional long-run multipliers and,
in this sense, (10) may be interpreted as a conditional long-run level relationship between yt and
xt , whereas, in the latter, because the processes fyt g1 1
tD1 and fxt gtD1 are cointegrated, (10) represents
the conditional long-run level relationship between yt and xt . Two degenerate cases arise. First,
if !yy 6D 0 and pyx.x D 00 , clearly, from (10), yt is (trend) stationary or yt ¾ I0 whatever the
value of r. Consequently, the differenced variable yt depends only on its own lagged level yt1
in the conditional ECM (8) and not on the lagged levels xt1 of the forcing variables. Second, if
!yy D 0, that is, Assumption 5a holds, and pyx.x D ayx w0 axx b0xx 6D 00 , as rank5 D r, pyx.x D
f w0 axx b0xx which, from the above, yields pyx.x xt mx gx t D py.x CŁ Let , t D 1, 2, . . .,
where m D )y , m0x 0 and g D *y , g0x 0 are partitioned conformably with zt D yt , x0t 0 . Thus, in
(8), yt depends only on the lagged level xt1 through the linear combination f w0 axx of the
lagged mutually cointegrating relations b0xx xt1 for the process fxt g1 tD1 . Consequently, yt ¾ I1
whatever the value of r. Finally, if both !yy D 0 and pyx.x D 00 , there are no level effects in the
conditional ECM (8) with no possibility of any level relationship between yt and xt , degenerate
or otherwise, and, again, yt ¾ I1 whatever the value of r.
Therefore, in order to test for the absence of level effects in the conditional ECM (8) and, more
crucially, the absence of a level relationship between yt and xt , the emphasis in this paper is a
test of the joint hypothesis !yy D 0 and pyx.x D 00 in (8).7,8 In contradistinction, the approach of
yt D c0 C c1 t C ˛yy ˇyy yt1 C b0yx xt1 C ayx w0 axx b0xx xt1

p1
C y0i zti C w0 xt C ut 11
iD1
BDM test for the exclusion of yt1 in (11) when r D 0, that is, bxx D 0 in (11) or 5xx D 0 in
(7) and, thus, fxt g is purely I1; cf. HJNR and PSS.9 Therefore, BDM consider the hypothesis
˛yy D 0 (or !yy D 0).10 More generally, when 0 < r k, BDM require the imposition of the
untested subsidiary hypothesis ayx w0 axx D 00 ; that is, the limiting distribution of the BDM test
is obtained under the joint hypothesis !yy D 0 and pyx.x D 0 in (8).
In the following sections of the paper, we focus on (8) and differentiate between five cases of
ž Case I (no intercepts; no trends) c0 D 0 and c1 D 0. That is, m D 0 and g D 0. Hence, the
ECM (8) becomes

p1
yt D !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 12
iD1
ž Case II (restricted intercepts; no trends) c0 D !yy , pyx.x m and c1 D 0. Here, g D 0. The
ECM is

p1
yt D !yy yt1 )y C pyx.x xt1 mx C y0i zti C w0 xt C ut 13
iD1
7 This joint hypothesis may be justified by the application of Roy’s union-intersection principle to tests of ! D 0
yy
in (8) given pyx.x . Let W!yy pyx.x be the Wald statistic for testing !yy D 0 for a given value of pyx.x . The test
max!yx.x W!yy pyx.x is identical to the Wald test of !yy D 0 and pyx.x D 0 in (8).
8 A related approach to that of this paper is Hansen’s (1995) test for a unit root in a univariate time series which, in our
context, would require the imposition of the subsidiary hypothesis pyx.x D 00 .
9 The BDM test is based on earlier contributions of Kremers et al. (1992), Banerjee et al. (1993), and Boswijk (1994).
10 Partitioning 0 D g 0 0
xi xy,i , 0xx,i , i D 1, . . . , p 1, conformably with zt D yt , xt , BDM also set gxy,i D 0, i D
1, . . . , p 1, which implies gxy D 0, where 0x D gxy , 0xx ; that is, yt does not Granger cause xt .
ž Case III (unrestricted intercepts; no trends) c0 6D 0 and c1 D 0. Again, g D 0. Now, the

intercept restriction c0 D !yy , pyx.x m is ignored and the ECM is

p1
yt D c0 C !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 14
iD1
ž Case IV (unrestricted intercepts; restricted trends) c0 6D 0 and c1 D !yy , pyx.x g.

p1
yt D c0 C !yy yt1 *y t C pyx.x xt1 gx t C y0i zti C w0 xt C ut 15
iD1
ž Case V (unrestricted intercepts; unrestricted trends) c0 6D 0 and c1 6D 0. Here, the deterministic

trend restriction c1 D !yy , pyx.x * is ignored and the ECM is

p1
iD1
It should be emphasized that the DGPs for Cases II and III are treated as identical as are those
compared with that of Dickey and Fuller (1981) for univariate models, estimation and hypothesis
testing in Cases III and V proceed ignoring the constraints linking respectively the intercept and
trend coefficient, c0 and c1 , to the parameter vector !yy , pyx.x whereas Cases II and IV fully

In this section we develop bounds procedures for testing for the existence of a level relationship
between yt and xt using (12)–(16); see (10). The main approach taken here, cf. Engle and
Granger (1987) and BDM, is to test for the absence of any level relationship between yt and
xt via the exclusion of the lagged level variables yt1 and xt1 in (12)–(16). Consequently, we
! !
define the constituent null hypotheses H0 yy : !yy D 0, H0 yx.x : pyx.x D 00 , and alternative hypotheses
!yy !yx.x 0
H1 : !yy 6D 0, H1 : pyx.x 6D 0 . Hence, the joint null hypothesis of interest in (12)–(16) is
given by:
! !
H0 D H0 yy \ H0 yx.x 17
and the alternative hypothesis is correspondingly stated as:

! !
H1 D H1 yy [ H1 yx.x 18
However, as indicated in Section 2, not only does the alternative hypothesis H1 of (17) cover the
case of interest in which !yy 6D 0 and pyx.x 6D 00 but also permits !yy 6D 0, pyx.x D 00 and !yy D 0
and pyx.x 6D 00 ; cf. (8). That is, the possibility of degenerate level relationships between yt and xt
is admitted under H1 of (18). We comment further on these alternatives at the end of this section.
For ease of exposition, we consider Case IV and rewrite (15) in matrix notation as
y D iT c0 C ZŁ1 pŁy.x C Z y C u 19
where iT is a T-vector of ones, y y1 , . . . , yT 0 , X x1 , . . . , xT 0 , Zi
z1i , . . . , zTi 0 , i D 1, . . . , p 1, y w0 , y01 , . . . , y0p1 0 , Z X, Z1 , . . . ,
Z1p , ZŁ1 tT , Z1 , tT 1, . . . , T0 , Z1 z0 , . . . , zT1 0 , u u1 , . . . , uT 0 and

g0 !yy
pŁy.x D
IkC1 p0yx.x
The least squares (LS) estimator of pŁy.x is given by:
0
p̂Ły.x Z̃Ł1 P Ł 1 Ł0
Z Z̃1 Z̃1 P
Z y
20
P. Z , y
where Z̃Ł1 P. ZŁ1 , Z P. y, P. IT iT i0 iT 1 i0 and P
T T Z IT

0 0
1
Z Z Z Z . The Wald and the F-statistics for testing the null hypothesis H0 of
0 0 W
W p̂Ły.x Z̃Ł1 P Ł Ł
Z Z̃1 p̂y.x /ωO uu , F 21
kC2

where ωO uu T m1 TtD1 uQ t2 , m k C 1p C 1 C 1 is the number of estimated coefficients
and uQ t , t D 1, 2, . . . , T, are the least squares (LS) residuals from (19).
The next theorem presents the asymptotic null distribution of the Wald statistic; the limit
behaviour of the F-statistic is a simple corollary and is not presented here or subsequently.
Let WkrC1 a Wu a, Wkr a0 0 denote a k r C 1-dimensional standard Brownian motion
partitioned into the scalar and k r-dimensional sub-vector independent standard Brownian
motions Wu a and Wkr a, a 2 [0, 1]. We will also require the corresponding 1 de-meaned k
r C 1-vector standard Brownian motion W̃krC1 a WkrC1 a 0 WkrC1 ada, and de-
meaned
and
de-trended k r C 1-vector standard Brownian motion ŴkrC1 a W̃krC1 a
1
12 a 12 0 a 12 W̃krC1 ada, and their respective partitioned counterparts W̃krC1 a D
WQ u a, W̃kr a0 0 , and ŴkrC1 a D W O u a, Ŵkr a0 0 , a 2 [0, 1].
Theorem 3.1 (Limiting distribution of W) If Assumptions 1–4 and 5a hold, then under H0 :
!yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic distribution of the Wald statistic W of

1
1 1
1
W ) z0r zr C dWu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 adWu a 22
0 0 0
where zr ¾ N0, Ir is distributed independently of the second term in (22) and

 WkrC1 a Case I 

 

 WkrC1 a , 1
0 0
Case II 

FkrC1 a D W̃krC1 a Case III

 
 W̃krC1 a0 , a 2 0 Case IV 
1
 

ŴkrC1 a Case V
r D 0, . . . , k, and Cases I–V are defined in (12)–(16), a 2 [0, 1].
The asymptotic distribution of the Wald statistic W of (21) depends on the dimension and
cointegration rank of the forcing variables fxt g, k and r respectively. In Case IV, referring to
(11), the first component in (22), z0r zr ¾ / 2 r, corresponds to testing for the exclusion of the r-
dimensional stationary vector b0xx xt1 , that is, the hypothesis ayx w0 axx D 00 , whereas the second
term in (22), which is a non-standard Dickey–Fuller unit-root distribution, corresponds to testing
for the exclusion of the k r C 1-dimensional I1 vector b? ? 0
y , b zt1 and, in Cases II and
IV, the intercept and time-trend respectively or, equivalently, ˛yy D 0.
We specialize Theorem 3.1 to the two polar cases in which, first, the process for the forcing
variables fxt g is purely integrated of order zero, that is, r D k and 5xx is of full rank, and, second,
the fxt g process is not mutually cointegrated, r D 0, and, hence, the fxt g process is purely integrated
of order one.
Corollary 3.1 (Limiting distribution of W if fxt g ¾ I0). If Assumptions 1–4 and 5a hold
and r D k, that is, fxt g ¾ I0, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the
1
FadWu a2
W ) z0k zk C 0 1 23
0 Fa2 da
where zk ¾ N0, Ik is distributed independently of the second term in (23) and

 

 Wu a Case I 

 Wu a, 10 Case II 
 
Fa D WQ u a Case III

 WQ u a, a 1 0 Case IV 

 

 O
2

Wu a Case V
r D 0, . . . , k, where Cases I–V are defined in (12)–(16), a 2 [0, 1].
and r D 0, that is, fxt g ¾ I1, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the

1
1 1
1
0 0
W) dWu aFkC1 a FkC1 aFkC1 a da FkC1 adWu a
0 0 0
where FkC1 a is defined in Theorem 3.1 for Cases I–V, a 2 [0, 1].
In practice, however, it is unlikely that one would possess a priori knowledge of the rank r
of 5xx ; that is, the cointegration rank of the forcing variables fxt g or, more particularly, whether
fxt g ¾ I0 or fxt g ¾ I1. Long-run analysis of (12)–(16) predicated on a prior determination
of the cointegration rank r in (7) is prone to the possibility of a pre-test specification error;
see, for example, Cavanagh et al. (1995). However, it may be shown by simulation that the
asymptotic critical values obtained from Corollaries 3.1 (r D k and fxt g ¾ I0) and 3.2 (r D 0
and fxt g ¾ I1) provide lower and upper bounds respectively for those corresponding to the
general case considered in Theorem 3.1 when the cointegration rank of the forcing variables
fxt g process is 0 r k.11 Hence, these two sets of critical values provide critical value
bounds covering all possible classifications of fxt g into I0, I1 and mutually cointegrated
processes. Asymptotic critical value bounds for the F-statistics covering Cases I–V are set out in
Tables CI(i)–CI(v) for sizes 0.100, 0.050, 0.025 and 0.010; the lower bound values assume that
the forcing variables fxt g are purely I0, and the upper bound values assume that fxt g are purely
I1.12
Hence, we suggest a bounds procedure to test H0 : !yy D 0 and pyx.x D 00 of (17) within the
conditional ECMs (12)–(16). If the computed Wald or F-statistics fall outside the critical value
fxt g process. If, however, the Wald or F-statistic fall within these bounds, inference would be
fxt g is required to proceed further.
The conditional ECMs (12)–(16), derived from the underlying VAR(p) model (2), may also be
interpreted as an autoregressive distributed lag model of orders (p, p, . . . , p) (ARDL(p, . . . , p)).
However, one could also allow for differential lag lengths on the lagged variables yti and
xti in (2) to arrive at, for example, an ARDL(p, p1 , . . . , pk ) model without affecting the
one can use a flexible choice for the dynamic lag structure in (12)–(16) as well as allowing
for short-run feedbacks from the lagged dependent variables, yti , i D 1, . . . , p, to xt in
(1992, 1995), PSS, and Urbain (1992), where it is assumed in addition that 5xx D 0 or xt is purely
I1 in (7).
the deterministics given by (12), (14) and (16). Note that the restrictions on the deterministics’
but do not test the implicit hypothesis ayx w0 axx D 00 ; that is, the limiting distributional results
given below are also obtained under the joint hypothesis H0 : !yy D 0 and pyx.x D 00 of (17). BDM
!
test ˛yy D 0 (or H0 yy : !yy D 0) via the exclusion of yt1 in Cases I, III and V. For example, in
ŷ01 P
y
Z
,X̂1
t!yy D 1/2
24
ωO uu ŷ01 P
Z ŷ1 1/2
,X̂1
where ωO uu is defined in the line after (21), y P. ,0 y, ŷ1 P. ,0 y1 , y1
T T T T
P. ,0 Z , P. ,0 P.
y0 , . . . , yT1 , X̂1 P.T ,0T X1 , X1 x0 , . . . , xT1 0 , Z
0
T T T T T
0 1 0
P.T tT t0T P.T tT 1 t0T P.T , P
Z ,X̂1 D P Z X̂1 X̂1 P
Z P Z X̂1 X̂1 P
Z and P Z
IT Z Z 0 Z 1 Z 0 .

simulations with different combinations of values for k and 0 r k.
12 The critical values for the Wald version of the bounds test are given by k C 1 times the critical values of the F-test in
Cases I, III and V, and k C 2 times in Cases II and IV.
relationshipa
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 3.00 3.00 4.20 4.20 5.47 5.47 7.17 7.17 1.16 1.16 2.32 2.32
1 2.44 3.28 3.15 4.11 3.88 4.92 4.81 6.02 1.08 1.54 1.08 1.73
2 2.17 3.19 2.72 3.83 3.22 4.50 3.88 5.30 1.05 1.69 0.70 1.27
3 2.01 3.10 2.45 3.63 2.87 4.16 3.42 4.84 1.04 1.77 0.52 0.99
4 1.90 3.01 2.26 3.48 2.62 3.90 3.07 4.44 1.03 1.81 0.41 0.80
5 1.81 2.93 2.14 3.34 2.44 3.71 2.82 4.21 1.02 1.84 0.34 0.67
6 1.75 2.87 2.04 3.24 2.32 3.59 2.66 4.05 1.02 1.86 0.29 0.58
7 1.70 2.83 1.97 3.18 2.22 3.49 2.54 3.91 1.02 1.88 0.26 0.51
8 1.66 2.79 1.91 3.11 2.15 3.40 2.45 3.79 1.02 1.89 0.23 0.46
9 1.63 2.75 1.86 3.05 2.08 3.33 2.34 3.68 1.02 1.90 0.20 0.41
10 1.60 2.72 1.82 2.99 2.02 3.27 2.26 3.60 1.02 1.91 0.19 0.37
Table CI(ii) Case II: Restricted intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 3.80 3.80 4.60 4.60 5.39 5.39 6.44 6.44 2.03 2.03 1.77 1.77
1 3.02 3.51 3.62 4.16 4.18 4.79 4.94 5.58 1.69 2.02 1.01 1.25
2 2.63 3.35 3.10 3.87 3.55 4.38 4.13 5.00 1.52 2.02 0.69 0.96
3 2.37 3.20 2.79 3.67 3.15 4.08 3.65 4.66 1.41 2.02 0.52 0.78
4 2.20 3.09 2.56 3.49 2.88 3.87 3.29 4.37 1.34 2.01 0.42 0.65
5 2.08 3.00 2.39 3.38 2.70 3.73 3.06 4.15 1.29 2.00 0.35 0.56
6 1.99 2.94 2.27 3.28 2.55 3.61 2.88 3.99 1.26 2.00 0.30 0.49
7 1.92 2.89 2.17 3.21 2.43 3.51 2.73 3.90 1.23 2.01 0.26 0.44
8 1.85 2.85 2.11 3.15 2.33 3.42 2.62 3.77 1.21 2.01 0.23 0.40
9 1.80 2.80 2.04 3.08 2.24 3.35 2.50 3.68 1.19 2.01 0.21 0.36
10 1.76 2.77 1.98 3.04 2.18 3.28 2.41 3.61 1.17 2.00 0.19 0.33
Table CI(iii) Case III: Unrestricted intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 6.58 6.58 8.21 8.21 9.80 9.80 11.79 11.79 3.05 3.05 7.07 7.07
1 4.04 4.78 4.94 5.73 5.77 6.68 6.84 7.84 2.03 2.52 2.28 2.89
2 3.17 4.14 3.79 4.85 4.41 5.52 5.15 6.36 1.69 2.35 1.23 1.77
3 2.72 3.77 3.23 4.35 3.69 4.89 4.29 5.61 1.51 2.26 0.82 1.27
4 2.45 3.52 2.86 4.01 3.25 4.49 3.74 5.06 1.41 2.21 0.60 0.98
5 2.26 3.35 2.62 3.79 2.96 4.18 3.41 4.68 1.34 2.17 0.48 0.79
6 2.12 3.23 2.45 3.61 2.75 3.99 3.15 4.43 1.29 2.14 0.39 0.66
7 2.03 3.13 2.32 3.50 2.60 3.84 2.96 4.26 1.26 2.13 0.33 0.58
8 1.95 3.06 2.22 3.39 2.48 3.70 2.79 4.10 1.23 2.12 0.29 0.51
9 1.88 2.99 2.14 3.30 2.37 3.60 2.65 3.97 1.21 2.10 0.25 0.45
10 1.83 2.94 2.06 3.24 2.28 3.50 2.54 3.86 1.19 2.09 0.23 0.41
(Continued overleaf )
Table CI. (Continued )
Table CI(iv) Case IV: Unrestricted intercept and restricted trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 5.37 5.37 6.29 6.29 7.14 7.14 8.26 8.26 3.17 3.17 2.68 2.68
1 4.05 4.49 4.68 5.15 5.30 5.83 6.10 6.73 2.45 2.77 1.41 1.65
2 3.38 4.02 3.88 4.61 4.37 5.16 4.99 5.85 2.09 2.57 0.92 1.20
3 2.97 3.74 3.38 4.23 3.80 4.68 4.30 5.23 1.87 2.45 0.67 0.93
4 2.68 3.53 3.05 3.97 3.40 4.36 3.81 4.92 1.72 2.37 0.51 0.76
5 2.49 3.38 2.81 3.76 3.11 4.13 3.50 4.63 1.62 2.31 0.42 0.64
6 2.33 3.25 2.63 3.62 2.90 3.94 3.27 4.39 1.54 2.27 0.35 0.55
7 2.22 3.17 2.50 3.50 2.76 3.81 3.07 4.23 1.48 2.24 0.31 0.49
8 2.13 3.09 2.38 3.41 2.62 3.70 2.93 4.06 1.44 2.22 0.27 0.44
9 2.05 3.02 2.30 3.33 2.52 3.60 2.79 3.93 1.40 2.20 0.24 0.40
10 1.98 2.97 2.21 3.25 2.42 3.52 2.68 3.84 1.36 2.18 0.22 0.36
Table CI(v) Case V: Unrestricted intercept and unrestricted trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 9.81 9.81 11.64 11.64 13.36 13.36 15.73 15.73 5.33 5.33 11.35 11.35
1 5.59 6.26 6.56 7.30 7.46 8.27 8.74 9.63 3.17 3.64 3.33 3.91
2 4.19 5.06 4.87 5.85 5.49 6.59 6.34 7.52 2.44 3.09 1.70 2.23
3 3.47 4.45 4.01 5.07 4.52 5.62 5.17 6.36 2.08 2.81 1.08 1.51
4 3.03 4.06 3.47 4.57 3.89 5.07 4.40 5.72 1.86 2.64 0.77 1.14
5 2.75 3.79 3.12 4.25 3.47 4.67 3.93 5.23 1.72 2.53 0.59 0.91
6 2.53 3.59 2.87 4.00 3.19 4.38 3.60 4.90 1.62 2.45 0.48 0.75
7 2.38 3.45 2.69 3.83 2.98 4.16 3.34 4.63 1.54 2.39 0.40 0.64
8 2.26 3.34 2.55 3.68 2.82 4.02 3.15 4.43 1.48 2.35 0.34 0.56
9 2.16 3.24 2.43 3.56 2.67 3.87 2.97 4.24 1.43 2.31 0.30 0.49
10 2.07 3.16 2.33 3.46 2.56 3.76 2.84 4.10 1.40 2.28 0.26 0.44
a The critical values are computed via stochastic simulations using T D 1000 and 40,000 replications for the F-statistic
for testing f D 0 in the regression: yt D f zt1 C a wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and
 
 zt1 D yt1 , x0t1 0 , wt D 0 Case I 

 
 z 0 0
 t1 D yt1 , xt1 , 1 , wt D 0 Case II 

zt1 D yt1 , x0t1 0 , wt D 1 Case III

 

 z D yt1 , x0t1 , t0 , wt D 1 Case IV 

 t1 
zt1 D yt1 , x0t1 0 , wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0 and
et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0 if xt
is purely I0. The critical values for k D 0 correspond to the squares of the critical values of Dickey and Fuller’s (1979)
unit root t-statistics for Cases I, III and V, while they match those for Dickey and Fuller’s (1981) unit root F-statistics
for Cases II and IV. The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0,
while the columns headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
Theorem 3.2 (Limiting distribution of t!yy ). If Assumptions 1-4 and 5a hold and gxy D 0, where
0x D gxy , 0xx , then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic
distribution of the t-statistic t!yy of (24) has the representation

1
1 1/2
dWu aFkr a Fkr a2 da 25
0 0
where
 1 1 
1

 Wu a 0 Wu aWkr a0 da 0 Wkr aWkr a0 da Wkr a Case I  

 1 

1 1
Fkr a D WQ u a W Q u aW̃kr a da
0
W̃ aW̃ a0
da W̃ a Case III


0

0 kr kr
1
kr



 O 1 1


Wu a 0 WO u aŴkr a da
0
Ŵkr aŴkr a0
da Ŵkr a Case V
0
r D 0, . . . , k, and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
The form of the asymptotic representation (25) is similar to that of a Dickey–Fuller test for
a unit root except that the standard Brownian motion Wu a is replaced by the residual from
an asymptotic regression of Wu a on the independent (k r)-vector standard Brownian motion
Wkr a (or their de-meaned and de-meaned and de-trended counterparts).
Similarly to the analysis following Theorem 3.1, we detail the limiting distribution of the t-
statistic t!yy in the two polar cases in which the forcing variables fxt g are purely integrated of
Corollary 3.3 (Limiting distribution of t!yy if fxt g ¾ I0). If Assumptions 1-4 and 5a hold
asymptotic distribution of the t-statistic t!yy of (24) has the representation

1
1 1/2
2
dWu aFa Fa da
0 0

Wu a Case I
where
Fa D Q u a Case III
W
O u a Case V
W
and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
Corollary 3.4 (Limiting distribution of t!yy if fxt g ¾ I1). If Assumptions 1-4 and 5a hold,
!
gxy D 0, where 0x D gxy , 0xx , and r D 0, that is, fxt g ¾ I1, then under H0 yy : !yy D 0, as
T ! 1, the asymptotic distribution of the t-statistic t!yy of (24) has the representation

1
1 1/2
2
dWu aFk a Fk a da
0 0
where Fk a is defined in Theorem 3.2 for Cases I, III and V, a 2 [0, 1].
As above, it may be shown by simulation that the asymptotic critical values obtained from
Corollaries 3.3 (r D k and fxt g is purely I0) and 3.4 (r D 0 and fxt g is purely I1) provide
lower and upper bounds respectively for those corresponding to the general case considered in
!
Theorem 3.2. Hence, a bounds procedure for testing H0 yy : !yy D 0 based on these two polar cases
may be implemented as described above based on the t-statistic t!yy for the exclusion of yt1 in
the conditional ECMs (12), (14) and (16) without prior knowledge of the cointegrating rank r.13
These asymptotic critical value bounds are given in Tables CII(i), CII(iii) and CII(v) for Cases I,
As is emphasized in the Proof of Theorem 3.2 given in Appendix A, if the asymptotic analysis
!
for the t-statistic t!yy of (24) is conducted under H0 yy : !yy D 0 only, the resultant limit distribution
for t!yy depends on the nuisance parameter w f in addition to the cointegrating rank r, where,
under Assumption 5a, ayx f0 axx D 00 . Moreover, if yt is allowed to Granger-cause xt , that is,
gxy,i 6D 0 for some i D 1, . . . , p 1, then the limit distribution also is dependent on the nuisance
parameter gxy /*yy f0 gxy ; see Appendix A. Consequently, in general, where w 6D f or gxy 6D 0,
Table CII. Asymptotic critical value bounds of the t-statistic. Testing for the existence of a levels relationshipa
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 1.62 1.62 1.95 1.95 2.24 2.24 2.58 2.58 0.42 0.42 0.98 0.98
1 1.62 2.28 1.95 2.60 2.24 2.90 2.58 3.22 0.42 0.98 0.98 1.12
2 1.62 2.68 1.95 3.02 2.24 3.31 2.58 3.66 0.42 1.39 0.98 1.12
3 1.62 3.00 1.95 3.33 2.24 3.64 2.58 3.97 0.42 1.71 0.98 1.09
4 1.62 3.26 1.95 3.60 2.24 3.89 2.58 4.23 0.42 1.98 0.98 1.07
5 1.62 3.49 1.95 3.83 2.24 4.12 2.58 4.44 0.42 2.22 0.98 1.05
6 1.62 3.70 1.95 4.04 2.24 4.34 2.58 4.67 0.42 2.43 0.98 1.04
7 1.62 3.90 1.95 4.23 2.24 4.54 2.58 4.88 0.42 2.63 0.98 1.04
8 1.62 4.09 1.95 4.43 2.24 4.72 2.58 5.07 0.42 2.81 0.98 1.04
9 1.62 4.26 1.95 4.61 2.24 4.89 2.58 5.25 0.42 2.98 0.98 1.04
10 1.62 4.42 1.95 4.76 2.24 5.06 2.58 5.44 0.42 3.15 0.98 1.03
Table CII(iii) Case III: Unrestricted intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 2.57 2.57 2.86 2.86 3.13 3.13 3.43 3.43 1.53 1.53 0.72 0.71
1 2.57 2.91 2.86 3.22 3.13 3.50 3.43 3.82 1.53 1.80 0.72 0.81
2 2.57 3.21 2.86 3.53 3.13 3.80 3.43 4.10 1.53 2.04 0.72 0.86
3 2.57 3.46 2.86 3.78 3.13 4.05 3.43 4.37 1.53 2.26 0.72 0.89
4 2.57 3.66 2.86 3.99 3.13 4.26 3.43 4.60 1.53 2.47 0.72 0.91
5 2.57 3.86 2.86 4.19 3.13 4.46 3.43 4.79 1.53 2.65 0.72 0.92
6 2.57 4.04 2.86 4.38 3.13 4.66 3.43 4.99 1.53 2.83 0.72 0.93
7 2.57 4.23 2.86 4.57 3.13 4.85 3.43 5.19 1.53 3.00 0.72 0.94
8 2.57 4.40 2.86 4.72 3.13 5.02 3.43 5.37 1.53 3.16 0.72 0.96
9 2.57 4.56 2.86 4.88 3.13 5.18 3.42 5.54 1.53 3.31 0.72 0.96
10 2.57 4.69 2.86 5.03 3.13 5.34 3.43 5.68 1.53 3.46 0.72 0.96
!
13 Although Corollary 3.3 does not require gxy D 0 and H0 yx.x : pyx.x D 00 is automatically satisfied under the conditions
!
of Corollary 3.4, the simulation critical value bounds result requires gxy D 0 and H0 yx.x : pyx.x D 00 for 0 < r < k.
Table CII. (Continued )

0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 3.13 3.13 3.41 3.41 3.65 3.66 3.96 3.97 2.18 2.18 0.57 0.57
1 3.13 3.40 3.41 3.69 3.65 3.96 3.96 4.26 2.18 2.37 0.57 0.67
2 3.13 3.63 3.41 3.95 3.65 4.20 3.96 4.53 2.18 2.55 0.57 0.74
3 3.13 3.84 3.41 4.16 3.65 4.42 3.96 4.73 2.18 2.72 0.57 0.79
4 3.13 4.04 3.41 4.36 3.65 4.62 3.96 4.96 2.18 2.89 0.57 0.82
5 3.13 4.21 3.41 4.52 3.65 4.79 3.96 5.13 2.18 3.04 0.57 0.85
6 3.13 4.37 3.41 4.69 3.65 4.96 3.96 5.31 2.18 3.20 0.57 0.87
7 3.13 4.53 3.41 4.85 3.65 5.14 3.96 5.49 2.18 3.34 0.57 0.88
8 3.13 4.68 3.41 5.01 3.65 5.30 3.96 5.65 2.18 3.49 0.57 0.90
9 3.13 4.82 3.41 5.15 3.65 5.44 3.96 5.79 2.18 3.62 0.57 0.91
10 3.13 4.96 3.41 5.29 3.65 5.59 3.96 5.94 2.18 3.75 0.57 0.92
a The critical values are computed via stochastic simulations using T D 1000 and 40 000 replications for the t-statistic for
testing 2 D 0 in the regression: yt D 2yt1 C d0 xt1 C a0 wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and

wt D 0 Case I
wt D 1 Case III
wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0
and et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0
if xt is purely I0. The critical values for k D 0 correspond to those of Dickey and Fuller’s (1979) unit root t-statistics.
The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0, while the columns
headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
!
although the t-statistic t!yy has a well-defined limiting distribution under H0 yy : !yy D 0, the above
!
bounds testing procedure for H0 yy : !yy D 0 based on t!yy is not asymptotically similar.14
Consequently, in the light of the consistency results for the above statistics discussed in
Section 4, see Theorems 4.1, 4.2 and 4.4, we suggest the following procedure for ascertaining
the existence of a level relationship between yt and xt : test H0 of (17) using the bounds procedure
based on the Wald or F-statistic of (21) from Corollaries 3.1 and 3.2: (a) if H0 is not rejected,
!
proceed no further; (b) if H0 is rejected, test H0 yy : !yy D 0 using the bounds procedure based on
!
the t-statistic t!yy of (24) from Corollaries 3.3 and 3.4. If H0 yy : !yy D 0 is false, a large value of
t!yy should result, at least asymptotically, confirming the existence of a level relationship between
yt and xt , which, however, may be degenerate (if pyx.x D 00 ).

This section first demonstrates that the proposed bounds testing procedure based on the Wald
statistic of (21) described in Section 3 is consistent. Second, it derives the asymptotic distribution
!
14 In principle, the asymptotic distribution of t!yy under H0 yy : !yy D 0 may be simulated from the limiting representation
2 2
given in the Proof of Theorem 3.2 of Appendix A after substitution of consistent estimators for f and lxy gxy /*yy.x under
!yy 2 0
H0 : !yy D 0, where *yy.x *yy f *xy . Although such estimators may be obtained straightforwardly, unfortunately,
they necessitate the use of parameter estimators from the marginal ECM (7) for fxt g1 tD1 .
of the Wald statistic of (21) under a sequence of local alternatives. Finally, we show that the
In the discussion of the consistency of the bounds test procedure based on the Wald statistic
of (21), because the rank of the long-run multiplier matrix 5 may be either r or r C 1 under the
! ! ! !
alternative hypothesis H1 D H1 yy [ H1 yx.x of (18) where H1 yy : !yy 6D 0 and H1 yx.x : pyx.x 6D 00 , it is
!yy
necessary to deal with these two possibilities. First, under H1 : !yy 6D 0, the rank of 5 is r C 1 so
!
Assumption 5b applies; in particular, ˛yy 6D 0. Second, under H0 yy : !yy D 0, the rank of 5 is r so
!yx.x
Assumption 5a applies; in this case, H1 : pyx.x 6D 00 holds and, in particular, ayx w0 axx 6D 00 .
!
Theorem 4.1 (Consistency of the Wald statistic bounds test procedure under H1 yy ). If Assumptions
!
1-4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the Wald statistic W (21) is consistent against
!yy
H1 : !yy 6D 0 in Cases I–V defined in (12)–(16).
! !
Theorem 4.2 (Consistency of the Wald statistic bounds test procedure under H1 yx.x \ H0 yy ). If
! !
Assumptions 1–4 and 5a hold, then under H1 yx.x : pyx.x 6D 00 of (18) and H0 yy : !yy D 0 of (17) the
!yx.x 0
Wald statistic W (21) is consistent against H1 : pyx.x 6D 0 in Cases I–V defined in (12)–(16).
Hence, combining Theorems 4.1 and 4.2, the bounds procedure of Section 3 based on the Wald
! ! ! !
statistic W (21) defines a consistent test of H0 D H0 yy \ H0 yx.x of (17) against H1 D H1 yy [ H1 yx.x
of (18). This result holds irrespective of whether the forcing variables fxt g are purely I0, purely
I1 or mutually cointegrated.
We now turn to consider the asymptotic distribution of the Wald statistic (21) under a suitably
specified sequence of local alternatives. Recall that under Assumption 5b, py.x [D !yy , pyx.x ] D
˛yy ˇyy , ˛yy b0xy C ayx w0 axx b0xx . Consequently, we define the sequence of local alternatives
H1T : py.xT [D !yyT , pyx.xT ] D T1 ˛yy ˇyy , T1 ˛yy b0xy C T1/2 dyx w0 dxx b0xx 26

!yyT pyxT
5T
0 5xxT
and recalling  D ab0 , where 1, w0 a D ayx w0 axx D 00 , we have

dyx
5T 5 D T1 ay b0y C T1/2 b0 27
dxx
In order to detail the limit distribution of the Wald statistic under the sequence of local alterna-
tives H1T of (26), it is necessary to define the (k r C 1)-dimensional Ornstein–Uhlenbeck pro-
cess JŁkrC1 a D JŁu a, JŁkr a 0 0
which obeys the stochastic integral and differential equations,
0 a Ł
JkrC1 a D WkrC1 a C ab 0 JkrC1 r dr and dJŁkrC1 a D dWkrC1 a C ab0 JŁkrC1 a da,
Ł
where WkrC1 a is a (k r C 1)-dimensional standard Brownian motion, a D [a? ? 0 ?

y , a Zay ,
? 1/2 ? ? 0 ? ? 0 ? ? 1/2 ? ? 0 ? ? 1 ? ? 0
a ] ay , a ay , b D [ay , a Zay , a ] [by , b 0ay , a ] by , b by , together
with the de-meaned and de-meaned and de-trended counterparts J̃ŁkrC1 a D JQ Łu a, J̃Łkr a0 0
and ĴŁkrC1 a D JO Łu a, ĴŁkr a0 0 partitioned similarly, a 2 [0, 1]. See, for example, Johansen
(1995, Chapter 14, pp. 201–210).
Theorem 4.3 (Limiting distribution of W under H1T ). If Assumptions 1–4 and 5a hold, then under
H1T : !y.x D T1 ˛yy b0y C T1/2 dyx w0 dxx b0 of (26), as T ! 1, the asymptotic distribution of

1
1 1
1
W ) z0r zr C dJŁu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 a dJŁu a 28
0 0 0
0
where zr ¾ NQ1/2 h, Ir , Q[D Q1/20 Q1/2 ] D p limT!1 T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ , h dyx w
0
dxx 0 , is distributed independently of the second term in (28) and

 
 JŁkrC1 a Case I 

 

 JŁkrC1 a0 , 10 Case II 

FkrC1 a D J̃ŁkrC1 a Case III

 

 J̃Ł a0 , a 1/20 Case IV 

 krC1Ł 
ĴkrC1 a Case V
The first component of (28) z0r zr is non-central chi-square distributed with r degrees of
!
freedom and non-centrality parameter h0 Qh and corresponds to the local alternative H1Tyx.x :
!
pyx.xT D T1/2 dyx w0 dxx b0xx under H0 : !yy D 0. The second term in (28) is a non-standard
yy
!
Dickey–Fuller unit-root distribution under the local alternative H1Tyy : !yyT D T1 ˛yy ˇyy and
dyx w0 dxx D 00 . Note that under H0 of (17), that is, ˛yy D 0 and dyx w0 dxx D 00 , the limiting
The proof for the consistency of the bounds test procedure based on the t-statistic of (24)
requires that the rank of the long-run multiplier matrix 5 is r C 1 under the alternative hypothesis
!
H1 yy : !yy 6D 0. Hence, Assumption 5b applies; in particular, ˛yy 6D 0.
!
Theorem 4.4 (Consistency of the t-statistic bounds test procedure under H1 yy ). If Assumptions
!
1–4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the t-statistic t!yy (24) is consistent against
!yy
H1 : !yy 6D 0 in Cases I, III and V defined in (12), (14) and (16).
As noted at the end of Section 3, Theorem 4.4 suggests the possibility of using t!yy to
! ! !
discriminate between H0 yy : !yy D 0 and H1 yy : !yy 6D 0, although, if H0 yx.x : pyx.x D 00 is false,

and Whittaker (1995), CSW hereafter. The theoretical basis of the Treasury’s earnings equation
where firms and unions set wages to maximize a weighted average of firms’ profits and unions’
utility. Following Darby and Wren-Lewis (1993), the theoretical real wage equation underlying
the Treasury’s earnings equation is given by
Prodt
wt D 29
1 C fURt 1 RRt /Uniont
where wt is the real wage, Prodt is labour productivity, RRt is the replacement ratio defined as
the ratio of unemployment benefit to the wage rate, Uniont is a measure of ‘union power’, and
fURt is the probability of a union member becoming unemployed, which is assumed to be an
increasing function of the unemployment rate URt . The econometric specification is based on a
log-linearized version of (29) after allowing for a wedge effect that takes account of the difference
between the ‘real product wage’ which is the focus of the firms’ decision, and the ‘real consumption
wage’ which concerns the union.15 The theoretical arguments for a possible long-run wedge effect
on real wages is mixed and, as emphasized by CSW, whether such long-run effects are present
is an empirical matter. The change in the unemployment rate (URt ) is also included in the
Treasury’s wage equation. CSW cite two different theoretical rationales for the inclusion of URt
in the wage equation: the differential moderating effects of long- and short-term unemployed
on real wages, and the ‘insider–outsider’ theories which argue that only rising unemployment
will be effective in significantly moderating wage demands. See Blanchard and Summers (1986)
and Lindbeck and Snower (1989). The ARDL model and its associated unrestricted equilibrium
We begin our empirical analysis from the maintained assumption that the time series properties
of the key variables in the Treasury’s earnings equation can be well approximated by a log-linear
VARp model, augmented with appropriate deterministics such as intercepts and time trends.
included in the analysis. CSW, p. 50, report that ‘... it has not proved possible to identify a
significant effect from the replacement ratio, and this had to be omitted from our specification’.16
D7475t D 1, over the period 1974q1 1975q4, 0 elsewhere
The asymptotic theory developed in the paper is not affected by the inclusion of such ‘one-
off’ dummy variables.17 Let zt D wt , Prodt , URt , Wedget , Uniont 0 D wt , x0t 0 . Then, using the

p1
wt D c0 C c1 t C c2 D7475t C c3 D7579t C !ww wt1 C pwx.x xt1 C y0i zti C d0 xt C ut
iD1
30
Under the assumption that lagged real wages, wt1 , do not enter the sub-VAR model for xt ,
the above real wage equation is identified and can be estimated consistently by LS.18 Notice,
the unemployment or productivity equations, for example. The exclusion of the level of real wages
postulates that labour productivity is partly determined by the level of real wages.19 It is clear
that, in our framework, the bargaining theory and the efficiency wage theory cannot be entertained
The above specification is also based on the assumption that the disturbances ut are serially
uncorrelated. It is therefore important that the lag order p of the underlying VAR is selected
appropriately. There is a delicate balance between choosing p sufficiently large to mitigate the
residual serial correlation problem and, at the same time, sufficiently small so that the conditional
ECM (30) is not unduly over-parameterized, particularly in view of the limited time series data
Finally, a decision must be made concerning the time trend in (30) and whether its coefficient
should be restricted.20 This issue can only be settled in light of the particular sample period under
consideration. The time series data used are quarterly, cover the period 1970q1-1997q4, and are
seasonally adjusted (where relevant).21 To ensure comparability of results for different choices of
p, all estimations use the same sample period, 1972q1–1997q4 (T D 104), with the first eight
The five variables in the earnings equation were constructed from primary sources in the fol-
lowing manner: wt D lnERPRt /PYNONGt , Wedget D ln1 C TEt C ln1 TDt lnRPIXt /
PYNONGt , URt D ln100 ð ILOUt /ILOUt C WFEMPt , Prodt D lnYPROMt C 278.29 ð
YMFt /EMFt C ENMFt , and Uniont D lnUDENt , where ERPRt is average private sector
earnings per employee (£), PYNONGt is the non-oil non-government GDP deflator, YPROMt
tor cost (£ million, 1990), YMFt is the manufacturing output index adjusted for stock changes
(1990 D 100), EMFt and ENMFt are respectively employment in UK manufacturing and non-
employers’ National Insurance contribution rate, TDt is the average direct tax rate on employ-
union density (used to proxy ‘union power’) measured by union membership as a percentage of
Figures 1–3.
18 See Assumption 3 and the following discussion. By construction, the contemporaneous effects x are uncorrelated
t
are uncorrelated with ut and also have a reasonable degree of correlation with the included variables in (30).
sources and the descriptions of the variables, see CSW, pp. 46–51 and p. 11 of the Annex.
(a)
4.0
3.5
Real Wages
3.0
Log Scale
2.5
2.0
1.5 Productivity
1.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
(b)
0.04
0.03
Real Wage
0.02
0.01
0.00
−0.01
−0.02
Productivity
−0.03
−0.04
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
It is clear from Figure 1 that real wages (average earnings) and productivity show steadily rising
root tests to the five variables, perhaps not surprisingly, yields mixed results with strong evidence
not necessarily preclude the other three variables (UR, Wedge, and Union) having levels impact
on real wages. Following the methodology developed in this paper, it is possible to test for the
existence of a real wage equation involving the levels of these five variables irrespective of whether
they are purely I0, purely I1, or mutually cointegrated.
23 Over the period 1972q1– 97q4, real wages grew by 2.14% per annum as compared to labour productivity that increased
−0.2
−0.3
UNION
−0.4
−0.5
−0.6
WEDGE
−0.7
−0.8
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
3.0
2.5
2.0
Log Scale
UR
1.5
1.0
0.5
0.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
and without a linear time trend, for p D 1, 2, . . . , 7. As pointed out earlier, all regressions were
computed over the same period 1972q1–1997q4. We found that lagged changes of the productivity
variable, Prodt1 , Prodt2 , . . . , were insignificant (either singly or jointly) in all regressions.
all other variables. Table I gives Akaike’s and Schwarz’s Bayesian Information Criteria, denoted
respectively by AIC and SBC, and Lagrange multiplier (LM) statistics for testing the hypothesis
2 2
of no residual serial correlation against orders 1 and 4 denoted by /SC 1 and /SC 4 respectively.
As might be expected, the lag order selected by AIC, p aic D 6, irrespective of whether a
deterministic trend term is included or not, is much larger than that selected by SBC. This latter
criterion gives estimates p sbc D 1 if a trend is included and p
sbc D 4 if not. The /SC
2
statistics also
suggest using a relatively high lag order: 4 or more. In view of the importance of the assumption
of serially uncorrelated errors for the validity of the bounds tests, it seems prudent to select p to
be either 5 or 6.24 Nevertheless, for completeness, in what follows we report test results for p D 4
and 5, as well as for our preferred choice, namely p D 6. The results in Table I also indicate
trend.
in Tables CI and CII. First, consider the bounds F-statistic. As argued in PSS, the statistic FIV
IV of (15), is more appropriate than FV , Case V of (16), which ignores this constraint. Note that,
if the trend coefficient c1 is not subject to this restriction, (30) implies a quadratic trend in the
level of real wages under the null hypothesis of !ww D 0 and pwx.x D 00 , which is empirically
implausible. The critical value bounds for the statistics FIV and FV are given in Tables CI(iv) and
CI(v). Since k D 4, the 0.05 critical value bounds are (3.05, 3.97) and (3.47, 4.57) for FIV and
FV , respectively.25 The test outcome depends on the choice of the lag order p. For p D 4, the

p AIC SBC 2 1
/SC 2 4
/SC AIC SBC 2 1
/SC 2 4
/SC
1 319.33 302.14 16.86Ł 35.89Ł 317.51 301.64 18.38Ł 34.88Ł

2 324.25 301.77 2.16 19.71Ł 323.77 302.62 1.98 21.52Ł
3 321.51 293.74 0.52 17.07Ł 320.87 294.43 1.56 19.35Ł
4 334.37 301.31 3.48ŁŁŁ 7.79ŁŁŁ 335.37 303.63 3.41ŁŁŁ 7.13
5 335.84 297.50 0.03 2.50 336.49 299.47 0.03 2.15
6 337.06 293.42 0.85 3.58 337.03 294.72 0.99 3.99
7 336.96 288.04 0.17 2.20 336.85 289.25 0.09 0.64
Notes: p is the lag order of the underlying VAR model for the conditional ECM (30), with zero restrictions on the
coefficients of lagged changes in the productivity variable. AICp LLp sp and SBCp LLp sp /2 ln T denote
Akaike’s and Schwarz’s Bayesian Information Criteria for a given lag order p, where LLp is the maximized log-likelihood
value of the model, sp is the number of freely estimated coefficients and T is the sample size. /SC 2 1 and / 2 4 are LM
SC
statistics for testing no residual serial correlation against orders 1 and 4. The symbols Ł , ŁŁ , and ŁŁŁ denote significance
24 In the Treasury model, different lag orders are chosen for different variables. The highest lag order selected is 4 applied
to the log of the price deflator and the wedge variable. The estimation period of the earnings equation in the Treasury
model is 1971q1– 1994q3.
25 Following a suggestion from one of the referees we also computed critical value bounds for our sample size, namely
T D 104. For k D 4, the 5% critical value bounds associated with FIV and FV statistics turned out to be (3.19,4.16) and
(3.61,4.76), respectively, which are only marginally different from the asymptotic critical value bounds.

With Without
p FIV FV tV FIII tIII
4 2.99a 2.34a 2.26a 3.63b 3.02b

5 4.42c 3.96b 2.83a 5.23c 4.00c
6 4.78c 3.59b 2.44a 5.42c 3.48b
Notes: See the notes to Table I. FIV is the F-statistic for testing
0
!ww D 0, pwx.x D 0 and c1 D 0 in (30). FV is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30). FIII is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30) with c1 set equal to 0. tV
and tIII are the t-ratios for testing !ww D 0 in (30) with and without
of whether the regressors are purely I0, purely I1 or mutually cointegrated. For p D 5, the
bounds test is inconclusive. For p D 6 (selected by AIC), the statistic FV is still inconclusive, but
FIV D 4.78 lies outside the 0.05 critical value bounds and rejects the null hypothesis that there
exists no level earnings equation, irrespective of whether the regressors are purely I0, purely
I1 or mutually cointegrated.26 This finding is even more conclusive when the bounds F-test is
applied to the earnings equations without a linear trend. The relevant test statistic is FIII and the
associated 0.05 critical value bounds are (2.86, 4.01).27 For p D 4, FIII D 3.63, and the test result
is inconclusive. However, for p D 5 and 6, the values of FIII are 5.23 and 5.42 respectively and
bounds for tIII and tV , when k D 4, are (2.86, 3.99) and (3.41, 4.36).28 Therefore, if a
linear trend is included, the bounds t-test does not reject the null even if p D 5 or 6. However,
when the trend term is excluded, the null is rejected for p D 5. Overall, these test results support
In testing the null hypothesis that there are no level effects in (30), namely (!ww D 0, pwx.x D 0)
it is important that the coefficients of lagged changes remain unrestricted, otherwise these tests
could be subject to a pre-testing problem. However, for the subsequent estimation of levels effects
and short-run dynamics of real wage adjustments, the use of a more parsimonious specification
seems advisable. To this end we adopt the ARDL approach to the estimation of the level relations
26 The same conclusion is also reached for p D 7.

discussed in Pesaran and Shin (1999).29 First, the (estimated) orders of an ARDLp, p1 , p2 , p3 , p4
model in the five variables wt , Prodt , URt , Wedget , Uniont were selected by searching across
the 75 D 16, 807 ARDL models, spanned by p D 0, 1, . . . , 6, and pi D 0, 1, . . . , 6, i D 1, . . . , 4,
using the AIC criterion.30 This resulted in the choice of an ARDL6, 0, 5, 4, 5 specification with
wt D 1.063 Prodt 0.105 URt 0.943 Wedget C1.481 Uniont C2.701 C vO t 31
0.050 0.034 0.265 0.311 0.242
where vO t is the equilibrium correction term, and the standard errors are given in parenthesis.
All levels estimates are highly significant and have the expected signs. The coefficients of the
productivity and the wedge variables are insignificantly different from unity. In the Treasury’s
earnings equation, the levels coefficient of the productivity variable is imposed as unity and the
above estimates can be viewed as providing empirical support for this a priori restriction. Our
levels estimates of the effects of the unemployment rate and the union variable on real wages,
namely 0.105 and 1.481, are also in line with the Treasury estimates of 0.09 and 1.31.31
wedge variable. We obtain a much larger estimate, almost twice that obtained by the Treasury.
Setting the levels coefficients of the Prodt and Wedget variables to unity provides the alternative
interpretation that the share of wages (net of taxes and computed using RPIX rather than the
implicit GDP deflator) has varied negatively with the rate of unemployment and positively with
union strength.32
The conditional ECM regression associated with the above level relationship is given in
Table III.33 These estimates provide further direct evidence on the complicated dynamics that seem
to exist between real wage movements and their main determinants.34 All five lagged changes in
real wages are statistically significant, further justifying the choice of p D 6. The equilibrium
correction coefficient is estimated as 0.229 (0.0586) which is reasonably large and highly
significant.35 The auxiliary equation of the autoregressive part of the estimated conditional ECM
has real roots 0.9231 and 0.9095 and two pairs of complex roots with moduli 0.7589 and 0.6381,
which suggests an initially cyclical real wage process that slowly converges towards the equilibrium
described by (31).36 The regression fits reasonably well and passes the diagnostic tests against non-
normal errors and heteroscedasticity. However, it fails the functional form misspecification test at
29 Note that the ARDL approach advanced in Pesaran and Shin (1999) is applicable irrespective of whether the regressors
are purely I0, purely I1 or mutually cointegrated.
33 Clearly, it is possible to simplify the model further, but this would go beyond the remit of this section which is first to
test for the existence of a level relationship using an unrestricted ARDL specification and, second, if we are satisfied that
34 The standard errors of the estimates reported in Table III allow for the uncertainty associated with the estimation of the
levels coefficients. This is important in the present application where it is not known with certainty whether the regressors
are purely I0, purely I1 or mutually cointegrated. It is only in the case when it is known for certain that all regressors
are I1 that it would be reasonable in large samples to treat these estimates as known because of their super-consistency.
35 The equilibrium correction coefficient in the Treasury’s earnings equation is estimated to be 0.1848 (0.0528), which
is smaller than our estimate; see p. 11 in Annex of CSW. This seems to be because of the shorter lag lengths used in the
Treasury’s specification rather than the shorter time period 1971q1– 1994q3. Note also that the t-ratio reported for this
coefficient does not have the standard t-distribution; see Theorem 3.2. p
36 The complex roots are 0.34293 š 0.67703i and 0.17307 š 0.61386i, where i D 1.

earnings equation
vO t1 0.229 0.0586 N/A

wt1 0.418 0.0974 0.000
wt2 0.328 0.1089 0.004
wt3 0.523 0.1043 0.000
wt4 0.133 0.0892 0.140
wt5 0.197 0.0807 0.017
Prodt 0.315 0.0954 0.001
URt 0.003 0.0083 0.683
URt1 0.016 0.0119 0.196
URt2 0.003 0.0118 0.797
URt3 0.028 0.0113 0.014
URt4 0.027 0.0122 0.031
Wedget 0.297 0.0534 0.000
Wedget1 0.048 0.0592 0.417
Wedget2 0.093 0.0569 0.105
Wedget3 0.188 0.0560 0.001
Uniont 0.969 0.8169 0.239
Uniont1 2.915 0.8395 0.001
Uniont2 0.021 0.9023 0.981
Uniont3 0.101 0.7805 0.897
Uniont4 1.995 0.7135 0.007
Intercept 0.619 0.1554 0.000
D7475t 0.029 0.0063 0.000
D7579t 0.017 0.0063 0.009
2
R D 0.5589, GO D 0.0083, AIC D 339.57, SBC D 302.55,
2 4 D 8.74[0.068], / 2 1 D 4.86[0.027]
/SC FF
2 2 D 0.01[0.993], / 2 1 D 0.66[0.415].
/N H
Notes: The regression is based on the conditional ECM given by (30)

using an ARDL6, 0, 5, 4, 5 specification with dependent variable, wt
estimated over 1972q1– 1997q4, and the equilibrium correction term
2
vO t1 is given in (31). R is the adjusted squared multiple correlation
coefficient, GO is the standard error of the regression, AIC and SBC are
Akaike’s and Schwarz’s Bayesian Information Criteria, /SC 2 4, / 2 1,
FF
/N 2 2, and / 2 1 denote chi-squared statistics to test for no residual
H
serial correlation, no functional form mis-specification, normal errors and
homoscedasticity respectively with p-values given in [Ð]. For details of
these diagnostic tests see Pesaran and Pesaran (1997, Ch. 18).
37 The conditional ECM regression in Table III also passes the test against residual serial correlation but, as the model
was specified to deal with this problem, it should not therefore be given any extra credit!
6. CONCLUSIONS
Empirical analysis of level relationships has been an integral part of time series econometrics
and pre-dates the recent literature on unit roots and cointegration.38 However, the emphasis of this
earlier literature was on the estimation of level relationships rather than testing for their presence (or
otherwise). Cointegration analysis attempts to fill this vacuum, but, typically, under the relatively
restrictive assumption that the regressors, xt , entering the determination of the dependent variable of
interest, yt , are all integrated of order 1 or more. This paper demonstrates that the problem of testing
for the existence of a level relationship between yt and xt is non-standard even if all the regressors
under consideration are I0 because, under the null hypothesis of no level relationship between yt
and xt , the process describing the yt process is I1, irrespective of whether the regressors xt are
purely I0, purely I1 or mutually cointegrated. The asymptotic theory developed in this paper
provides a simple univariate framework for testing the existence of a single level relationship
between yt and xt when it is not known with certainty whether the regressors are purely I0,
purely I1 or mutually cointegrated.39 Moreover, it is unnecessary that the order of integration
of the underlying regressors be ascertained prior to testing the existence of a level relationship
between yt and xt . Therefore, unlike typical applications of cointegration analysis, this method is
not subject to this particular kind of pre-testing problem. The application of the proposed bounds
testing procedure to the UK earnings equation highlights this point, where one need not take an a
priori position as to whether, for example, the rate of unemployment or the union density variable
are I1 or I0.
The analysis of this paper is based on a single-equation approach. Consequently, it is inappropri-
ate in situations where there may be more than one level relationship involving yt . An extension of
this paper and those of HJNR and PSS to deal with such cases is part of our current research, but
the consequent theoretical developments will require the computation of further tables of critical
values.

We confine the main proof of Theorem 3.1 to that for Case IV and briefly detail the alterations
necessary for the other cases. Under Assumptions 1–4 and 5a, the process fzt g1
tD1 has the infinite
zt D m C gt C Cst C CŁ Let A1

p
where the partial sum st tiD1 ei , 8zCz D Cz8z D 1 zIkC1 , 8z IkC1 iD1
8i zi , Cz IkC1 C 1 i Ł
iD1 Ci z D C C 1 zC z, t D 1, 2 . . .; see Johansen (1991) and PSS.
?
Note that C D by , b [ay , a 0(by , b )] ay , a? 0 ; see Johansen (1991, (4.5), p. 1559).
? ? ? ? 0 ? 1 ?
Define the k C 2, r and k C 2, k r C 1 matrices bŁ and d by

g0 g0
bŁ b and d b? ?
y ,b
IkC1 IkC1
39 Of course, the system approach developed by Johansen (1991, 1995) can also be applied to a set of variables containing
possibly a mixture of I0 and I1 regressors.
where b? ?
y , b is a k C 1, k r C 1 matrix whose columns are a basis for the orthogonal
y , b is a basis for R
kC1
complement of b. Hence, b, b? ?
. Let x be the k C 2-unit vector 1, 00 0 .
Then, bŁ , x, d is a basis for R . It therefore follows that
kC2
T1/2 d0 zŁ[Ta] D T1/2 b? ? 0

y ,b m CT
1/2 ?
by , b? 0 Cs[Ta] C b? ? 0 1/2 Ł
y ,b T C Le[Ta]
) b? ? 0
y , b CBkC1 a
where zŁt D t, z0t 0 , BkC1 a is a k C 1-vector Brownian motion with variance matrix Z and [Ta]
denotes the integer part of Ta, a 2 [0, 1]; see Phillips and Solo (1992, Theorem 3.15, p. 983). Also,
T1 x0 zŁt D T1 t ) a. Similarly, noting that b0 C D 0, we have that bŁ0 zŁt D b0 m C b0 CŁ Let D
OP 1. Hence, from Phillips and Solo (1992, Theorem 3.16, p. 983), defining Z̃Ł1 Pi ZŁ1 and
Pi Z , it follows that
Z
0 0
T1 b0Ł Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 b0Ł Z̃Ł1 Z 0 Z
D OP 1, T1 Z D OP 1

0 0
D OP 1
T1 B0T Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 B0T Z̃Ł1 Z A2

where BT d, T1/2 x . Similarly, defining ũ Pi u,
0
0
ũ D OP 1
T1/2 b0Ł Z̃Ł1 ũ D OP 1, T1/2 Z A3

Cf. Johansen (1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146).
The next result follows from Phillips and Solo (1992, Theorem 3.15, p. 983); cf. Johansen
(1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146) and Phillips and Durlauf
(1986).

Lemma A.1 Let BT d, T1/2 x and define Ga D G1 a0 , G2 a0 , where G1 a b? ? 0
y ,b
1 1
CB̃kC1 a, B̃kC1 a[D BQ 1 a , B̃k a ] D BkC1 a 0 BkC1 ada, and G2 a a 2 , a 2 [0,1].
0 0 0
Then
1
1
0 0
T2 B0T Z̃Ł1 Z̃Ł1 BT ) GaGa0 da, T1 B0T Z̃Ł1 ũ ) GadBQ uŁ a
0 0
where BQ uŁ a BQ 1 a w0 B̃k a and B̃k a D BQ 1 a, B̃k a0 0 , a 2 [0, 1]
Proof of Theorem 3.1 Under H0 of (17), the Wald statistic W of (21) can be written as
0 1 0
ωO uu W D ũ0 P Z Z̃ Ł
1 Z̃ Ł
1 P Z

Z̃ Ł
1 Z̃Ł1 P Z ũ
1
0 Ł0 0
D ũ0 P Ł
Z Z̃1 AT AT Z̃1 P
Ł
Z Z̃1 AT A0T Z̃Ł1 P
Z ũ
0
where AT T1/2 bŁ , T1/2 BT . Consider the matrix A0T Z̃Ł1 P Ł
Z Z̃1 AT . It follows from (A2)
and Lemma A.1 that
1 0 Ł0 Ł
0 T bŁ Z̃1 P Z Z̃1 bŁ 00
A0T Z̃Ł1 P Z̃
Z 1 T
Ł
A D 0 C oP 1 A4
0 T2 B0T Z̃Ł1 Z̃Ł1 BT
0
Next, consider A0T Z̃Ł1 P
Z ũ. From (A3) and Lemma A.1,

0
0 T1/2 b0Ł Z̃Ł1 P
Z ũ
A0T Z̃Ł1 P
Z ũ D 0
C oP 1 A5
T1 B0T Z̃Ł1 ũ
Finally, the estimator for the error variance ωuu (defined in the line after (21)),

0 Ł0 1 0 Ł0
ωO uu D T m1 ũ0 ũ ũ0 P Ł
Ł
Z ũ
D T m1 ũ0 ũ C oP 1 D ωuu C oP 1 A6
From (A4)–(A6) and Lemma A.1,

1
1 0 Ł0 0
W D T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃
Ł 1  P Z̃
Z 1
Ł
b Ł b0Ł Z̃Ł1 P
Z ũ/ωuu
0
1 0
C T2 ũ0 Z̃Ł1 BT T2 B0T Z̃Ł1 Z̃Ł1 BT B0T Z̃Ł1 ũ/ωuu C oP 1 A7
We consider each of the terms in the representation (A7) in turn. A central limit theorem allows us
to state 1/2
0 0
1/2
T1 b0Ł Z̃Ł1 P Z̃
Z 1
Ł
b Ł T1/2 b0Ł Z̃Ł1 P
Z ũ/ωuu ) zr ¾ N0, Ir

Hence, the first term in (A7) converges in distribution to z0r zr , a chi-square random variable with
r degrees of freedom; that is,
1
1 0 Ł0 0
2
T1 ũ0 P Ł
Z Z̃1 bŁ T bŁ Z̃1 P
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 P 0
Z ũ/ωuu ) zr zr ¾ / r A8
From Lemma A.1, the second term in (A7) weakly converges to

1
1 1
1
dBQ uŁ aGa0 GaGa0 dr GkC1 adBQ uŁ a/ωuu
0 0 0
which, as C D b? ? ? ? 0 ? ? 1 ? ? 0
y , b [ay , a 0ˇy , b )] ay , a , may be expressed as

0
? ? 0 1
1
a? ? 0
y , a B̃kC1 a a?1 ? 0
y , a B̃kC1 a ay , a B̃kC1 a 0
dBQ uŁ a da
0 a 12 0 a 12 a 12

1 ? ? 0
ay , a B̃kC1 a
ð dBQ uŁ a/ωuu
0 a 12
Now, noting that under H0 of (17) we may express a? 0 0 ? ?0 0
y D 1, w and a D 0, axx where
a? 0
xx axxD 0, we define the k r C 1-vector of independent de-meaned standard Brownian
motions,
Q u a, W̃kr a0 0 ] [a?
W̃krC1 a[ D W ? 0 ? ? 1/2 ?
y , a Zay , a ] ay , a? 0 B̃kC1 a
1/2 Q

ωuu Bu a
D
a? 0 ? 1/2 ? 0
xx Zxx axx axx B̃k a
where BQ uŁ a D BQ 1 a w0 B̃k a is independent of B̃k a and B̃kC1 a BQ 1 a, B̃k a0 0 is par-
titioned according to zt D yt , x0t 0 , a 2 [0, 1]. Hence, the second term in (A7) has the following

1 0
1 0 1
dW Q u a W̃krC11a W̃krC1 a W̃krC1 a
da
0 a 2 0 a 12 a 12

1
W̃krC1 a Q u a
ð dW A9
0 a 12
Note that dW Q u a in (A9) may be replaced by dWu a, a 2 [0, 1]. Combining (A8) and (A9) gives
For the remaining cases, we need only make minor modifications to the proof for Case IV.
In Case I, d D b? ? ?
y , b with b, by , b
?
a basis for RkC1 and BT D d. For Case II, where
Ł 0 0
Z1 D iT , Z1 , we have
m0
bŁ D b
IkC1
and, consequently, we define x as in Case IV,

m0
dD b? ?
y , b and BT D d, x.
IkC1
Case III is similar to Case I as is Case V.
Proof of Corollary 3.1 Follows immediately from Theorem 3.1 by setting r D k.
Proof of Corollary 3.2 Follows immediately from Theorem 3.1 by setting r D 0.
Proof of Theorem 3.2 We provide a proof for Case V which may be simply adapted for Cases I
and III. To emphasize the potential dependence of the limit distribution on nuisance parameters,
the proof is initially conducted under Assumptions 1-4 together with Assumption 5a which implies
! p
H0 yy : !yy D 0 but not necessarily H0 yx.x : pyx.x D 00 ; in particular, note that we may write a? y D
!
1, f0 0 for some k-vector f. The t-statistic for H0 yy : !yy D 0 may be expressed as the square
root of 1
0P
y 0 0
A0T Ẑ01 P
Z ,X̂1

Ẑ 1 A T A Ẑ P
T 1 Z Ẑ 1 A T Z ,X̂1 y/ωO uu A10
where AT T1/2 b, T1/2 BT and BT D b? ?

y , b . Note that only the diagonal element of the
?
inverse in (A10) corresponding to by is relevant, which implies that we only need to consider
the blocks T2 B0T Ẑ01 P 1 0 0
Z Ẑ1 BT and T BT Ẑ1 P Z ,X̂1 y in (A10). Therefore, using (A2) and
1 1 0 0
T1 û0 PX̂1 b?xx Ẑ1 BT T2 B0T Ẑ01 Ẑ1 BT T BT Ẑ1 PX̂1 b?xx û/ωuu A11
?0 0 ? 1 ? 0 0
where PX̂1 b?xx IT X̂1 b?
xx bxx X̂1 X̂1 bxx bxx X̂1 . Now,
?
T1/2 b? 0 ?0 ? ? ? 0 ? 1 ? ? 0
xx x̂[Ta] ) 0, bxx bxx [ay , a 0(by , b )] ay , a B̂kC1 a
f f ? 1 ? 0 f
D b? 0 ? ?0
xx bxx [axx 0xx lxy gyx.x bxx ] axx B̂k a
where, for convenience, but without loss of generality, we have set b? ? 0 0 ?

y D ˇyy , 0 and b D
?0 0 2 2 2 0 2 0 2 2 O2
0, bxx , lxy gxy /*yy.x , *yy.x *yy f gxy , gyx.x gyx f 0xx and B̂k a B̂k a lxy Bu a,
BO u2 a BO 1 a 20 B̂k a, a 2 [0, 1]. Hence, (A11) weakly converges to

1
1 1
1
OBu2 adWu a OBu2 aB̂2k a0 da a? ?0
xx axx
2 2
B̂k aB̂k a da a?
0
xx
0 0 0

1 2
1
1
2 OBu2 aB̂2k a0 da a?
ð a?
xx
0
B̂k adWu a ł BO u2 a2 da xx
0 0 0

1 1
1
2 2
ð a?
xx
0
B̂k aB̂k a0 da a?
xx a?
xx
0
B̂fk aBO u2 ada
0 0
Under the conditions of the theorem, f D w and l2xy D 0 and, therefore, BO u2 a[D BO uŁ a] D
0 2
1/2 O
ωuu Wu a and a? ?0 ?0 ? 1/2
xx B̂k a[D axx B̂k a] D axx Zxx axx Ŵkr a, a 2 [0, 1].
Proof of Theorem 4.1 Again, we consider Case IV; the remaining Cases I–III and V may be
!
dealt with similarly. Under H1 yy : !yy 6D 0, Assumption 5b holds and, thus,  D ay b0y C ab0 where
ay D ˛yy , 00 0 and by D ˇyy , b0yx 0 ; see above Assumption 5b. Under Assumptions 1–4 and 5b,
the process fzt g1 Ł
tD1 has the infinite moving-average representation, zt D m C gt C Cst C C Let ,
? ?0 ? 1 ?0
where now C b [a 0b ] a . We redefine bŁ and d as the k C 2, r C 1 and k C 2, k r
matrices,
g0
bŁ by , b
IkC1
and
g0
d b? ,
IkC1
where b? is a k C 1, k r matrix whose columns are a basis for the orthogonal complement of
by , b. Hence, by , b, b? is a basis for RkC1 and, thus, bŁ , x, d a basis for RkC2 , where again
x is the k C 2-unit vector 1, 00 0 . It therefore follows that
T1/2 d0 zŁ[Ta] D T1/2 b?0 m C T1/2 b?0 Cs[Ta] C b?0 T1/2 CŁ Le[Ta] ) b?0 CBkC1 a
Also, as above, T1 x0 zŁt D T1 t ) a and b0Ł zŁt D by , b0 m C by , b0 CŁ Let D OP 1.
The Wald statistic (21) multiplied by ωO uu may be written as
1
ũ P 0 Ł0 0
Ł Ł 0 Ł0 0 Ł0
Z ũ C 2lŁ Z̃1 P
Ł
Z Z̃1 lŁ ,
Z ũ C lŁ Z̃1 P
B1
where lŁ bŁ ay , a0 1, w0 0 , AT T1/2 bŁ , T1/2 BT and BT d, T1/2 x. Note that (A6)
!
continues to hold under H1 yy : !yy 6D 0. A similar argument to that in the Proof of Theorem 3.1
demonstrates that the first term in (B1) divided by ωuu has the limiting representation

1
1 1
1
z0rC1 zrC1 C dWu aFkr a0 Fkr aFkr a0 da Fkr adWu a B2
0 0 0
where zrC1 ¾ N0, IrC1 , Fkr a D W̃kr a0 , a 12 0 and W̃kr a a? 0 ? 1/2 ? 0
is a k r-vector of de-meaned independent standard Brownian 1 motions independent of the
standard Brownian motion Wu a, a 2 [0, 1]; cf. (22). Now, 0 Fkr adWu a is mixed normal
1
with conditional variance matrix 0 Fkr aFkr a0 da. Therefore, the second term in (B2) is
unconditionally distributed as a / 2 k r random variable and is independent of the first term; cf.
(A4). Hence, the first term in (B1) divided by ωuu has a limiting / 2 k C 1 distribution.
The second term in (B1) may be written as
0

1/2 1/2 0 Ł0
21, w0 ay , ab0Ł Z̃Ł1 P
Z ũ D 2T 1, w0
ay , a T b Z̃ P
Ł 1 Z ũ D OP T1/2 , B3

0
1, w0 ay , ab0Ł Z̃Ł1 P Ł 0 0 0
Z Z̃1 bŁ ay , a 1, w
0

DT1, w0 ay , a T1 b0Ł Z̃Ł1 P Ł 0 0 0
Z Z̃1 bŁ ay , a 1, w D OP T B4

0
as T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ converges in probability to a positive definite matrix. Moreover, as
!
1, w0 ay , a 6D 00 under H1 yy : !yy 6D 0, the Theorem is proved.
Proof of Theorem 4.2 A similar decomposition to (B1) for the Wald statistic (21) holds under
! !
H1 yx.x \ H0 yy except that bŁ and d are now as defined in the Proof of Theorem 3.1. Although
!yy !
H0 : !yy D 0 holds, we have H1 yx.x : pyx.x 6D 00 . Therefore, as in Theorem 3.2, note that we may
write a? 0 0
y D 1, f for some k-vector f 6D w. Consequently, the first term divided by ωuu may be
written as
1
1 0 Ł0 0
T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃
Ł 1 P Z̃
Z 1
Ł
Z ũ/ωuu
0
1 0
C T2 ũ0 Z̃Ł1 BT T2 B0T Z̃Ł1 Z̃Ł1 BT B0T Z̃Ł1 ũ/ωuu C oP 1 B5
cf. (A7). As in the Proof of Theorem 3.1, the first term of (B5) has the limiting representation z0r zr
where zr ¾ N0, Ir ; cf. (22). The second term of (B5) has the limiting representation
Q2  1

1 Bu a 0
1 BQ u2 a Q2
Bu a 0
dBQ uŁ a a? 0
xx B̃k a
 a? 0
xx B̃k a a? 0
xx B̃k a da
1 1
0 a 2 0 a 2 a 12

1 BQ u2 a
ð a? 0
xx B̃k a dBQ uŁ a/ωuu D OP 1
0 1
a 2
where BQ uf a BQ 1 a f0 B̃k a, a 2 [0, 1]; cf. Proof of Theorem 3.2. The second term of (B1)
becomes
0

1/2 1/2 0 Ł0
21, w0 ab0Ł Z̃Ł1 PZ ũ D 2T 1, w0
a T b Z̃ P
Ł 1 Z ũ D OP T1/2
and the third term

0
1, w0 ab0Ł Z̃Ł1 P Ł 0 0 0
Z Z̃1 bŁ a 1, w D T1, w a
0
0

ð T1 b0Ł Z̃Ł1 PZ 1Z̃ Ł
b Ł a0 1, w0 0 D OP T

! p
The Theorem follows as 1, w0 a 6D 00 under H0 yy : !yy D 0 and H1 yx.x : pyx.x 6D 00 .
Proof of Theorem 4.3 We concentrate on Case IV; the remaining Cases I–III and V are
proved by a similar argument. Let fztT gTtD1 denote the process under H1T of (26). Hence,
8LztT m gt D xtT , where xtT 5T 5[zt1T m gt 1] C et and 5T 5 is
given in (27). Therefore, ztT ) gt D CxtT C CŁ LxtT , Cz D C C 1 zCŁ z and
?
C D b? ? ? ? 0 ? 1 ? ? 0
y , b [ay , a 0(by , b )] ay , a , and thus,
[IkC1 IkC1 C T1 Cay b0y L]ztT m gt D CetT C CŁ LxtT B6
where

dyx
etT T1/2 b0 [zt1T m gt 1] C et , t D 1, . . . , T, T D 1, 2, . . .
dxx

s1
i
ztT D IkC1 C T1 Cay b0y s zsT m gs C m C gt C IkC1 C T1 Cay b0y
iD0
Ł
ð[CetiT C C LxtiT ]
Note that xtT D 5T 5[zt1T m gt 1] C et . It therefore follows that T1/2 d0 zŁ[Ta]T
a
) b? ? 0 Ł 0 0
y , b CJkC1 a, where d is defined above Lemma A.1 and ztT D t, ztT , JkC1 a 0 exp
0
fay by Ca rgdBkC1 r is an Ornstein-Uhlenbeck process and BkC1 a is a k C 1-vector Brow-
nian motion with variance matrix Z, a 2 [0, 1]; cf. Johansen (1995, Theorem 14.1, p. 202).
Similarly to (A4),
1 0 Ł0 Ł
T bŁ Z̃1 PZ Z̃1 bŁ 00
A0T Z̃01 P Z̃ A
Z 1 T D 0 C oP 1
Therefore, expression (B1) for the Wald statistic (21) multiplied by ωO uu is revised to
1
% 0 yP
ωO uu W D T1  Ł
T 1 0 Ł0 Ł 0
b0Ł Z̃Ł1 P
Z

Z̃ 1 b Ł b Ł Z̃ 1 P Z

Z̃ 1 b Ł Z y
1
C T2  % 0 yP Ł
T 2 0 Ł0 Ł 0
B0T Z̃Ł1 P
Z̃
Z 1 T

B B Z̃ Z̃
T 1 1 T B Z y C oP 1 B7

1
1 0 Ł0 0
T1 ũ0 P Ł
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 PZ ũ
1
1 0 Ł0 0 Ł0
C 2T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃ P
Ł 1  Z̃
Z 1
Ł
b Ł b0Ł Z̃Ł1 P Ł
Z Z̃1 pyT
0
1
1 0 Ł0 0 Ł0
C T1 pŁyT Z̃Ł1 P Ł
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 P Ł
Z Z̃1 pyT B8
where pŁyT T1 ˛yy b0yŁ C T1/2 dyx w0 dxx b0Ł . Defining h dyx w0 dxx 0 , consider
0 0 0
T1/2 b0Ł Z̃Ł1 P Ł Ł
Z Z̃1 pyT D T
1/2 0 Ł Ł 1
Z Z̃1 byŁ ˛yy T C bŁ hT
bŁ Z̃1 P 1/2

0
D T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ h C oP 1 B9
where we have made use of T1/2 b0yŁ zŁ[Ta]T ) b0y CJkC1 a. Therefore, (B8) divided by ωuu may be
re-expressed as
0
0
1/2 0 Ł0
T1/2 b0Ł Z̃Ł1 P
Z ũ C Qh Q 1
T b Z̃
Ł 1 P Z ũ C Qh /ωuu C oP 1 D z0r zr C oP 1
B9
1 0 Ł0 Ł 1/2
where Q p limT!1 T bŁ Z̃1 P Z Z̃1 bŁ and zr ¾ NQ h, Ir .
As Ł Ł0 0
T1 B0T Z̃Ł1 P 1 0 Ł0 Ł Ł0
P Z y D P Z Z̃1 pyT C ũ, Z0 y D T B0 T Z̃1 PZ Z̃1 pyT C ũ.
Consider the second term in (B7), in particular, T1 B0T Z̃Ł1 P Ł Ł
Z Z̃1 pyT which after substitution
Ł
for pyT becomes
0 0 0
T2 B0T Z̃Ł1 P Ł
Z Z̃1 byŁ ˛yy C T
3/2 0 Ł
BT Z̃1 P Ł 2 0 Ł
Z Z̃1 bŁ h D T BT Z̃1 P
Ł
Z Z̃1 byŁ ˛yy C oP 1

1 ? ? 0
by , b CJ̃kC1 a
) 1 J̃kC1 a0 C0 by ˛yy da
0 a 2
Therefore,

1
0
b? ? 0
y , b CJ̃kC1 a 1/2 Q
T1 B0T Z̃Ł1 P
Z y ) ωuu dWu a C J̃kC1 a0 C0 by ˛yy da

0 a 12
Consider
J̃ŁkrC1 a[D JQ Łu a, J̃Łkr a0 0 ] [a? ? 0 ?

y , a Zay , a ]
? 1/2 ?
ay , a? 0 J̃kC1 a
1/2 Q
ωuu Ju a
D
a? 0 ? 1/2 ? 0
xx xx axx axx J̃k a
where JQ u a D JQ 1 a w0 J̃k a is independent of J̃k a and J̃kC1 a JQ 1 a, J̃k a0 0 , a 2 [0, 1].
Now, J̃ŁkrC1
a satisfies the stochastic integral and differential equations, J̃ŁkrC1 a D W̃krC1
0 a Ł
a C ab 0 J̃krC1 r dr and dJ̃krC1 a D dW̃krC1 a C ab0 J̃ŁkrC1 a da, where a D [a?
Ł ? 0
y ,a
? ? 1/2 ? ? 1/2
ay , a ] ay , a ay and b D [ay , a Zay , a ] ð [by , b 0ay , a ] by , b? 0
? 0 ? ? 0 ? ? ? 0 ? ? 1 ?
by ; cf. Johansen (1995, Theorem 14.4, p. 207). Note that the first element of J̃ŁkrC1 a satisfies
QJŁu a D WQ u a C ωuu 0 a Ł
1/2
˛yy b 0 J̃krC1 r dr and dJQ Łu a D dWQ u a C ωuu
1/2
˛yy b0 JQ ŁkrC1 a da.
Therefore,

1
1 0
b? ? 0
y , b CJ̃kC1 a 1/2 Q Ł
T B0T Z̃Ł1 P
Z Y ) ωuu dJu a
0 a 12
Hence, the second term in (B7) weakly converges to

1
1 1
1
ωuu dJQ Łu aFkrC1 a0 FkrC1 aFkrC1 a da 0
FkrC1 a dJQ Łu a B10
0 0 0
where FkrC1 a D J̃ŁkrC1 a0 , a 12 0 .

Combining (B9) and (B10) gives the result stated in Theorem 4.3 as ωO uu ωuu D OP 1 under
H1T of (26) and noting dJQ Łu a may be replaced by dJŁu a.
Proof of Theorem 4.4 We consider Case V; the remaining Cases I and III may be dealt with
!
similarly. Under H1 yy : !yy 6D 0, from (10), ŷ1 D X̂1 q C v̂1 , where v̂1 P Z ,X̂1 v1 and
0 0
v1 D 0, v1 , . . . , vT1 . Therefore, ŷ1 P 0 0
Z ,X̂1 y D v̂1 P
Z ,X̂1 Y and ŷ1 P
Z ,X̂1 ŷ1 D
0
Z ,X̂1 v̂1 .
v̂1 P
As in Appendix A,
T1/2 b? 0
xx x[Ta] D T
1/2 ? 0
bxx mx C T1/2 b? 0
xx gx t C T
1/2 ? 0 ?
bxx bxx a?0 0b? 1 a?0 s[Ta]
C 0, b? 0
xx T
1/2 Ł
C Le[Ta]
and noting that b0xx b? 0

xx D 0, bxx xt D T
1/2 0
bxx mx C T1/2 b0xx gx t C 0, b0xx CŁ Let . Consequently,
1 0 0
T bxx X̂1 PZ X̂1 bxx 00
A0xT X̂01 P X̂
Z 1 xTA D 0 0 C oP 1
0 T2 b? xx X̂1 P
?
Z X̂1 bxx
where AxT T1/2 bxx , T1/2 b? xx .

D OP 1, T1 Z
Now, because T1 b0xx X̂01 v̂1 D OP 1, T1 b0xx X̂01 Z 0 Z D OP 1 and

0
v̂1 D OP 1, hence T1 b0 X̂0 P
T1 Z 1 ? 0 0
xx 1 Z v̂1 D OP 1. Also because T bxx X̂1 v̂1 D OP 1
0 0 1 ? 0 0
and T1 b? xx X̂1 Z D OP 1, hence T bxx X̂1 P Z v̂1 D OP 1; cf. (A3). Hence, noting that
T1 b0xx X̂01 P X̂ b
Z 1 xx D O P 1 and T2 ? 0 0
b X̂
xx 1 P ?
Z X̂1 bxx D OP 1,
T1 ŷ01 P
Z ŷ1 D T1 v̂01 P
Z v̂1 T1 v̂01 P
Z ? v̂1 C oP 1
,X̂1 ,X̂1 bxx ,X̂1 bxx
D T1 v̂01 P
Z v̂1 C oP 1
,X̂1 bxx
0 0 1 0 0
where P
Z ,X̂1 bxx P
Z PZ X̂1 bxx bxx X̂1 P
Z X̂1 bxx bxx X̂1 P
Z and P
Z ,X̂1 b?xx
? ?0 0 ? 1 ? 0 0 1 0
P
Z . Therefore, as T v̂1 v̂1 D OP 1,

T1 ŷ01 P
Z ŷ1 D OP 1 B11
,X̂1
The numerator of t!yy of (24) may be written as ŷ01 P û C v̂01 P D v̂0 P

y
Z Z ,X̂1
,X̂1
1 Z ,X̂1
0 0 0 1/2 0 0 1/2 0
Ẑ1 l, where l by , bay , a 1, w . Because T bxx X̂1 û D OP 1 and T Z û D
OP 1, T1/2 b0xx X̂01 P 1 ? 0 0 1 ? 0 0

Z û D OP 1, and, as T bxx X̂1 û D OP 1, T bxx X̂1 P
Z û D OP 1.
Therefore,
T1/2 v̂01 P
Z û D T1/2 v̂01 P
Z û T1/2 v̂01 P
Z ? û C oP 1
D T1/2 v̂01 P
Z û C oP 1 D OP 1
,X̂1 bxx
D OP 1, T1 l0 Ẑ0

noting T1/2 v̂01 û D OP 1. Similarly, as 1, w0 ay , a 6D 00 , T1 l0 Ẑ01 Z 1
1 0 0 ?
X̂1 bxx D OP 1 and T l Ẑ1 X̂1 bxx D OP 1. Therefore,
T1 v̂01 P
Z Ẑ1 l D T1 v̂01 P
Z Ẑ1 l T1 v̂01 P
Z ? Ẑ1 l C oP 1
D T1 v̂01 P
Z Ẑ1 l C oP 1 D OP 1
,X̂1 bxx
noting T1 v̂01 Ẑ1 l D OP 1. Thus,
T1/2 v̂01 P
Z Ẑ1 l D OP T1/2 . B12
,X̂1
Because ωO uu ωuu D oP 1, combining (B11) and (B12) yields the desired result.
ACKNOWLEDGEMENTS
We are grateful to the Editor (David Hendry) and three anonymous referees for their helpful
comments on an earlier version of this paper. Our thanks are also owed to Michael Binder, Peter
Burridge, Clive Granger, Brian Henry, Joon-Yong Park, Ron Smith, Rod Whittaker and seminar
participants at the University of Birmingham. Partial financial support from the ESRC (grant Nos
R000233608 and R000237334) and the Isaac Newton Trust of Trinity College, Cambridge, is
gratefully acknowledged. Previous versions of this paper appeared as DAE Working Paper Series,
REFERENCES
Banerjee A, Dolado J, Galbraith JW, Hendry DF. 1993. Co-Integration, Error Correction, and the Econo-
Banerjee A, Dolado J, Mestre R. 1998. Error-correction mechanism tests for cointegration in single-equation
framework. Journal of Time Series Analysis 19: 267–283.
Banerjee A, Galbraith JW, Hendry DF, Smith GW. 1986. Exploring equilibrium relationships in economet-
rics through static models: some Monte Carlo Evidence. Oxford Bulletin of Economics and Statistics 48:
253–277.
Blanchard OJ, Summers L. 1986. Hysteresis and the European Unemployment Problem. In NBER Macroe-
conomics Annual 15–78.
Boswijk P. 1992. Cointegration, Identification and Exogeneity: Inference in Structural Error Correction
Boswijk HP. 1994. Testing for an unstable root in conditional and structural error correction models. Journal
of Econometrics 63: 37–70.
Boswijk HP. 1995. Efficient inference on cointegration parameters in structural error correction models.
Journal of Econometrics 69: 133–158.
Cavanagh CL, Elliott G, Stock JH. 1995. Inference in models with nearly integrated regressors. Econometric
Theory 11: 1131–1147.
Chan A, Savage D, Whittaker R. 1995. The new treasury model. Government Economic Series Working
Darby J, Wren-Lewis S. 1993. Is there a cointegrating vector for UK wages? Journal of Economic Studies
20: 87–115.
Dickey DA, Fuller WA. 1979. Distribution of the estimators for autoregressive time series with a unit root.
Journal of the American Statistical Association 74: 427–431.
Dickey DA, Fuller WA. 1981. Likelihood ratio statistics for autoregressive time series with a unit root.
Econometrica 49: 1057–1072.
Engle RF, Granger CWJ. 1987. Cointegration and error correction representation: estimation and testing.
Granger CWJ, Lin J-L. 1995. Causality in the long run. Econometric Theory 11: 530–536.
Hansen BE. 1995. Rethinking the univariate approach to unit root testing: using covariates to increase power.
Econometric Theory 11: 1148–1171.
Harbo I, Johansen S, Nielsen B, Rahbek A. 1998. Asymptotic inference on cointegrating rank in partial
systems. Journal of Business Economics and Statistics 16: 388–399.
Hendry DF, Pagan AR, Sargan JD. 1984. Dynamic specification. In Handbook of Econometrics (Vol. II)
Johansen S. 1991. Estimation and hypothesis testing of cointegrating vectors in Gaussian vector autoregres-
sive models. Econometrica 59: 1551–1580.
Johansen S. 1992. Cointegration in partial systems and the efficiency of single-equation analysis. Journal of
Econometrics 52: 389–402.
Johansen S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford Uni-
Kremers JJM, Ericsson NR, Dolado JJ. 1992. The power of cointegration tests. Oxford Bulletin of Economics
and Statistics 54: 325–348.
Layard R, Nickell S, Jackman R. 1991. Unemployment: Macroeconomic Performance and the Labour
Lindbeck A, Snower D. 1989. The Insider Outsider Theory of Employment and Unemployment, MIT Press:
Cambridge, MA.
Manning A. 1993. Wage bargaining and the Phillips curve: the identification and specification of aggregate
wage equations. Economic Journal 103: 98–118.
Nickell S, Andrews M. 1983. Real wages and employment in Britain. Oxford Economic Papers 35: 183–206.
Nielsen B, Rahbek A. 1998. Similarity issues in cointegration analysis. Preprint No. 7, Department of
Park JY. 1990. Testing for unit roots by variable addition. In Advances in Econometrics: Cointegration,
Pesaran MH, Pesaran B. 1997. Working with Microfit 4.0: Interactive Econometric Analysis, Oxford Univer-
sity Press: Oxford.
Pesaran MH, Shin Y. 1999. An autoregressive distributed lag modelling approach to cointegration analysis.
Chapter 11 in Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial
Pesaran MH, Shin Y, Smith RJ. 2000. Structural analysis of vector error correction models with exogenous
I(1) variables. Journal of Econometrics 97: 293–343.
Phillips AW. 1958. The relationship between unemployment and the rate of change of money wage rates in
the United Kingdom, 1861–1957. Economica 25: 283–299.
Phillips PCB, Durlauf S. 1986. Multiple time series with integrated variables. Review of Economic Studies
53: 473–496.
Phillips PCB, Ouliaris S. 1990. Asymptotic properties of residual based tests for cointegration. Econometrica
58: 165–193.
Phillips PCB, Solo V. 1992. Asymptotics for linear processes. Annals of Statistics 20: 971–1001.
Rahbek A, Mosconi R. 1999. Cointegration rank inference with stationary regressors in VAR models. The
Econometrics Journal 2: 76–91.
Sargan JD. 1964. Real wages and prices in the U.K. Econometric Analysis of National Economic Planning,
Hart PE Mills G, Whittaker JK (eds). Macmillan: New York. Reprinted in Hendry DF, Wallis KF (eds.)
Econometrics and Quantitative Economics. Basil Blackwell: Oxford; 275–314.
1097–1107.
Urbain JP. 1992. On weak exogeneity in error correction models. Oxford Bulletin of Economics and Statistics
52: 187–202.
J. Appl. Econ. 16: 289– 326 (2001)
DOI: 10.1002/jae.616

M. HASHEM PESARAN,a * YONGCHEOL SHINb AND RICHARD J. SMITHc

SUMMARY
This paper develops a new approach to the problem of testing the existence of a level relationship between
a dependent variable and a set of regressors, when it is not known with certainty whether the underlying
regressors are trend- or first-difference stationary. The proposed tests are based on standard F- and t-statistics
used to test the significance of the lagged levels of the variables in a univariate equilibrium correction
mechanism. The asymptotic distributions of these statistics are non-standard under the null hypothesis that
there exists no level relationship, irrespective of whether the regressors are I0 or I1. Two sets of asymptotic
critical values are provided: one when all regressors are purely I1 and the other if they are all purely
I0. These two sets of critical values provide a band covering all possible classifications of the regressors
into purely I0, purely I1 or mutually cointegrated. Accordingly, various bounds testing procedures are
proposed. It is shown that the proposed tests are consistent, and their asymptotic distribution under the null
and suitably defined local alternatives are derived. The empirical relevance of the bounds procedures is
demonstrated by a re-examination of the earnings equation included in the UK Treasury macroeconometric
model. Copyright  2001 John Wiley & Sons, Ltd.
1. INTRODUCTION
Over the past decade considerable attention has been paid in empirical economics to testing for
the existence of relationships in levels between variables. In the main, this analysis has been
based on the use of cointegration techniques. Two principal approaches have been adopted: the
two-step residual-based procedure for testing the null of no-cointegration (see Engle and Granger,
1987; Phillips and Ouliaris, 1990) and the system-based reduced rank regression approach due to
Johansen (1991, 1995). In addition, other procedures such as the variable addition approach of Park
(1990), the residual-based procedure for testing the null of cointegration by Shin (1994), and the
stochastic common trends (system) approach of Stock and Watson (1988) have been considered.
All of these methods concentrate on cases in which the underlying variables are integrated of order
one. This inevitably involves a certain degree of pre-testing, thus introducing a further degree of
uncertainty into the analysis of levels relationships. (See, for example, Cavanagh, Elliott and Stock,
1995.)
Ł Correspondence to: M. H. Pesaran, Faculty of Economics and Politics, University of Cambridge, Sidgwick Avenue,
Copyright  2001 John Wiley & Sons, Ltd. Received 16 February 1999
I(0), purely I(1) or mutually cointegrated. The statistic underlying our procedure is the familiar
Wald or F-statistic in a generalized Dicky–Fuller type regression used to test the significance
of lagged levels of the variables under consideration in a conditional unrestricted equilibrium
correction model (ECM). It is shown that the asymptotic distributions of both statistics are
non-standard under the null hypothesis that there exists no relationship in levels between the
included variables, irrespective of whether the regressors are purely I(0), purely I(1) or mutually
cointegrated. We establish that the proposed test is consistent and derive its asymptotic distribution
under the null and suitably defined local alternatives, again for a set of regressors which are a
mixture of I0/I1 variables.
Two sets of asymptotic critical values are provided for the two polar cases which assume that all
the regressors are, on the one hand, purely I(1) and, on the other, purely I(0). Since these two sets
of critical values provide critical value bounds for all classifications of the regressors into purely
I(1), purely I(0) or mutually cointegrated, we propose a bounds testing procedure. If the computed
ratio (unemployment benefit–wage ratio) and the wedge between the ‘real product wage’ and the
‘real consumption wage’ that typically enter the earnings equation. There is another consideration
in the choice of this application. Under the influence of the seminal contributions of Phillips (1958)
the development of time series econometrics in the UK. Sargan’s work is particularly noteworthy
The relationship in levels underlying the UK Treasury’s earning equation relates real average
et al. (1991). In order to identify our model as corresponding to the bargaining theory of wage
vice versa; see Manning (1993). This assumption, of course, does not preclude the rate of change
A number of conditional ECMs in these five variables were estimated and we found that, if a
sufficiently high order is selected for the lag lengths of the included variables, the hypothesis that
there exists no relationship in levels between these variables is rejected, irrespective of whether
they are purely I(0), purely I(1) or mutually cointegrated. Given a level relationship between these
variables, the autoregressive distributed lag (ARDL) modelling approach (Pesaran and Shin, 1999)
The plan of the paper is as follows. The vector autoregressive (VAR) model which underpins
the analysis of this and later sections is set out in Section 2. This section also addresses the
issues involved in testing for the existence of relationships in levels between variables. Section 3
considers the Wald statistic (or the F-statistic) for testing the hypothesis that there exists no
level relationship between the variables under consideration and derives the associated asymptotic
theory together with that for the t-statistic of Banerjee et al. (1998). Section 4 discusses the power
properties of these tests. Section 5 describes the empirical application. Section 6 provides some
concluding remarks. The Appendices detail proofs of results given in Sections 3 and 4.
The following notation is used. The symbol ) signifies ‘weak convergence in probability
measure’, Im ‘an identity matrix of order m’, Id ‘integrated of order d’, OP K ‘of the same
order as K in probability’ and oP K ‘of smaller order than K in probability’.

Let fzt g1
denote a k C 1-vector random process. The data-generating process for fzt g1
tD1 tD1 is the
8Lzt m gt D et , t D 1, 2, . . . 1
where L is the lag operator, m and g are unknown k C 1-vectors

p of intercept and ptrend coefficients,
the k C 1, k C 1 matrix lag polynomial 8L D IkC1 iD1 8i L i with fi giD1 k C 1, k C 1
matrices of unknown coefficients; see Harbo et al. (1998) and Pesaran, Shin and Smith (2000),
henceforth HJNR and PSS respectively. The properties of the k C 1-vector error process fet g1
tD1
are given in Assumption 2 below. All the analysis of this paper is conducted given the initial
observations Z0 z1p , . . . , z0 . We assume:
p
Assumption 1. The roots of jIkC1 iD1 8i zi j D 0 are either outside the unit circle jzj D 1 or
satisfy z D 1.
Assumption 2. The vector error process fet g1

tD1 is IN0, Z, Z positive definite.
Assumption 1 permits the elements of zt to be purely I(1), purely I(0) or cointegrated but excludes
the possibility of seasonal unit roots and explosive roots.1 Assumption 2 may be relaxed somewhat
to permit fet g1
tD1 to be a conditionally mean zero and homoscedastic process; see, for example,
We may re-express the lag polynomial 8L in vector equilibrium correction model (ECM)
form; i.e. 8L 5L C 0L1 L in which the long-run multiplier matrix is defined by 5
1 Assumptions 5a and 5b below further restrict the maximal order of integration of fzt g1
tD1 to unity.
p p1 i
p iD1 8i , and the short-run response matrix lag polynomial 0L IkC1 iD1 0i L ,
IkC1
0i D jDiC1 j , i D 1, . . . , p 1. Hence, the VAR(p) model (1) may be rewritten in vector
ECM form as

p1
zt D a0 C a1 t C 5zt1 C 0i zti C et t D 1, 2, . . . 2
iD1
where  1 L is the difference operator,
a0 5m C 0 C 5g, a1 5g 3

p1 p
and the sum of the short-run coefficient matrices 0 Im iD1 0i D 5 C iD1 i8i . As
detailed in PSS, Section 2, if g 6D 0, the resultant constraints (3) on the trend coefficients a1
in (2) ensure that the deterministic trending behaviour of the level process fzt g1 tD1 is invariant to
the (cointegrating) rank of 5; a similar result holds for the intercept of fzt g1
tD1 if m 6D 0 and g D 0.
Consequently, critical regions defined in terms of the Wald and F-statistics suggested below are
The focus of this paper is on the conditional modelling of the scalar variable yt given the k-
vector xt and the past values fzti gt1 0 0
iD1 and Z0 , where we have partitioned zt D yt , xt . Partitioning
0 0 0 0 0
the error term et conformably with zt D yt , xt as et D εyt , ext and its variance matrix as

ωyy wyx
ZD
wxy xx
we may express εyt conditionally in terms of ext as
εyt D wyx Z1

xx ext C ut 4
where ut ¾ IN0, ωuu , ωuu ωyy wyx Z1 xx wxy and ut is independent of ext . Substitution of (4)
into (2) together with a similar partitioning of a0 D ay0 , a0x0 0 , a1 D ay1 , a0x1 0 , 5 D p0y , 50x 0 ,
0 D g0y , 00x 0 , 0i D g0yi , 00xi 0 , i D 1, . . . , p 1, provides a conditional model for yt in terms of
zt1 , xt , zt1 , . . .; i.e. the conditional ECM

p1
yt D c0 C c1 t C py.x zt1 C y0i zti C w0 xt C ut t D 1, 2, . . . 5
iD1
where w 1 0 0 0 0
xx wxy , c0 ay0 w ax0 , c1 ay1 w ax1 , yi gyi w 0xi , i D 1, . . . , p 1, and
0
py.x py w x . The deterministic relations (3) are modified to
c0 D py.x m C gy.x C py.x g c1 D py.x g 6
where gy.x gy w0 0x .
We now partition the long-run multiplier matrix 5 conformably with zt D yt , x0t 0 as

!yy pyx
D
pxy 5xx
Assumption 3. The k-vector pxy D 0.
In the application of Section 6, Assumption 3 is an identifying assumption for the bargaining


p1
xt D ax0 C ax1 t C 5xx xt1 C 0xi zti C ext t D 1, 2, . . . . 7
iD1
Thus, we may regard the process fxt g1 1

tD1 as long-run forcing for fyt gtD1 as there is no feedback
3
from the level of yt in (7); see Granger and Lin (1995). Assumption 3 restricts consideration to
cases in which there exists at most one conditional level relationship between yt and xt , irrespective
4
of the level of integration of the process fxt g1
tD1 ; see (10) below.

p1
iD1
t D 1, 2, . . ., where
c0 D !yy , pyx.x m C [gy.x C !yy , pyx.x ]g, c1 D !yy , pyx.x g 9

0
and pyx.x pyx w 5xx .5
the system.
Assumption 4. The matrix 5xx has rank r, 0 r k.
Under Assumption 4, from (7), we may express 5xx as 5xx D axx b0xx , where axx and bxx are both
k, r matrices of full column rank; see, for example, Engle and Granger (1987) and Johansen
1, 3 and 4, the process fxt g1tD1 is mutually cointegrated of order r, 0 r k. However, in
concentrate on the case r D 0, we do not wish to impose an a priori specification of r.6 When
pxy D 0 and 5xx D 0, then xt is weakly exogenous for !yy and pyx.x D pyx in (8); see, for example,
3 Note that this restriction does not preclude fyt g1 1

tD1 being Granger-causal for fxt gtD1 in the short run.
4 Assumption 3 may be straightforwardly assessed via a test for the exclusion of the lagged level yt1 in (7). The
asymptotic properties of such a test are the subject of current research.
5 PSS and HJNR consider a similar model but where x is purely I1; that is, under the additional assumption 5 D 0.
t xx
If current and lagged values of a weakly exogenous purely I0 vector wt are included asadditional explanatory variables
in (8), the lagged level vector xt1 should be augmented to include the cumulated sum t1 sD1 ws in order to preserve the
asymptotic similarity of the statistics discussed below. See PSS, sub-section 4.3, and Rahbek and Mosconi (1999).
6 BDM, pp. 277– 278, also briefly discuss the case when 0 < r k. However, in this circumstance, as will become clear
below, the validity of the limiting distributional results for their procedure requires the imposition of further implicit and
Johansen (1995, Theorem 8.1, p. 122). In the more general case where 5xx is non-zero, as !yy and
pyx.x D pyx w0 5xx are variation-free from the parameters in (7), xt is also weakly exogenous for
matrix 5 for the system (8) and (7) is r C 1 and the minimal cointegrating rank of 5 is r. The
and (7) to be unity. First, we consider the requisite conditions for the case in which rank5 D r.
In this case, under Assumptions 1, 3 and 4, !yy D 0 and pyx f0 5xx D 00 for some k-vector f.
Note that pyx.x D 00 implies the latter condition. Thus, under Assumptions 1, 3 and 4, 5 has rank
r and is given by
0 pyx
D
0 5xx
Hence, we may express 5 D ab0 where a D a0yx , a0xx 0 and b D 0, b0xx 0 are k C 1, r matrices of
full column rank; cf. HJNR, p. 390. Let the columns of the k C 1, k r C 1 matrices a? ?
y ,a
? ? ? ? ? ?
and by , b , where ay , by and a , b are respectively k C 1-vectors and k C 1, k r
matrices, denote bases for the orthogonal complements of respectively a and b; in particular,
a? ? 0 ? ? 0
y , a a D 0 and by , b b D 0.
Assumption 5a. If rank5 D r, the matrix a? ? 0 ? ?

y , a 0by , b is full rank k r C 1, 0 r k.

Second, if the long-run multiplier matrix 5 has rank r C 1, then under Assumptions 1, 3 and 4,
!yy 6D 0 and 5 may be expressed as 5 D ay b0y C ab0 , where ay D ˛yy , 00 0 and by D ˇyy , b0yx 0
are k C 1-vectors, the former of which preserves Assumption 3. For this case, the columns of a?
and b? form respective bases for the orthogonal complements of ay , a and by , b; in particular,
a?0 ay , a D 0 and b?0 by , b D 0.
Assumption 5b. If rank5 D r C 1, the matrix a?0 0b? is full rank k r, 0 r k.
Assumptions 1, 3, 4 and 5a and 5b permit the two polar cases for fxt g1 1
tD1 . First, if fxt gtD1 is a
purely I0 vector process, then 5xx , and, hence, axx and bxx , are nonsingular. Second, if fxt g1 tD1
is purely I1, then 5xx D 0, and, hence, axx and bxx are also null matrices.
Using (A.1) in Appendix A, it is easily seen that py.x zt m gt D py.x CŁ Let , where
fCŁ Let g is a mean zero stationary process. Therefore, under Assumptions 1, 3, 4 and 5b, that is,
!yy 6D 0, it immediately follows that there exists a conditional level relationship between yt and
xt defined by
yt D (0 C (1 t C qxt C vt , t D 1, 2, . . . 10
where (0 py.x m/!yy , (1 py.x g/!yy , q pyx.x /!yy and vt D py.x CŁ Lεt /!yy , also a zero mean
stationary process. If pyx.x D ˛yy b0yx C ayx w axx b0xx 6D 00 , the level relationship between yt
and xt is non-degenerate. Hence, from (10), yt ¾ I0 if rankbyx , bxx D r and yt ¾ I1 if
rankbyx , bxx D r C 1. In the former case, q is the vector of conditional long-run multipliers and,
in this sense, (10) may be interpreted as a conditional long-run level relationship between yt and
xt , whereas, in the latter, because the processes fyt g1 1
tD1 and fxt gtD1 are cointegrated, (10) represents
the conditional long-run level relationship between yt and xt . Two degenerate cases arise. First,
if !yy 6D 0 and pyx.x D 00 , clearly, from (10), yt is (trend) stationary or yt ¾ I0 whatever the
value of r. Consequently, the differenced variable yt depends only on its own lagged level yt1
in the conditional ECM (8) and not on the lagged levels xt1 of the forcing variables. Second, if
!yy D 0, that is, Assumption 5a holds, and pyx.x D ayx w0 axx b0xx 6D 00 , as rank5 D r, pyx.x D
f w0 axx b0xx which, from the above, yields pyx.x xt mx gx t D py.x CŁ Let , t D 1, 2, . . .,
where m D )y , m0x 0 and g D *y , g0x 0 are partitioned conformably with zt D yt , x0t 0 . Thus, in
(8), yt depends only on the lagged level xt1 through the linear combination f w0 axx of the
lagged mutually cointegrating relations b0xx xt1 for the process fxt g1 tD1 . Consequently, yt ¾ I1
whatever the value of r. Finally, if both !yy D 0 and pyx.x D 00 , there are no level effects in the
conditional ECM (8) with no possibility of any level relationship between yt and xt , degenerate
or otherwise, and, again, yt ¾ I1 whatever the value of r.
Therefore, in order to test for the absence of level effects in the conditional ECM (8) and, more
crucially, the absence of a level relationship between yt and xt , the emphasis in this paper is a
test of the joint hypothesis !yy D 0 and pyx.x D 00 in (8).7,8 In contradistinction, the approach of
yt D c0 C c1 t C ˛yy ˇyy yt1 C b0yx xt1 C ayx w0 axx b0xx xt1

p1
C y0i zti C w0 xt C ut 11
iD1
BDM test for the exclusion of yt1 in (11) when r D 0, that is, bxx D 0 in (11) or 5xx D 0 in
(7) and, thus, fxt g is purely I1; cf. HJNR and PSS.9 Therefore, BDM consider the hypothesis
˛yy D 0 (or !yy D 0).10 More generally, when 0 < r k, BDM require the imposition of the
untested subsidiary hypothesis ayx w0 axx D 00 ; that is, the limiting distribution of the BDM test
is obtained under the joint hypothesis !yy D 0 and pyx.x D 0 in (8).
In the following sections of the paper, we focus on (8) and differentiate between five cases of
ž Case I (no intercepts; no trends) c0 D 0 and c1 D 0. That is, m D 0 and g D 0. Hence, the
ECM (8) becomes

p1
yt D !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 12
iD1
ž Case II (restricted intercepts; no trends) c0 D !yy , pyx.x m and c1 D 0. Here, g D 0. The
ECM is

p1
yt D !yy yt1 )y C pyx.x xt1 mx C y0i zti C w0 xt C ut 13
iD1
7 This joint hypothesis may be justified by the application of Roy’s union-intersection principle to tests of ! D 0
yy
in (8) given pyx.x . Let W!yy pyx.x be the Wald statistic for testing !yy D 0 for a given value of pyx.x . The test
max!yx.x W!yy pyx.x is identical to the Wald test of !yy D 0 and pyx.x D 0 in (8).
8 A related approach to that of this paper is Hansen’s (1995) test for a unit root in a univariate time series which, in our
context, would require the imposition of the subsidiary hypothesis pyx.x D 00 .
9 The BDM test is based on earlier contributions of Kremers et al. (1992), Banerjee et al. (1993), and Boswijk (1994).
10 Partitioning 0 D g 0 0
xi xy,i , 0xx,i , i D 1, . . . , p 1, conformably with zt D yt , xt , BDM also set gxy,i D 0, i D
1, . . . , p 1, which implies gxy D 0, where 0x D gxy , 0xx ; that is, yt does not Granger cause xt .
ž Case III (unrestricted intercepts; no trends) c0 6D 0 and c1 D 0. Again, g D 0. Now, the

intercept restriction c0 D !yy , pyx.x m is ignored and the ECM is

p1
yt D c0 C !yy yt1 C pyx.x xt1 C y0i zti C w0 xt C ut 14
iD1
ž Case IV (unrestricted intercepts; restricted trends) c0 6D 0 and c1 D !yy , pyx.x g.

p1
yt D c0 C !yy yt1 *y t C pyx.x xt1 gx t C y0i zti C w0 xt C ut 15
iD1
ž Case V (unrestricted intercepts; unrestricted trends) c0 6D 0 and c1 6D 0. Here, the deterministic

trend restriction c1 D !yy , pyx.x * is ignored and the ECM is

p1
iD1
It should be emphasized that the DGPs for Cases II and III are treated as identical as are those
compared with that of Dickey and Fuller (1981) for univariate models, estimation and hypothesis
testing in Cases III and V proceed ignoring the constraints linking respectively the intercept and
trend coefficient, c0 and c1 , to the parameter vector !yy , pyx.x whereas Cases II and IV fully

In this section we develop bounds procedures for testing for the existence of a level relationship
between yt and xt using (12)–(16); see (10). The main approach taken here, cf. Engle and
Granger (1987) and BDM, is to test for the absence of any level relationship between yt and
xt via the exclusion of the lagged level variables yt1 and xt1 in (12)–(16). Consequently, we
! !
define the constituent null hypotheses H0 yy : !yy D 0, H0 yx.x : pyx.x D 00 , and alternative hypotheses
!yy !yx.x 0
H1 : !yy 6D 0, H1 : pyx.x 6D 0 . Hence, the joint null hypothesis of interest in (12)–(16) is
given by:
! !
H0 D H0 yy \ H0 yx.x 17
and the alternative hypothesis is correspondingly stated as:

! !
H1 D H1 yy [ H1 yx.x 18
However, as indicated in Section 2, not only does the alternative hypothesis H1 of (17) cover the
case of interest in which !yy 6D 0 and pyx.x 6D 00 but also permits !yy 6D 0, pyx.x D 00 and !yy D 0
and pyx.x 6D 00 ; cf. (8). That is, the possibility of degenerate level relationships between yt and xt
is admitted under H1 of (18). We comment further on these alternatives at the end of this section.
For ease of exposition, we consider Case IV and rewrite (15) in matrix notation as
y D iT c0 C ZŁ1 pŁy.x C Z y C u 19
where iT is a T-vector of ones, y y1 , . . . , yT 0 , X x1 , . . . , xT 0 , Zi
z1i , . . . , zTi 0 , i D 1, . . . , p 1, y w0 , y01 , . . . , y0p1 0 , Z X, Z1 , . . . ,
Z1p , ZŁ1 tT , Z1 , tT 1, . . . , T0 , Z1 z0 , . . . , zT1 0 , u u1 , . . . , uT 0 and

g0 !yy
pŁy.x D
IkC1 p0yx.x
The least squares (LS) estimator of pŁy.x is given by:
0
p̂Ły.x Z̃Ł1 P Ł 1 Ł0
Z Z̃1 Z̃1 P
Z y
20
P. Z , y
where Z̃Ł1 P. ZŁ1 , Z P. y, P. IT iT i0 iT 1 i0 and P
T T Z IT

0 0
1
Z Z Z Z . The Wald and the F-statistics for testing the null hypothesis H0 of
0 0 W
W p̂Ły.x Z̃Ł1 P Ł Ł
Z Z̃1 p̂y.x /ωO uu , F 21
kC2

where ωO uu T m1 TtD1 uQ t2 , m k C 1p C 1 C 1 is the number of estimated coefficients
and uQ t , t D 1, 2, . . . , T, are the least squares (LS) residuals from (19).
The next theorem presents the asymptotic null distribution of the Wald statistic; the limit
behaviour of the F-statistic is a simple corollary and is not presented here or subsequently.
Let WkrC1 a Wu a, Wkr a0 0 denote a k r C 1-dimensional standard Brownian motion
partitioned into the scalar and k r-dimensional sub-vector independent standard Brownian
motions Wu a and Wkr a, a 2 [0, 1]. We will also require the corresponding 1 de-meaned k
r C 1-vector standard Brownian motion W̃krC1 a WkrC1 a 0 WkrC1 ada, and de-
meaned
and
de-trended k r C 1-vector standard Brownian motion ŴkrC1 a W̃krC1 a
1
12 a 12 0 a 12 W̃krC1 ada, and their respective partitioned counterparts W̃krC1 a D
WQ u a, W̃kr a0 0 , and ŴkrC1 a D W O u a, Ŵkr a0 0 , a 2 [0, 1].
Theorem 3.1 (Limiting distribution of W) If Assumptions 1–4 and 5a hold, then under H0 :
!yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic distribution of the Wald statistic W of

1
1 1
1
W ) z0r zr C dWu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 adWu a 22
0 0 0
where zr ¾ N0, Ir is distributed independently of the second term in (22) and

 WkrC1 a Case I 

 

 WkrC1 a , 1
0 0
Case II 

FkrC1 a D W̃krC1 a Case III

 
 W̃krC1 a0 , a 2 0 Case IV 
1
 

ŴkrC1 a Case V
The asymptotic distribution of the Wald statistic W of (21) depends on the dimension and
cointegration rank of the forcing variables fxt g, k and r respectively. In Case IV, referring to
(11), the first component in (22), z0r zr ¾ / 2 r, corresponds to testing for the exclusion of the r-
dimensional stationary vector b0xx xt1 , that is, the hypothesis ayx w0 axx D 00 , whereas the second
term in (22), which is a non-standard Dickey–Fuller unit-root distribution, corresponds to testing
for the exclusion of the k r C 1-dimensional I1 vector b? ? 0
y , b zt1 and, in Cases II and
IV, the intercept and time-trend respectively or, equivalently, ˛yy D 0.
We specialize Theorem 3.1 to the two polar cases in which, first, the process for the forcing
variables fxt g is purely integrated of order zero, that is, r D k and 5xx is of full rank, and, second,
the fxt g process is not mutually cointegrated, r D 0, and, hence, the fxt g process is purely integrated
of order one.
1
FadWu a2
W ) z0k zk C 0 1 23
0 Fa2 da
where zk ¾ N0, Ik is distributed independently of the second term in (23) and

 

 Wu a Case I 

 Wu a, 10 Case II 
 
Fa D WQ u a Case III

 WQ u a, a 1 0 Case IV 

 

 O
2

Wu a Case V
r D 0, . . . , k, where Cases I–V are defined in (12)–(16), a 2 [0, 1].
and r D 0, that is, fxt g ¾ I1, then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the

1
1 1
1
0 0
W) dWu aFkC1 a FkC1 aFkC1 a da FkC1 adWu a
0 0 0
where FkC1 a is defined in Theorem 3.1 for Cases I–V, a 2 [0, 1].
In practice, however, it is unlikely that one would possess a priori knowledge of the rank r
of 5xx ; that is, the cointegration rank of the forcing variables fxt g or, more particularly, whether
fxt g ¾ I0 or fxt g ¾ I1. Long-run analysis of (12)–(16) predicated on a prior determination
of the cointegration rank r in (7) is prone to the possibility of a pre-test specification error;
see, for example, Cavanagh et al. (1995). However, it may be shown by simulation that the
asymptotic critical values obtained from Corollaries 3.1 (r D k and fxt g ¾ I0) and 3.2 (r D 0
and fxt g ¾ I1) provide lower and upper bounds respectively for those corresponding to the
general case considered in Theorem 3.1 when the cointegration rank of the forcing variables
fxt g process is 0 r k.11 Hence, these two sets of critical values provide critical value
bounds covering all possible classifications of fxt g into I0, I1 and mutually cointegrated
processes. Asymptotic critical value bounds for the F-statistics covering Cases I–V are set out in
Tables CI(i)–CI(v) for sizes 0.100, 0.050, 0.025 and 0.010; the lower bound values assume that
the forcing variables fxt g are purely I0, and the upper bound values assume that fxt g are purely
I1.12
Hence, we suggest a bounds procedure to test H0 : !yy D 0 and pyx.x D 00 of (17) within the
conditional ECMs (12)–(16). If the computed Wald or F-statistics fall outside the critical value
fxt g process. If, however, the Wald or F-statistic fall within these bounds, inference would be
fxt g is required to proceed further.
The conditional ECMs (12)–(16), derived from the underlying VAR(p) model (2), may also be
interpreted as an autoregressive distributed lag model of orders (p, p, . . . , p) (ARDL(p, . . . , p)).
However, one could also allow for differential lag lengths on the lagged variables yti and
xti in (2) to arrive at, for example, an ARDL(p, p1 , . . . , pk ) model without affecting the
one can use a flexible choice for the dynamic lag structure in (12)–(16) as well as allowing
for short-run feedbacks from the lagged dependent variables, yti , i D 1, . . . , p, to xt in
(1992, 1995), PSS, and Urbain (1992), where it is assumed in addition that 5xx D 0 or xt is purely
I1 in (7).
the deterministics given by (12), (14) and (16). Note that the restrictions on the deterministics’
but do not test the implicit hypothesis ayx w0 axx D 00 ; that is, the limiting distributional results
given below are also obtained under the joint hypothesis H0 : !yy D 0 and pyx.x D 00 of (17). BDM
!
test ˛yy D 0 (or H0 yy : !yy D 0) via the exclusion of yt1 in Cases I, III and V. For example, in
ŷ01 P
y
Z
,X̂1
t!yy D 1/2
24
ωO uu ŷ01 P
Z ŷ1 1/2
,X̂1
where ωO uu is defined in the line after (21), y P. ,0 y, ŷ1 P. ,0 y1 , y1
T T T T
P. ,0 Z , P. ,0 P.
y0 , . . . , yT1 , X̂1 P.T ,0T X1 , X1 x0 , . . . , xT1 0 , Z
0
T T T T T
0 1 0
P.T tT t0T P.T tT 1 t0T P.T , P
Z ,X̂1 D P Z X̂1 X̂1 P
Z P Z X̂1 X̂1 P
Z and P Z
IT Z Z 0 Z 1 Z 0 .

simulations with different combinations of values for k and 0 r k.
12 The critical values for the Wald version of the bounds test are given by k C 1 times the critical values of the F-test in
Cases I, III and V, and k C 2 times in Cases II and IV.
relationshipa
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 3.00 3.00 4.20 4.20 5.47 5.47 7.17 7.17 1.16 1.16 2.32 2.32
1 2.44 3.28 3.15 4.11 3.88 4.92 4.81 6.02 1.08 1.54 1.08 1.73
2 2.17 3.19 2.72 3.83 3.22 4.50 3.88 5.30 1.05 1.69 0.70 1.27
3 2.01 3.10 2.45 3.63 2.87 4.16 3.42 4.84 1.04 1.77 0.52 0.99
4 1.90 3.01 2.26 3.48 2.62 3.90 3.07 4.44 1.03 1.81 0.41 0.80
5 1.81 2.93 2.14 3.34 2.44 3.71 2.82 4.21 1.02 1.84 0.34 0.67
6 1.75 2.87 2.04 3.24 2.32 3.59 2.66 4.05 1.02 1.86 0.29 0.58
7 1.70 2.83 1.97 3.18 2.22 3.49 2.54 3.91 1.02 1.88 0.26 0.51
8 1.66 2.79 1.91 3.11 2.15 3.40 2.45 3.79 1.02 1.89 0.23 0.46
9 1.63 2.75 1.86 3.05 2.08 3.33 2.34 3.68 1.02 1.90 0.20 0.41
10 1.60 2.72 1.82 2.99 2.02 3.27 2.26 3.60 1.02 1.91 0.19 0.37
Table CI(ii) Case II: Restricted intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 3.80 3.80 4.60 4.60 5.39 5.39 6.44 6.44 2.03 2.03 1.77 1.77
1 3.02 3.51 3.62 4.16 4.18 4.79 4.94 5.58 1.69 2.02 1.01 1.25
2 2.63 3.35 3.10 3.87 3.55 4.38 4.13 5.00 1.52 2.02 0.69 0.96
3 2.37 3.20 2.79 3.67 3.15 4.08 3.65 4.66 1.41 2.02 0.52 0.78
4 2.20 3.09 2.56 3.49 2.88 3.87 3.29 4.37 1.34 2.01 0.42 0.65
5 2.08 3.00 2.39 3.38 2.70 3.73 3.06 4.15 1.29 2.00 0.35 0.56
6 1.99 2.94 2.27 3.28 2.55 3.61 2.88 3.99 1.26 2.00 0.30 0.49
7 1.92 2.89 2.17 3.21 2.43 3.51 2.73 3.90 1.23 2.01 0.26 0.44
8 1.85 2.85 2.11 3.15 2.33 3.42 2.62 3.77 1.21 2.01 0.23 0.40
9 1.80 2.80 2.04 3.08 2.24 3.35 2.50 3.68 1.19 2.01 0.21 0.36
10 1.76 2.77 1.98 3.04 2.18 3.28 2.41 3.61 1.17 2.00 0.19 0.33
Table CI(iii) Case III: Unrestricted intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 6.58 6.58 8.21 8.21 9.80 9.80 11.79 11.79 3.05 3.05 7.07 7.07
1 4.04 4.78 4.94 5.73 5.77 6.68 6.84 7.84 2.03 2.52 2.28 2.89
2 3.17 4.14 3.79 4.85 4.41 5.52 5.15 6.36 1.69 2.35 1.23 1.77
3 2.72 3.77 3.23 4.35 3.69 4.89 4.29 5.61 1.51 2.26 0.82 1.27
4 2.45 3.52 2.86 4.01 3.25 4.49 3.74 5.06 1.41 2.21 0.60 0.98
5 2.26 3.35 2.62 3.79 2.96 4.18 3.41 4.68 1.34 2.17 0.48 0.79
6 2.12 3.23 2.45 3.61 2.75 3.99 3.15 4.43 1.29 2.14 0.39 0.66
7 2.03 3.13 2.32 3.50 2.60 3.84 2.96 4.26 1.26 2.13 0.33 0.58
8 1.95 3.06 2.22 3.39 2.48 3.70 2.79 4.10 1.23 2.12 0.29 0.51
9 1.88 2.99 2.14 3.30 2.37 3.60 2.65 3.97 1.21 2.10 0.25 0.45
10 1.83 2.94 2.06 3.24 2.28 3.50 2.54 3.86 1.19 2.09 0.23 0.41
Table CI. (Continued )
Table CI(iv) Case IV: Unrestricted intercept and restricted trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 5.37 5.37 6.29 6.29 7.14 7.14 8.26 8.26 3.17 3.17 2.68 2.68
1 4.05 4.49 4.68 5.15 5.30 5.83 6.10 6.73 2.45 2.77 1.41 1.65
2 3.38 4.02 3.88 4.61 4.37 5.16 4.99 5.85 2.09 2.57 0.92 1.20
3 2.97 3.74 3.38 4.23 3.80 4.68 4.30 5.23 1.87 2.45 0.67 0.93
4 2.68 3.53 3.05 3.97 3.40 4.36 3.81 4.92 1.72 2.37 0.51 0.76
5 2.49 3.38 2.81 3.76 3.11 4.13 3.50 4.63 1.62 2.31 0.42 0.64
6 2.33 3.25 2.63 3.62 2.90 3.94 3.27 4.39 1.54 2.27 0.35 0.55
7 2.22 3.17 2.50 3.50 2.76 3.81 3.07 4.23 1.48 2.24 0.31 0.49
8 2.13 3.09 2.38 3.41 2.62 3.70 2.93 4.06 1.44 2.22 0.27 0.44
9 2.05 3.02 2.30 3.33 2.52 3.60 2.79 3.93 1.40 2.20 0.24 0.40
10 1.98 2.97 2.21 3.25 2.42 3.52 2.68 3.84 1.36 2.18 0.22 0.36
Table CI(v) Case V: Unrestricted intercept and unrestricted trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 9.81 9.81 11.64 11.64 13.36 13.36 15.73 15.73 5.33 5.33 11.35 11.35
1 5.59 6.26 6.56 7.30 7.46 8.27 8.74 9.63 3.17 3.64 3.33 3.91
2 4.19 5.06 4.87 5.85 5.49 6.59 6.34 7.52 2.44 3.09 1.70 2.23
3 3.47 4.45 4.01 5.07 4.52 5.62 5.17 6.36 2.08 2.81 1.08 1.51
4 3.03 4.06 3.47 4.57 3.89 5.07 4.40 5.72 1.86 2.64 0.77 1.14
5 2.75 3.79 3.12 4.25 3.47 4.67 3.93 5.23 1.72 2.53 0.59 0.91
6 2.53 3.59 2.87 4.00 3.19 4.38 3.60 4.90 1.62 2.45 0.48 0.75
7 2.38 3.45 2.69 3.83 2.98 4.16 3.34 4.63 1.54 2.39 0.40 0.64
8 2.26 3.34 2.55 3.68 2.82 4.02 3.15 4.43 1.48 2.35 0.34 0.56
9 2.16 3.24 2.43 3.56 2.67 3.87 2.97 4.24 1.43 2.31 0.30 0.49
10 2.07 3.16 2.33 3.46 2.56 3.76 2.84 4.10 1.40 2.28 0.26 0.44
a The critical values are computed via stochastic simulations using T D 1000 and 40,000 replications for the F-statistic
for testing f D 0 in the regression: yt D f zt1 C a wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and
 
 zt1 D yt1 , x0t1 0 , wt D 0 Case I 

 
 z 0 0
 t1 D yt1 , xt1 , 1 , wt D 0 Case II 

zt1 D yt1 , x0t1 0 , wt D 1 Case III

 

 z D yt1 , x0t1 , t0 , wt D 1 Case IV 

 t1 
zt1 D yt1 , x0t1 0 , wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0 and
et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0 if xt
is purely I0. The critical values for k D 0 correspond to the squares of the critical values of Dickey and Fuller’s (1979)
unit root t-statistics for Cases I, III and V, while they match those for Dickey and Fuller’s (1981) unit root F-statistics
for Cases II and IV. The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0,
while the columns headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
Theorem 3.2 (Limiting distribution of t!yy ). If Assumptions 1-4 and 5a hold and gxy D 0, where
0x D gxy , 0xx , then under H0 : !yy D 0 and pyx.x D 00 of (17), as T ! 1, the asymptotic
distribution of the t-statistic t!yy of (24) has the representation

1
1 1/2
dWu aFkr a Fkr a2 da 25
0 0
where
 1 1 
1

 Wu a 0 Wu aWkr a0 da 0 Wkr aWkr a0 da Wkr a Case I  

 1 

1 1
Fkr a D WQ u a W Q u aW̃kr a da
0
W̃ aW̃ a0
da W̃ a Case III


0

0 kr kr
1
kr



 O 1 1


Wu a 0 WO u aŴkr a da
0
Ŵkr aŴkr a0
da Ŵkr a Case V
0
r D 0, . . . , k, and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
The form of the asymptotic representation (25) is similar to that of a Dickey–Fuller test for
a unit root except that the standard Brownian motion Wu a is replaced by the residual from
an asymptotic regression of Wu a on the independent (k r)-vector standard Brownian motion
Wkr a (or their de-meaned and de-meaned and de-trended counterparts).
Similarly to the analysis following Theorem 3.1, we detail the limiting distribution of the t-
statistic t!yy in the two polar cases in which the forcing variables fxt g are purely integrated of
Corollary 3.3 (Limiting distribution of t!yy if fxt g ¾ I0). If Assumptions 1-4 and 5a hold
asymptotic distribution of the t-statistic t!yy of (24) has the representation

1
1 1/2
2
dWu aFa Fa da
0 0

Wu a Case I
where
Fa D Q u a Case III
W
O u a Case V
W
and Cases I, III and V are defined in (12), (14) and (16), a 2 [0, 1].
Corollary 3.4 (Limiting distribution of t!yy if fxt g ¾ I1). If Assumptions 1-4 and 5a hold,
!
gxy D 0, where 0x D gxy , 0xx , and r D 0, that is, fxt g ¾ I1, then under H0 yy : !yy D 0, as
T ! 1, the asymptotic distribution of the t-statistic t!yy of (24) has the representation

1
1 1/2
2
dWu aFk a Fk a da
0 0
where Fk a is defined in Theorem 3.2 for Cases I, III and V, a 2 [0, 1].
As above, it may be shown by simulation that the asymptotic critical values obtained from
Corollaries 3.3 (r D k and fxt g is purely I0) and 3.4 (r D 0 and fxt g is purely I1) provide
lower and upper bounds respectively for those corresponding to the general case considered in
!
Theorem 3.2. Hence, a bounds procedure for testing H0 yy : !yy D 0 based on these two polar cases
may be implemented as described above based on the t-statistic t!yy for the exclusion of yt1 in
the conditional ECMs (12), (14) and (16) without prior knowledge of the cointegrating rank r.13
These asymptotic critical value bounds are given in Tables CII(i), CII(iii) and CII(v) for Cases I,
As is emphasized in the Proof of Theorem 3.2 given in Appendix A, if the asymptotic analysis
!
for the t-statistic t!yy of (24) is conducted under H0 yy : !yy D 0 only, the resultant limit distribution
for t!yy depends on the nuisance parameter w f in addition to the cointegrating rank r, where,
under Assumption 5a, ayx f0 axx D 00 . Moreover, if yt is allowed to Granger-cause xt , that is,
gxy,i 6D 0 for some i D 1, . . . , p 1, then the limit distribution also is dependent on the nuisance
parameter gxy /*yy f0 gxy ; see Appendix A. Consequently, in general, where w 6D f or gxy 6D 0,
Table CII. Asymptotic critical value bounds of the t-statistic. Testing for the existence of a levels relationshipa
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 1.62 1.62 1.95 1.95 2.24 2.24 2.58 2.58 0.42 0.42 0.98 0.98
1 1.62 2.28 1.95 2.60 2.24 2.90 2.58 3.22 0.42 0.98 0.98 1.12
2 1.62 2.68 1.95 3.02 2.24 3.31 2.58 3.66 0.42 1.39 0.98 1.12
3 1.62 3.00 1.95 3.33 2.24 3.64 2.58 3.97 0.42 1.71 0.98 1.09
4 1.62 3.26 1.95 3.60 2.24 3.89 2.58 4.23 0.42 1.98 0.98 1.07
5 1.62 3.49 1.95 3.83 2.24 4.12 2.58 4.44 0.42 2.22 0.98 1.05
6 1.62 3.70 1.95 4.04 2.24 4.34 2.58 4.67 0.42 2.43 0.98 1.04
7 1.62 3.90 1.95 4.23 2.24 4.54 2.58 4.88 0.42 2.63 0.98 1.04
8 1.62 4.09 1.95 4.43 2.24 4.72 2.58 5.07 0.42 2.81 0.98 1.04
9 1.62 4.26 1.95 4.61 2.24 4.89 2.58 5.25 0.42 2.98 0.98 1.04
10 1.62 4.42 1.95 4.76 2.24 5.06 2.58 5.44 0.42 3.15 0.98 1.03
Table CII(iii) Case III: Unrestricted intercept and no trend
0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 2.57 2.57 2.86 2.86 3.13 3.13 3.43 3.43 1.53 1.53 0.72 0.71
1 2.57 2.91 2.86 3.22 3.13 3.50 3.43 3.82 1.53 1.80 0.72 0.81
2 2.57 3.21 2.86 3.53 3.13 3.80 3.43 4.10 1.53 2.04 0.72 0.86
3 2.57 3.46 2.86 3.78 3.13 4.05 3.43 4.37 1.53 2.26 0.72 0.89
4 2.57 3.66 2.86 3.99 3.13 4.26 3.43 4.60 1.53 2.47 0.72 0.91
5 2.57 3.86 2.86 4.19 3.13 4.46 3.43 4.79 1.53 2.65 0.72 0.92
6 2.57 4.04 2.86 4.38 3.13 4.66 3.43 4.99 1.53 2.83 0.72 0.93
7 2.57 4.23 2.86 4.57 3.13 4.85 3.43 5.19 1.53 3.00 0.72 0.94
8 2.57 4.40 2.86 4.72 3.13 5.02 3.43 5.37 1.53 3.16 0.72 0.96
9 2.57 4.56 2.86 4.88 3.13 5.18 3.42 5.54 1.53 3.31 0.72 0.96
10 2.57 4.69 2.86 5.03 3.13 5.34 3.43 5.68 1.53 3.46 0.72 0.96
!
13 Although Corollary 3.3 does not require gxy D 0 and H0 yx.x : pyx.x D 00 is automatically satisfied under the conditions
!
of Corollary 3.4, the simulation critical value bounds result requires gxy D 0 and H0 yx.x : pyx.x D 00 for 0 < r < k.
Table CII. (Continued )

0.100 0.050 0.025 0.010 Mean Variance

k I0 I1 I0 I1 I0 I1 I0 I1 I0 I1 I0 I1
0 3.13 3.13 3.41 3.41 3.65 3.66 3.96 3.97 2.18 2.18 0.57 0.57
1 3.13 3.40 3.41 3.69 3.65 3.96 3.96 4.26 2.18 2.37 0.57 0.67
2 3.13 3.63 3.41 3.95 3.65 4.20 3.96 4.53 2.18 2.55 0.57 0.74
3 3.13 3.84 3.41 4.16 3.65 4.42 3.96 4.73 2.18 2.72 0.57 0.79
4 3.13 4.04 3.41 4.36 3.65 4.62 3.96 4.96 2.18 2.89 0.57 0.82
5 3.13 4.21 3.41 4.52 3.65 4.79 3.96 5.13 2.18 3.04 0.57 0.85
6 3.13 4.37 3.41 4.69 3.65 4.96 3.96 5.31 2.18 3.20 0.57 0.87
7 3.13 4.53 3.41 4.85 3.65 5.14 3.96 5.49 2.18 3.34 0.57 0.88
8 3.13 4.68 3.41 5.01 3.65 5.30 3.96 5.65 2.18 3.49 0.57 0.90
9 3.13 4.82 3.41 5.15 3.65 5.44 3.96 5.79 2.18 3.62 0.57 0.91
10 3.13 4.96 3.41 5.29 3.65 5.59 3.96 5.94 2.18 3.75 0.57 0.92
a The critical values are computed via stochastic simulations using T D 1000 and 40 000 replications for the t-statistic for
testing 2 D 0 in the regression: yt D 2yt1 C d0 xt1 C a0 wt C 1t , t D 1, . . . , T, where xt D x1t , . . . , xkt 0 and

wt D 0 Case I
wt D 1 Case III
wt D 1, t0 Case V
The variables yt and xt are generated from yt D yt1 C ε1t and xt D Pxt1 C e2t , t D 1, . . . , T, where y0 D 0, x0 D 0
and et D ε1t , e02t 0 is drawn as k C 1 independent standard normal variables. If xt is purely I1, P D Ik whereas P D 0
if xt is purely I0. The critical values for k D 0 correspond to those of Dickey and Fuller’s (1979) unit root t-statistics.
The columns headed ‘I0’ refer to the lower critical values bound obtained when xt is purely I0, while the columns
headed ‘I1’ refer to the upper bound obtained when xt is purely I1.
!
although the t-statistic t!yy has a well-defined limiting distribution under H0 yy : !yy D 0, the above
!
bounds testing procedure for H0 yy : !yy D 0 based on t!yy is not asymptotically similar.14
Consequently, in the light of the consistency results for the above statistics discussed in
Section 4, see Theorems 4.1, 4.2 and 4.4, we suggest the following procedure for ascertaining
the existence of a level relationship between yt and xt : test H0 of (17) using the bounds procedure
based on the Wald or F-statistic of (21) from Corollaries 3.1 and 3.2: (a) if H0 is not rejected,
!
proceed no further; (b) if H0 is rejected, test H0 yy : !yy D 0 using the bounds procedure based on
!
the t-statistic t!yy of (24) from Corollaries 3.3 and 3.4. If H0 yy : !yy D 0 is false, a large value of
t!yy should result, at least asymptotically, confirming the existence of a level relationship between
yt and xt , which, however, may be degenerate (if pyx.x D 00 ).

This section first demonstrates that the proposed bounds testing procedure based on the Wald
statistic of (21) described in Section 3 is consistent. Second, it derives the asymptotic distribution
!
14 In principle, the asymptotic distribution of t!yy under H0 yy : !yy D 0 may be simulated from the limiting representation
2 2
given in the Proof of Theorem 3.2 of Appendix A after substitution of consistent estimators for f and lxy gxy /*yy.x under
!yy 2 0
H0 : !yy D 0, where *yy.x *yy f *xy . Although such estimators may be obtained straightforwardly, unfortunately,
they necessitate the use of parameter estimators from the marginal ECM (7) for fxt g1 tD1 .
of the Wald statistic of (21) under a sequence of local alternatives. Finally, we show that the
In the discussion of the consistency of the bounds test procedure based on the Wald statistic
of (21), because the rank of the long-run multiplier matrix 5 may be either r or r C 1 under the
! ! ! !
alternative hypothesis H1 D H1 yy [ H1 yx.x of (18) where H1 yy : !yy 6D 0 and H1 yx.x : pyx.x 6D 00 , it is
!yy
necessary to deal with these two possibilities. First, under H1 : !yy 6D 0, the rank of 5 is r C 1 so
!
Assumption 5b applies; in particular, ˛yy 6D 0. Second, under H0 yy : !yy D 0, the rank of 5 is r so
!yx.x
Assumption 5a applies; in this case, H1 : pyx.x 6D 00 holds and, in particular, ayx w0 axx 6D 00 .
!
Theorem 4.1 (Consistency of the Wald statistic bounds test procedure under H1 yy ). If Assumptions
!
1-4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the Wald statistic W (21) is consistent against
!yy
H1 : !yy 6D 0 in Cases I–V defined in (12)–(16).
! !
Theorem 4.2 (Consistency of the Wald statistic bounds test procedure under H1 yx.x \ H0 yy ). If
! !
Assumptions 1–4 and 5a hold, then under H1 yx.x : pyx.x 6D 00 of (18) and H0 yy : !yy D 0 of (17) the
!yx.x 0
Wald statistic W (21) is consistent against H1 : pyx.x 6D 0 in Cases I–V defined in (12)–(16).
Hence, combining Theorems 4.1 and 4.2, the bounds procedure of Section 3 based on the Wald
! ! ! !
statistic W (21) defines a consistent test of H0 D H0 yy \ H0 yx.x of (17) against H1 D H1 yy [ H1 yx.x
of (18). This result holds irrespective of whether the forcing variables fxt g are purely I0, purely
I1 or mutually cointegrated.
We now turn to consider the asymptotic distribution of the Wald statistic (21) under a suitably
specified sequence of local alternatives. Recall that under Assumption 5b, py.x [D !yy , pyx.x ] D
˛yy ˇyy , ˛yy b0xy C ayx w0 axx b0xx . Consequently, we define the sequence of local alternatives
H1T : py.xT [D !yyT , pyx.xT ] D T1 ˛yy ˇyy , T1 ˛yy b0xy C T1/2 dyx w0 dxx b0xx 26

!yyT pyxT
5T
0 5xxT
and recalling  D ab0 , where 1, w0 a D ayx w0 axx D 00 , we have

dyx
5T 5 D T1 ay b0y C T1/2 b0 27
dxx
In order to detail the limit distribution of the Wald statistic under the sequence of local alterna-
tives H1T of (26), it is necessary to define the (k r C 1)-dimensional Ornstein–Uhlenbeck pro-
cess JŁkrC1 a D JŁu a, JŁkr a 0 0
which obeys the stochastic integral and differential equations,
0 a Ł
JkrC1 a D WkrC1 a C ab 0 JkrC1 r dr and dJŁkrC1 a D dWkrC1 a C ab0 JŁkrC1 a da,
Ł
where WkrC1 a is a (k r C 1)-dimensional standard Brownian motion, a D [a? ? 0 ?

y , a Zay ,
? 1/2 ? ? 0 ? ? 0 ? ? 1/2 ? ? 0 ? ? 1 ? ? 0
a ] ay , a ay , b D [ay , a Zay , a ] [by , b 0ay , a ] by , b by , together
with the de-meaned and de-meaned and de-trended counterparts J̃ŁkrC1 a D JQ Łu a, J̃Łkr a0 0
and ĴŁkrC1 a D JO Łu a, ĴŁkr a0 0 partitioned similarly, a 2 [0, 1]. See, for example, Johansen
(1995, Chapter 14, pp. 201–210).
Theorem 4.3 (Limiting distribution of W under H1T ). If Assumptions 1–4 and 5a hold, then under
H1T : !y.x D T1 ˛yy b0y C T1/2 dyx w0 dxx b0 of (26), as T ! 1, the asymptotic distribution of

1
1 1
1
W ) z0r zr C dJŁu aFkrC1 a0 FkrC1 aFkrC1 a0 da FkrC1 a dJŁu a 28
0 0 0
0
where zr ¾ NQ1/2 h, Ir , Q[D Q1/20 Q1/2 ] D p limT!1 T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ , h dyx w
0
dxx 0 , is distributed independently of the second term in (28) and

 
 JŁkrC1 a Case I 

 

 JŁkrC1 a0 , 10 Case II 

FkrC1 a D J̃ŁkrC1 a Case III

 

 J̃Ł a0 , a 1/20 Case IV 

 krC1Ł 
ĴkrC1 a Case V
The first component of (28) z0r zr is non-central chi-square distributed with r degrees of
!
freedom and non-centrality parameter h0 Qh and corresponds to the local alternative H1Tyx.x :
!
pyx.xT D T1/2 dyx w0 dxx b0xx under H0 : !yy D 0. The second term in (28) is a non-standard
yy
!
Dickey–Fuller unit-root distribution under the local alternative H1Tyy : !yyT D T1 ˛yy ˇyy and
dyx w0 dxx D 00 . Note that under H0 of (17), that is, ˛yy D 0 and dyx w0 dxx D 00 , the limiting
The proof for the consistency of the bounds test procedure based on the t-statistic of (24)
requires that the rank of the long-run multiplier matrix 5 is r C 1 under the alternative hypothesis
!
H1 yy : !yy 6D 0. Hence, Assumption 5b applies; in particular, ˛yy 6D 0.
!
Theorem 4.4 (Consistency of the t-statistic bounds test procedure under H1 yy ). If Assumptions
!
1–4 and 5b hold, then under H1 yy : !yy 6D 0 of (18) the t-statistic t!yy (24) is consistent against
!yy
H1 : !yy 6D 0 in Cases I, III and V defined in (12), (14) and (16).
As noted at the end of Section 3, Theorem 4.4 suggests the possibility of using t!yy to
! ! !
discriminate between H0 yy : !yy D 0 and H1 yy : !yy 6D 0, although, if H0 yx.x : pyx.x D 00 is false,

and Whittaker (1995), CSW hereafter. The theoretical basis of the Treasury’s earnings equation
where firms and unions set wages to maximize a weighted average of firms’ profits and unions’
utility. Following Darby and Wren-Lewis (1993), the theoretical real wage equation underlying
the Treasury’s earnings equation is given by
Prodt
wt D 29
1 C fURt 1 RRt /Uniont
where wt is the real wage, Prodt is labour productivity, RRt is the replacement ratio defined as
the ratio of unemployment benefit to the wage rate, Uniont is a measure of ‘union power’, and
fURt is the probability of a union member becoming unemployed, which is assumed to be an
increasing function of the unemployment rate URt . The econometric specification is based on a
log-linearized version of (29) after allowing for a wedge effect that takes account of the difference
between the ‘real product wage’ which is the focus of the firms’ decision, and the ‘real consumption
wage’ which concerns the union.15 The theoretical arguments for a possible long-run wedge effect
on real wages is mixed and, as emphasized by CSW, whether such long-run effects are present
is an empirical matter. The change in the unemployment rate (URt ) is also included in the
Treasury’s wage equation. CSW cite two different theoretical rationales for the inclusion of URt
in the wage equation: the differential moderating effects of long- and short-term unemployed
on real wages, and the ‘insider–outsider’ theories which argue that only rising unemployment
will be effective in significantly moderating wage demands. See Blanchard and Summers (1986)
and Lindbeck and Snower (1989). The ARDL model and its associated unrestricted equilibrium
We begin our empirical analysis from the maintained assumption that the time series properties
of the key variables in the Treasury’s earnings equation can be well approximated by a log-linear
VARp model, augmented with appropriate deterministics such as intercepts and time trends.
included in the analysis. CSW, p. 50, report that ‘... it has not proved possible to identify a
significant effect from the replacement ratio, and this had to be omitted from our specification’.16
The asymptotic theory developed in the paper is not affected by the inclusion of such ‘one-
off’ dummy variables.17 Let zt D wt , Prodt , URt , Wedget , Uniont 0 D wt , x0t 0 . Then, using the

p1
wt D c0 C c1 t C c2 D7475t C c3 D7579t C !ww wt1 C pwx.x xt1 C y0i zti C d0 xt C ut
iD1
30
Under the assumption that lagged real wages, wt1 , do not enter the sub-VAR model for xt ,
the above real wage equation is identified and can be estimated consistently by LS.18 Notice,
the unemployment or productivity equations, for example. The exclusion of the level of real wages
postulates that labour productivity is partly determined by the level of real wages.19 It is clear
that, in our framework, the bargaining theory and the efficiency wage theory cannot be entertained
The above specification is also based on the assumption that the disturbances ut are serially
uncorrelated. It is therefore important that the lag order p of the underlying VAR is selected
appropriately. There is a delicate balance between choosing p sufficiently large to mitigate the
residual serial correlation problem and, at the same time, sufficiently small so that the conditional
ECM (30) is not unduly over-parameterized, particularly in view of the limited time series data
Finally, a decision must be made concerning the time trend in (30) and whether its coefficient
should be restricted.20 This issue can only be settled in light of the particular sample period under
consideration. The time series data used are quarterly, cover the period 1970q1-1997q4, and are
seasonally adjusted (where relevant).21 To ensure comparability of results for different choices of
p, all estimations use the same sample period, 1972q1–1997q4 (T D 104), with the first eight
The five variables in the earnings equation were constructed from primary sources in the fol-
lowing manner: wt D lnERPRt /PYNONGt , Wedget D ln1 C TEt C ln1 TDt lnRPIXt /
PYNONGt , URt D ln100 ð ILOUt /ILOUt C WFEMPt , Prodt D lnYPROMt C 278.29 ð
YMFt /EMFt C ENMFt , and Uniont D lnUDENt , where ERPRt is average private sector
earnings per employee (£), PYNONGt is the non-oil non-government GDP deflator, YPROMt
tor cost (£ million, 1990), YMFt is the manufacturing output index adjusted for stock changes
(1990 D 100), EMFt and ENMFt are respectively employment in UK manufacturing and non-
employers’ National Insurance contribution rate, TDt is the average direct tax rate on employ-
union density (used to proxy ‘union power’) measured by union membership as a percentage of
Figures 1–3.
18 See Assumption 3 and the following discussion. By construction, the contemporaneous effects x are uncorrelated
t
are uncorrelated with ut and also have a reasonable degree of correlation with the included variables in (30).
sources and the descriptions of the variables, see CSW, pp. 46–51 and p. 11 of the Annex.
(a)
4.0
3.5
Real Wages
3.0
Log Scale
2.5
2.0
1.5 Productivity
1.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
(b)
0.04
0.03
Real Wage
0.02
0.01
0.00
−0.01
−0.02
Productivity
−0.03
−0.04
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
It is clear from Figure 1 that real wages (average earnings) and productivity show steadily rising
root tests to the five variables, perhaps not surprisingly, yields mixed results with strong evidence
not necessarily preclude the other three variables (UR, Wedge, and Union) having levels impact
on real wages. Following the methodology developed in this paper, it is possible to test for the
existence of a real wage equation involving the levels of these five variables irrespective of whether
they are purely I0, purely I1, or mutually cointegrated.
23 Over the period 1972q1– 97q4, real wages grew by 2.14% per annum as compared to labour productivity that increased
−0.2
−0.3
UNION
−0.4
−0.5
−0.6
WEDGE
−0.7
−0.8
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
3.0
2.5
2.0
Log Scale
UR
1.5
1.0
0.5
0.0
1972Q1 1974Q3 1977Q1 1979Q3 1982Q1 1984Q3 1987Q1 1989Q3 1992Q1 1994Q3 1997Q1
Quarters
and without a linear time trend, for p D 1, 2, . . . , 7. As pointed out earlier, all regressions were
computed over the same period 1972q1–1997q4. We found that lagged changes of the productivity
variable, Prodt1 , Prodt2 , . . . , were insignificant (either singly or jointly) in all regressions.
all other variables. Table I gives Akaike’s and Schwarz’s Bayesian Information Criteria, denoted
respectively by AIC and SBC, and Lagrange multiplier (LM) statistics for testing the hypothesis
2 2
of no residual serial correlation against orders 1 and 4 denoted by /SC 1 and /SC 4 respectively.
As might be expected, the lag order selected by AIC, p aic D 6, irrespective of whether a
deterministic trend term is included or not, is much larger than that selected by SBC. This latter
criterion gives estimates p sbc D 1 if a trend is included and p
sbc D 4 if not. The /SC
2
statistics also
suggest using a relatively high lag order: 4 or more. In view of the importance of the assumption
of serially uncorrelated errors for the validity of the bounds tests, it seems prudent to select p to
be either 5 or 6.24 Nevertheless, for completeness, in what follows we report test results for p D 4
and 5, as well as for our preferred choice, namely p D 6. The results in Table I also indicate
trend.
in Tables CI and CII. First, consider the bounds F-statistic. As argued in PSS, the statistic FIV
IV of (15), is more appropriate than FV , Case V of (16), which ignores this constraint. Note that,
if the trend coefficient c1 is not subject to this restriction, (30) implies a quadratic trend in the
level of real wages under the null hypothesis of !ww D 0 and pwx.x D 00 , which is empirically
implausible. The critical value bounds for the statistics FIV and FV are given in Tables CI(iv) and
CI(v). Since k D 4, the 0.05 critical value bounds are (3.05, 3.97) and (3.47, 4.57) for FIV and
FV , respectively.25 The test outcome depends on the choice of the lag order p. For p D 4, the

p AIC SBC 2 1
/SC 2 4
/SC AIC SBC 2 1
/SC 2 4
/SC
1 319.33 302.14 16.86Ł 35.89Ł 317.51 301.64 18.38Ł 34.88Ł

2 324.25 301.77 2.16 19.71Ł 323.77 302.62 1.98 21.52Ł
3 321.51 293.74 0.52 17.07Ł 320.87 294.43 1.56 19.35Ł
4 334.37 301.31 3.48ŁŁŁ 7.79ŁŁŁ 335.37 303.63 3.41ŁŁŁ 7.13
5 335.84 297.50 0.03 2.50 336.49 299.47 0.03 2.15
6 337.06 293.42 0.85 3.58 337.03 294.72 0.99 3.99
7 336.96 288.04 0.17 2.20 336.85 289.25 0.09 0.64
Notes: p is the lag order of the underlying VAR model for the conditional ECM (30), with zero restrictions on the
coefficients of lagged changes in the productivity variable. AICp LLp sp and SBCp LLp sp /2 ln T denote
Akaike’s and Schwarz’s Bayesian Information Criteria for a given lag order p, where LLp is the maximized log-likelihood
value of the model, sp is the number of freely estimated coefficients and T is the sample size. /SC 2 1 and / 2 4 are LM
SC
statistics for testing no residual serial correlation against orders 1 and 4. The symbols Ł , ŁŁ , and ŁŁŁ denote significance
24 In the Treasury model, different lag orders are chosen for different variables. The highest lag order selected is 4 applied
to the log of the price deflator and the wedge variable. The estimation period of the earnings equation in the Treasury
model is 1971q1– 1994q3.
25 Following a suggestion from one of the referees we also computed critical value bounds for our sample size, namely
T D 104. For k D 4, the 5% critical value bounds associated with FIV and FV statistics turned out to be (3.19,4.16) and
(3.61,4.76), respectively, which are only marginally different from the asymptotic critical value bounds.

With Without
p FIV FV tV FIII tIII
4 2.99a 2.34a 2.26a 3.63b 3.02b

5 4.42c 3.96b 2.83a 5.23c 4.00c
6 4.78c 3.59b 2.44a 5.42c 3.48b
Notes: See the notes to Table I. FIV is the F-statistic for testing
0
!ww D 0, pwx.x D 0 and c1 D 0 in (30). FV is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30). FIII is the F-statistic for
testing !ww D 0 and pwx.x D 0 in (30) with c1 set equal to 0. tV
and tIII are the t-ratios for testing !ww D 0 in (30) with and without
of whether the regressors are purely I0, purely I1 or mutually cointegrated. For p D 5, the
bounds test is inconclusive. For p D 6 (selected by AIC), the statistic FV is still inconclusive, but
FIV D 4.78 lies outside the 0.05 critical value bounds and rejects the null hypothesis that there
exists no level earnings equation, irrespective of whether the regressors are purely I0, purely
I1 or mutually cointegrated.26 This finding is even more conclusive when the bounds F-test is
applied to the earnings equations without a linear trend. The relevant test statistic is FIII and the
associated 0.05 critical value bounds are (2.86, 4.01).27 For p D 4, FIII D 3.63, and the test result
is inconclusive. However, for p D 5 and 6, the values of FIII are 5.23 and 5.42 respectively and
bounds for tIII and tV , when k D 4, are (2.86, 3.99) and (3.41, 4.36).28 Therefore, if a
linear trend is included, the bounds t-test does not reject the null even if p D 5 or 6. However,
when the trend term is excluded, the null is rejected for p D 5. Overall, these test results support
In testing the null hypothesis that there are no level effects in (30), namely (!ww D 0, pwx.x D 0)
it is important that the coefficients of lagged changes remain unrestricted, otherwise these tests
could be subject to a pre-testing problem. However, for the subsequent estimation of levels effects
and short-run dynamics of real wage adjustments, the use of a more parsimonious specification
seems advisable. To this end we adopt the ARDL approach to the estimation of the level relations
26 The same conclusion is also reached for p D 7.

discussed in Pesaran and Shin (1999).29 First, the (estimated) orders of an ARDLp, p1 , p2 , p3 , p4
model in the five variables wt , Prodt , URt , Wedget , Uniont were selected by searching across
the 75 D 16, 807 ARDL models, spanned by p D 0, 1, . . . , 6, and pi D 0, 1, . . . , 6, i D 1, . . . , 4,
using the AIC criterion.30 This resulted in the choice of an ARDL6, 0, 5, 4, 5 specification with
wt D 1.063 Prodt 0.105 URt 0.943 Wedget C1.481 Uniont C2.701 C vO t 31
0.050 0.034 0.265 0.311 0.242
where vO t is the equilibrium correction term, and the standard errors are given in parenthesis.
All levels estimates are highly significant and have the expected signs. The coefficients of the
productivity and the wedge variables are insignificantly different from unity. In the Treasury’s
earnings equation, the levels coefficient of the productivity variable is imposed as unity and the
above estimates can be viewed as providing empirical support for this a priori restriction. Our
levels estimates of the effects of the unemployment rate and the union variable on real wages,
namely 0.105 and 1.481, are also in line with the Treasury estimates of 0.09 and 1.31.31
wedge variable. We obtain a much larger estimate, almost twice that obtained by the Treasury.
Setting the levels coefficients of the Prodt and Wedget variables to unity provides the alternative
interpretation that the share of wages (net of taxes and computed using RPIX rather than the
implicit GDP deflator) has varied negatively with the rate of unemployment and positively with
union strength.32
The conditional ECM regression associated with the above level relationship is given in
Table III.33 These estimates provide further direct evidence on the complicated dynamics that seem
to exist between real wage movements and their main determinants.34 All five lagged changes in
real wages are statistically significant, further justifying the choice of p D 6. The equilibrium
correction coefficient is estimated as 0.229 (0.0586) which is reasonably large and highly
significant.35 The auxiliary equation of the autoregressive part of the estimated conditional ECM
has real roots 0.9231 and 0.9095 and two pairs of complex roots with moduli 0.7589 and 0.6381,
which suggests an initially cyclical real wage process that slowly converges towards the equilibrium
described by (31).36 The regression fits reasonably well and passes the diagnostic tests against non-
normal errors and heteroscedasticity. However, it fails the functional form misspecification test at
29 Note that the ARDL approach advanced in Pesaran and Shin (1999) is applicable irrespective of whether the regressors
are purely I0, purely I1 or mutually cointegrated.
33 Clearly, it is possible to simplify the model further, but this would go beyond the remit of this section which is first to
test for the existence of a level relationship using an unrestricted ARDL specification and, second, if we are satisfied that
34 The standard errors of the estimates reported in Table III allow for the uncertainty associated with the estimation of the
levels coefficients. This is important in the present application where it is not known with certainty whether the regressors
are purely I0, purely I1 or mutually cointegrated. It is only in the case when it is known for certain that all regressors
are I1 that it would be reasonable in large samples to treat these estimates as known because of their super-consistency.
35 The equilibrium correction coefficient in the Treasury’s earnings equation is estimated to be 0.1848 (0.0528), which
is smaller than our estimate; see p. 11 in Annex of CSW. This seems to be because of the shorter lag lengths used in the
Treasury’s specification rather than the shorter time period 1971q1– 1994q3. Note also that the t-ratio reported for this
coefficient does not have the standard t-distribution; see Theorem 3.2. p
36 The complex roots are 0.34293 š 0.67703i and 0.17307 š 0.61386i, where i D 1.

earnings equation
vO t1 0.229 0.0586 N/A

wt1 0.418 0.0974 0.000
wt2 0.328 0.1089 0.004
wt3 0.523 0.1043 0.000
wt4 0.133 0.0892 0.140
wt5 0.197 0.0807 0.017
Prodt 0.315 0.0954 0.001
URt 0.003 0.0083 0.683
URt1 0.016 0.0119 0.196
URt2 0.003 0.0118 0.797
URt3 0.028 0.0113 0.014
URt4 0.027 0.0122 0.031
Wedget 0.297 0.0534 0.000
Wedget1 0.048 0.0592 0.417
Wedget2 0.093 0.0569 0.105
Wedget3 0.188 0.0560 0.001
Uniont 0.969 0.8169 0.239
Uniont1 2.915 0.8395 0.001
Uniont2 0.021 0.9023 0.981
Uniont3 0.101 0.7805 0.897
Uniont4 1.995 0.7135 0.007
Intercept 0.619 0.1554 0.000
D7475t 0.029 0.0063 0.000
D7579t 0.017 0.0063 0.009
2
R D 0.5589, GO D 0.0083, AIC D 339.57, SBC D 302.55,
2 4 D 8.74[0.068], / 2 1 D 4.86[0.027]
/SC FF
2 2 D 0.01[0.993], / 2 1 D 0.66[0.415].
/N H
Notes: The regression is based on the conditional ECM given by (30)

using an ARDL6, 0, 5, 4, 5 specification with dependent variable, wt
estimated over 1972q1– 1997q4, and the equilibrium correction term
2
vO t1 is given in (31). R is the adjusted squared multiple correlation
coefficient, GO is the standard error of the regression, AIC and SBC are
Akaike’s and Schwarz’s Bayesian Information Criteria, /SC 2 4, / 2 1,
FF
/N 2 2, and / 2 1 denote chi-squared statistics to test for no residual
H
serial correlation, no functional form mis-specification, normal errors and
homoscedasticity respectively with p-values given in [Ð]. For details of
these diagnostic tests see Pesaran and Pesaran (1997, Ch. 18).
37 The conditional ECM regression in Table III also passes the test against residual serial correlation but, as the model
was specified to deal with this problem, it should not therefore be given any extra credit!
6. CONCLUSIONS
Empirical analysis of level relationships has been an integral part of time series econometrics
and pre-dates the recent literature on unit roots and cointegration.38 However, the emphasis of this
earlier literature was on the estimation of level relationships rather than testing for their presence (or
otherwise). Cointegration analysis attempts to fill this vacuum, but, typically, under the relatively
restrictive assumption that the regressors, xt , entering the determination of the dependent variable of
interest, yt , are all integrated of order 1 or more. This paper demonstrates that the problem of testing
for the existence of a level relationship between yt and xt is non-standard even if all the regressors
under consideration are I0 because, under the null hypothesis of no level relationship between yt
and xt , the process describing the yt process is I1, irrespective of whether the regressors xt are
purely I0, purely I1 or mutually cointegrated. The asymptotic theory developed in this paper
provides a simple univariate framework for testing the existence of a single level relationship
between yt and xt when it is not known with certainty whether the regressors are purely I0,
purely I1 or mutually cointegrated.39 Moreover, it is unnecessary that the order of integration
of the underlying regressors be ascertained prior to testing the existence of a level relationship
between yt and xt . Therefore, unlike typical applications of cointegration analysis, this method is
not subject to this particular kind of pre-testing problem. The application of the proposed bounds
testing procedure to the UK earnings equation highlights this point, where one need not take an a
priori position as to whether, for example, the rate of unemployment or the union density variable
are I1 or I0.
The analysis of this paper is based on a single-equation approach. Consequently, it is inappropri-
ate in situations where there may be more than one level relationship involving yt . An extension of
this paper and those of HJNR and PSS to deal with such cases is part of our current research, but
the consequent theoretical developments will require the computation of further tables of critical
values.

We confine the main proof of Theorem 3.1 to that for Case IV and briefly detail the alterations
necessary for the other cases. Under Assumptions 1–4 and 5a, the process fzt g1
tD1 has the infinite
zt D m C gt C Cst C CŁ Let A1

p
where the partial sum st tiD1 ei , 8zCz D Cz8z D 1 zIkC1 , 8z IkC1 iD1
8i zi , Cz IkC1 C 1 i Ł
iD1 Ci z D C C 1 zC z, t D 1, 2 . . .; see Johansen (1991) and PSS.
?
Note that C D by , b [ay , a 0(by , b )] ay , a? 0 ; see Johansen (1991, (4.5), p. 1559).
? ? ? ? 0 ? 1 ?
Define the k C 2, r and k C 2, k r C 1 matrices bŁ and d by

g0 g0
bŁ b and d b? ?
y ,b
IkC1 IkC1
39 Of course, the system approach developed by Johansen (1991, 1995) can also be applied to a set of variables containing
possibly a mixture of I0 and I1 regressors.
where b? ?
y , b is a k C 1, k r C 1 matrix whose columns are a basis for the orthogonal
y , b is a basis for R
kC1
complement of b. Hence, b, b? ?
. Let x be the k C 2-unit vector 1, 00 0 .
Then, bŁ , x, d is a basis for R . It therefore follows that
kC2
T1/2 d0 zŁ[Ta] D T1/2 b? ? 0

y ,b m CT
1/2 ?
by , b? 0 Cs[Ta] C b? ? 0 1/2 Ł
y ,b T C Le[Ta]
) b? ? 0
y , b CBkC1 a
where zŁt D t, z0t 0 , BkC1 a is a k C 1-vector Brownian motion with variance matrix Z and [Ta]
denotes the integer part of Ta, a 2 [0, 1]; see Phillips and Solo (1992, Theorem 3.15, p. 983). Also,
T1 x0 zŁt D T1 t ) a. Similarly, noting that b0 C D 0, we have that bŁ0 zŁt D b0 m C b0 CŁ Let D
OP 1. Hence, from Phillips and Solo (1992, Theorem 3.16, p. 983), defining Z̃Ł1 Pi ZŁ1 and
Pi Z , it follows that
Z
0 0
T1 b0Ł Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 b0Ł Z̃Ł1 Z 0 Z
D OP 1, T1 Z D OP 1

0 0
D OP 1
T1 B0T Z̃Ł1 Z̃Ł1 bŁ D OP 1, T1 B0T Z̃Ł1 Z A2

where BT d, T1/2 x . Similarly, defining ũ Pi u,
0
0
ũ D OP 1
T1/2 b0Ł Z̃Ł1 ũ D OP 1, T1/2 Z A3

Cf. Johansen (1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146).
The next result follows from Phillips and Solo (1992, Theorem 3.15, p. 983); cf. Johansen
(1991, Lemma A.3, p. 1569) and Johansen (1995, Lemma 10.3, p. 146) and Phillips and Durlauf
(1986).

Lemma A.1 Let BT d, T1/2 x and define Ga D G1 a0 , G2 a0 , where G1 a b? ? 0
y ,b
1 1
CB̃kC1 a, B̃kC1 a[D BQ 1 a , B̃k a ] D BkC1 a 0 BkC1 ada, and G2 a a 2 , a 2 [0,1].
0 0 0
Then
1
1
0 0
T2 B0T Z̃Ł1 Z̃Ł1 BT ) GaGa0 da, T1 B0T Z̃Ł1 ũ ) GadBQ uŁ a
0 0
where BQ uŁ a BQ 1 a w0 B̃k a and B̃k a D BQ 1 a, B̃k a0 0 , a 2 [0, 1]
Proof of Theorem 3.1 Under H0 of (17), the Wald statistic W of (21) can be written as
0 1 0
ωO uu W D ũ0 P Z Z̃ Ł
1 Z̃ Ł
1 P Z

Z̃ Ł
1 Z̃Ł1 P Z ũ
1
0 Ł0 0
D ũ0 P Ł
Ł
Z ũ
0
where AT T1/2 bŁ , T1/2 BT . Consider the matrix A0T Z̃Ł1 P Ł
Z Z̃1 AT . It follows from (A2)
and Lemma A.1 that
1 0 Ł0 Ł
0 T bŁ Z̃1 P Z Z̃1 bŁ 00
A0T Z̃Ł1 P Z̃
Z 1 T
Ł
A D 0 C oP 1 A4
0
Next, consider A0T Z̃Ł1 P
Z ũ. From (A3) and Lemma A.1,

0
0 T1/2 b0Ł Z̃Ł1 P
Z ũ
A0T Z̃Ł1 P
Z ũ D 0
C oP 1 A5
T1 B0T Z̃Ł1 ũ
Finally, the estimator for the error variance ωuu (defined in the line after (21)),

0 Ł0 1 0 Ł0
ωO uu D T m1 ũ0 ũ ũ0 P Ł
Ł
Z ũ
D T m1 ũ0 ũ C oP 1 D ωuu C oP 1 A6
From (A4)–(A6) and Lemma A.1,

1
1 0 Ł0 0
W D T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃
Ł 1  P Z̃
Z 1
Ł
Z ũ/ωuu
0
1 0
C T2 ũ0 Z̃Ł1 BT T2 B0T Z̃Ł1 Z̃Ł1 BT B0T Z̃Ł1 ũ/ωuu C oP 1 A7
We consider each of the terms in the representation (A7) in turn. A central limit theorem allows us
to state 1/2
0 0
1/2
T1 b0Ł Z̃Ł1 P Z̃
Z 1
Ł
b Ł T1/2 b0Ł Z̃Ł1 P
Z ũ/ωuu ) zr ¾ N0, Ir

Hence, the first term in (A7) converges in distribution to z0r zr , a chi-square random variable with
r degrees of freedom; that is,
1
1 0 Ł0 0
2
T1 ũ0 P Ł
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 P 0
Z ũ/ωuu ) zr zr ¾ / r A8
From Lemma A.1, the second term in (A7) weakly converges to

1
1 1
1
dBQ uŁ aGa0 GaGa0 dr GkC1 adBQ uŁ a/ωuu
0 0 0
which, as C D b? ? ? ? 0 ? ? 1 ? ? 0
y , b [ay , a 0ˇy , b )] ay , a , may be expressed as

0
? ? 0 1
1
a? ? 0
y , a B̃kC1 a a?1 ? 0
y , a B̃kC1 a ay , a B̃kC1 a 0
dBQ uŁ a da
0 a 12 0 a 12 a 12

1 ? ? 0
ay , a B̃kC1 a
ð dBQ uŁ a/ωuu
0 a 12
Now, noting that under H0 of (17) we may express a? 0 0 ? ?0 0
y D 1, w and a D 0, axx where
a? 0
xx axxD 0, we define the k r C 1-vector of independent de-meaned standard Brownian
motions,
Q u a, W̃kr a0 0 ] [a?
W̃krC1 a[ D W ? 0 ? ? 1/2 ?
y , a Zay , a ] ay , a? 0 B̃kC1 a
1/2 Q

ωuu Bu a
D
a? 0 ? 1/2 ? 0
where BQ uŁ a D BQ 1 a w0 B̃k a is independent of B̃k a and B̃kC1 a BQ 1 a, B̃k a0 0 is par-
titioned according to zt D yt , x0t 0 , a 2 [0, 1]. Hence, the second term in (A7) has the following

1 0
1 0 1
dW Q u a W̃krC11a W̃krC1 a W̃krC1 a
da
0 a 2 0 a 12 a 12

1
W̃krC1 a Q u a
ð dW A9
0 a 12
Note that dW Q u a in (A9) may be replaced by dWu a, a 2 [0, 1]. Combining (A8) and (A9) gives
For the remaining cases, we need only make minor modifications to the proof for Case IV.
In Case I, d D b? ? ?
y , b with b, by , b
?
a basis for RkC1 and BT D d. For Case II, where
Ł 0 0
Z1 D iT , Z1 , we have
m0
bŁ D b
IkC1
and, consequently, we define x as in Case IV,

m0
dD b? ?
y , b and BT D d, x.
IkC1
Case III is similar to Case I as is Case V.
Proof of Theorem 3.2 We provide a proof for Case V which may be simply adapted for Cases I
and III. To emphasize the potential dependence of the limit distribution on nuisance parameters,
the proof is initially conducted under Assumptions 1-4 together with Assumption 5a which implies
! p
H0 yy : !yy D 0 but not necessarily H0 yx.x : pyx.x D 00 ; in particular, note that we may write a? y D
!
1, f0 0 for some k-vector f. The t-statistic for H0 yy : !yy D 0 may be expressed as the square
root of 1
0P
y 0 0
A0T Ẑ01 P
Z ,X̂1

Ẑ 1 A T A Ẑ P
T 1 Z Ẑ 1 A T Z ,X̂1 y/ωO uu A10
where AT T1/2 b, T1/2 BT and BT D b? ?

y , b . Note that only the diagonal element of the
?
inverse in (A10) corresponding to by is relevant, which implies that we only need to consider
the blocks T2 B0T Ẑ01 P 1 0 0
Z Ẑ1 BT and T BT Ẑ1 P Z ,X̂1 y in (A10). Therefore, using (A2) and
1 1 0 0
T1 û0 PX̂1 b?xx Ẑ1 BT T2 B0T Ẑ01 Ẑ1 BT T BT Ẑ1 PX̂1 b?xx û/ωuu A11
?0 0 ? 1 ? 0 0
where PX̂1 b?xx IT X̂1 b?
xx bxx X̂1 X̂1 bxx bxx X̂1 . Now,
?
T1/2 b? 0 ?0 ? ? ? 0 ? 1 ? ? 0
xx x̂[Ta] ) 0, bxx bxx [ay , a 0(by , b )] ay , a B̂kC1 a
f f ? 1 ? 0 f
D b? 0 ? ?0
xx bxx [axx 0xx lxy gyx.x bxx ] axx B̂k a
where, for convenience, but without loss of generality, we have set b? ? 0 0 ?

y D ˇyy , 0 and b D
?0 0 2 2 2 0 2 0 2 2 O2
0, bxx , lxy gxy /*yy.x , *yy.x *yy f gxy , gyx.x gyx f 0xx and B̂k a B̂k a lxy Bu a,
BO u2 a BO 1 a 20 B̂k a, a 2 [0, 1]. Hence, (A11) weakly converges to

1
1 1
1
OBu2 adWu a OBu2 aB̂2k a0 da a? ?0
xx axx
2 2
B̂k aB̂k a da a?
0
xx
0 0 0

1 2
1
1
2 OBu2 aB̂2k a0 da a?
ð a?
xx
0
B̂k adWu a ł BO u2 a2 da xx
0 0 0

1 1
1
2 2
ð a?
xx
0
B̂k aB̂k a0 da a?
xx a?
xx
0
B̂fk aBO u2 ada
0 0
Under the conditions of the theorem, f D w and l2xy D 0 and, therefore, BO u2 a[D BO uŁ a] D
0 2
1/2 O
ωuu Wu a and a? ?0 ?0 ? 1/2
xx B̂k a[D axx B̂k a] D axx Zxx axx Ŵkr a, a 2 [0, 1].
Proof of Theorem 4.1 Again, we consider Case IV; the remaining Cases I–III and V may be
!
dealt with similarly. Under H1 yy : !yy 6D 0, Assumption 5b holds and, thus,  D ay b0y C ab0 where
ay D ˛yy , 00 0 and by D ˇyy , b0yx 0 ; see above Assumption 5b. Under Assumptions 1–4 and 5b,
the process fzt g1 Ł
tD1 has the infinite moving-average representation, zt D m C gt C Cst C C Let ,
? ?0 ? 1 ?0
where now C b [a 0b ] a . We redefine bŁ and d as the k C 2, r C 1 and k C 2, k r
matrices,
g0
bŁ by , b
IkC1
and
g0
d b? ,
IkC1
where b? is a k C 1, k r matrix whose columns are a basis for the orthogonal complement of
by , b. Hence, by , b, b? is a basis for RkC1 and, thus, bŁ , x, d a basis for RkC2 , where again
x is the k C 2-unit vector 1, 00 0 . It therefore follows that
T1/2 d0 zŁ[Ta] D T1/2 b?0 m C T1/2 b?0 Cs[Ta] C b?0 T1/2 CŁ Le[Ta] ) b?0 CBkC1 a
Also, as above, T1 x0 zŁt D T1 t ) a and b0Ł zŁt D by , b0 m C by , b0 CŁ Let D OP 1.
The Wald statistic (21) multiplied by ωO uu may be written as
1
ũ P 0 Ł0 0
Ł Ł 0 Ł0 0 Ł0
Z ũ C 2lŁ Z̃1 P
Ł
Z Z̃1 lŁ ,
Z ũ C lŁ Z̃1 P
B1
where lŁ bŁ ay , a0 1, w0 0 , AT T1/2 bŁ , T1/2 BT and BT d, T1/2 x. Note that (A6)
!
continues to hold under H1 yy : !yy 6D 0. A similar argument to that in the Proof of Theorem 3.1
demonstrates that the first term in (B1) divided by ωuu has the limiting representation

1
1 1
1
z0rC1 zrC1 C dWu aFkr a0 Fkr aFkr a0 da Fkr adWu a B2
0 0 0
where zrC1 ¾ N0, IrC1 , Fkr a D W̃kr a0 , a 12 0 and W̃kr a a? 0 ? 1/2 ? 0
is a k r-vector of de-meaned independent standard Brownian 1 motions independent of the
standard Brownian motion Wu a, a 2 [0, 1]; cf. (22). Now, 0 Fkr adWu a is mixed normal
1
with conditional variance matrix 0 Fkr aFkr a0 da. Therefore, the second term in (B2) is
unconditionally distributed as a / 2 k r random variable and is independent of the first term; cf.
(A4). Hence, the first term in (B1) divided by ωuu has a limiting / 2 k C 1 distribution.
The second term in (B1) may be written as
0

1/2 1/2 0 Ł0
21, w0 ay , ab0Ł Z̃Ł1 P
Z ũ D 2T 1, w0
ay , a T b Z̃ P
Ł 1 Z ũ D OP T1/2 , B3

0
1, w0 ay , ab0Ł Z̃Ł1 P Ł 0 0 0
Z Z̃1 bŁ ay , a 1, w
0

DT1, w0 ay , a T1 b0Ł Z̃Ł1 P Ł 0 0 0
Z Z̃1 bŁ ay , a 1, w D OP T B4

0
as T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ converges in probability to a positive definite matrix. Moreover, as
!
1, w0 ay , a 6D 00 under H1 yy : !yy 6D 0, the Theorem is proved.
Proof of Theorem 4.2 A similar decomposition to (B1) for the Wald statistic (21) holds under
! !
H1 yx.x \ H0 yy except that bŁ and d are now as defined in the Proof of Theorem 3.1. Although
!yy !
H0 : !yy D 0 holds, we have H1 yx.x : pyx.x 6D 00 . Therefore, as in Theorem 3.2, note that we may
write a? 0 0
y D 1, f for some k-vector f 6D w. Consequently, the first term divided by ωuu may be
written as
1
1 0 Ł0 0
T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃
Ł 1 P Z̃
Z 1
Ł
Z ũ/ωuu
0
1 0
C T2 ũ0 Z̃Ł1 BT T2 B0T Z̃Ł1 Z̃Ł1 BT B0T Z̃Ł1 ũ/ωuu C oP 1 B5
cf. (A7). As in the Proof of Theorem 3.1, the first term of (B5) has the limiting representation z0r zr
where zr ¾ N0, Ir ; cf. (22). The second term of (B5) has the limiting representation
Q2  1

1 Bu a 0
1 BQ u2 a Q2
Bu a 0
dBQ uŁ a a? 0
xx B̃k a
 a? 0
xx B̃k a a? 0
xx B̃k a da
1 1
0 a 2 0 a 2 a 12

1 BQ u2 a
ð a? 0
xx B̃k a dBQ uŁ a/ωuu D OP 1
0 1
a 2
where BQ uf a BQ 1 a f0 B̃k a, a 2 [0, 1]; cf. Proof of Theorem 3.2. The second term of (B1)
becomes
0

1/2 1/2 0 Ł0
21, w0 ab0Ł Z̃Ł1 PZ ũ D 2T 1, w0
a T b Z̃ P
Ł 1 Z ũ D OP T1/2
and the third term

0
1, w0 ab0Ł Z̃Ł1 P Ł 0 0 0
Z Z̃1 bŁ a 1, w D T1, w a
0
0

ð T1 b0Ł Z̃Ł1 PZ 1Z̃ Ł
b Ł a0 1, w0 0 D OP T

! p
The Theorem follows as 1, w0 a 6D 00 under H0 yy : !yy D 0 and H1 yx.x : pyx.x 6D 00 .
Proof of Theorem 4.3 We concentrate on Case IV; the remaining Cases I–III and V are
proved by a similar argument. Let fztT gTtD1 denote the process under H1T of (26). Hence,
8LztT m gt D xtT , where xtT 5T 5[zt1T m gt 1] C et and 5T 5 is
given in (27). Therefore, ztT ) gt D CxtT C CŁ LxtT , Cz D C C 1 zCŁ z and
?
C D b? ? ? ? 0 ? 1 ? ? 0
y , b [ay , a 0(by , b )] ay , a , and thus,
[IkC1 IkC1 C T1 Cay b0y L]ztT m gt D CetT C CŁ LxtT B6
where

dyx
etT T1/2 b0 [zt1T m gt 1] C et , t D 1, . . . , T, T D 1, 2, . . .
dxx

s1
i
ztT D IkC1 C T1 Cay b0y s zsT m gs C m C gt C IkC1 C T1 Cay b0y
iD0
Ł
ð[CetiT C C LxtiT ]
Note that xtT D 5T 5[zt1T m gt 1] C et . It therefore follows that T1/2 d0 zŁ[Ta]T
a
) b? ? 0 Ł 0 0
y , b CJkC1 a, where d is defined above Lemma A.1 and ztT D t, ztT , JkC1 a 0 exp
0
fay by Ca rgdBkC1 r is an Ornstein-Uhlenbeck process and BkC1 a is a k C 1-vector Brow-
nian motion with variance matrix Z, a 2 [0, 1]; cf. Johansen (1995, Theorem 14.1, p. 202).
Similarly to (A4),
1 0 Ł0 Ł
T bŁ Z̃1 PZ Z̃1 bŁ 00
A0T Z̃01 P Z̃ A
Z 1 T D 0 C oP 1
Therefore, expression (B1) for the Wald statistic (21) multiplied by ωO uu is revised to
1
% 0 yP
ωO uu W D T1  Ł
T 1 0 Ł0 Ł 0
b0Ł Z̃Ł1 P
Z

Z̃ 1 b Ł b Ł Z̃ 1 P Z

Z̃ 1 b Ł Z y
1
C T2  % 0 yP Ł
T 2 0 Ł0 Ł 0
B0T Z̃Ł1 P
Z̃
Z 1 T

B B Z̃ Z̃
T 1 1 T B Z y C oP 1 B7

1
1 0 Ł0 0
T1 ũ0 P Ł
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 PZ ũ
1
1 0 Ł0 0 Ł0
C 2T1 ũ0 P Z̃
Z 1
Ł
b Ł T b Z̃ P
Ł 1  Z̃
Z 1
Ł
b Ł b0Ł Z̃Ł1 P Ł
Z Z̃1 pyT
0
1
1 0 Ł0 0 Ł0
C T1 pŁyT Z̃Ł1 P Ł
Ł
Z Z̃1 bŁ b0Ł Z̃Ł1 P Ł
Z Z̃1 pyT B8
where pŁyT T1 ˛yy b0yŁ C T1/2 dyx w0 dxx b0Ł . Defining h dyx w0 dxx 0 , consider
0 0 0
T1/2 b0Ł Z̃Ł1 P Ł Ł
Z Z̃1 pyT D T
1/2 0 Ł Ł 1
Z Z̃1 byŁ ˛yy T C bŁ hT
bŁ Z̃1 P 1/2

0
D T1 b0Ł Z̃Ł1 P Ł
Z Z̃1 bŁ h C oP 1 B9
where we have made use of T1/2 b0yŁ zŁ[Ta]T ) b0y CJkC1 a. Therefore, (B8) divided by ωuu may be
re-expressed as
0
0
1/2 0 Ł0
T1/2 b0Ł Z̃Ł1 P
Z ũ C Qh Q 1
T b Z̃
Ł 1 P Z ũ C Qh /ωuu C oP 1 D z0r zr C oP 1
B9
1 0 Ł0 Ł 1/2
where Q p limT!1 T bŁ Z̃1 P Z Z̃1 bŁ and zr ¾ NQ h, Ir .
As Ł Ł0 0
T1 B0T Z̃Ł1 P 1 0 Ł0 Ł Ł0
P Z y D P Z Z̃1 pyT C ũ, Z0 y D T B0 T Z̃1 PZ Z̃1 pyT C ũ.
Consider the second term in (B7), in particular, T1 B0T Z̃Ł1 P Ł Ł
Z Z̃1 pyT which after substitution
Ł
for pyT becomes
0 0 0
T2 B0T Z̃Ł1 P Ł
Z Z̃1 byŁ ˛yy C T
3/2 0 Ł
BT Z̃1 P Ł 2 0 Ł
Z Z̃1 bŁ h D T BT Z̃1 P
Ł
Z Z̃1 byŁ ˛yy C oP 1

1 ? ? 0
by , b CJ̃kC1 a
) 1 J̃kC1 a0 C0 by ˛yy da
0 a 2
Therefore,

1
0
b? ? 0
y , b CJ̃kC1 a 1/2 Q
T1 B0T Z̃Ł1 P
Z y ) ωuu dWu a C J̃kC1 a0 C0 by ˛yy da

0 a 12
Consider
J̃ŁkrC1 a[D JQ Łu a, J̃Łkr a0 0 ] [a? ? 0 ?

y , a Zay , a ]
? 1/2 ?
ay , a? 0 J̃kC1 a
1/2 Q
ωuu Ju a
D
a? 0 ? 1/2 ? 0
xx xx axx axx J̃k a
where JQ u a D JQ 1 a w0 J̃k a is independent of J̃k a and J̃kC1 a JQ 1 a, J̃k a0 0 , a 2 [0, 1].
Now, J̃ŁkrC1
a satisfies the stochastic integral and differential equations, J̃ŁkrC1 a D W̃krC1
0 a Ł
a C ab 0 J̃krC1 r dr and dJ̃krC1 a D dW̃krC1 a C ab0 J̃ŁkrC1 a da, where a D [a?
Ł ? 0
y ,a
? ? 1/2 ? ? 1/2
ay , a ] ay , a ay and b D [ay , a Zay , a ] ð [by , b 0ay , a ] by , b? 0
? 0 ? ? 0 ? ? ? 0 ? ? 1 ?
by ; cf. Johansen (1995, Theorem 14.4, p. 207). Note that the first element of J̃ŁkrC1 a satisfies
QJŁu a D WQ u a C ωuu 0 a Ł
1/2
˛yy b 0 J̃krC1 r dr and dJQ Łu a D dWQ u a C ωuu
1/2
˛yy b0 JQ ŁkrC1 a da.
Therefore,

1
1 0
b? ? 0
y , b CJ̃kC1 a 1/2 Q Ł
T B0T Z̃Ł1 P
Z Y ) ωuu dJu a
0 a 12
Hence, the second term in (B7) weakly converges to

1
1 1
1
ωuu dJQ Łu aFkrC1 a0 FkrC1 aFkrC1 a da 0
FkrC1 a dJQ Łu a B10
0 0 0
where FkrC1 a D J̃ŁkrC1 a0 , a 12 0 .

Combining (B9) and (B10) gives the result stated in Theorem 4.3 as ωO uu ωuu D OP 1 under
H1T of (26) and noting dJQ Łu a may be replaced by dJŁu a.
Proof of Theorem 4.4 We consider Case V; the remaining Cases I and III may be dealt with
!
similarly. Under H1 yy : !yy 6D 0, from (10), ŷ1 D X̂1 q C v̂1 , where v̂1 P Z ,X̂1 v1 and
0 0
v1 D 0, v1 , . . . , vT1 . Therefore, ŷ1 P 0 0
Z ,X̂1 y D v̂1 P
Z ,X̂1 Y and ŷ1 P
Z ,X̂1 ŷ1 D
0
Z ,X̂1 v̂1 .
v̂1 P
As in Appendix A,
T1/2 b? 0
xx x[Ta] D T
1/2 ? 0
bxx mx C T1/2 b? 0
xx gx t C T
1/2 ? 0 ?
bxx bxx a?0 0b? 1 a?0 s[Ta]
C 0, b? 0
xx T
1/2 Ł
C Le[Ta]
and noting that b0xx b? 0

xx D 0, bxx xt D T
1/2 0
bxx mx C T1/2 b0xx gx t C 0, b0xx CŁ Let . Consequently,
1 0 0
T bxx X̂1 PZ X̂1 bxx 00
A0xT X̂01 P X̂
Z 1 xTA D 0 0 C oP 1
0 T2 b? xx X̂1 P
?
Z X̂1 bxx
where AxT T1/2 bxx , T1/2 b? xx .

D OP 1, T1 Z
Now, because T1 b0xx X̂01 v̂1 D OP 1, T1 b0xx X̂01 Z 0 Z D OP 1 and

0
v̂1 D OP 1, hence T1 b0 X̂0 P
T1 Z 1 ? 0 0
xx 1 Z v̂1 D OP 1. Also because T bxx X̂1 v̂1 D OP 1
0 0 1 ? 0 0
and T1 b? xx X̂1 Z D OP 1, hence T bxx X̂1 P Z v̂1 D OP 1; cf. (A3). Hence, noting that
T1 b0xx X̂01 P X̂ b
Z 1 xx D O P 1 and T2 ? 0 0
b X̂
xx 1 P ?
Z X̂1 bxx D OP 1,
T1 ŷ01 P
Z ŷ1 D T1 v̂01 P
Z v̂1 T1 v̂01 P
Z ? v̂1 C oP 1
D T1 v̂01 P
Z v̂1 C oP 1
,X̂1 bxx
0 0 1 0 0
where P
Z ,X̂1 bxx P
Z PZ X̂1 bxx bxx X̂1 P
Z and P
Z ,X̂1 b?xx
? ?0 0 ? 1 ? 0 0 1 0
P
Z . Therefore, as T v̂1 v̂1 D OP 1,

T1 ŷ01 P
Z ŷ1 D OP 1 B11
,X̂1
The numerator of t!yy of (24) may be written as ŷ01 P û C v̂01 P D v̂0 P

y
Z Z ,X̂1
,X̂1
1 Z ,X̂1
0 0 0 1/2 0 0 1/2 0
Ẑ1 l, where l by , bay , a 1, w . Because T bxx X̂1 û D OP 1 and T Z û D
OP 1, T1/2 b0xx X̂01 P 1 ? 0 0 1 ? 0 0

Z û D OP 1, and, as T bxx X̂1 û D OP 1, T bxx X̂1 P
Z û D OP 1.
Therefore,
T1/2 v̂01 P
Z û D T1/2 v̂01 P
Z û T1/2 v̂01 P
Z ? û C oP 1
D T1/2 v̂01 P
Z û C oP 1 D OP 1
,X̂1 bxx
D OP 1, T1 l0 Ẑ0

noting T1/2 v̂01 û D OP 1. Similarly, as 1, w0 ay , a 6D 00 , T1 l0 Ẑ01 Z 1
1 0 0 ?
X̂1 bxx D OP 1 and T l Ẑ1 X̂1 bxx D OP 1. Therefore,
T1 v̂01 P
Z Ẑ1 l D T1 v̂01 P
Z Ẑ1 l T1 v̂01 P
Z ? Ẑ1 l C oP 1
D T1 v̂01 P
Z Ẑ1 l C oP 1 D OP 1
,X̂1 bxx
noting T1 v̂01 Ẑ1 l D OP 1. Thus,
T1/2 v̂01 P
Z Ẑ1 l D OP T1/2 . B12
,X̂1
Because ωO uu ωuu D oP 1, combining (B11) and (B12) yields the desired result.
ACKNOWLEDGEMENTS
We are grateful to the Editor (David Hendry) and three anonymous referees for their helpful
comments on an earlier version of this paper. Our thanks are also owed to Michael Binder, Peter
Burridge, Clive Granger, Brian Henry, Joon-Yong Park, Ron Smith, Rod Whittaker and seminar
participants at the University of Birmingham. Partial financial support from the ESRC (grant Nos
R000233608 and R000237334) and the Isaac Newton Trust of Trinity College, Cambridge, is
gratefully acknowledged. Previous versions of this paper appeared as DAE Working Paper Series,
REFERENCES
Banerjee A, Dolado J, Galbraith JW, Hendry DF. 1993. Co-Integration, Error Correction, and the Econo-
Banerjee A, Dolado J, Mestre R. 1998. Error-correction mechanism tests for cointegration in single-equation
framework. Journal of Time Series Analysis 19: 267–283.
Banerjee A, Galbraith JW, Hendry DF, Smith GW. 1986. Exploring equilibrium relationships in economet-
rics through static models: some Monte Carlo Evidence. Oxford Bulletin of Economics and Statistics 48:
253–277.
Blanchard OJ, Summers L. 1986. Hysteresis and the European Unemployment Problem. In NBER Macroe-
conomics Annual 15–78.
Boswijk P. 1992. Cointegration, Identification and Exogeneity: Inference in Structural Error Correction
Boswijk HP. 1994. Testing for an unstable root in conditional and structural error correction models. Journal
of Econometrics 63: 37–70.
Boswijk HP. 1995. Efficient inference on cointegration parameters in structural error correction models.
Journal of Econometrics 69: 133–158.
Cavanagh CL, Elliott G, Stock JH. 1995. Inference in models with nearly integrated regressors. Econometric
Theory 11: 1131–1147.
Chan A, Savage D, Whittaker R. 1995. The new treasury model. Government Economic Series Working
Darby J, Wren-Lewis S. 1993. Is there a cointegrating vector for UK wages? Journal of Economic Studies
20: 87–115.
Dickey DA, Fuller WA. 1979. Distribution of the estimators for autoregressive time series with a unit root.
Journal of the American Statistical Association 74: 427–431.
Dickey DA, Fuller WA. 1981. Likelihood ratio statistics for autoregressive time series with a unit root.
Engle RF, Granger CWJ. 1987. Cointegration and error correction representation: estimation and testing.
Granger CWJ, Lin J-L. 1995. Causality in the long run. Econometric Theory 11: 530–536.
Hansen BE. 1995. Rethinking the univariate approach to unit root testing: using covariates to increase power.
Harbo I, Johansen S, Nielsen B, Rahbek A. 1998. Asymptotic inference on cointegrating rank in partial
systems. Journal of Business Economics and Statistics 16: 388–399.
Hendry DF, Pagan AR, Sargan JD. 1984. Dynamic specification. In Handbook of Econometrics (Vol. II)
Johansen S. 1991. Estimation and hypothesis testing of cointegrating vectors in Gaussian vector autoregres-
sive models. Econometrica 59: 1551–1580.
Johansen S. 1992. Cointegration in partial systems and the efficiency of single-equation analysis. Journal of
Econometrics 52: 389–402.
Johansen S. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford Uni-
Kremers JJM, Ericsson NR, Dolado JJ. 1992. The power of cointegration tests. Oxford Bulletin of Economics
and Statistics 54: 325–348.
Layard R, Nickell S, Jackman R. 1991. Unemployment: Macroeconomic Performance and the Labour
Lindbeck A, Snower D. 1989. The Insider Outsider Theory of Employment and Unemployment, MIT Press:
Cambridge, MA.
Manning A. 1993. Wage bargaining and the Phillips curve: the identification and specification of aggregate
wage equations. Economic Journal 103: 98–118.
Nickell S, Andrews M. 1983. Real wages and employment in Britain. Oxford Economic Papers 35: 183–206.
Nielsen B, Rahbek A. 1998. Similarity issues in cointegration analysis. Preprint No. 7, Department of
Park JY. 1990. Testing for unit roots by variable addition. In Advances in Econometrics: Cointegration,
Pesaran MH, Pesaran B. 1997. Working with Microfit 4.0: Interactive Econometric Analysis, Oxford Univer-
sity Press: Oxford.
Pesaran MH, Shin Y. 1999. An autoregressive distributed lag modelling approach to cointegration analysis.
Chapter 11 in Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial
Pesaran MH, Shin Y, Smith RJ. 2000. Structural analysis of vector error correction models with exogenous
I(1) variables. Journal of Econometrics 97: 293–343.
Phillips AW. 1958. The relationship between unemployment and the rate of change of money wage rates in
the United Kingdom, 1861–1957. Economica 25: 283–299.
Phillips PCB, Durlauf S. 1986. Multiple time series with integrated variables. Review of Economic Studies
53: 473–496.
Phillips PCB, Ouliaris S. 1990. Asymptotic properties of residual based tests for cointegration. Econometrica
58: 165–193.
Phillips PCB, Solo V. 1992. Asymptotics for linear processes. Annals of Statistics 20: 971–1001.
Rahbek A, Mosconi R. 1999. Cointegration rank inference with stationary regressors in VAR models. The
Econometrics Journal 2: 76–91.
Sargan JD. 1964. Real wages and prices in the U.K. Econometric Analysis of National Economic Planning,
Hart PE Mills G, Whittaker JK (eds). Macmillan: New York. Reprinted in Hendry DF, Wallis KF (eds.)
Econometrics and Quantitative Economics. Basil Blackwell: Oxford; 275–314.
1097–1107.
Urbain JP. 1992. On weak exogeneity in error correction models. Oxford Bulletin of Economics and Statistics
52: 187–202.
+ MODEL
Available online at www.sciencedirect.com
ScienceDirect
The Journal of Finance and Data Science xx (2018) 1e19
http://www.keaipublishing.com/en/journals/jfds/
Selecting appropriate methodological framework for time series data

analysis*
Min B. Shrestha a,*, Guna R. Bhatta b
a
National Planning Commission, Government of Nepal, Singha Durbar, Kathmandu, Nepal
b
Nepal Rastra Bank, Research Department, Baluwatar, Kathmandu, Nepal
Received 1 July 2017; revised 3 November 2017; accepted 6 November 2017
Available online xxx
Abstract
Economists face method selection problem while working with time series data. As time series data may possess specific
properties such as trend and structural break, common methods used to analyze other types of data may not be appropriate for the
analysis of time series data. This paper discusses the properties of time series data, compares common data analysis methods and
presents a methodological framework for time series data analysis. The framework greatly helps in choosing appropriate test
methods. To present an example, Nepal's moneyeprice relationship is examined. Test results obtained following this methodo-
logical framework are found to be more robust and reliable.
© 2018 China Science Publishing & Media Ltd. Production and hosting by Elsevier on behalf of KeAi Communications Co. Ltd.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Time series analysis; Unit root test; Methodological framework; Moneyeprice relationship in Nepal
1. Introduction
Time series data is a sequence of observations of the defined variable at a uniform interval over a period of time in
successive order. Most common series are in annual, quarterly, monthly, weekly and daily frequencies. Economic time
series data often possess unique features such as clear trend, high degree of persistence on shocks, higher volatility
over time and meandering and sharing co-movements with other series.1 Researchers need to understand such features
of time series data properly and address them.
In time series analysis, it is important to understand the behavior of variables, their interactions and integrations
over time. If major characteristics of time series data are understood and addressed properly, a simple regression
analysis using such data can also tell us about the pattern of relationships among variables of interest. This paper
attempts to highlight the basic econometric issues related to the time series data and provides a basic methodological
framework for time series analysis. In addition, the paper analyses the relationship between money and price in Nepal
using the methodological framework presented in this paper to provide practical example.
*
Note: Preliminary version of this paper was published as NRB Working Paper No. 36 (March 2017).
* Corresponding author.
E-mail addresses: minbshrestha@gmail.com (M.B. Shrestha), bhatta.gunaraj@gmail.com (G.R. Bhatta).
Peer review under responsibility of China Science Publishing & Media Ltd.
https://doi.org/10.1016/j.jfds.2017.11.001
2405-9188/© 2018 China Science Publishing & Media Ltd. Production and hosting by Elsevier on behalf of KeAi Communications Co. Ltd. This
is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Please cite this article in press as: Shrestha MB, Bhatta GR, Selecting appropriate methodological framework for time series data analysis,
(2018), https://doi.org/10.1016/j.jfds.2017.11.001
+ MODEL
2 M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19
Fig. 1. Stationary time series.
2. Properties of time series data
2.1. Autoregressive character of time series
Time series data may have some kind of relationship with its previous values. The autoregressive (AR) character of
time series model indicates that present value of any variable is determined by its past value and some adjustment
factors. Such adjustment factors are estimated from the relation of current value with past values. If the current value is
based solely on the immediate preceding value, it is termed as first order autoregressive, AR (1), and if it is based on
two preceding values, second order autoregressive, AR (2), and so on.
A univariate linear regression modelc can be estimated as:
Yt ¼ m þ rYt1 þ εt ð1Þ
where, Yt is a dependent variable, Y, at period t. m is a constant parameter. εt is the unexplained part (gap) of actual data
and fitted line by regression equation, termed as error. Yt1 is the first lagged value of Y, r is the coefficient of Yt1.
Eq. (1) says that the value of Yt equals the constant m plus r times its previous value and an unknown component εt
The model to be estimated in Eq. (1) is an AR (1) process.
Similarly,
Yt ¼ m þ r1 Yt1 þ r2 Yt2 þ εt ð2Þ
The model to be estimated in Eq. (2) is an AR (2) process.

Besides the AR process, moving average (MA) model also estimates the present value of a variable based on the
current and previous years' error terms.d As in AR process, there can be more than one order of integration in MA as
well.
2.2. Stationary and non-stationary series
A time series data is called stationary if its value tends to revert to its long-run average value and properties of data
series are not affected by the change in time only (Fig. 1).e On the contrary, the non-stationary time series does not tend
to return to its long-run average value, hence, its mean, variance and co-variance also change over time (Fig. 2).
Most of the macroeconomic variables such as volume of gross domestic product (GDP), consumption, consumer
price index, etc. exhibit a strong upward or downward movement over time with no tendency to revert to a fixed mean.
c
For details, see Stigler (1981).2
d
Error terms are the unobserved factors of regression that may affect the dependent variable. These are residuals of actual and fitted values of a
regression. It is represented by ε or u. Wooldridge (2002) mentions that “dealing with this error term is the most important component of any
econometric analysis”.
e
For details, see Verbeek (2017)3, Chapter 8.
+ MODEL
M.B. Shrestha, G.R. Bhatta / The Journal of Finance and Data Science xx (2018) 1e19 3
Fig. 2. Non-stationary time series.
Hence, they are non-stationary series. If the time series is non-stationary, it is said to have a unit root. Therefore, in
econometrics, the stationarity of a time series is examined by conducting unit root test.
Mathematically, the series Yt is stationary if:
EðYt Þ ¼ EðYts Þ ¼ m; for some s > 0 ð3Þ
VarðYt Þ ¼ VarðYts Þ ¼ sy2 and ð4Þ
CovðYt ; Yts Þ ¼ gs ð5Þ
where,
E(Yt) ¼ Expected value of Y at period t
Var ¼ Variance, the variation or spread of Yt from E(Yt)
Cov ¼ Covariance, the joint variation of Yt and Yts
Yts ¼ Lag of Y up to period ts
2.3. Trend, cycle and seasonality in time series data
Trend is a sustained upward or downward movement in time series data over the long run (Fig. 3). Cycle is a short-
run fluctuation which occurs in a given interval such as monthly, quarterly or annual (Fig. 4). Trends are always non-
stationary whereas cycles may be either stationary or non-stationary.
Seasonality is a kind of pattern in a high frequency data such as quarterly, monthly, weekly or daily. For instance,
we may observe a high volume of sales in festive season, more currency coming into circulation during Dashain
festival and increased government spending at the last quarter of the fiscal year in Nepal.f
3. Determining stationary of time series
Most of the modeling techniques applied in time series analysis are primarily concerned with stationarity of the
data. The starting point is to examine the properties of series graphically and confirming it statistically. Graphs are the
most preliminary tool to get the rough idea about the stationarity of the series. However, statistical tests are required
for final decision. Unit root tests provide statistical evidence on the stationarity of a given series.
f
The easiest way of identifying seasonality is seasonal graphs that can be drawn using EViews. It gives a graphical plot of the series for each
season. If the series is found to be higher or lower than the average at any particular season, say a month, we can declare that there is seasonality. If
the seasonality is detected in the series, it should be addressed while modeling. The possible solution for the seasonality is generation of seasonally
adjusted series by using available methods. In EViews, Census X13, Census X12, X11 (Historical), Tramo/Seats and Moving Average Methods are
available. These methods generate seasonally adjusted additional variable of the original seasonal unadjusted series.
+ MODEL
Fig. 3. Trend.
60
50
40
30
20
10
0
1981Q1
1981Q3
1982Q1
1982Q3
1983Q1
1983Q3
1984Q1
1984Q3
1985Q1
1985Q3
1986Q1
1986Q3
1987Q1
1987Q3
1988Q1
Fig. 4. Cycle.
3.1. Unit root test methods
The statistical procedure employed to determine the stationarity of a series is called ‘unit root test’. The following
section discusses the widely used stationarity test methods, namely Augmented DickeyeFuller, PhillipsePerron and
KPSS tests.
3.1.1. Augmented Dickey Fuller (ADF) test

The Augmented DickeyeFuller (ADF) test is the most common method for testing unit root. Suppose, we have a
series yt for testing unit root. Then, ADF model tests unit root as follows.
X
k
Dyt ¼ m þ dyt1 þ bi Dyti þ et ð6Þ
i¼1
where,
d¼a1
a ¼ coefficient of yt1
Dyt ¼ first difference of yt, i.e. ytyt1
The null hypothesis of ADF is d ¼ 0 against the alternative hypothesis of d < 0. If we do not reject null, the series is
non-stationary whereas rejection means the series is stationary.
3.1.2. PhillipsePerron (PP) test

PhillipsePerron (PP) test is an alternative model to test the presence of unit root in a time series. This model tests in
the following form:
+ MODEL
Dyt ¼ pyt1 þ bi Dti þ et ð7Þ
where,
et is a I(0) with zero mean and Dti is a deterministic trend component.
The hypothesis is tested for p ¼ 0. The basic difference between the ADF and PP tests is that PP is a non-parametric
test, meaning that it does not need to specify the form of the serial correlation of Dyt under the null hypothesis. Thus,
the calculation procedure of t-ratio to get the value of p becomes different. Furthermore, PP corrects the statistics to
consider the autocorrelation and heteroskedasticity issues. The hypothesis testing procedure is similar as of ADF test.
Although the ADF test has been reported to be more reliable than the PP test, the problem of size distortion and low
power of test make both these tests less useful.4 For the larger volume of financial data, PP test is also suggested.
3.1.3. Kwiatkowsky, Phillips, Schmidt and Shin (KPSS) test

The classical testing framework is found sometimes to be biased towards accepting null hypothesis (H0). Hence,
Kwiatkowsky, Phillips, Schmidt and Shin (KPSS) have developed another method to test the stationarity. In KPSS test,
null hypothesis is stationary and alternative hypothesis is non-stationary. KPSS test model is as follows:
Yt ¼ Xt þ εt and hence; Xt ¼ Xt1 þ ut ð8Þ
In the above model, hypothesis is tested for ut. The reported critical values of the KPSS test is derived from the
Lagrange Multiplier (LM) test statistics.
3.2. Structural break in time series
Structural break is a sudden jump or fall in an economic time series which occurs due to the change in regime,
policy direction, and external shocks, among others. Structural break may occur in intercept, trend or both (Fig. 5).
Structural breaks can create difficulties in unit root test. As shown by Perron (1989),5 in the presence of structural
break, conventional unit root test methods may show a time series to be non-stationary, which in fact is a stationary
series. In other words, a stationary series which has a structural break may be regarded as a non-stationary series by the
above mentioned unit root test methods because these methods do not make adjustment for structural break.
To address the structural break issue Perron (1989),5 has developed a unit root test method, which accommodates a
known structural break in the time series. More recently, some new methods have been proposed for unit root test
allowing unknown single and multiple structural breaks.6e9g
4. Methods for time series analysis
4.1. Method selection framework
Applying appropriate methodology for the time series data is most crucial part of the time series analysis as wrong specification
of the model or using wrong method provides biased and unreliable estimates. Primarily, the method selection for time series
analysis is based on the unit root test results which determine the stationarity of the variable. Methods commonly used to analyze
the stationary time series cannot be used to analyze non-stationary series. If all the variables of interest are stationary, the meth-
odology becomes simple. In such a case, ordinary least square (OLS) or vector autoregressive (VAR) models can provide unbiased
estimates. If all the variables of interest are non-stationary, OLS or VAR models may not be appropriate to analyze the relationship.
Similarly, additional problem arises when variables used in the analysis are of mixed type, i.e., some are stationary and others are
non-stationary.
Following is a general methodological framework for time series analysis.
The method selection criteria of Fig. 6 should be treated as the most basic approach. This is because there are other several
considerations in time series models.
The non-stationary variables can be made stationary by taking first difference. Similarly, the non-stationary data with a
persistent long-run trend can be made stationary with either i) putting time variable in the regression or ii) extracting trends and
cycles from the single series by using popular filtering techniques such as Hodrick Prescott (HP) filter. Nevertheless, it should be
noted that the long-run relationship/information of the variables may be lost when we modify them to make stationary such as by
differencing, de-trending or filtering.
g
See Shrestha and Chowdhury (2005)10 for detailed discussion on unit root test with the structural break.
+ MODEL
30
70
25
60
20
50
15
40
10 30
5 20
10
0
1981 1986 1991 1996 2001 2006
0
19811983 1985198719891991 1993199519971999 200120032005 20072009
a b
120
50
45 100
40
35 80
30
25 60
20
40
15
10 20
5
0 0
19811983 1985198719891991 1993199519971999 200120032005 20072009 198119831985198719891991199319951997199920012003200520072009
c d
Fig. 5. Structural break in time series data, a. Structural break in intercept, b. Structural break in intercept, c. Structural break in trend, d.
Structural break in intercept and trend.
Unit
Unit root tests
All variables All variables non-

Mixed variables
stationary stationary
OLS/VAR models Johansen Test

Johansen test ARDL models
No cointegration Cointegration
ECM
All Variables
Causality test
Nonstationary
Fig. 6. Method selection for time series data. OLS: Ordinary least squares; VAR: Vector autoregressive; ARDL: Autoregressive distributed lags;
ECM: Error correction models.
+ MODEL
4.2. Ordinary least square (OLS) method
The first step to start the time series analysis is to conduct unit root test. If unit root test results show that all variables being
analyzed are stationary, then OLS method can be used to determine the relationship between the given variables. A bivariate linear
regression model, termed as ordinary least squares (OLS), can be estimated as:
Yi ¼ b1 þ b2 Xi þ ei ð9Þ
which can be written as:

ei ¼ Yi Ybi ð10Þ
¼ Yi b1 b2 Xi ð11Þ
Above model shows that the residuals (ei ) are simply the difference between the actual (Yi ) and estimated ( Ybi ) values. OLS
minimizes the residual sum of squares while choosing b1 and b2.h
As mentioned above, a non-stationary time series can be converted into a stationary series by differencing. If a time series
becomes stationary after differencing one time, then the series is said to be integrated of order one and denoted by I(1). Similarly, if
a time series has to be differenced two times to make it stationary, then it is called integrated of order 2 and written as I(2). As the
stationary time series need not to be differenced, it is denoted by I(0).
Taking difference of non-stationary time series and using OLS method after making all the variables stationary may seem to be
an easy way to analyze the relationship. However, the difference represents only the short-run change in the time series but totally
misses out the long-run information. Hence, this method is not suggested for the analysis of non-stationary variables.
4.3. Vector autoregressive (VAR) model
Vector Autoregressive (VAR) model allows the feedback or reverse causality among the dependent and independent regressors
using their own past values. In the general VAR model, no exogenous variables require as it assumes all the regressors endogenous.
The simpler VAR dimensioni for two variables X and Y with only one lag is given below:
Yt ¼ d1 þ q11 Yt1 þ q12 Xt1 þ ε1t ð12Þ
Xt ¼ d2 þ q21 Yt1 þ q22 Xt1 þ ε2t ð13Þ
where ε1t and ε2t are uncorrelated white noise disturbances or error terms.
Choosing appropriate lag length is important in VAR modeling. Optimal number of lags can be selected by using available lag
length selection criteria. Most popular criteria are Akaike Information Criterion (AIC), Schwartz Bayesian Criterion (SBC), and
Hannan Quinn criterion (HQC).
4.4. Cointegration test
Using ordinary least square or other similar methods for non-stationary time series may produce spurious results. In other words,
the test results of regression may show that a significant relationship exists between two given variables, which in fact are un-
correlated. This type of regression is termed as ‘spurious regression’ which mainly occurs due to the non-stationarity of the time
series used in the regression model. On the other hand, two or more variables may form long term equilibrium relationship even
though they may deviate from the equilibrium in the short run. Due to these issues, Engle and Granger (1987)13 developed
cointegration test method to analyze the relationships among non-stationary variables.
If two or more variables are linked to form an equilibrium relationship spanning the long run, these variables are said to be
cointegrated. In fact, one variable drags the other over the period and hence, both of them share the same movement. Fig. 7 shows
the movement of two cointegrated time series.
4.4.1. Johansen cointegration test

Addressing weaknesses in the EngeleGranger methodology, Johansen (1988) and Johansen and Juselius (1990)14,15 have
developed improved cointegration tests models. The Johansen (1988)14 version is widely used and has been incorporated in various
econometric software. This test method is based on the relationship between the rank of matrix and its characteristics roots.
We have a generalized model with n variable vectors:
h
See Gujarati (1995)11 for detailed discussion.
i
See Sims and Sachs (1982)12 for details.
+ MODEL
900
Remiance Import
800
700
600
500
400
300
200
100
0
2001 2003 2005 2007 2009 2011 2013 2015
Fig. 7. Remittance and import in Nepal.
xt ¼ A1 xt1 þ εt ; ð14Þ
so that
Dxt ¼ A1 xt1 xt1 þ εt ð15Þ
¼(A1I) xt1þεt can be written as

Y
¼ xt1 þ εt ð16Þ
where,
xt and εt are (n.1) vectors
A1 ¼ an (n.n) matrix of parameters
I ¼ an (n.n) identity matrix
P ¼ A1I
We test the rank of A1I matrix. If the rank of A1I, that is, the rank of P ¼ 0, then we say sequences are unit root processes.
If rank of P ¼ k then we say the series is stationary and if rank of P < k, also known as reduced rank, then there exists
cointegration. Hence, the intuition is if we have 3 variables in cointegration tests, the maximum rank of P can be less than three (if
k ¼ 3, cointegration rank<3 and maximum cointegration relation is only two).
4.5. Error correction models
If the variables are I(1) and there exists a cointegration relationship, then Error Correction Model (ECM) can be derived.
Consider the following bivariate relationship.
Yt ¼ m þ b1 Xt þ εt ð17Þ
13
Based on the representation theorem of Engle and Granger (1987), we establish a link between the cointegration and Error
Correction Model (ECM) by transforming Eq. (17).
Cointegration equation between Yt, and Xt are as follows:
εt ¼ Yt m b1 Xt ð18Þ
The Error Correction Models for Yt, and Xt are as follows:

X
l X
l
DYt ¼ mY þ aY εt1 þ a1h DYth þ b1h DXth þ uYt ð19Þ
h¼1 h¼1
X
l X
l
DXt ¼ mX þ aX εt1 þ a2h DYth þ b2h DXth þ uXt ð20Þ
h¼1 h¼1
where, uYt and uXt are stationary white noise processes for some number of lags l. The model can be further advanced in multivariate
case in a similar way.
+ MODEL
The coefficients in the cointegration equation give the estimated long-run relationship among the variables and coefficients on
the ECM describe how deviations from that long-run relationship affect the changes on them in next period. The parameters aY and
aX of Eqs. (19) and (20) measure the speed of adjustment of Xt and Yt, respectively towards the long-run equilibrium.
4.6. ARDL models
Johansen cointegration test cannot be applied directly if variables of interest are of mixed order of integration or all of them are
not non-stationary, as this method requires all the variables to be I(1). An autoregressive distributed lag (ARDL) model is an
ordinary least square (OLS) based model which is applicable for both non-stationary time series as well as for times series with
mixed order of integration.16,17 This model takes sufficient numbers of lags to capture the data generating process in a general-to-
specific modeling framework.
A dynamic error correction model (ECM) can be derived from ARDL through a simple linear transformation. Likewise, the
ECM integrates the short-run dynamics with the long-run equilibrium without losing long-run information and avoids problems
such as spurious relationship resulting from non-stationary time series data.
To illustrate the ARDL modeling approach, the following simple model can be considered:
yt ¼ a þ bxt þ dzt þ et ð21Þ
The error correction version of the ARDL model is given by:

X
p X
p X
p
Dyt ¼ a0 þ bi Dyti þ di Dxti þ εi Dzti þ l1 yt1 þ l2 xt1 þ l3 zt1 þ ut ð22Þ
i¼1 i¼1 i¼1
The first part of the equation with b, d and ε represents short run dynamics of the model. The second part with ls represents long
run relationship. The null hypothesis in the equation is l1 þ l2 þ l3 ¼ 0, which means non-existence of long run relationship.
4.7. Causality test
If two variables Y and X is cointegrated, then there may exist any of the 3 relationships: a) X affects Y, b) Y affects X and c) X
and Y affect each other. The first two show unidirectional relationship while the third shows bidirectional relationship. If two
variables are not cointegrated, then one does not affect the other and are independent. To determine the pattern of such relationship,
Granger (1969)18 has developed causality test method. If current and lagged values of X improve the prediction of the future value
of Y, then it is said that X ‘Granger causes’ Y. The simple model of Granger causality is as follows:
X
n X
n
DYt ¼ ai DYti þ bj DXtj þ u1t ð23Þ
i¼1 j¼1
X
n X
n
DXt ¼ li DXti þ dj DYtj þ u2t ð24Þ
i¼1 j¼1
Eq. (23) shows that the current value of DY is related to the past values of itself and the past values of DY. Similarly, Eq. (24)
postulates that DX is related to the past values of itself and that of DY.
The null hypothesis in Eq. (23) is bj ¼ 0 which means, “DX does not Granger cause DY”. Similarly, the null hypothesis in Eq.
(24) is dj ¼ 0, and states “DY does not Granger cause DX.” The rejection or non-rejection of the null hypothesis is based on the F-
statistics.
5. Diagnostic tests of the time series model
To make the estimated model robust and unbiased, we need to determine the fitness of the model through checking
goodness of fit statistics and conducting diagnostics tests.
5.1. Goodness of fit
A rough impression of the robustness of estimated regression coefficients can be made by examining how well the
regression line explains the data, whether there is a serial correlation in residuals and whether the overall model is
+ MODEL
significant, among others. Goodness of fit test values is displayed together with the estimated coefficients by almost all
types of software.
Common tests for goodness of fit include R2, which shows a correlation in bivariate case and hence the value closer
towards 1 is considered to be better. In a multivariate regression, adjusted R2 is chosen instead of R2. R2 increases with
the increase in the number of variable while adjusted R2 increases only when the new variable improves the prediction
power. Durbin Watson (DW) statistics indicate whether there is an autocorrelation in residuals. If the value of DW is
near to two, then model is considered to be ‘autocorrelation free’.
5.2. Diagnostics tests
Diagnostic tests tell us about the robustness of estimated coefficients. Diagnostic test statistics are generally not
reported automatically by software and thus should be estimated separately. Type of the diagnostic test depends upon
the modeling technique being utilized. However, the most common types of diagnostics tests are lag structure, co-
efficient diagnostics and residual diagnostics. Residual diagnostics is the most crucial part of diagnostic tests in
economic modeling since the regression models try to minimize errors (or residuals). The error terms must be white
noise (independently and identically distributed, i.i.d.). Residual diagnostics examine whether the error terms are i.i.d.
Lagrange multiplier (LM) test, correlogram, and heteroskedasticity test are the major test methods for residual di-
agnostics. The stability diagnostics examine whether the parameters of the estimated model are stable across various
sub-samples of the data.
The diagnostics tests have been discussed in detail in Annex 1.
6. Analyzing the moneyeprice relationship in Nepal: an example
6.1. Theoretical background
Classical and neoclassical economists believe that over-supply of money leads to an increase in price level. The
most famous quantity theory of money by Fisher (1922)19 has expressed the moneyeprice relationship as follows:
MV ¼ PT ð25Þ
where, M denotes money supply, V refers the velocity of money, P is the average price level and T indicates the total
volume of transaction of goods and services in an economy. The modern quantity theory of money (QTM) believes
that the firm specific cost increase cannot be inflationary as long as they are not related to, or accommodated by,
increases in the money supply. The relationship can be expressed as:
MV ¼ PY ð26Þ
In the above equation, if output of the economy, Y, and the velocity of money V are given, then increase in M will
proportionately increase P.
In the developing countries like Nepal, where the supply side bottlenecks are also a big issue, demand side inflation
may be dominated by structural constraints. This paper empirically analyses the moneyeprice relationship in Nepal
by following the econometric framework discussed in the preceding sections of this paper.
6.2. Data and unit root test
In line with the methodological framework discussed above, Nepal's moneyeprice relationship is analyzed with
due consideration to the properties of time series. We include the monthly series of Nepalese consumer price index
(CPI), nominal effective exchange rate (NEER), broad money (M2) in Rs. million, and Indian CPI (CPII) from
January 2000 through April 2014. The graphical plots of the series are presented in Annex 2.
From the graphs shown in Annex 2, we can figure out the possible non-stationarity in Nepalese CPI, Indian CPI and
M2 but cannot determine the nature of the non-stationarity. In the case of NEER, it looks like a stationary series but
cannot be sure about it.
+ MODEL
The unit root test on the monthly series of CPI, M2, NEER and CPII at level data and as transformed series by
taking log, first difference and including intercept and both trend and intercept was carried out separately on the three
popular test methods: ADF, PP and KPSS as discussed in Section 3.1. The unit root test results are presented below in
Table 1.
The ADF tests for stationarity shows that all the four variables are non-stationary at the level data as well as at log
transformation. The level series of NEER and CPII becomes stationary at first difference. Nonetheless, even at the first
difference, CPI and M2 are non-stationary. But M2 becomes stationary at first difference after taking log. None of the
series is trend stationary since all of them were still non-stationary after the inclusion of time trend in the ADF test
equation. CPI variable is found non-stationary even at the first difference with and without the log. However, CPI is
stationary at 5 percent at first difference if both trend and intercept is included in ADF test equation.
The PhilipsePerron (PP) test results also show that all the variables are non-stationary (Table 2). Results are
consistent with ADF test results.
The KPSS test for stationary also shows the similar results of non-stationarity of all the series at level (with and
without taking logs). Nonetheless, the test results are different at first difference. Although CPI and log(CPI) both were
non-stationary at first difference (without trend) in ADF Tests, KPSS test reports that log(CPI) is stationary but CPI is
not at the first difference. Both of them were stationary at first difference in PP tests. In case of M2, both M2 and
log(M2) were non-stationary in KPSS tests even at first difference but log(M2) was stationary at PP as well as in ADF
tests (at first difference). M2 at first difference was stationary only in PP tests. Surprisingly, although log(CPII) was
stationary at first difference in ADF and PP tests, it is not stationary in KPSS tests (Table 3).
As shown in Table 4, although non-stationary property can be confirmed by any of the available test methods,
sometimes the way we make those variables stationary and retest for confirmation might show inconsistent results.
Usually, we have to be careful on those variables that cannot be stationary even at the first difference and those at the
borderline of decision points. One good way might be choosing the property that has been repeated or are similar in
various test results.
6.3. OLS model estimation
Following the methodology illustrated in Section 3, we should not estimate OLS model as unit root tests show that
all the variables included in our model are non-stationary. However, for comparison purpose, we conduct following
OLS regression using the log data in order to measure elasticity.
logðCPIt Þ ¼ a þ b1 logðM2t Þ þ b2 logðCPIIt Þ þ b3 logðNEERt Þ þ εt ð27Þ
The estimates of Eq. (27) are:

0:97* 0:106*logðM2t Þ 0:776*logðCPIIt Þ 0:14*logðNEERt Þ
logðCPIt Þ ¼ þ þ þ ð28Þ
ð5:75Þ ð4:04Þ ð13:28Þ ð3:37Þ
Adj. R2 ¼ 0.999, F-stat: 12040. DW Stat: 0.686. *: significant at 5 percent or lower level.
Table 1
ADF tests results.
Variable Intercept Trend and intercept
Level First difference Level First difference
t-stat p-value t-stat p-value t-stat p-value t-stat p-value
CPI 2.252 1.000 0.996 0.754 0.004 0.996 3.689 0.026
log(CPI) 1.375 0.999 1.953 0.307 2.150 0.514 2.722 0.229
M2 4.6176 1.000 0.653 0.854 0.904 0.999 6.679 0.000
log(M2) 2.147 0.999 12.160 0.000 1.399 0.858 12.576 0.000
NEER 1.456 0.553 10.736 0.000 1.162 0.914 10.805 0.000
log(NEER) 1.441 0.561 10.845 0.000 1.139 0.918 10.916 0.000
CPII 5.149 1.000 9.498 0.000 0.566 0.999 7.613 0.000
log(CPII) 3.748 1.000 10.406 0.000 1.177 0.911 7.605 0.000
+ MODEL
Table 2
PhilipsePerron tests results.
Variable Intercept Trend and intercept
t-stat p-value t-stat p-value t-stat p-value t-stat p-value
CPI 3.705 1.000 10.755 0.000 1.130 0.920 12.232 0.000
log(CPI) 2.541 1.000 13.249 0.000 2.535 0.311 18.626 0.000
M2 7.46 1.000 12.359 0.000 1.351 1.000 14.187 0.000
log (M2) 1.662 0.999 13.917 0.000 1.352 0.871 14.265 0.000
NEER 1.127 0.704 10.657 0.000 0.928 0.949 10.777 0.000
log (NEER) 1.127 0.704 10.819 0.000 0.944 0.948 10.876 0.000
CPII 3.892 1.000 9.603 0.000 0.587 0.978 10.422 0.000
log (CPII) 2.541 1.000 13.249 0.000 2.535 0.311 18.626 0.000
Table 3
KPSS test results.
Variable LM Statistics KPSS tests
Intercept Trend and intercept
Critical value @ 5% ¼ 0.463 Critical value @ 5% ¼ 0.146
CPI 1.579 0.732 0.417 0.032
log(CPI) 1.638 0.461 0.405 0.112
M2 1.519 1.385 0.415 0.108
log(M2) 1.647 0.527 0.347 0.102
NEER 0.466 0.267 0.342 0.072
log(NEER) 0.471 0.261 0.344 0.073
CPII 1.537 1.235 0.412 0.051
log(CPII) 1.604 0.696 0.411 0.061
Note: For not rejecting null, i.e., the variable to be stationary, LM-Stats should be smaller than the critical values and vice versa.
Table 4
Comparison of results of three unit root test methods.
Variables ADF PP KPSS
CPI Non-stationary at first difference Non-stationary at first difference Stationary at first difference
log(CPI) Non-stationary at first difference Stationary at first difference Stationary at first difference
M2 Non-stationary at first difference Stationary at first difference Non-stationary at first difference
log(M2) Stationary at first difference Stationary at first difference Non-stationary at first difference
CPII Stationary at first difference Stationary at first difference Non-stationary at first difference
log(CPII) Stationary at first difference Stationary at first difference Non-stationary at first difference
Without considering the time series properties of the data, the level data estimates show robust-looking result with
high adjusted R2 values, significant F-stat among others and all variables being significant. But the preliminary
observation of non-stationarity of these series might have given spurious estimates. This can be shown by the lower
value of DW-Stat (0.686) even lower than the R2 values. Also, the Adjusted R2 value very close to 1 (0.999) is believed
to have spurious relation. Further to this, if we plot the residuals of the model, we won't get the sum zero, which
violates the OLS assumptions (Annex 3 Figure A5).
6.4. VAR model estimation
The VAR models of those four variables using level data with two lags can be represented as follows:
+ MODEL
logðCPIt Þ ¼ a1 þ b1 logðCPIt1 Þ þ b2 logðCPIt2 Þ þ b3 logðM2t1 Þ þ b4 logðM2t2 Þ

ð29Þ
þb5 logðCPIIt1 Þ þ b6 logðCPIIt2 Þ þ b7 logðNEERt1 Þ þ b8 logðNEERt2 Þ þ εt
logðCPIIt Þ ¼ a2 þ g1 logðCPIt1 Þ þ g2 logðCPIt2 Þ þ g3 logðM2t1 Þ þ g4 logðM2t2 Þ

ð30Þ
þg5 logðCPIIt1 Þ þ g6 logðCPIIt2 Þ þ g7 logðNEERt1 Þ þ g8 logðNEERt2 Þ þ εt
logðM2t Þ ¼ a3 þ l1 logðCPIt1 Þ þ l2 logðCPIt2 Þ þ l3 logðM2t1 Þ þ l4 logðM2t2 Þ

ð31Þ
þl5 logðCPIIt1 Þ þ l6 logðCPIIt2 Þ þ l7 logðNEERt1 Þ þ l8 logðNEERt2 Þ þ εt
logðNEERt Þ ¼ a4 þ d1 logðCPIt1 Þ þ d2 logðCPIt2 Þ þ d3 logðM2t1 Þ þ d4 logðM2t2 Þ

ð32Þ
þd5 logðCPIIt1 Þ þ d6 logðCPIIt2 Þ þ d7 logðNEERt1 Þ þ d8 logðNEERt2 Þ þ εt
The estimates of the VAR model (Eqs. (29)e(32)) are obtained as follows:
logðCPIt Þ ¼ 0:39* þ 0:82*logðCPIt1 Þ 0:20*logðCPIt2 Þ þ 0:14*logðM2t1 Þ 0:09logðM2t2 Þ

þ0:65*logðCPIIt1 Þ 0:36*logðCPIIt2 Þ 0:21logðNEERt1 Þ þ 0:26**logðNEERt2 Þ ð33Þ
logðCPIIt Þ ¼ 0:10 0:03logðCPIt1 Þ þ 0:01logðCPIt2 Þ 0:006logðM2t1 Þ þ 0:04logðM2t2 Þ

ð34Þ
þ1:13*logðCPIIt1 Þ 0:18*logðCPIIt2 Þ 0:04logðNEERt1 Þ þ 0:05logðNEERt2 Þ
logðM2t Þ ¼ 0:20 þ 0:08logðCPIt1 Þ þ 0:09logðCPIt2 Þ þ 0:84 logðM2t1 Þ þ 0:14logðM2t2 Þ

ð35Þ
þ0:13logðCPIIt1 Þ 0:25logðCPIIt2 Þ 0:44 logðNEERt1 Þ þ 0:49 logðNEERt2 Þ
logðNEERt Þ ¼ 0:05 0:05logðCPIt1 Þ 0:003logðCPIt2 Þ þ 0:04logðM2t1 Þ 0:04logðM2t2 Þ

ð36Þ
0:06logðCPIIt1 Þ 0:09logðCPIIt2 Þ þ 1:18 logðNEERt1 Þ 0:20 logðNEERt2 Þ
*significant at 5% or lower level of significance, **significant at 10% level of significance.

Adjusted R2: 0.99 for CPI, CPII, M2 and 0.96 for NEER.
The VAR estimates of the level data also show robust-looking result with high adjusted R2 values, significant F-stat
among others. But the LM test stat shows the autocorrelation in residuals (we reject null hypothesis for up to 2 lags
with LM Stat 56.26 and p-value 0.000). Likewise, the correlogram spikes also cross the benchmark line.
6.5. Cointegration test
The unit root tests show that all the series included in the moneyeprice model are I(1). In this case, the series might
be cointegrated which, if not addressed, may result in spurious estimates. This has been shown by the test results of
OLS and VAR models presented in the preceding sections. Hence, we conduct Johansen cointegration test employing
monthly series of log(CPI), log(CPII), log(M2) and log(NEER). The test results given by the EViews software are
presented in Table 5.
The software reports two different types of test statistics: Trace statistics and maximum eigenvalue statistics. The
calculation process of rank of the matrix slightly differs between them. The trace statistics tests for the null hypothesis
of k cointegrating relations against the alternative hypothesis of k1. On the other hand, the maximum eigenvalue
statistics tests for the null hypothesis of r cointegrating relations against the alternative of rþ1. In both methods, we
proceed sequentially from r ¼ 0 to r ¼ k1 until we fail to reject the null hypothesis. Both methods in general show the
similar decisions on number of cointegration relations. In case both methods show conflicting results, there is a
convention of interpreting the result based upon the economic logics and purpose of the study.
With this process, the unrestricted cointegration rank tests based on trace statistics and maximum eigenvalue both
indicate that there exists one cointegration relationship. However, it is relatively weak since we reject the null hy-
pothesis only at the 10 percent level of significance. Still, with the logic that Nepal's inflation might have cointegrated
with Indian inflation (as graphical plots indicate), we can assume one cointegration relation.
+ MODEL
Besides the number of cointegrating vectors, Johansen cointegration test also jointly estimates the long-run and
short-run relationships of the variables incorporated in the model. The long-run estimates are called Beta relations
while short-run estimates are Alpha relations. However, the beta coefficients are only identified when some re-
strictions are imposed in VECM to normalize the relationship amongst the variables.
The results of the Johansen cointegration relations are presented in Table 6.
The Johansen cointegration test results indicate that all the three variables have significant positive impact on
inflation in the long run. The magnitude of impact (the coefficient) of Indian inflation is largest while that of money
supply and exchange rate are almost similar. The short run relation statistics show a significant positive impact on
inflation of its lag values and a negative impact of money supply.
6.6. Error correction models (ECMs)
Based on the Johansen test result of one cointegrated relation, we estimate an error correction model (ECM) as
described in Section 4.5.
To identify the number of optimal lags, we can run normal unrestricted VAR and check for optimal lag lengths of
the series. In our case, the optimal lag length is three as indicated by FPE, AIC and HQ criterion (Table 7).
The cointegration and error correction equation of the LNCPI, LNCPII, LNM2 and LNNEER can be estimated as
given below. The VECM approach estimates the long run relationship (with cointegration equation) first and then the
short run relationships for each of the variables (error correction equations).
Cointegration Equation:
εt ¼ LNCPIt f b1 LNCPIIt b2 LNM2t b3 LNNEERt ð37Þ
ECM for Nepal's CPI:
X
3 X
3 X
3
DLNCPIt ¼ mLNCPI þ aLNCPI εt1 þ a1h DLNCPIth þ b1h DLNCPIIth þ c1h DLNM2th
h¼1 h¼1 h¼1
ð38Þ
X
3
þ d1h DLNNEERth þ uLNCPIt
h¼1
ECM for Indian CPI:
X
3 X
3 X
3
DLNCPI It ¼ mLNCPI I þ aLNCPI I εt1 þ a2h DLNCPIth þ b2h DLNCPIIth þ c2h DLNM2th
h¼1 h¼1 h¼1
ð39Þ
X
3
þ d2h DLNNEERth þ uLNCPI It
h¼1
ECM for Nepal's Broad Money (M2):
Table 5
Johansen cointegration test results.
Unrestricted cointegration tank test (Trace) Unrestricted cointegration rank test (Maximum eigenvalue)
Hypothesized Eigenvalue Trace statistics 0.05 Critical Prob. Hypothesized Eigenvalue Statistic Critical value Prob.
no. of CE(s) value no. of CE(s)
None* 0.203 67.036 47.86 0.00 None* 0.203 37.36 27.58 0.002
At most 1 0.109 29.673 29.80 0.05 At most 1 0.109 19.13 21.13 0.093
At most 2 0.049 10.541 15.49 0.24 At most 2 0.049 8.33 14.26 0.346
At most 3 0.013 2.213 3.84 0.12 At most 3 0.013 2.21 3.84 0.137
*
Rejection of hypothesis at 5 percent significance level.
+ MODEL
Table 6
Johansen cointegration relations results.
Coefficient Estimates Standard error
Long-run (Beta) relations
LNCPIIt 0.72b 0.079
LNM2t 0.13b 0.035
LNNEERt 0.125b 0.056
Short-run (Alpha) relations
Dlog (CPII)t 0.055 0.056
Dlog (NEER)t 0.049 0.038
Dlog(M2)t 0.16a 0.090
Dlog(CPI)t 0.49b 0.065
a
Significant at 10 percent level.
b
Significant at 5 percent or lower level.
X
3 X
3 X
3
DLNM2t ¼ mLNM2 þ aLNM2 εt1 þ a3h DDLNCPIth þ b3h DLNCPIIth þ c3h DDLNM2th
h¼1 h¼1 h¼1
ð40Þ
X
3
þ d3h DLNNEERth þ uLNM2t
h¼1
ECM for Nominal Effective Exchange Rate (NEER):

X
3 X
3 X
3
DLNNEERt ¼ mLNNEER þ aLNNEER εt1 þ a4h DLNCPIth þ b4h DLNCPI Ith þ c4h DLNM2th
h¼1 h¼1 h¼1
ð41Þ
X
3
þ d4h DLNNEERth þ uLNNEERt
h¼1
The estimates of the coefficients of Eqs. (37e41) obtained through EViews 8 are given in Table 8j. The long-run
relationships indicate that the contemporaneous impact of Indian CPI is about 68 percent to Nepal's CPI whereas broad
money supply (M2) and nominal effective exchange rate (NEER) account about 15 percent and 10 percent respec-
tively. All the coefficients of cointegration equation are significant at 5 percent or lower level of significance.
The short-run equilibrium coefficient of ECM (as) indicates that M2 helps correcting the disequilibrium of Nepal's
inflation whereas exchange rate and Indian inflation does not. The coefficient for LNM2 is 0.18 and significant at 10
percent level, indicating some level of control of the central bank over inflation both in the short and long-run.
The diagnostics test results show a robust VECM estimates. The residual plot of regressors and the cointegration
equation shows a random zero mean disturbances. Likewise, the inverse roots of AR lie randomly inside the circle.
The LM test does not reject the null hypothesis of no autocorrelation in residuals up to three lags (Annex 4).
6.7. Autoregressive distributed lag (ARDL) model
As mentioned earlier, we estimated the determinants of Nepal's consumer price index (CPI) by including broad
money supply (M2), Indian CPI (CPII) and nominal effective exchange rate (NEER) in the model. The Johansen test
indicated a weak cointegration relation. On the other hand, graphical plots of CPI and CPII show a common trend,
indicating a cointegration relation. Thus, it would not be wise to take first difference and estimate models as it may
ignore the long run relationship. In this case, ARDL model can capture both long-run and short-run relation of the
cointegrated variables. Hence, ARDL model discussed in 4.6 has been employed to revisit the moneyeprice rela-
tionship in Nepal. Following is the model used with data in log (LN) form:
LNCPIt ¼ a þ bLNM2t þ cLNNEERt þ dLNCPIIt þ et ð42Þ
The error correction version of the above model is as follows:

j
The order of the variables has been set as decreasing order of exogeneity: LNCPII, LNNEER, LNM2 and LNCPI.
+ MODEL
Table 7
VAR lag order selection criteria.
Lag LogL LR FPE AIC SC HQ
1 1992.727 NA 2.96e-16 24.404 24.099a 24.280
2 2016.836 45.837 2.68e-16 24.504 23.894 24.256
3 2045.989 53.986 2.28ee16a 24.666a 23.751 24.295a
4 2057.528 20.799 2.41ee16 24.611 23.391 24.116
LR: Likelihood ratio; FPE: Final prediction error; AIC: Akaike information criterion; SC: Schwarz criterion; HQ: HannaneQuinn criterion.
a
Optimal lag length.
X
p X
p X
p X
p
DLNCPIt ¼ ε0 þ fi DLNCPIti þ 4i DLNM2ti þ gi DLNNEERti þ hi DCPIIti
i¼1 i¼1 i¼1 i¼1
ð43Þ
þl1 LNCPIt1 þ l2 LNM2 þ l3 LNNEERt1 þ l4 LNCPII þ ut
Above models were estimated on Microfit. The ARDL (1,0,0,1) model was selected based on Akaike Information
Criterion (Table 9).
The long-run estimates of ARDL (Eqs. (42) and (43)) show that M2 and CPII are the determinants of inflation in
Nepal. According to the test results, one percent change in money supply (M2) brings a change of about 0.27 percent
in inflation while one percent change in Indian inflation leads to a change in Nepal's inflation by 0.43 percent.
However, NEER does not seem to affect the inflation.
The diagnostic tests for the ARDL estimates indicate a white noise i.i.d. error terms with Homoskedasticity and
normality. The null hypothesis of Lagrange multiplier test of residual serial correlation cannot be rejected, the
functional form is fine, error terms distributed normally and the null hypothesis for homoskedastic error terms cannot
be rejected (Annex 5).
6.8. Granger causality tests
As described in Section 4.7, the Granger causality tests show pairwise relationship, which may be one-way or two-
way relationship or no relationship. To justify the inclusion of variables in the model, validate cointegration relation
and know the direction of the relationship, this test serves as a complement. The summary results of the Granger
causality test are presented in Table 10.
In a nutshell, the Granger causality test confirms that all the variables (CPII, NEER and M2) included in the model
influence the CPI. These relationships are also theoretically valid and no other problems such as endogeneity are
observed.
Table 8
Cointegration and EC estimates of Eqs. (37e41).
Coefficient Estimates t-stats Equation No.
Long-run cointegration estimates
4 0.965 e Eq. 37
b1 (LNCPIIt) 0.682a 8.20
b2 (LNM2t) 0.150a 3.997
b3 (LNNEERt) 0.097b 1.709
Short-run ECM estimates
aLNCPI 0.462a 5.968 Eq. 38
aLNCPII 0.106a 2.516 Eq. 39
aLNM2 0.18b 1.728 Eq. 40
aLNNEER 0.052 1.172 Eq. 41
a
Significant at 5 percent or lower level.
b
Significant at 10 percent level.
+ MODEL
Table 9
ARDL test statistics.
Coefficient Estimates t-stats
Long-run estimates (Eq. (41))
a (Constant) 1.386**
b (LNM2) 0.268** 2.935
l (LNNEER) 0.130 0.795
d (LNCPII) 0.432** 2.106
Short-run estimates (Eq. (42))
DConstant 0.414* 1.80
DLNM2 0.80** 2.558
DLNNEER 0.039 4.178
DLNCPII 0.008 2.653
Adjusted R2 0.99
DW stat 2.15
F-stat. F(5, 162) 3316
*Significant at 10 percent level.
**Significant at 5 percent level.
Table 10
Granger causality tests.
Pair Null hypothesis F-Stat/(p-value) Explanations
1 CPII does not Granger cause CPI 24.729 (0.000) Only first hypothesis is rejected. It shows that Indian inflation has a
CPI does not Granger cause CPII 0.8509(0.428) unidirectional relationship with Nepal's inflation.
2 NEER does not Granger cause CPI 3.658(0.027) Only the first hypothesis is rejected. This indicates that there exists a
CPI does not Granger cause NEER 1.066(0.346) unidirectional relationship of NEER with CPI.
3 M2 does not Granger cause CPI 6.089(0.002) Only the first hypothesis is rejected. It means that there is an unidirectional
CPI does not Granger cause M2 1.957(0.144) relationship of M2 with CPI.
4 NEER does not Granger cause CPII 0.609(0.545) We do not reject both null hypotheses. This indicates that there is no
CPII does not Granger cause NEER 0.964(0.383) relationship between NEER and CPII.
5 M2 does not Granger cause CPII 1.1823(0.309) As both hypotheses are not rejected, we can infer that M2 and CPII are
CPII does not Granger cause M2 1.684(0.188) independent from each other.
6 M2 does not Granger cause NEER 1.196(0.304) Second hypothesis is rejected. It shows that NEER affects M2 but M2 does
NEER does not Granger cause M2 7.962 (0.000) not affect NEER.
6.9. Summary results of different models
Nepal's moneyeprice relationship has been modeled following the methodological framework described in Section
4 of this paper. Different models provide different coefficients of relationships as shown in Table 11.
Table 11
Summary results of estimated models.
Model Variables incorporated Estimates Remarks
OLS CPI, CPII, M2, NEER M2 ¼ 0.106 Estimates are significant but DW stat is
(Monthly 2000 Jane2014 Apr) Adjusted R2 ¼ 0.99 lower than R2 value. It shows that the
DW ¼ 0.686 model is spurious.
VAR M2 ¼ 0.14 Estimates are significant but we reject
Adjusted R2 ¼ 0.99 the null hypothesis of no autocorrelation
LM Stat: 58 in residuals.
p-value: 0.00
Johansen Long run: M2 ¼ 0.13 Shows one cointegration equation but
cointegration Short run: M2 ¼ 0.16 relatively weak (we reject null at 10%)
ECM Long run: M2 ¼ 0.15 Estimates are significant and robust
Short run: M2 ¼ 0.18
LM stat: 13.06
p-value: 0.667
ARDL Long run: M2 ¼ 0.27 Estimates are significant and robust.
Short Run: DM2 ¼ 0.80
Adjusted R2 ¼ 0.99
DW ¼ 2.15
Granger causality A unidirectional relationship: M2 affects CPI but CPI does not affect M2.
+ MODEL
As discussed above, various methods report different coefficients on the impact of money on consumer price
inflation in Nepal. OLS test results suggest that one percent change in M2 will lead to 0.11 percent change in CPI.
According to VAR results, one percent change in M2 brings a change of 0.14 percent in CPI. However, model fitness
indicators show that these results are spurious. This is due mainly to the non-stationarity of the variables included in
the model. Johansen cointegration test shows that long-run coefficient of M2 is 0.13, while coefficient of M2 estimated
by VECM is 0.15. According to ARDL model test results, coefficient of M2 is found to be 0.27 indicating that one
percent change in M2 leads to 0.27 percent change in CPI.
7. Summary and conclusion
The unique features possessed by time series data create difficulty in method selection process while analyzing the
relationship among economic variables. The autoregressiveness, stationarity, trends, cycles, seasonality and structural
breaks are the most common properties of time series. These properties should be duly accommodated or addressed to
make the models robust. Specifically, researchers must be aware of spurious relationship among variables. This paper
suggests a general framework for time series analysis which could help in avoiding spurious regression and obtaining
robust results.
Unit root test is the starting point for time series analysis. Based on the results of the unit root test, methods and
models should be selected for the analysis. It is suggested that OLS, VAR or other similar models be used if all the
variables are stationary. However, these models may provide spurious relationship if all or some variables are non-
stationary. The diagnostics test is the significant part of time series analysis to identify spuriousness and robustness.
Johansen cointegration test method is employed when all the variables included in the model are nonstationary. In
the case of mixed variables, i.e., some variables stationary but others nonstationary, Johansen cointegration method
cannot be used. In such a case, ARDL models are appropriate. ARDL models also can be employed using all
nonstationary variables.
Nepal's money price relationship is analyzed following the methodological framework suggested in this paper. The
framework greatly helps in choosing appropriate test methods for data analysis. Analysis of the moneyeprice relation
employing ARDL model shows that in the long-run, money supply affects consumer price inflation by 27 percent.
Based on the model fitness statistics, we can argue that this estimate is robust and reliable compared to the estimates
given by other methods.
Appendix A. Supplementary data
Supplementary data related to this article can be found at https://doi.org/10.1016/j.jfds.2017.11.001
References
1. Enders Walter. Applied Econometric Time Series. 4th ed. USA: John Wiley & Sons; 2014.
2. Stigler Stephen M. Gauss and the invention of least squares. Ann Stat. 1981;9(3):465e474.
3. Verbeek Marno. A Guide to Modern Econometrics. 5th ed. Australia: John Wiley & Sons Ltd; 2017.
4. Maddala GS, Kim IM. Unit Roots, Cointegration, and Structural Change. Cambridge: Cambridge University Press; 2003.
5. Perron Pierre. The great crash, the oil price shock, and the unit root hypothesis. Econometrica. 1989;57(6):1361e1401.
6. Perron Pierre, Vogelsang Timothy J. Nonstationary and level shifts with an application to purchasing power parity. J Bus Econ Stat.
1992;10(3):301e320.
7. Perron Pierre. Further evidence on breaking trend functions in macroeconomic variables. J Econom. 1997;80:355e385.
8. Lumsidaine R, Papel DH. Multiple trend breaks and the unit root hypothesis. Rev Econ Stat. 1997;79:212e218.
9. Bai Jushan, Perron Pierre. Computation and analysis of multiple structural change models. J Appl Econom. 2003;18:1e22.
10. Shrestha Min B, Chowdhury Khorshed. Sequential Procedure for Testing Unit Roots in the Presence of Structural Break in Time Series Data.
Economics Working Papers. NSW, Australia: School of Economics, University of Wollongong; 2005.
11. Gujarati Damodar N. Basic Econometrics. New York: McGraw-Hill; 1995.
12. Sims C, Goldfeld S, Sachs J. Policy analysis with econometric models. Brookings Pap Econ Activ. 1982;1982(1):107e164.
13. Engel Robert F, Granger CWJ. Co-integration and error correction: representation, estimation, and testing. Econometrica.
1987;55(2):251e276.
14. Johansen S. Statistical analysis of cointegration vectors. J Econ Dynam Contr. 1988;12(2e3):231e254.
15. Johansen S, Juselius K. Maximum likelihood estimation and inference on cointegration- with applications to the demand for money. Oxf Bull
Econ Stat. 1990;52:169e210.
+ MODEL
16. Pesaran M, Hasem, Pesaran Bahram. Working with Microfit 4.0: Interactive Econometric Analysis. Oxford: Oxford University Press; 1997.
17. Pesaran M, Hasem, Shin Yongcheol. An autoregressive distributed lag modelling approach to cointegration analysis. In: Strom S, Holly A,
Diamond P, eds. Econometrics and Economic Theory in the 20th Century: The Ranger Frisch Centennial Symposium. Cambridge: Cambridge
University Press; 1999.
18. Granger CWJ. Investigating causal relations by econometric models and cross-spectral methods. Econometrica. 1969;37(3):424e438.
19. Fisher Irving. The Purchasing Power of Money-its Determination and Relation to Credit, Interest, and Crises. New York: The Macmillan Co;
1922.
~ 1 j _,i() 1
UNITED NATIONS ft.

~ .. ~
' ~~1
(1.)
NATIONS UNIES
INSTITUT AFRICAIN DE DEVELOPPEMENT ECONOMIQUE ET DE PLANIFICATION

AFRICAN INSTITUTE FOR ECONOMIC DEVELOPMENT AND PLANNING
(IDEP)
ANALYSE DBS DBTEllMINANTS DB LA CllOISSANCB

ECONOMIQUE DANS UN PAYS EN DEVHWPPHMHNT:
Cas du Sénégal
Soumis en partie pour l'obtention du Diplôme d 'Etudes Approfondies en Politique

Economique et Gestion de l 'Economie de l'Institut Africain de Développement Economique
et de Planification (IDEP)
Par
Çlément YELOU
Lu et approuvé par :
Superviseur Principal: Prof Aly Ahmadou MBAYE '

Membres du Jury: Prof. Fodiyé B. DOUCOURE ~-
r? j; .
Prof. Mourad LABIDI ~~ . . /·· ,. _
NIAN~~
Examinateur externe:
Prof. Birahim Bouna
Prof. Am ath NDIAYE ~
1
Directeur de l'IDEP: Dr. Samuel OCHOLA
Date: 23 Mai ~000

~ G.:
Dakar, Mai 2000
« COMMENT TOUT EST POSSIBLE»
Au commencement Dieu créa les cieux et la terre. La terre était informe et vide. ... Dieu dit: Que
la lumière soit; et la lumière fut.
... Puis Dieu dit: Que la terre produise de la verdure, de l'herbe portant de la semence, des arbres
fruitiers donnant du fruit selon leur espèce et ayant en eux leur semence sur la terre. Et cela fut
ainsi.
... Aujourd'hui, Il me fait reposer dans de verts pâturages, Il me dirige près des eaux paisibles....
Aussi, Il répond à quiconque vient à Lui car Il dit: "Que celui qui a soif vienne, que celui qui
veut, prenne de l'eau cie la vie gratuitement'~

REMERCIEMENTS
Je tiens à signaler ma reconnaissance envers l'administration et tout le personnel de l'IDEP

pour le bon climat de collabration qui a régné durant cette formation. En particulier, que le
professeur P. K. QUARCOO reçoive mes félicitations pour la bonne organisation des
cours. Que 1' ensemble des professeurs qui ont intervenu dans 1' exécution des cours du
programme trouvent ici le fruit de leurs riches enseignements; nous espérons en faire un
usage utile dans notre carrière professionnel. En outre, la 3écrétaire de la division
Formation, Mme NIASSE, a toujours· fourni de grands efforts pour faire passer
l'information. Ausi, sans la bonne volonté des interprètes le programme n'aurait pu se
dérouler convenablement; je les remercie et leur demande de continuer a toujours faire
preuve d'une bonne disposition d'esprit. Grâce aux divers services de maintenance du
personnel du centre informatique, l'outil informatique a toujours répondu à nos besoins
durant la formation; que Mesdames Aby KAMARA et Hawa TRAORE recoivent ici mes
remerciements et mes encouragements pour leur amour du travail bien fait.
Les premières idées de ce travail ont été enrichies de nombreuses manières tout au long de
la recherche par mon directeur de recherche, le professeur Aly Ahmadou MBAYE; je lui
témoigne toute ma reconnaissance et le remercie pour ses encouragements et 1'_exemple de
rigueur et de précision qu'il m'a montré. Que le professeur Bouna Birahim NIANG reçoive
ici mes remerciements pour l'exemple de précision et d'humilité qui caractérise la
collaboration avec lui. La coordination et le calendrier de progression de ce travail ont été
bien assurés par le professeur M. LABIDI, chef de la division formation de l'IDEP. Que
l'ensemble du personnel de la bibliothèque de l'IDEP ainsi que celui du centre de
documentation de la mission résidente de la Banque Mondiale à Dakar soient rassurés de
mes reconnaissances pour leurs divers services de documentation et d'information. Le
directeur de la Division des Entreprises de la Direction de la Prévision et de la Statistique
(DPS) du Sénégal, M. SAMBA BA, m'a fourni les données nécessaires aux analyses; qu'il
t
soit remercié pour cela et pour son soutien permanent.
Que mes parents trouvent ici une part du fruit de l'éducation qu'il m'ont assurée dès mon
enfance; elle était de nature à m'encourager à l'effort. A mon frère Emmanuel GOLOU et
sa famille, je demande de recevoir mes remerciements; ils m'entourent d'une grande
attention et de beaucoup de soins. Que les familles FANTODJI et DANS OU trouvent en ce
travail l'effet de leur affection toujours renouvelée à mon égard. Que ma soeur Fatima
Myriam VICENS soit rassurée de ma gratitude pour son attention, ses encouragements et
son soutien. A Madame Soukheynatou KABA, qui s'est toujours montrée préoccupée par
rapport à ma situation sociale, je dois beaucoup de conseils et de soutien; qu'elle soit
satisfaite de ce travail. Mon frère Thimothé AMOUSSOU m'a toujours aidé de
nombreuses manières; qu'il trouve en ce travail un fruit de tous ces efforts. Je dois aussi
beaucoup à mes soeurs Ablavi GOZA et SENAVOR DJIGBODI en raison des diverses
aides qu'elles m'ont apportées dans mes multiples occupations pendant les moments de ce
travail.
A mes amis que l'avancement de ce travail a toujours préoccupé depuis qu'ils savent que
je dois le faire je demande d'en tirer toute la fierté possible: Blandine, Angélique,
Stéphanie, Antoine, Aby, Olivier, Marius, Wilfrid, Kourouma, Alexis, Sabine, Lydienne,
Hervé, Fernand, Essowaza, Reine, Biaka, Calixte, David, Félicité, Stéphane, Ngaradoum,
Allasra, Symphorien et les autres dont le nom n'est pas écrit ici.
Que mes amis Narcisse KOUTON, Appolinaire HOUENOU et Damien Fousséni CHABI-
YO trouvent un grand plaisir en ce travail; je sais bien qu'ils pensent toujours à moi.
Que mes collègues stagiaires du programme de DEA de l'IDEP se rassurent de ma

reconnaissance pour leur collaboration et leur soutien multiforme.
Par dessus tout, cette formation n'aurait pas été possible sans le soutien constant de Dieu.
Je crois que c'est de Sa volonté et je lui rends grâce; il a éliminé tout obstacle et sa fidélité
me constitue une bonne source d'espérance et de courage.
11
RESUME
Depuis l'indépendance, malgré les efforts mis en oeuvre pour assurer la diversification des
l-activités économiques et les diverses politiques économiques appliquées, l'économie
sénégalaise n'a pas connu une croissance régulière et continue. Cette absence de décollage
économique véritable suggére que les facteurs pertinents pour la croissance économique
n'auraient pas été bien maîtrisés dans l'économie. Pour cela, cette etude a essayé de
répondre à la question : "Quels sont les facteurs explicatifs des variations du PIB par tête
dans l'économie sénégalaise". La méthode d'analyse utilisée consiste à estimer
,.
successivement un modèle de croissance avec résidus de Solow, un modèle à capital
humain, puis un modèle avec variables de politique économique en tant que variables de
contrôle. Cette démarche vise à endogénéiser le résidu de Solow tel que suggére par les ·
modèles de la croissance endogène. Ainsi, nous supposons que les variables de politique
économique influent sur le taux de croissance par le biais de leur effet sur le residu de
Solow. Les données de l'étude sont relatives à la période de 1971-1997 qui prend en
compte l'ensemble des principales réformes macroéconomiques et sectorielles adoptées
dans l'économie]
Les résultats suggèrent que ni le capital humain ni le capital physique n'ont été bien
exploités dans l'économie, faute d'un environnement de travail et de motivation adéquat.
En fait, les politiques macroéconomiques et le cadre de production n'ont pas permis une
pleine exploitation des ressources. Par ailleurs, les politiques de dépen,ses de
consommation de l'Etat ont induit des attitudes favorables à la croissance. En outre, les
changements climatiques engendrant une forte sécheresse ont des répercussions négatives
sur 1' économie. Enfin, les politiques de libéralisation commerciale se révèlent comme des
mesures qm incitent, aussi bien les entreprises de production locale que celles
d'exportation, à rechercher une meilleure compétitivité à travers une hausse de la
productivité et l'amélioration de la qualité des produits; de ce fait, elles favorisent la
crOissance.
Ces analyses suggèrent que la politique gouvernementale devrait définir un cadre à

l'intérieur duquel les entreprises peuvent fonctionner à pleine capacité et aux meilleures
productivités. Les différents acteurs de l'economie devraient se déployer pour exploiter au
mieux les opportunités de production offertes par les ressources disponibles et par
1' environnement socio-économique national ou international.
lll
ABSTRACT
Since independence, despite the efforts put forth to secun;! the diversification of economie
activities and the various applied economies, senegalese economy hasn't known constant
and regular growth. This absence of real economie take off suggests that pertinent factors
for the economie growth haven't been mastered in the economy. For tfhat, this study has
tried to answer to the question : "What are the explicative factors of the per capita GDP
variations in the senegalese economy". The analysis method use consists in estimating
successively a growth model with Solow residual, a model with human capital and a model
with economie policy variables considered as control variables. This proceeding aims at
endogenizing the Solow residual as suggested by endogenous growth models. So, we
suppose that economie policy variables influence the growth rate through the impact they
have on Solow residual. The data of the study relate to the period 1971-1997 that includes
the whole of the main sectorial and macroeconomie reforms adopted in the economy.
The results suggest that neither human capital nor physical capital have been well
exploited because of the lack of an adequate working environment and motivation. In fact,
macroeconomie policies and the production frame didn't allow a full exploitation of
resources. Besides, the public consumption expenses policy induced fovourable attitudes to
growth. Futhermore, climatic changes causing strong grought have negative effects on the
economy. Finally, commercial liberalization policies are measures that incite as well as
local production enterprises than the orres of exportation to look for a better
competitiveness through a rise of productivity and an improvement of the products'
quality ; so, they incite growth.
These analyses suggest that government policy should define a frame in which enterprises
could widely work for better productivity. The different actors of the economy should put
forth all their strenght to better exploite the opportunities of production.
IV
•
•
SOMMAIRE
REMERCIEMENTS ..................................................·........................................................... i
RESUME .............................................................................................................................. iii
INTRODUCTION GÉNÉRALE ........................................................................................ !
1. Introduction ..................................................................................................................... 1
2. Problème Central De L'étude ....... .......................................................................
,.
.......... 3
3. Justification De L'étude ...................................................................................... ......... ... 7
4. Objectifs De L'étude ....................................................................................................... 8
5. Hypothèses De Recherche .................................................................. ........................... 8
6. Organisation De L'étude .................. .................... .......................................................... 9
CHAPITRE PREMIER :
EVOLUTION RECENTE DE L'ECONOMIE SENEGALAISE ................................... lü
1.1. CARACTÉRISTIQUES GÉNÉRALES DE L'ÉCONOMIE SÉNÉGALAISE ............................... 10

1.1.1. Localisation géographique et contexte économique du Sénégai. ............ I o
1.1 :2. Structure de l'économie sénégalaise ....... ,................................................... 13
1.2. POLITIQUES DE RÉFORMES MACROÉCONOMIQUES ET SECTORIELLES ............ .. ............ 18
1.2.1. Programmes de stabilisation et d'ajustement structurel (1980-1993) .... 18
1.2.2. La politique d'ajustement externe :la dévaluation du franc CFA (1994 )22
1.3. PERSPECTIVES DE CROISSANCE ÉCONOMIQUE ....... ......... ............................................. 26
CHAPITRE DEUXIÈME :FACTEURS DE CROISSANCE ÉCONOMIQUE : UNE

REVUE DE LA LITTÉRATURE ...................................................................................... 28
2.1. EVOLUTION DES THÉORIES DE LA CROISSANCE .... :.................... .................................. 28

2.2. CONCEPT ET ROLE DU CAPITAL HUMAIN DANS LA CROISSANCE ECONOMIQUE ... ......... 34
2.3. STRATÉGIES COMMERCIALES ET CROISSANCE ............................................ ....... .......... 42
2.4. ENVIRONNEMENT MACROÉCONOMIQUE ET CROISSANCE ............ ................ ...... ...... .. .. 49
CHAPITRE TROISIÈME: ETUDE EMPIRIQUE DES FACTEURS DE
CROISSANCE: MÉTHODOLOGIE ET RESULTATS ............................................... 56
3.1. MÉTHODOLOGIE D'ANALYSE .... .................................... ...................... ........................ 56

3.1. 1. Spécification du modèle d'analyse de la croissance ........................................... 56
3.1. 2. Technique d'estimation du modèle .. ................... .. ...... ............................... .. ....... 66
3.1.3. Sources des données ............................ ..... .................... ....... ... ..... ... ............... ..... . 69
3.2. RÉSULTATS ..................... ............ ............................................................................ .. .. 71
3.2. 1. Estimation des modèles de croissance ........................................................... .. ... 71
3.2.2. Analyse et interprétation des résultats .... .. ........ .. ................................................ 80
3.3. IMPLICATIONS DE POLITIQUE ECONOMIQUE ............................................... 91
3.3.1. Politique macroéconomique ............ ... ........................ .. .... ...... .. ........ .. ........ .. ....... 91
3.3.2. Développement technologique .................................................................. .... ... ... 92
3.3.3. Valorisation du capital humain ......................................................................... .. 94
3.3.4. Politique commerciale .............................................. .. ................................... ..... 95
3.3.5. Politique environnementale ... .... ..... ...................................... :····························· 96
CONCLUSION ET RECOMMANDATIONS ................................................................ 97
CONCLUSION ..... ....... .. .. .. .... ..... ... ....... ........................ ... .... .... ..... .... ......... ... ... .. .............. 97
RECOMMANDATIONS ....................... .... .. ........ ..................................... ........... .. .......... 99
1) Une politique macroéconomique saine .................................. ............................ 99
2) Réforme de l'intervention publique axée sur la bonne gouvernance .......... 100
3) Réformes en matière de dépenses publiques ................................................ 103
4) Renforcement du capital humain ............................ .......................................... 104
5) Expansion rapide des exportations .. ............... ........ .. ........ .. .............................. 106
6) Information- Education et Communication en matière d'Environnement ... 108
7) Rendre les réformes irréversibles .. ..... ............................................................ 109
BIBLIOGRAPHIE .......................................................................................................... 11 0
ANNEXES
vi
Introduction Générale
INTRODUCTION GENERALE
1. INTRODUCTION
En Afrique subsaharienne, plusieurs mesures ont été adoptées après les indépendances en
vue d'assurer une croissance économique rapide susceptible de éonduire chaque pays vers
l'autonomie économique. A cet effet, des stratégies d'industrialisation basées sur la
substitution aux importations ont été adoptées dans la plupart des pays. Mais, la protection
élevée et permanente des entreprises locales que cette stratégie requiert a engendré des
coûts de production élévés, la stagnation de la productivité et l'absence de compétitivité.
Aussi, la qualité de la technologie et de la main-d'oeuvre utilisées dans les firmes
productrices n'étant pas à la hauteur de celles utilisées dans les pays développés, cela n'a
pas permis aux produits locaux de répondre aux critères de compétitivité en terme de
qualité. Malgré la faible compétitivité-qualité des produits locaux, ceux-ci sont so.uvent
vendus à des prix plus élevés que ceux des produits homologues importés. Ce qui
n'encourage pas les consommateurs à les acheter. Cette stratégie de croissance n'a donc
pas permis de relancer les économies.
Dans le secteur agricole, pilier naturel de 1' économie en Afrique, le taux de croissance
annuel de la production n'a été que 2% en moyenne entre 1965 et 1980, ce qui est moins
que le taux de croissance démographique qui est de 2,8% en moyenne (V AN DER
HO EVEN et V AN DER KRAAIJ, 1994). Les autres secteurs des économies de 1' Afrique
subsaharienne n'ont pas enregistré des résultats meilleurs, si bien que depuis le milieu des
années 70, le taux de croissance annuel est devenu plus faible que sa valeur de la période
1965- 1973 (BANQUE MONDIALE, 1993). En effet, alors que dans cette région, ce taux
était de 4. 7 % sur la période 1965 - 1973, il est tombé à 3.2% au cours de la période 1974
- 1980, puis à 1.2% entre 1981 et 1985 avant de remonter à 2.5% entre 1986 et 1990
(BANQUE MONDIALE, 1993).
Cette situation économique défavorable de la fin des années 70 a conduit la plupart des
Etats de 1' Afrique subsaharienne à mettre en oeuvre des politiques de stabilisation et
d'ajustement strucurel à partir de 1980 avec le soutien du Fonds Monétaire International
(FMI) et de la Banque Mondiale. Un bilan de ces politiques montre que les résultats
macroéconomiques obtenus pendant la période où elles sont appliquées sont mitigés. En
effet, dans la plupart des pays, même si on a assisté à une stabilisation financière,
l'économie n'a connu qu'une stagnation et les revenus par habitants n'ont augmenté que
très faiblement. Ces faibles performances de l' activité économique se sont sérieusement
répercutées sur le niveau de vie des populations : le taux de croissance moyen du revenu
par habitant de l'Afrique subsaharienne est négatif entre 1981 (où elle vaut - 1.8%) et 1992
(où elle est de - 0.4%). Il en est découlé l'aggravation des problèmes de pauvreté et le
niveau de vie a décliné d'année en année (PNUD, 1998). Ainsi, ces politiques ont engendré
une forte détérioration de la condition sociale des populations.
Par ailleurs, des politiques d'ajustement externe ont été appliquées dans certains pays
comme ceux de la zone franc. En ce sens, la mesure de la dévaluation du FCFA adoptée en
1994 dans les pays africains de la zone franc devrait relancer la compétitivité de leur
économie. Bien que cette mesure ait engendré une baisse du pouvoir d' achat du
consommateur, elle a eu des résultats encourageants au niveau macroéconomique.
2
Au Sénégal, ces différentes mesures de politique économique adoptées au mveau de
1' Afrique subsaharienne ou de la zone franc ont été mises en œuvre avec certaines
spécificités comme des mesures d'accompagnement. Mais, elles n'ont pas permis une
relance remarquable de la croissance économique depuis l'indépendance. Ce n'est qu'après
la dévaluation du franc CFA que 1' on enregistre des taux de croissance économique assez
encourageant, de l'ordre de 5%, mais encore faible.
Dans ces conditions des actions devraient être mises en oeuvre par les autorités pour
permettre aux acteurs économiques d'accélérer la croissance économique au Sénégal. En
fait, même si le concept de croissance a beaucoup évolué et prend en compte aujourd'hui le
souci du bien-être individuel et collectif, il est évident que la croissance du revenu global
est un préalable nécessaire à la croissance qualitative.
Cette étude recherche les facteurs qui conditionnent la croissance du revenu global dans
l'économie du Sénégal. La démarche adoptée met l'accent sur les facteurs qui favorisent
l'amélioration de la Productivité Globale des Facteurs: le capital humain, l'ouverture
économique et la qualité de la politique macroéconomique.
2. PROBLEME CENTRAL DE L'ETUDE
La recherche des causes des différences entre les résultats économiques réalisés par divers
pays au cours d'une même période ou entre ceux enregistrés par un même pays en.deux
périodes différentes est une préoccupation majeure des théories économiques. Pour ce
faire, les théories de la croissance modélisent l'activité de production en définissant une
fonction de production. La fonction de production est définie à partir des facteurs utilisés
dans le processus de production, du mode de leur combinaison et de certaines propriétés
3
théoriques. Les facteurs de production se résument généralement à la main d'oeuvre et au
capital physique. Selon les premiers modèles de croissance (modèle de Solow (1956), de
Harrod-Domar (1959)) ces facteurs de production sont déterminés en dehors de la sphère

,
économique et ainsi la croissance ne peut donc s'auto-entretenir. Selon ces premiers
modèles, les mesures qui stimulent l'épargne et l'investissement n'accélèrent que
temporairement la croissance de la production, chaque adjonction nouvelle au stock de

(\J.-
capital étant censée produit un accroissement dégressif de la production. Aussi, les
résultats empiriques montrent l'existence d'une grande part de la croissance économique
réelle qui n'est pas saisie par ces deux facteurs. Cette part non expliquée est la Productivité
Globale des Facteurs (PGF) désignée par le terme ''résidu de Solow'' ; elle est supposée
refléter l'efficience de l'activité de production. La différence de PGF explique.rait une
grande part de la différence des résultats obtenus, soit dans le temps, soit dans l'espace,
lorsque les mêmes quantités de main-d'oeuvre et de capital physique sont utilisées
(Chenery, 1991). Ainsi, à niveau de développement semblable, les pays qui augmentent
plus leur PGF enregistrent des taux de croissance plus élevés.
Ces insuffisances des modèles à deux facteurs ont suscité la recherche d'autres facteurs
susceptibles de capter une bonne partie du résidu de Solow. C'est alors que les théories de
la croissance endogène examinent, à partir du milieu des années 80, les principaux facteurs
qui expliquent les dynamiques de croissance et leur caractère auto-entretenu. Ces nouvelles
théories partent du principe soit que l'investissement supplémentaire ne produit pas de
rendement marginal dégressif, soit qu'une partie de la production supplémentaire sert à des
activités économiques. Ces théories suggèrent aussi que les réformes de politiques, par
exemple la libéralisation commerciale, la macro-stabilisation et 1' élimination des
distorsions, peuvent favoriser la croissance économique en accroissant les incitations à
4
épargner et à investir d'une part, et la rentabilité de l'investissement d'autre part. A cet
effet, elles font une rupture avec les conceptions traditionnelles de la fonction de
production macroéconomique; la croissance est désormais vue comme un processus de
long terme. La croissance à long terme peut se faire par des rendements croissants du
capital, des phénomènes d'extemalités comme l'innovation technologique, les
connaissances ou les biens publics (AMABLE ET GUELLEC, !992). L'explication de la
croissance économique ne devrait donc plus se limiter à l'analyse des seules variations des
facteurs traditionnels de production, mais devrait prendre en compte d'autres aspects
comme le niveau du capital humain, la nature des dépenses publiques, l'existence de
rendements croissants, l'importance des processus d'apprentissage interne et du progrès
technique endogène.
Au Sénégal, il faut dire qu'au cours des années 60 et 70, il a été adopté des politiques
caractérisées par 1'intervention publique dans les marchés des biens et des facteurs, un
manque de discipline budgétaire et un protectionnisme commercial et industriel. Ces
politiques ont eu pour résultat de faibles niveaux d'épargne et d'investissement qui, face à
une forte croissance démographique, ont entrainé à leur tour une stagnation du PIB par tête
(Banque Mondiale, 1997). Les politiques d'ajustement initiées au debut des années 80 ont
réussi partiellement à rétablir les équilibres macroéconomiques. Mais, au début des années
90, les conditions économiques vont encore se dégrader à la suite d'une profonde
détérioration des termes de l'échange, de sécheresses répétées. La dévaluation du franc
CFA intervenue en 1994 et les réformes structurelles destinées à améliorer la flexibilité du
marché, à développer le secteur privé, à libéraliser l'économie et à réduire la taille du
secteur public, ont eu des résultats globalement positifs. Ainsi, sur la période 1994-1997,
on a pu enregistrer des taux de croissance annuels de l'ordre de 5%, ce qui est supérieur
5
aux résultats des périodes antérieures. Mais, ces résultats en matière de crOissance
économique sont encore très bas compte tenu du niveau actuel de l'économie.
Par ailleurs, dans l'économie sénégalaise, l'évolution de la Productivité Globale des
Facteurs (PGF) n'a pas été régulière depuis 1960. En effet, Berthélémy et al. (1996)
montre que la PGF n'a connu de croissance régulière et positive que sur la période 1960-
1966. Entre 1967 et 1990, la PGF a beaucoup fluctué, enregistrant même des taux de
croissance annuels négatifs. Ces auteurs trouvent que malgré cette faible évolution, la PGF
a contribué pour près de 22% à la croissance du PIB entre 1960 et 1990. On peut
comprendre qu'au Sénégal, cette contribution non négligeable de la PGF à la croissance et
ses fortes fluctuations peuvent avoir engendré les faibles résultats enregistrés en matière de
croissance depuis l'indépendance en 1960. En fait, depuis lors, des politiques de réformes
et d'ajustement interne ou externe ont été mises en oeuvre et l'environnement économique
international a beaucoup évolué. Aussi, des efforts ont été faits dans le cadre de
1' accroissement des compétences et aptitudes humaines. En ce sens, il faut noter que les
services de santé et de sécurité ont connu à la fois une amélioration de leur qualité et une
plus grande couverture géographique. Les taux d'inscription aux divers niveaux
d'éducation ont beaucoup augmenté par rapport à ce qu'ils étaient en 1960 (68% en 1996
contre 41% dans l'éducation primaire, World Development Indicators CD-Rom - World
Bank, 1999 ) et les domaines de formation se sont diversifiés. Ces différentes politiques
économiques et ces mesures de renforcement du capital humain auraient influencé la
croissance de la PGF. Dans ces conditions, il est important de rechercher quels ont été
leurs effets réels sur les variations du PIB par tête. Cette étude recherche les facteurs qui
permettent d'expliquer au mieux les variations du PIB par tête de l'économie sénégalaise
dans le temps. Pour cela elle tente de répondre à la question de recherche :
Quels sont les facteurs explicatifs de la croissance économique au Sénégal ?
6
3. JUSTIFICATION DE L'ETUDE
Aujourd'hui, avec l'aggravation et la généralisation de la pauvreté dans la plupart des pays
en développement, notamment dans ceux de 1' Afrique subsaharienne, il est nécessaire de
chercher à accroître le revenu par tête. Certes, il s'agit là d'un indicateur basé sur un calcul
de moyenne ; ce qui laisse de grands problèmes si de fortes inégalités caractérisent la
répartition de la richesse nationale. Toutefois, l'accroissement dJ.I revenu national sur une
longue période est une phase préalable et nécessaire à la réduction de la pauvreté. Il reste
donc que les politiques macroéconomiques doivent continuer à se focaliser sur l'objectif de
la croissance durable. Mais, les variables sur lesquelles il convient d'agir pour accélérer la
croissance sont identifiées par les théories et modèles de croissance. Mais, ces modèles ne
prennent pas en compte les facteurs de croissance qui peuvent être liés à, 1' environnement ~ ,
économique- aussi bien interne qu'externe - dans lequel s'opère l'activité de production.
Or, ce type de facteur est important. En effet, les facteurs travail et capital physique ne
peuvent opérer que dans un climat qui est favorable à leur pleine exploitation et à leur
renouvellement. L'intérêt de cette étude est de montrer que les deux facteurs traditionnels
de production sont importants dans l'explication des variations de la croissance, mais
qu'ils demeurent insuffisants. On identifiera des facteurs de l'environnement économique
qui conditionnent l'efficience du système productif et qui constituent ainsi des facteurs
indirects de croissance. De tels facteurs sont donc endogènes à la sphère économique.
Nous pourrons alors disposer de nouvelles stratégies pour l'orientation des politiques
macroéconomiques en vue d'atteindre, au cours des prochaines années, des taux de
croissance positifs et réguliers.
7
4. OBJECTIFS DE L'ETUDE
L'objectif de l'étude est de rechercher les facteurs explicatifs de la croissance économique
au Sénégal et de formuler des mesures de politique économique favorables à une
croissance continue du PIB par travailleur. De façon spécifique, elle vise à cerner :
La nature des facteurs de production prépondérants dans 1' accroissement du produit
global par travailleur ;
la pertinence des variables de politique et de comportement économiques par rapport à
l'objectif de croissance économique.
S. HYPOTHESES DE RECHERCHE
A partir des résultats de divers travaux empiriques portant sur 1' analyse de ·la croissance
dans plusieurs types de pays (en terme niveau de développement) nous formulons les
hypothèses suivantes. L'étude s'attachera à leur vérification.
Hl :L'accumulation du capital physique et du capital humain n'explique qu'une part assez
faible des variations du taux de croissance du produit intérieur brut (PIB) par tête de
1' économie sénégalaise.
H2 : La qualité de la politique macroéconomique, les performances à l'exportation, la
consommation publique sont les princip<).UX facteurs qui expliquent les variations du
taux de croissance du PIB par tête au Sénégal. Ces facteurs seraient positivement liés
au PIB par travailleur.
Ces hypothèses mettent ainsi l'accent sur les facteurs endogènes qui sont, en dehors du
capital humain, des variables de contrôle ou des variables structurelles.
'
8
6. ORGANISATION DE L'ETUDE
L'étude sera présentée en trois chapitres. Le premier est consacré à la connaissance de
l'économie du Sénégal, son histoire et son évolution, la structure de sa production, les
principales politiques et réformes qui y ont été mises en oeuvre depuis 1960. Le chapitre
deuxième examine les théories de la croissance dans la littérature économique. On mettra
particulièrement en exergue les liens que le capital humain, le régime commercial et la
stabilité macroéconomique entretiennent avec la croissance économique. Alors, au
troisième chapitre, nous définirons un cadre méthodologique d'analyse, pms nous
procéderons à l'analyse empmque. A partir de ces résultats, nous indiquerons les
implications de politique économique et nous formulerons des mesures de promotion de la
croissance économique au Sénégal.
9
Chapitre 1: Evolution récente de l 'économie sénégalaise
CHAPitRE PRE MriER : 1
EVOLUTION RECEN"J:E DE L'ECO:NOMIE SE NEGALAISE 1
1.1. CARACTERISTIQUES GENERALES DE L'ECONOMIE SENEGALAISE
1.1.1. Localisation géographique et contexte économique du Sénégal
Avec sa position très avancée dans 1' ouest de 1'Afrique, le Sénégal couvre une superficie
de 196722 Km2 et comprend une enclave, la Gambie, de 10300 Km 2 . Le relief du pays est,
dans sa grande partie, plat et ne s'élève pas au-dessus de 130 mètres; seule la région
Sud-Est est quelque peu accidentée. Le climat subit des influences géographiques et diffère
notablement entre la zone côtière et les régions de l'intérieur. D'autre part, la circulation
atmosphérique favorisée par l'absence d'obstacles montagneux place le pays sous les effets
de l'alizé maritime, de l'harmattan et de la mousson. On note généralement deux saisons
dont les durées varient d'une région à l'autre; les pluies diminuent progressivement en
durée et en intensité du sud au nord : en moyenne, 1500 mm par an dans la région de
Ziguinchor (sud), 800 mm dans la région de Kaolack (zone centrale), 330 mm à Podor
(nord). En dehors de ses deux fleuves (le Sénégal et la Gambie), le sénégal dispose
d'importantes ressources d'eaux souterraines; celles-ci permettent la mise en oeuvre de
programmes d'hydraulique villageoise. Le climat et la nature des sols déterminent
plusieurs types de végétations : au nord, on rencontre une brousse clairesemée où
prédominent les épineux ; la savane arborée riche en faune caractérise les zones
soudaniennes et la forêt épaisse est localisée dans la zone subguinéenne limitée à la Basse-
Casamance.
Toutes ces caractéristiques physiques ont fortement limité les possibilités de choix en
matière d'activités du secteur primaire, notamment l'agriculture.
10
Chapitre 1: Evolution récente de l'économie sénégalaise
A propos de l'historique de la constitution de l'économie sénégalaise, il faut noter que lors
de son accession à l'indépendance en 1960, le Sénégal avait hérité d'une infrastructure
matérielle et sociale assez développée en raison de la place prépondérante de Dakar, dans
les rapports de la métropole avec ses colonies d'Afrique occidentale. En effet, pour des
raisons de rentabilité et de difficultés d'acheminement liées aux conflits mondiaux,
l'investissement privé français en Afrique s'est développé par,. la création d'industries
légères de substitution aux importations. La prolifération de ce type d'industrie s'est
notamment accentuée durant la grande crise de 1929, puis pendant et au lendemain de la
deuxième guerre mondiale. Ainsi, le Sénégal avait été privilégié par les autorités coloniales
dans l'installation des infrastructures industrielles et administratives en Afrique. De ce fait,
le Sénégal avait joué un rôle de premier plan dans les affaires coloniales en Afrique et ,a
bénéficé de plusieurs facilités d'obtention de ressources extérieures substantielles. C'est
dans les années qui suivent l'indépendance que l'économie nationale s'est orientée,
structurée et planifiée en vue d'un développement économique et social harmonieux.
Au cours des deux décennies qui ont suivi 1'indépendance (1960-1980), la situation
économique du Sénégal n'a été que peu satisfaisante dans l'ensemble, même au regard des
autres pays d'Afrique subsaharienne. Le PIB a augmenté en moyenne de 2,1% par an alors
que l'accroissement de la population était de 2,8%, ce qui a engendré une baisse du revenu
réel par tête. Aussi, de tous les pays africains épargnés par la guerre, le Sénégal est celui
qui a enregistré le plus faible taux de croissance au cours de cette période. En fait, on peut
distinguer sur cette période quatre sous-périodes caractérisées par des tendances différentes
en matière de croissance économique.
Jusqu'en 1966, année où le sénégal a perdu le bénéfice du traitement préférentiel
accordé par la France à ses exportations agricoles, la gestion de l' économie nationale a
11
été relativement saine et la croissance a atteint 3,5% par an environ, soit plus que le
taux de croissance démographique.
Entre 1967 et 1974, année où le prix mondial du pétrole a quadruplé, le PIB du Sénégal
n'a augmenté que de 1,3% par an et la production d'arachide a chuté de près de moitié.
Pendant cette période, le pays a poursuivi une politique active de nationalisation et de
substitution aux importations industrielles.
Pendant la troisième période, de 1974 à 1978, la croissance moyenne du PIB a été à
peu près égale à celle de la population, ce qui s'explique en grande partie par de bonnes
conditions météorologiques et par l'évolution très favorable des termes de l'échange,
résultant de la hausse des prix mondiaux du phosphate et de 1' arachide.
Au cours de la quatrième période, de 1978 à 1980, le Sénégal a été victime de deux
grandes sécheresses et de la forte baisse des cours mondiaux de 1' arachide ; ce qui a
conduit à un taux de croissance moyen du PIB de 0,8% par an.
A la fin de cette période, les principaux indicateurs économiques faisaient tous apparaître
de graves déséquilibres financiers et structurels. Le déficit budgétaire et le déficit extérieur
courant atteignaient, respectivement 12,5% et 25,8% du PIB. Le taux d'épargne était
négatif et la consommation totale était supérieure au PIB. Aussi, entre 1975 et 1980,
l'inflation s'est accélérée pour atteindre un taux de 12% tandis que les termes de l'échange
diminuaient eux aussi de 12%. Cette situation laisse percevoir la nécessité de réformes
économiques structurelles en vue d'une relance de la croissance de 1' économie.
Avant de présenter les grandes politiques de gestion et de réforme macroéconomiques qui
y ont été mises en oeuvre depuis 1980 nous faisons d'abord un aperçu de la structure de la
production nationale.
12
Chapitre 1: Evolution récente del 'économie sénégalaise
1.1.2. Structure de l'économie sénégalaise
L'économie sénégalaise développe ses activités à travers trois pnnc1paux secteurs : le
primaire, le secondaire et le tertiaire; certaines tenninologies distinguent aujourd'hu i le
secteur infonnel du tertiaire.
La population active du pays est inégalement répartie entre ces secteurs ; de même la
contribution à la production nationale et à sa croissance diffère entre secteu rs. L'évolution
globale de la structure de la population active selon ces trois secteurs est donnée dans le
· tableau suivant.
Structure sectorielle de la population active occupée de 1980 à 1997 (%).
~ ]
1980 1983 1991 1995 1997
s
Primaire 32,5 23,1 20,6 21,0 19,0
Secondaire 27,3 20,5 19,0 19,5 19,8
Tertiaire 40,2 56,4 60,4 59,5 61,2
Total 100 100 100 100 100
Source : Rapport Zone Franc, Pans - France, d1vers numeros.
L'emploi au Sénégal se déplace de plus en plus vers le secteur te1iiair•!. Ce transfert
s'explique par la forte expansion des activités du secteur informel et par l'exode rural qui
engendre une baisse de la population engagée dans les activités agricoles. Depuis 1983, la
part de l'emploi du secteur secondaire semble stagner autour de 20% de l'emploi total ; ce
qui résulte de l'importance toujours accordée aux activités industrielles dan ~ . le pays. Cette
réallocation progressive de la main-d'oeuvre semble conditionner les performances des
différents secteurs.
Chapitre 1: Evolution récente de 1'économie sénégalaise
Tableau n°2 : Taux de croissance annuel moyen des valeurs ajoutées sectorielles
réelles (prix constant de 1987) de 1970 à 1997 (%).
~
1970-74 1975-79 1980-84 1985-89 1990-93 1994-97
s
Primaire 1,66 0,89 0,42 2,38 -2,38 1,88
Secondaire 3,38 4,15 5,10 4,67 0,40 7,6
Tertiaire 0,34 0,90 2,34 4,17 0,61 6,50
Source : Calculs de 1' auteur a' partir des PIB sectonels de la base de données économiques
annuelles de la Direction de la Prévision et de la Statistique du Sénégal.
Le secteur secondaire a presque toujours amélioré la performance de ses activités même si
le déclin général de la période 1990-1993 a ramené sa croissance moyenne au taux de
0,4%. De plus en plus, la croissance de la valeur ajoutée du tertiaire s'améliore. On peut
alors penser à une amélioration de la productivité de la main-·d'oeuvre dans le secondaire et
dans le tertiaire. Par contre, les taux de croissance toujours faibles du primaire peuvent
s'expliquer par les mouvements climatiques et la réduction de la main-d'oeuvre. L'analyse
de la structure de la doissance montre toutefois qu'en général le ..~secteur primaire y
contribue plus que le secondaire, le tertiaire étant le plus prépondérant.
Tableau n°3 : Décomposition de la croissance du PIB selon les contributions des

différents secteurs, de 1970 à 1997 (%).
~
1970 1975 1980 1985 1990 1993
s
Primaire 1,7 1,1 -4,8 1,6 2,2
Secondaire 1,4 1,1 1,2 0,4 0,7
Tertiaire 5,6 5,5 1,6 2,0 1,7
Total 8,8 7,6 -2,0 4,0 4,5
Source: World Bank, World Tables, 1990, 1995.

Une étude détaillée des activités de chaque secteur permet de mieux cerner les conditions
de déroulement de leurs activités.
14
a) Le secteur primaire
Il s'agit principalement de 1' agriculture, de 1' élevage, de la pêche et des services forestiers
de maintien et d'amélioration de la fertilité des sols. L'importance stratégique de
1' agriculture dans 1' économie et dans la subsistance de la population - notamment dans
l'optique de l'autosuffisance alimentaire - doit inciter à mieux gérer les atouts et les
contraintes qui y sont liés au Sénégal. Or, depuis deux décenniss, cette branche agricole
traverse des difficultés dues à une pluviosité défavorable, à la réduction des subventions et
aux inadéquations de la politique agricole du pays ; sa contribution au PIB est passée de
18,8% sur la période 1960-1986 à 11% sur la période 1987-1993 (MEPF, 1997).
On note, malgré les mesures prises à partir de 1981 par les autorités pour la résorption du
déficit alimentaire, que celui-ci s'aggrave du fait de la forte croissance démographique et
de l'urbanisme accéléré à la hauteur desquelles les disponibilités alimentaires n'ont pas pu
se hisser pour réaliser 1' équilibre.
L'élevage contribue en moyenne pour près de 7,3% au PIB. Cette branche a connu des
efforts de développement mais ne satisfait pas encore toute la demande en lait (MEPF,
1997). Elle permet cependant des exportations de cuirs et de peaux. Elle souffre de la
fragilité du paturâge due à la légère végétation du pays et à une mauvaise gestion du milieu
naturel.
La pêche a connu une croissance régulière qui la place aujourd'hui au premier rang de
l'économie nationale en terme de recettes d'exportations. Cette filière comporte la pêche
artisanale qui a connu depuis les années 70 de nouvelles techniques, et la pêche industrielle
qui est surtout tournée vers 1' exportation.
15
Chapitre 1: Evolution récente de 1'économie sénégalaise
b) Le secteur secondaire
Le secteur secondaire regroupe les activités industrielles, les activités de bâtiment et
travaux Publics (BTP) et d'artisanat. Bien que la contribution à l'économie nationale des
autres composantes ne soit pas négligeable, les activités industrielles constituent la branche
prédominante du secondaire. Le secteur industriel est caractérisé, depuis 1994, par de
nouvelles potentialités liées au contexte de la dévaluation du franc CF A et à 1'instauration
d'un climat favorable au marché. L'industrie fournit l'essentiel de l'offre nationale au
commerce international. En effet, le Sénégal dispose de matières premières agricoles,
minières, énergétiques, forestières et halieutiques importantes dont la transformation
fournit les produits manufacturiers d'exportations. En vertu de cette variété de ressources,
il suffit de renforcer la synergie entre les branches de production pour obtenir des
avantages en terme de coûts de production et de qualité des produits. Aussi, des possibilités
de diversification demeurent inexploitées dans l'industrie (MEPF, 1997). Les principales
branches industrielles sont: l'industrie alimentaire, l'industrie chimique, les mines, le
textile et l'énergie. Selon les critères de la valeur ajoutée et des opportunités d'emplois
offertes, l'industrie alimentaire se révèle comme la branche la plus importante. En effet,
cette dernière fournit en moyenne 40% de la valeur ajoutée industrielle et près de la moitié
de l'emploi total de l'industrie manufacturière (Latreille, T., 1996). Il faut noter que dans le
cadre des politiques de développement industriel la stratégie de substitution aux
importations a été adoptée jusqu'en 1985; cette stratégie vise à protéger l'industrie locale
par l'instauration de barrières tarifaires et non tarifaires. La forte protection n'a pas permis
à l'industrie d'être compétitive ni sur les marchés extérieurs ni par rapport aux produits
importés. Mais, depuis 1986 des politiques de libéralisation ont été adoptées et l'industrie
devrait désormais faire des efforts pour améliorer sa compétitivité prix et qualité.
16
c) Le secteur tertiaire
Il est dominé par les services du tourisme, du commerce, de télécommunication, de
transport et par des activités du secteur informel. Les services de transport, de poste et de
télécommunication sont les activités dominantes. Le fort taux d'analphabétisme adulte au
Sénégal fait que plusieurs personnes s'adonnent aux activités commerciales ou à des
activités relevant du secteur informel. Ceci justifie la forte contribution du tertiaire au PIB .
En fait, cette prépondérance du tertiaire dans l'économie devrait susciter un meilleur
encadrement visant à prendre en compte l'intégralité de tous les avantages économiques
qu'il crée. Le tourisme est favorisé par la position géographique du pays, la qualité de ses
plages et de ses sites touristiques : le tourisme occupe la deuxième place en terme de
recettes en devises après la pêche (MEPF, 1997).
Dans le domaine des télécommunications, le Sénégal est le pays le plus compétitif dans la
zone UEMOA. Il possède aussi les produits de télécommunication les plus nombreux et de
la meilleure qualité, y compris 1' accès à 1'internet. Ce secteur vient aussi d'être libéralisé.
En outre, ses infrastructures de télécomunication sont 1'une des plus modernes de 1'Afrique
de l'ouest; leur taux de défaut de 39% est parmi les plus bas de l'Afrique subsaharienne
(Banque Mondiale 1997).
Au Sénégal, les politiques budgétaires et monétaires sont caractérisées par une forte
intervention de l'Etat et par des contraintes liées à l'appartenance à l'UEMOA 1• Elles ont
engendré des déséquilibres macroéconomiques qui se sont aggravés vers la fin des années
70. Alors, les institutions internationales vont intervenir à travers des politiques de
stabilisation et de relance de 1' économie.
1
Dans le cadre de sa politique de crédit, cette union impose un plafond statutaire aux avances consenties aux
gouvernements qui ne peuvent dépasser 20% des recettes budgétaires ordinaires de l'exercice budgétaire
précédent. Ce qui limite le financement des déficits publics par la Banque Centrale de l'Union.
17
Chapitre 1: Evolution récente de 1'éco nomie sénégalaise
1.2. POLITIQUES DE REFORMES MACROECONOMIQUES ET SECTORIELLES.
1.2.1. Programmes de stabilisation et d'ajustement structurel (1980-1992)
A la fin des années 70, le gouvernement sénégalais a commencé à se renue compte des
faiblesses de ses plans de développement du secteur public et de ses ambitieux
programmes de nationalisation. En effet, la flambée des cours de 1'arachide et des
phosphates au cours des années 1973-1977 et un contexte d'endettement fac le ont favorisé
des politiques internes expansives (Diagne, A., 1995). Les déséquilibres internes et
externes se sont accentués et l'endettement extérieur, de plus en plus lourd , a financ é
essentiellement la consommation. C'est dans ce contexte que le gouvernement a dû faire
appel aux institutions de Bretton Woods pour mettre en oeuvre des p rogrammes de
stabilisation et d'ajustement structurel. L'ensemble de ces programmes )nt couvert la
période 1980-1993 et comprennent deux grandes phases.
a) La phase de stabilisation : le Plan à Moyen Terme de Redressement f~conomique et
Financier (PREF) (1980-1984).
Dans une première phase, entre 1980 et 1984, le pays s'est efforcé de mettr( en oeuvre des
politiques de compression de la demande interne afin de résorber les déficit~ budgétaires et
du compte courant: c'est la phase de stabilisation. Cette phase a été marquée par
l'adoption d'un Plan à Moyen Terme de Redressement Economique et Financier (PREF)
en Novembre 1979 sur une période de cinq ans . Les mesures contenue~ dans ce plan
visaient l'accroissement de l' investissement dans les secteurs productifs, l'augmentation de
l'épargne publique, la libéralisation du commerce et la rationalisatic'n du secteur
parapublic. Il a aussi été conclu en Août 1980 un accord de facilité élargie (FEE) avec le
FMI, puis un prêt à l'ajustement structurel en Décembre 1980 avec la Banque Mondiale.
18
Chapitre 1: Evolution réc ente de 1'économie sénégalaise
Le FEE contenait des mesures fiscales visant à accroître les recettes et let réduction des
dépenses publiques. Mais le gouvernement sénégalais n'ayant pas appliqtLé les mesures
demandées par le FMI, aucun achat n'a pu être effectué au titre du FEE el en Septembre
1981 le FEE a été annulé à la demande du gouvernement.
Par la suite, il a été conclu avec le FMI un programme annuel de confirmat on pour 1981-
1982 avec révision à mi-parcours et des achats et des achats au titre de cet accord sont
conditionnés à la réalisation de progrès satisfaisants. Les mesures prises dan ~ : le cadre de ce
programme sont : augmentation des prix des denrées de première nécessité, majoration des
taux d'imposition indirecte, blocage des salaires de la fonction publique, cc,mpression des
dépenses d'équipements, limitation de la croissance du crédit intérieur.
En 1982, un autre programme conclu pour la période 1982-1983 avait mi ~ en oeuvre les
mesures suivantes : limitation des effectifs de la fonction publique, relèvem;nt du pri x des
intrants, compression du coût des effectifs des filières agricoles et de certaines entreprises
publiques, relèvement des taxes à l'importation, limitation de la masse n·.onétaire et du
crédit.
Pour la période 1983-1984, les mesures prises visaient 1'ugmentation des prix des produits
de base, l'augmentation de la retenue sur les prix au producteurs, la réduction des dépenses
de fonctionnement et la limitation de la croissance des effectifs dans la fonct on publique.
b) Le Programme d'ajustement à moyen et long terme (PAMLT) (1985-1 992).
La deuxième phase (phase d'ajustement structurel oud'ajustement partiel) a débuté en
Décembre 1985 avec l'adoption d'un Programme d'Ajustement à Moyen d Long Term e
(PAMLT). Ce programme est basé sur le diagnostic fait par le "Mémorandum de
l'économie sénégalaise" de la Banque Mondiale en Novembre 1984. Selon ce diagnostic,
la stagnation de 1' économie sénégalaise est imputable à trois catégories de facteurs : une
19
demande intérieure supérieure au PIB, le faible potentiel de croissance du secteur primaire,
un secteur public hypertrophié et inefficace. Les mesures contenues dans le PAMLT ont
alors mis l'accent sur la résorption des déséquilibres macro-économiques, le
développement de l'initiative privée, la correction des distorsions de prix, la stimulation de
l'épargne intérieure et la relance des activités agricoles, industrielles et commerciales. A
cet effet, des politiques agricole, industrielle et commerciale ont été élaborées dans le cadre
des programmes d'ajustement structurel (PAS).
La Nouvelle Politique Industrielle qui est un plan d'actions mis au point en Juillet 1986 a
été adoptée pour s'attaquer à la protection excessive dont l'industrie sénégalaise a toujours
bénéficié et pour renforcer la compétitivité de l'économie. Considérée comme le volet de
politique industrielle et commerciale des PAS, la NPI a pour objectifs :
L'amélioration de la compétitivité et des gains de productivité dans les entreprises
industrielles par la libéralisation des échanges intérieurs et extérieurs et par
1' élimination des distorsions de prix sur le marché intérieur ;
La rationalisation du dispositif des incitations industrielles afin de promouvOir les
activités à haute valeur ajoutée et tournées vers les marchés extérieurs;
L'assouplissement des conditions de fonctionnement du marché du travail par
1' élimination de tous les facteurs de rigidité relatifs à 1' emploi et à la détermination des
salaires.
Pour sa mise en oeuvre prévue pour trois ans (1986-1988), les mesures suivantes ont été
envisagées : 1) La réduction et l'harmonisation des tarifs douaniers ; 2) La suppression
progressive des restrictions quantitatives aux importations; 3) Le réaménagement du
système de subvention aux exportations afin de promouvoir les industries exportatrices et à
forte valeur ajoutée; 4) La révision du code des investissements et l'élimination graduelle
des conventions spéciales, de manière à encourager les investissements privés.
20
A propos des retombées macro-économiques de ces programmes, elles sont mitigées.
Plusieurs études sont unanimes que de nombreux changements stratégiques et
institutionnels sont intervenus pendant la période où elles ont été appliquées: succès des
mesures de stabilisation, introduction de mesures d'ajustement axées sur l'offre
(déréglementation sur les marchés, suppression des contrôles des prix, meilleure gestion
des entreprises parapubliques) (Elliot Berg and associates, 1990; Ministère de l'Economie
des Finances et du Plan, 1991 ; Banque Mondiale, .1993.).
Les résultats macroéconomiques sont plutôt médicocres. L'étude de BERG, E. and
associates (1990) aboutit à la conclusion centrale : ''Le Sénégaln 'a pas apporté de grands
changements en matière d'ajustement durant les années 80, décennie pendant laquelle
l'intérêt pour le changement structurel occupait vraisemblablement le centre de la scène
politique"(p.32). Selon cette étude, l'ajustement structurel a été ajourné au Sénégal. Pour
Diagne, A. (1995), le bilan de l'ajustement structurel au Sénégal durant la période 1980-
1992 peut être résumé par l'expression: "Stabilisation, peut-être; croissance, non" (p.
XI). Diagne, A. (1995) trouve en effet que la diminution du déficit budgétaire entre 1985 et
1992 a été obtenue au moyen de mesures incompatibles avec l'objectif de croissance. Pour
lui, la pression fiscale s'est alourdie, les taxes frappant les facteurs techniques (eau,
énergie, télécommunication) ont provoqué une hausse de leurs prix, ce qui a réduit la
compétitivité des entreprises. Par ailleurs, les ressources d'entretien et d'investissement ont
baissé alors que la masse salariale augmentait. Pour BERG, E. and associates, la lenteur
des efforts du Sénégal en matière d'ajustement structurel peut s'expliquer par plusieurs
facteurs : 1) des facteurs exogènes comme la sécheresse, la structure des prix des
importations ou des exportations; 2) l'inaction des agents économiques (absence d'action
ou actions non adaptées, lenteur bureaucratique, absence de volonté ou d'engagement
21
politique); 3) la présence des bailleurs de fonds qm auraient manqué de capacité à
superviser des réformes complexes.
En terme d'impact de la NPI, les études menées à cet effet montrent que les résultats
escomptés n'ont pas été atteints et que la politique a même engendré des effets négatifs sur
certains aspects de 1' économie nationale. En effet, la Banque Mondiale ( 1992) montre que
la libéralisation des importations a eu des effets négatifs sur la production et l'emploi
industriels. Cette mesure a transformé un grand nombre d'industriels en importateurs, d'où
une accentuation de l'informalisation de l'économie, la perte de nombreux emplois
industriels et la baisse subséquente de la production industrielle. De même, sur les
exportations industrielles, l'effet de la politique n'a pas été positif. A l'exception du
phosphate, toutes les exportations ont en réalité, soit stagné, soit diminué.
1.2.2. La politique d'ajustement externe: la dévaluation du franc CFA (1994)
Au début des années 90, la plupart des pays africains de la zone franc ont vu leurs
conditions économiques se dégrader à la suite d'une profonde détérioration des termes de
l'échange et de la récession économique en Europe (Banque Mondiale, 1997, p.9). Les
gouvernements ont cherché à adapter leurs économies à ce nouveau contexte. Il a été
adopté une dévaluation du franc CFA de 50% le 11 Janvier 1994. L'objectif principal est
de restaurer la compétitivité de ces économies. En effet, une dévaluation augmente le prix
des biens échangeables par rapport à celui des biens non échangeables, améliore la balance
commerciale et incite à une réallocation des ressources en faveur du secteur des biens
exportables.
Au Sénégal, la dévaluation a été accompagnée d'un programme de stabilisation axé sur
l'ajustement budgétaire et d'un programme de réformes structurelles destinées à améliorer
la flexibilité du marché, à développer le secteur privé, à libéraliser 1' économie et à réduire
22
la taille du secteur public. Depuis 1994, les mesures de réforme suivantes ont été bien
menées (Diagne, A., 1995, p.XIII; Banque Mondiale, 1997, p.lO):
Le renforcement de la concurrence au sein de 1' économie : libéralisation des pnx,
suppression ou renégociation des conventions spéciales qui protégeaient plusieurs
entreprises privées ou publiques ;
La libéralisation des échanges : élimination des autorisations préalables d'importer ou
d'exporter, ainsi que des prix de référence au cordon douanier, réduction des tarifs
douaniers et simplification du barème des tatifs, suppression du monopôle sur les
importations pour tous les produits à l'exception du pétrole;
La promotion de l'investissement privé et des exportations;
L'amélioration de la flexibilité du marché du travail : élimination des autorisations
préalables requises pour les licenciements effectués pour des motifs économiques ;
La réduction du rôle de l'Etat dans l'économie, préparation d'un audit de la fonction
publique et règlement des dettes croisées au sein des entreprises publiques ;
En terme de mesures sociales, la hausse des salaires de 10% et la mise en place d'un
filet de sécurité sociale d'une enveloppe de 15 milliards de francs CF A.
Par rapport à l'objectif de croissance, les effets de la dévaluation peuvent être analysés
dans le secteur productif et au niveau macroéconomique.
Dans le secteur productif, l'Etat a mis en oeuvre plusieurs programmes d'appui à la
dévaluation : le P ASA (le Programme d'Ajustement Structurel de 1' Agriculture), le projet
d'appui au secteur privé, etc. D'une façon générale, les effets ont été mitigés et ne sont pas
très encourageants.
23
Trois études 2 menées en 1996 sur la réaction des petites et moyennes entreprises (PME) du
secteur industriel à la dévaluation ont abouti aux conclusions suivantes.
Les PME au Sénégal utilisent beaucoup d'intrants importés, lesquels sont difficilement
remplaçables par des intrants locaux. La dévaluation n'a donc pas permis la
substitution des intrants importés dont le coût a pourtant augmenté ;
La dévaluation a réduit la concurrence venant de l'étranger et encouragé de plus petites
entreprises à se lancer dans les secteurs qui concurrencent les importations. Une
concurrence accrue est née de la part des micro-entreprises et du secteur informel qui
peuvent produire des marchandises à moindres coûts. Dans ces conditions, les
entreprises intervenant sur le marché local ont été contraintes d'ajuster vers le bas et le
prix et la qualité de leurs produits ;
Tous les secteurs tournés vers les exportations ont pns un grand essor, surtout
l'industrie de la pêche;
Les secteurs protégés de la concurrence extérieure (électricité, eau, énergie), mais qui
sont d' importants fournisseurs du secteur extérieur, ont continué à affaiblir l'efficacité
de ce dernier secteur ;
Etant donné l'importance de la capacité de production non utilisée au cours de la
période antérieure à la dévaluation, l'amélioration des conditions de la demande ne
conduit pas à la création d'emplois. Les entreprises préfèrent augmenter les salaires et
le nombre d'employés temporaires plutôt que d'engager des employés permanents. Il y
2Les références de ces études sont: 1) Les petites et moyennes industries après la dévaluation du franc CFA:
Conséquences, réactions et potentiels au Sénégal, par R. QUALMANN; R. FRACKMANN; T.
GANSLAMYR; B. GERHARDUS et B. SCHONEWALD, Etudes et rapports d'expertise 1511996. Institut
Allemand de développement, Berlin, 1996.
2) L'offre des entreprises manufacturières deux ans après la dévaluation du franc CFA: le cas du Sénégal, par
G. COLLANGE, Département des Politiques et des Etudes, Division de l'ajustement et de la macro-
économie (CFD), Janvier 1996.
3) Impact de la dévaluation sur le secteur productif. Rapport provisoire. Ministère de 1' Economie, des
Finances et du Plan. Unité de Politique Economique, Dakar, Mars 1996.
24
a donc eu une baisse de la sous-utilisation des capacités de production, ce qui s'est
traduit par une importante hausse de la productivité.
Toutefois, au niveau macroéconomique, il a été noté une reprise de l'activité économique
depuis 1994. Aussi bien le gouvernement que les bailleurs de fonds trouvent que les effets
de la dévaluation et des réformes ont été positifs (Harold, 1995 ; Banque Mondiale, 1997).
Il est cependant à prendre en compte les facteurs exogènes à cette politique qui peuvent
avoir joué. Le tableau suivant donne l'évolution de quelques indicateurs
macroéconomiques sur des périodes avant et après la dévaluation.
Tableau n°4 :Quelques indicateurs macroéconomiques au Sénégal de 1986 à 1997( %)
Années 1986-90 1991-93 1994 1995 1996 1997
Croissance du PIB 3,3 0,0 2,0 4,8 5,6 4,7
Croissance du PIB/tête 0,3 -2,8 -0,6 2,2 3,0 2,1
Investis. Intér. Brut /PIB 12,6 13,1 13,7 15,6 16,3 16,7
Investissement privé/PIB 8,6 8,9 9,0 10,8 Il ,5 11 ,7
Epargne intér. Brute /PIB 6,5 5,6 7,4 10,4 11 ,4 11,8
Croissance exportations 7,9 -3,7 5,3 9,4 4,8 0,7
Taux d'inflation (IPC) 0,1 -0,8 32,1 8,1 2,8 2,5
Déficit compte cour./PIB -10,7 -9,5 -9,3 -7,9 -7,2 -6,1
Déficit budgétaire -3,1 -1,9 -5,7 -3,2 -2,0 -1 ,3
Sources : World Bank, World tables, 1992 , 1995, African Development Ind1cators, 1997,
1998/1999
L'inflation a été ramenée de 32% en 1994 à moins de 3% en 1996. La croissance réelle du
PIB a été positive en 1994 et passe de 4,8% en 1995 à 5,6 en 1996, puis à 4,7% en 1997 ;
ce qui contraste avec la stagnation du début des années 90. Les déficits budgétaires et de la
25
balance courante (hors dons) sont passés respectivement de 5,7% et 9,3% du PIB en 1994 à
2% et 7,2% du PIB en 1996, puis à 1,3% et 6,1% en 1997. L'épargne intérieure brute est
passée de 7,4% du PIB en 1994 à 10,9% en 1996. Après avoir diminué de 3,7% par an au
cours de la période 1991-1993, le total des exportations a augmenté de 6,5% en volume
pendant la période 1994-1996.
Les politiques macroéconomiques ont ainsi eu des résultats mitigés en terme de croissance.
Les récentes mesures de politique économique permettent toutefois d'envisager de bonnes
perspectives en matière de croissance.
1.3. PERSPECTIVES DE CROISSANCE ECONOMIQUE
En définitive, depuis l'indépendance, l'économie du Sénégal n'a pa pu connaître un
véritable décollage en terme de croissance. Bien que les aléas climatiques peuvent avoir
joué, les avantages en terme d'infrastructures dont le pays bénéficiait dès le départ
(l'indépendance) apparaissent avoir été mal exploités. Ainsi, cette absence de décollage
véritable devrait être plutôt liée aux erreurs contenues dans la mise en oeuvre des
différentes politiques économiques adoptées dans le pays. Mais, les nouvelles orientations
de la politique économique depuis 1995 permettent d'être optimiste. En effet, le
gouvernement du Sénégal a adopté en 1995, à l'issue de la troisième réunion du groupe
consultatif des bailleurs de fonds, une nouvelle stratégie fondée sur 1' accélération de la
croissance et dont les objectifs sont: 1) Réaliser l'équilibre des opérations financières de
l'Etat à partir de 1997 ; 2) Réduire le déficit des paiements courants de 8,3% du PIB en
1994 à 6,8% en l'an 2000; 3) Maîtriser l'inflation pour le maintenir à des niveaux
semblables à ceux d'avant dévaluation (autour de 2%); 4) Relever le taux d'investissement
à 19% à 1'horizon 2000.
26
Pour atteindre ces objectifs, il a été prévu de se focaliser sur les quatre axes suivants: 1) La
mise en place d'un cadre macroéconomique viable; 2) la poursuite du désengagement de
l'Etat des activités marchandes; 3) La libéralisation plus poussée de l'économie; 4) La
promotion du secteur privé.
Dans ce cadre, il a été mis en oeuvre plusieurs mesures qui n'ont pas manqué de stimuler
l'économie : des mesures sur le plan fiscal et budgétaire, des mesures visant une saine
concurrence entre les entreprises, des mesures d'amélioration de l'investissement et de
compétitivité des entreprises, des mesures d'amélioration du cadre légal et réglementaire
des entreprises et des mesures de libéralisation et de promotion du commerce extérieur.
Bien que la croissance ait été accrue depuis 1994, l'objectif de taux de croissance de 6%
n'a pas encore été atteint (le maximum de 5,6% a été atteint en 1996). Les réalisations en
terme d'investissement sont encourageants (15,2% du PIB en 1995 et 17% en 1997), mais
demeurent encore inférieures à l'objectif de 19%. Mais, il faut noter qu'aujourd'hui,
l'engagement des institutions internationales à oeuvrer pour la croissance, la stabilité
politique et sociale qui caractérise la plus grande partie du pays (en dehors de la région de
Casamance), le réseau de télécommunication relativement moderne et d'un coût compétitif
(au moins dans la région de l'Afrique subsaharienne), le Sénégal peut attirer un important
volume d'investissement étranger et améliorer ses résultats en matière de croissance. Mais,
pour que ces opportunités soient bien exploitées, il faut des mesures de politique adéquates .
C'est en ce sens que la suite de ce travail est consacrée à la recherche des facteurs qui
expliquent le mieux les mouvements du taux de croissance du produit global par tête de
l'économie sénégalaise. Dans le chapitre suivant, la question des facteurs de croissance
sera abordée d'un point de vue théorique à travers les résultats de quelques travaux
théoriques et empiriques portant sur la croissance économique.
27
CHAPITRE DEUXIEME:
FACTEURS DE CROISSANCE ECONOMIQUE :
UNE REVUE DE LA LITIERATURE
Cette revue sera structurée autour de quatre aspects. Le premier est relatif à l'évolution des
théories de la croissance, depuis les travaux de SOLOW jusqu'à la théorie de la croissance
endogène. Le second traite du concept de capital humain et son rôle dans la croissance
économique. Après avoir présenté les liens entre la stratégie commerciale et la croissance
dans un troisième point, nous consacrerons le quatrième point à l'importance de la stabilité
macroéconomique dans les résultats macroéconomiques.
2.1. EVOLUTION DES THEORIES DE LA CROISSANCE
Le concept de croissance économique a évolué dans le temps pour prendre en compte, à
chaque fois, les nouvelles préoccupations des responsables chargés du développement.
Selon PERROUX, F., la croissance économique est "l'augmentation soutenue pendant une
ou plusieurs périodes longues d'un indicateur de dimension : pour une nation, le produit
global en termes réels". Pour pouvoir parler de croissance économique, il faut que la
quantité de biens et de services matériels produits dans l'économie augmente pendant une
longue période. Par ailleurs, si la répartition des revenus créés n'est pas trop inégalitaire,
cette augmentation globale s'accompagne généralement d'une amélioration du bien-être
matériel des membres de l'économie. Mais, la définition de PERROUX ne pennet pas de
saisir ces changements qualitatifs. De ce fait, des définitions plus récentes de la croissance
prennent en compte l'idée d'une augmentation du bien-être économique. Ainsi, KUZNETS
(1973) cité par TERLECKYJ (1984) considère que "la croissance économique moderne
reflète une capacité permanente d'offrir à une population en augmentation une quantité
28
accrue de biens et services par habitant". Plus globalement, TERLECKYJ ( 1984) définit
la croissance de façon à prendre en compte les cas où elle est négative ou positive d'une
part et ceux où elle concerne la production globale ou la production par habitant: "On
peut légitimement qualifier de croissance économique, une capacité à soutenir des effectifs
de population en augmentation rapide avec un maintien ou un léger accroissement
seulement du niveau de vie ''.
On peut donc concevoir la croissance économique comme une augmentation soutenue du
produit réel par tête de l'économie pendant une longue période de façon à améliorer, si
infime soit-il, le niveau de vie des membres de l'économie.
La croissance économique, ainsi définie, est un processus de long terme dont la finalité est
d'accroître un indicateur de bien-être de l'économie. Son étude peut contribuer bien
davantage à l'amélioration des niveaux de vie que ne l'ont fait toutes les analyses de
politiques macroéconomiques de court terme. En ce sens, des économistes ont porté leur
attention sur les caractéristiques du processus de croissance. Pour Kaldor ( 1963) cité par
Barro et Sal-I-Martin (1996), la croissance économique est caractérisée par six principaux
faits : 1) La production par tête croît à un taux relativement constant; 2) Le capital
physique par tête croît avec le temps; 3) Le taux de rendement du capital est
approximativement constant; 4) Le rapport du capital physique à la production est
approximativement constant ; 5) Les parts respectives du travail et du capital physique
dans le revenu national sont approximativement constantes ; 6) Le taux de croissance de la
production par tête est très variable d'un pays à l'autre.
Ces caractéristiques concernent essentiellement divers éléments du processus de
production et n'indiquent rien sur les changements sectoriels qui accompagnent la
croissance économique. En ce sens, Kuznets (1981) a dégagé d'autres caractéristiques de la
29
crmssance économique moderne; il souligne le taux rapide des transformations
structurelles (déplacement de l'agriculture vers l'industrie, puis vers les services). Ce
processus suppose l'urbanisation, le passage de l'artisanat à la modernité et le rôle
croissant de l'éducation. La croissance moderne entraîne aussi un développement du
commerce extérieur et le progrès technique devrait réduire la dépendance à l'égard des
ressources naturelles. Il mentionne également l'importance croissante du rôle de l'Etat.
Toutefois, ces caractéristiques ne permettent pas de comprendre comment se fait la
combinaison des facteurs de croissance pour conduire à un niveau de croissance donné. On
peut résumer l'historique de la théorie moderne de la croissance à partir des travaux de
BARRO et SAL-I-MARTIN (1996) et de GUELLEC, D. et RALLE, P.{l995) comme suit.
Les premières idées de la théorie de la croissance remonteraient à l'article de RAMSEY
écrit en 1928. Les conditions d'optimalité introduites par RAMSEY sont beaucoup
utilisées aujourd'hui dans l'étude de la théorie de la consommation, de la fixation du prix,
ou de la théorie des cycles économiques. Entre RAMSEY et la fin des années 40,
HARROD (1939) et DOMAR (1946) ont tenté de concilier l'analyse keynésienne avec
certains éléments de la croissance économique. Ils utilisent pour cela des fonctions de
production à facteurs faiblement substituables afin de prouver que le système capitaliste
était intrinsèquement instable. Par la suite, les travaux de SOLOW (1956) et SWAN (1956)
qui suivirent ont élaboré une fonction de production de forme néoclassique. Leur fonction
de production postule que les rendements d'échelle sont constants, que les rendements
factoriels sont décroissants par rapport à chaque facteur de production et que 1'élasticité de
substitution entre facteurs est supérieure à 1. Un des résultats importants du modèle de
SOLOW-SWAN est la notion de convergence conditionnelle: plus le niveau de départ du
PIB réel par tête est faible par rapport à sa position de long terme ou d'état régulier, plus le
30
taux de croissance est rapide. Cette notion qui découle de 1'hypothèse des rendements
décroissants du capital permet de comprendre une grande part des différences de taux de
croissance économique entre certains pays ou certaines régions. En effet, les économies
proches de leur position de long terme croissent moins vite que celles qui y sont plus
éloignées. Un autre résultat de ce modèle est qu'en l'absence d'améliorations constantes de
la technologie, la croissance par tête finit par s'arrêter; ce qui est lié à la décroissance du
rendement marginal du capital. Ce résultat est toutefois mis en défaut par les observations
empiriques de taux de croissance par tête positifs sans aucune tendance nette à la baisse
dans des économies où la technologie ne s'est véritablement pas améliorée.
Les travaux ont alors été poursuivis par d'autres théoriciens néoclassiques avec le postulat
de progrès technique exogène. L'idée de progrès technique exogène suppose que le progrès
technique ne résulte pas d'une activité économique et que son niveau ne peut être
déterminé dans la sphère économique. Les nouveaux modèles obtenus montrent alors que
le taux de croissance par tête est déterminé par le taux du progrès technique et par le taux
de croissance de la population, tous deux exogènes aux modèles. Dans ces conditions, la
croissance ne pouvait être auto-entretenue pour engendrer d'elle-même la croissance à long
terme, puisque ses facteurs sont déterminés en dehors de la sphère économique.
Peu après les modèles de ARROW (1962) et SHESHINSKY (1967) introduisent les idées
qui sont considérées comme des sous-produits involontaires de la production ou de
l'investissement : c'est le mécanisme "d'apprentissage par la pratique" (leaming by
doing). Dans ces modèles, les découvertes de chaque individu se répandent immédiatement
dans 1'économie tout entière par un processus de diffusion. Mais, les travaux relatifs aux
effets de la diffusion des idées dans l'économie ne vont pas beaucoup évoluer.
31
Ce n'est qu'à partir du milieu des années 80 que la recherche sur la croissance économique
a connu un nouvel essor grâce aux travaux de Romer (1986) et de Lucas (1988). Ces
auteurs s'accordent à reconnaître la primauté de la croissance de long terme sur celle de
court terme. Mais, il fallait alors élaborer un modèle où la croissance par tête à long terme
n'est plus indexée sur des variables exogènes comme dans les modèles néoclassiques, mais
où elle est expliquée par des variables internes au modèle économique. Leurs modèles sont
ainsi qualifiés de modèles de croissance endogène. Dans ces modèles, la croissance peut se
poursuivre indéfiniment parce que le rendement des investissements réalisés dans une
catégorie de biens capitaux (incluant le capital humain) ne diminuent pas à mesure que
l'économie se développe. La diffusion du savoir parmi les producteurs et les bénéfices
externes du capital humain font partie du processus de croissance en faisant obstacle aux
rendements décroissants du capital. Les rendements du capital sont alors constants et
permettent d'entretenir la croissance. En fait, on pourrait penser à des rendements
croissants du capital, mais alors l'économie risque de connaître une croissance explosive,
phénomène dont les effets peuvent être très défavorables au développement. Aussi, le
progrès technique résulte d'une activité délibérée et volontaire de Recherche et
Développement (R&D). Il n'existe dès lors aucun risque d'épuisement des idées et le taux
de croissance de l'économie peut demeurer positif à long terme. C'est alors que les
interventions de l'Etat dans l'économie peuvent influencer le taux de croissance à long
terme; ces modèles tendent ainsi à réhabiliter le rôle de l'Etat dans l'activité économique.
Dans le cadre des théories de la croissance endogène, le premier modèle de croissance
endogène est formulé par P. ROMER en 1986 et constitue un approfondissement des
derniers travaux sur la croissance en insistant sur l'importance de l'investissement dans le
processus de croissance. Dans ce modèle, la croissance peut se poursuivre indéfiniment du
fait que le rendement des investissements réalisés dans les biens capitaux ne diminuent pas
32
nécessairementL au fur et à mesure que 1' économie se développe. Le second modèle de
nouvelle vague est élaboré par LUCAS, R. en 1988 et privilégie 1' accumulation de capital
humain effectuée par les individus. Dans le même sens, G. Becker et al. (1990) reprennent
la croissance démographique, source exogène de croissance économique chez SOLOW et
considèrent que les économies ont intérêt à limiter la progression de la population afin de
lui garantir un meilleur niveau de capital humain et pouvoir ainsi soutenir un processus de
croissance durable. Peu après, P. ROMER élabore en 1990 un nouveau modèle où il
examine le rôle de l'innovation technologique et des dépenses en matière de recherche et
développement (R & D) dans la croissance. Ce modèle considère une économie
multisectorielle où le capital n'est pas homogène, mais provient de différentes générations
d'inputs. Les nouveaux inputs fabriqués avec des rendements croissants permettent
d'améliorer la productivité du secteur de biens finals et d'accroître l'efficacité globale de
l'économie. Par ailleurs, dans ses travaux effectués en 1990 et en 1991, R. BARRO
construit des modèles de croissance endogène en mettant en exergue le poids des
infrastructures publiques dans le processus de croissance. Dans ces modèles, les biens
publics permettent d'améliorer la productivité des agents privés et d'accroître les processus
de croissance.
Le prolongement de ces modèles conduit aujourd'hui aux modèles de diffusion
technologique qui postulent que par l'imitation les pays moins développés peuvent utiliser
les découvertes des pays plus avancés, ce qui est moins coûteux que l'innovation
technologique.
En définitive, les théories de la croissance endogène font une rupture avec les théories
traditionnelles de croissance dans l'analyse des facteurs de croissance. Elles formalisent la
croissance comme un processus auto-entretenu qui peut être influencé par des rendements
33
constants et par divers types d'extemalité comme l'innovation technologique, les biens
publics, les connaissances et les idées.
2.2. CONCEPT ET ROLE DU CAPITAL HUMAIN DANS LA CROISSANCE ECONOMIQUE
Selon l'encyclopédie économique, le capital humain est le stock des capacités humaines
économiquement productives (BEHRMAN, J.R. et TAUBMAN, P.J. , 1984). Ces capacités
sont créées ou produites par la combinaison de capacités innées et d'investissements dans
les êtres humains. Cette définition souligne l'intérêt du capital humain en tant que facteur
de croissance économique, mais elle ne montre pas explicitement le processus de son
accumulation. En ce sens, NGUYEN et SCHWAB (1999) définissent le capital humain
comme " le niveau de qualification de la main-d'oeuvre, ce niveau résultant de
l'accumulation d'expériences et de savoir-faire, tant dans le système scolaire que
professionnel". Bien que ces définitions privilégient l' expérience issue de l'éducation dans
la formation du capital humain, il faut noter que ce dernier résulte également d'autres types
d'investissements en ressources humaines: amélioration de la santé et de l'alimentation,
réduction de la fécondité (PSACHAROPOULOS, 1988). En fait, l'éducation conditionne
et favorise ces autres types d'investissement, ce qui la rend centrale dans l'accumulation du
capital humain. L'introduction de la notion de capital humain dans la théorie économique
résulte de l'idée selon laquelle le volume du produit créé par une unité économique dépend
de l'état des connaissances et de la technologie. En effet, le volume de la production
dépend de l'efficacité avec laquelle les facteurs de production sont combinés et
transformés (BEHRMAN, J.R. et TAUBMAN, P.J., 1984). C'est alors que la trilogie de
facteurs de production - Travail humain, Ressources naturelles (la terre), et Capital
physique - se trouve dépassée et les économistes ont commencé à rechercher un cadre
34
économique aux décisions liées à l'investissement dans les êtres humains. En fait, c'est
depuis 1960, que THÉODORE SCHULTZ, GARY BECKER et JACOB MINCER ont
introduit l'idée que les hommes investissaient eux-mêmes pour accrroître leur stock de
capital humain (BEHRMAN, J. R. et TAUBMAN, P.J., 1984). Cette idée a été appréciée
avec intérêt par les économistes et un nombre important d'études y ont été consacrées
avant le ralentissement de la recherche dans ce domaine dans les années 70
(PSACHAROPOULOS, 1988). Ce n'est que dans la deuxième moitié des années 80 que
les travaux sur le rôle du capital humain ont été repris en même temps que la relance de la
recherche sur la croissance. L'intérêt et l'importance de l'investissement en capital humain
dans la promotion de la croissance peuvent se résumer en cette phrase de MING AT ( 1984)
cité par PSACHAROPOULOS (1988): "L 'investissement en infrastructures et en capital
physique n'atteindra pas tout son potentiel si on n'a pas investi dans les personnes qui sont
de façon ultime responsables du fonctionnement efficace du capital physique".
Les résultats empiriques portant sur le rôle du capital humain dans la croissance dans
divers pays ne permettent pas de conclure à la nature de son effet. Cet effet varie selon le
niveau de développement du pays ou selon les politiques économiques qui y ont été mises
en oeuvre.
LAU, JAMISON et LOUAT (1991) ont estimé une fonction de production Cobb-Douglas
utilisant les différences premières des logarithmes des variables, pour raison de
stationnarité. L'étude porte sur un panel de 58 pays situés dans 5 régions du monde en
développement et recherche une estimation des effets du niveau d'éducation sur la
croissance. Il ressort que le capital humain a un effet négatif en Afrique et dans le Moyen-
Orient et un effet non significatif en Asie du Sud et en Amérique latine. C'est seulement en
Asie de 1' est que 1' éducation a un impact positif et significatif. Ces résultats peuvent être le
35
reflet du fait que l'action du capital humain sur la croissance peut dépendre du niveau de
développement économique du pays. Le rapport de la Banque Mondiale sur le
développement dans le monde a noté la faible importance de l'éducation dans l'explication
de la croissance au niveau macroéconomique dans les pays en développement (Banque
Mondiale, 1995).
BARRO ( 1991) a regressé les revenus par tête des pays de son échantillon d'étude sur un
ensemble de variables avec le taux d'inscription au premier cycle de l'éducation secondaire
comme variable mesurant le capital humain. Ses estimations ont montré que le niveau
initial du capital humain est l'un des déterminants significatifs de la croissance.
PYO (1995) procède à une estimation empirique à partir de données en séries temporelles
relatives aux Etats-Unis et à la république de Corée. Le capital humain est capté par le total
des dépenses investies dans la formation du capital humain sous forme de subventions ou
de dépenses en éducation. Bien que l'effet du capital humain sur la croissance soit positif
et significatif pour les deux pays, 1' auteur fait remarquer que dans le cas de la Corée
comme dans celui des pays en développement, le capital humain joue plutôt un rôle
d'accumulation de ressources pour complèter le capital physique et le travail. Il ne génère
donc pas d'extemalités comme le supposent les modèles de croissance endogène.
..
PRITCHETT (1996) a fait une analyse des facteurs de croissance à partir des données de
panel couvrant 91 pays. Ses résultats montrent que l'accumulation du capital humain
mesurée à 1' aide des données relatives à 1' éducation a un important effet négatif et
significatif sur la croissance de la productivité. Il donne trois explications possibles à ce
résultat: 1) l'éducation ne crée pas véritablement du capital humain; 2) les rendements
marginaux de l'éducation baissent rapidement et en même temps la demande de main-
d'oeuvre qualifiée est quasi-constante; 3) un environnement institutionnel défavorable
36
aurait empêché la main-d'oeuvre qualifiée de servir dans les activités qui promeuvent la
crOissance.
BERTHÉLEMY et al. (1997) ont contribué à l'analyse du rôle du capital humain dans la
croissance en utilisant des données de panel relatives à 83 pays et à six périodes de 5 ans,
de 1960-1965 à 1985- 1990. La justification de leur étude est liée au constat qu'aucune
validation économétrique basée sur données de panel n'avait encore été faite au sujet de
1'hypothèse selon laquelle le capital humain contribue à la croissance. Dans une première
estimation du modèle de Solow augmenté, ces auteurs ont abouti à un effet négatif du
capital humain sur la croissance. En introduisant alors une variable explicative qui rend
compte de l'évolution de la politique commerciale (mesurée par le taux d'ouverture
commerciale), les résultats s'améliorent sans toutefois changer le signe de la variable
représentant le capital humain. Se référant alors à l'analyse de Gould et Ruffin. (1995), ils
notent que le régime commercial influence la capacité d'une économie à mobiliser son
capital humain au profit de la croissance. Cette argumentation semble se confirmer du fait
des coefficients positifs de la variable de régime commercial et de la variable d'interaction
entre capital humain et régime commercial. En définitive, il ressort de leur étude que le
capital humain peut exercer un effet positif sur la croissance, mais cet effet dépend de la
capacité de l'économie à canaliser ses ressources humaines dans des activités génératrices
de progrès technologique par l'innovation ou par l'imitation.
SACERDOTI et al. (1998) partent des résultats de certaines études qui montrent que la
relation positive entre les taux d'inscription scolaire et la croissance ne devraient pas faire
conclure que le capital humain contribue positivement à la croissance, 1'inscription scolaire
étant très faiblement corrélée avec l'accumulation de capital humain. L'objectif de leur
étude est de rechercher les facteurs qui influencent la croissance économique dans 9 pays
37
d'Afrique de l'Ouest et de calculer des séries de données relatives à l'accumulation du
capital humain pour chacun de ces pays. A partir d'une méthodologie d'analyse comptable
des sources de la croissance, ils trouvent qu'une augmentation du capital physique, surtout
de l'investissement privé, contribue énormément à accroître le PIB par travailleur. Mais
l'impact de l'accumulation du capital humain n'est pas significatif. Ce résultat amène à se
demander comment des avantages élevés résultant d'un plus haut niveau d'éducation
n'auraient qu'un faible impact ou même un impact négatif sur la croissance du produit par
travailleur. Ils ont poursuivi leur analyse en construisant des modèles où des facteurs
spécifiques aux pays, comme les variables de chocs exogènes ou de politique économique,
sont pris en compte. Ils identifient les termes de 1' échange, le degré d'ouverture
commerciale, le déficit public et la part de l'investissement public dans 1'investissement
total comme étant les principaux composants des effets spécifiques. Ils en déduisent donc
que, pour avoir un impact significatif sur la croissance, l'éducation devrait être
accompagnée de la mise en oeuvre de réformes structurelles qui favorisent ses rendements
sociaux. Ils recommandent que les politiques économiques doivent alors viser la création
d'un environnement économique favorable à l'investissement privé et à l'utilisation
effective des compétences des travailleurs.
RAMON, L. et al. (1998) ont fait remarquer qu'aucun pays n'a connu un développement
soutenu sans avoir véritablement investi dans le capital humain. Mais, les faits ont aussi
montré que certains pays ont adopté de bonnes politiques d'éducation sans pour autant
emegistrer par la suite de bons résultats en terme économique. Face à ce contraste entre
faits empiriques et résultats théoriques relatifs au rôle du capital humain dans la croissance,
leur étude a essayé de répondre à la question de savoir quand et comment 1' éducation peut
engendrer des effets remarq~ables dans l'économie. L'étude a fait ressortir deux facteurs
38
explicatifs: la distribution de l'éducation et les politiques économiques mises en oeuvre.
Ainsi, à partir des données de panel sur un ensemble de 12 pays d'Asie et d'Amérique
Latine et sur la période 1970 - 1994, ils ont recherché les liens entre l'éducation, les
réformes économiques et la croissance économique. Leurs résultats sont concluants.
D'abord, une distribution très inégalitaire de l'éducation entre les travailleurs tend à avoir
un impact négatif sur le revenu par tête dans la plupart des pays. Lorsqu'on utilise un
modèle où la distribution de l'éducation est contrôlée, l'éducation moyenne a un effet
positif et significatif alors que si on ne tient pas compte de la distribution, 1'effet est non
significatif ou même négatif pour certains pays. Par ailleurs, l'effet des réformes sur
1' impact de 1' éducation sur la croissance est saisi dans le modèle à 1' aide du coefficient
d'une variable d'interaction, produit de la variable d'éducation moyenne et de la variable
muette de réformes économiques qui prend la valeur 1 pour les années ou des réformes
sont mises en oeuvre et la valeur 0 pour les années sans mesure spécifique. Ensuite, les
résultats montrent que les politiques économiques qui suppriment les forces du marché
tendent à réduire l'impact du capital humain sur la croissance. L'investissement en capital
humain ne peut avoir qu'un faible effet sur la croissance à moins que l'éducation soit
acquise et utilisée sur des marchés ouverts et compétitifs. L'environnement économique
peut aussi aider à améliorer la qualité des effets de 1' éducation; il peut induire une
meilleure distribution de l'éducation et permettre ainsi de mieux accroître le bien-être.
VERNER (1999) a utilisé une série de données relatives aux travailleurs d'une part et à
leurs entreprises respectives d'autre part pour estimer une fonction de production et des
équations de salaires dans le cas des entreprises ghanéennes. Elle a utilisé les données de
l'enquête menée en 1994 par le "Regional Program on Enterprise development (RPED)"
et qui porte sur un échantillon de 215 entreprises manufacturières, des micro aux plus
39
grandes entreprises. Le modèle utilisé pour expliquer le salaire et la productivité est donné
par l'équation suivante: ln Y= "LI p + "L F 8

Y, la variable dépendante, est un vecteur à deux composantes: le salaire (w) et la
productivité (v).
Les variables explicatives F sont des caractéristiques de la firme, ce sont des facteurs de
demande de main-d'oeuvre; Les variables explicatives I sont des caractéristiques des
employés et constituent des facteurs d'offre de main-d'oeuvre. p et 8 sont des coefficients
mesurant les impacts marginaux des variables explicatives (I et F respectivement) sur les
salaires et la productivité.
Cette approche lui a permis de mesurer non seulement l'impact marginal de différentes
caractéristiques (aussi bien des travailleurs que de la firme) sur les salaires, mais aussi de
comparer comment ces caractéristiques affectent la productivité dans divers groupes de
travailleurs. Les résultats essentiels peuvent être résumés comme il suit:
Les femmes sont moins payées que les hommes dans les entreprises sans que cette
différence de salaire ne soit le reflet d'une différence de productivité.
Plus les travailleurs possèdent une formation et une éducation élevées, plus leurs
salaires sont élevés et plus grande est leur productivité.
Les différences de productivité sont distinguées pour cinq niveaux d'éducation. Les
écarts de productivité sont plus importants que les écarts de salaires pour ces différents
niveaux d'éducation, ce qui montre que les salaires ne sont pas rigoureusement indexés
sur la productivité.
La formation des travailleurs à l'extérieur de l'entreprise, par opposition à la formation
interne, engendre des salaires plus élevés sans avoir un impact notable sur la
productivité.
40
Elle en conclut que même dans le court terme l'investissement en capital humain améliore
la productivité.
NGUYEN et SCHWAB (1999) ont testé empiriquement le rôle du capital humain dans la
croissance en spécifiant deux modèles de croissance. Le premier explique le PIB par le
stock de capital physique et la population active (quantité de main-d'oeuvre); le second
ajoute à ces deux variables une autre représentant le capital humain mesuré par le nombre
d'actifs du pays qui ont fait des études de premier et second cycles du collège. Les modèles
sont spécifiés en log-linéaire et ont été estimés pour quatre nouveaux pays émergents
d'Asie: l'Indonésie, la Malaisie, les Philippines et la Thaïlande; les analyses ont été aussi
faites pour la république de Coréé et le Singapour pour servir de comparaison.
Les résultats d'estimation du premier modèle révèlent le rôle positif du capital physique et
de la population active dans la détermination du niveau de la production. Toutefois, la
Thaïlande fait l'exception avec un coefficient négatif mais non significatif pour la
population active. L'ajout de la variable de capital humain a amélioré les coefficients de
détermination R 2 . Le capital humain a des coefficients positifs dans la plupart des cas, ce
qui implique qu'une augmentation du nombre total de personnes ayant fait des études de
premier ou deuxième cycles engendre une augmentation de la production. Mais ces
coefficients ne sont pas significatifs, ce qui remet en cause 1' effet précedemment évoqué.
Pour ces auteurs, la non prise en compte de l'accumulation d'expériences et de savoir-faire
durant le parcours professionnel pourrait être à l'origine des coefficients non significatifs
obtenus pour le capital humain dans l'explication de la croissance dans les pays en
développement. Ils observent en effet que l'apprentissage par la pratique est prédominant
et que l'estimation du rôle du capital humain devrait prendre en compte ce fait. Dans le cas
de la Thaïlande, le coefficient du capital humain est négatif et significatif à 10%. Cela peut
41
s'expliquer par le fait que, dans les pays où le stock de capital humain est relativement
faible (comme le cas de la Thaïlande), il peut avoir des coûts fixes élevés dans la
production du capital humain et il existerait d'importants coûts d'opportunité dans
1' acquisition de 1' éducation. Aussi, dans les pays sous-développés, les travailleurs éduqués
travaillent généralement dans un environnement peu favorable, où ils ne peuvent pas
optimiser leurs capacités.
Au total, ces travaux ont permis de saisir les conditions dans lesquelles le capital humain
peut être utilisé pour contribuer effectivement à la croissance économique. Mais, puisque
l'économie évolue dans un contexte mondial, les politiques commerciales peuvent aussi
jouer un rôle important dans la croissance économique.
2.3. STRATEGIES COMMERCIALES ET CROISSANCE
Depuis le début des années 80, une nouvelle approche interventionniste liée à l'échange
international s'est développée. Cette approche s'écarte de l'approche traditionnelle en
examinant les cas d'économies d'échelle et de protection avec externalités (Krugman,
1990). Concernant le cas d'économies d'échelle, l'échange international ne se fait plus
dans des conditions de concurrence pure et parfaite. Le marché mondial ne permet la
coexistence que d'un nombre très réduit d'entreprises qui tirent des bénéfices au-delà du
coût d'opportunité du capital du fait qu'elles ont le pouvoir suffisant pour fixer les prix. Un
pays de taille relativement importante peut alors s'assurer d'une part importante de ce
bénéfice en subventionnant les exportations ou en protégeant l'industrie concernée pendant
une certaine période pour lui permettre de réaliser des économies d'échelle (théorie des
industries naissantes) (Krugman, P., 1996). Cet argument semble peu réaliste dans les pays
en développement. Le deuxième cas, plus pertinent pour les pays en voie de
42
développement, stipule que certains secteurs d'activités engendrent des effets
d'apprentissage et d'accumulation de savoir technique plus que d'autres. De tels effets
externes ne peuvent pas se produire par le seul libre jeu des mécanismes du marché local
exposé à la concurrence du marché international. Sans intervention de l'Etat, la production
locale de savoir technique sera sous-optimale. Dans ces conditions, une certaine protection
tarifaire et des subventions à 1' exportation deviennent des mesures salutaires pour le bien-
être de la nation. En fait, l'idée de ce deuxième argument suggère que les activités
porteuses de grandes externalités ainsi que la formation de capital humain devraient
recevoir l'aide de l'Etat sous forme de subventions ou de protection temporaire.
Mais pour les pays en voie de développement, quel type de stratégie commerciale faut-il
suivre pour accélérer la croissance? Avec la forte avancée de l'incontournable phénomène
de la mondialisation des économies, il semble évident qu'aucune économie ne peut plus
survivre et se développer en adoptant des politiques de protection et donc la libéralisation
s'impose. Que peut-on alors attendre de la libéralisation d'une économie sur sa croissance
et comment cela peut-il être rendu possible?
RODRICK, D. (1992) pose la question de la stratégie commerciale en terme de ses effets
sur la productivité. Il fait remarquer que la croissance de la productivité totale des facteurs
constitue un facteur de croissance beaucoup plus important dans les pays développés que
dans les pays en voie de développement, expliquant en moyenne la moitié de la croissance
dans le premier groupe de pays contre moins du tiers dans le second. Le rôle de la
productivité dans l'explication de la croissance se voit ainsi réduit pour les pays en voie de
développement. La question de la stratégie commerciale étant ainsi bien posée, sans que
l'auteur n'y apporte d'éléments de réponse, LAHOUEL (1996) a examiné, un peu plus
tard, la relation stratégie commerciale/productivité. Pour lui, les arguments en faveur de la
43
libéralisation des échanges comme moyen d'amélioration de la productivité sont au moins
au nombre de quatre:
1) D'abord la concurrence étrangère est de nature à enlever aux producteurs locaux,
surtout lorsque le marché local est exigü, la position oligopolistique ou même
monopolistique qui n'encourage pas la mise en œuvre de moyens permettant de réduire
les coûts et d'améliorer la productivité. En fait, en l'absence de concurrence, les
entreprises ne se trouvent pas obligées d'allouer des ressources suffisantes à la
réduction de leurs coûts. Dans ce cas, la libéralisation des importations peut les inciter
à améliorer leur productivité afin de conserver leur part de marché, leurs produits étant
alors en compétition avec ceux de l'étranger.
2) Le deuxième argument est relatif à l'instabilité macroéconomique qm serait un
phénomène spécifique à une économie où la politique est répressive à l'égard des
échanges extérieurs. Une telle politique vise un rationnememt des exportations de
produits intermédiaires et de biens d'équipement. Ce qui tend à réduire l'utilisation de
la capacité de production installée, surtout lorsque le marché local est très limité, et à
perpétuer ainsi l'inefficience. La libéralisation du régime commercial serait dans ces
conditions un moyen d'éviter ces goulots d'étranglement et de promouvoir la
productivité des facteurs locaux. Cet argumemt semble bien soutenable même si pour
certains auteurs comme RODRICK (1992), les goulots évoqués résultent d'une
mauvaise gestion macroéconomique conduisant à une demande excédentaire de biens
importés et ne sont donc pas le résultat du régime commercial.
3) Le troisième argument prend en compte les économies d'échelle dont bénéficieront les
entreprises produisant des biens pour lesquels le pays a des avantages comparatifs qui
seront exploités avec la libéralisation. Cet argument tient surtout dans les pays où le
marché local est assez restreint.
44
4) Le quatrième motif concerne la circulation d'idées, de biens et de nouvelles méthodes
de gestion qui s'opère grâce à 1' ouverture sur le marché extérieur. Cet argument des
nouvelles idées suceptibles d'accompagner les échanges extérieurs a été explicité par
LEWIS, A. en 1955, ARROW (1962) et SHESHINSKY (1967) bien avant la
formulation de la théorie de la croissance endogène. Pour LEWIS, A., "Les nouvelles
idées seront admises plus rapidement dans les sociétés où les gens sont habitués au
changement... Un pays qui est isolé est, par contre, moins amené à absorber
rapidement de nouvelles idées . . ." (LAHOUEL, 1996). En fait, en dehors des flux de
marchandises, l'assimilation de la technologie et de meilleures méthodes de gestion
serait facilitée par le contact accompagnant les échanges.
Par ailleurs, dans le cadre des travaux empiriques portant sur l'importance de la
libéralisation commerciale dans la croissance, BROCHART (1984) s'est intéressé à la
relation qui existe entre le taux de croissance du PIB et celui des exportations. Il a étudié le
cas spécifique des pays africains de la zone franc à l'aide d'une analyse transversale, puis à
partir d'une analyse en série temporelle sur la période 1962- 1979. Les résultats se sont
révélés décevants car une corrélation significative et positive n'est observée que dans trois
cas sur douze: la Côte d'Ivoire, le Sénégal et le Cameroun qui sont les pays les plus
développés de la zone. L'auteur fait alors trois remarques liées aux caractéristiques des
économies des pays étudiés. Premièrement, le taux de croissance des exportations
nécessaires pour obtenir un certain taux de croissance économique est d'autant plus faible
que le taux d'exportation du pays, c'est à dire le degré d'ouverture sur l'extérieure, est
élevé. Deuxièmement, les effets de la croissance des exportations sur la croissance
économique semblent dépendre de la structure des exportations; ces effets sont d'autant
plus importants que la structure des exportations est plus concentrée (forte homogénéité
45
des exportations). Troisièmement, l'effet de la croissance des exportations est d'autant plus
fort que l'évolution des termes de l'échange est plus favorable.
Dans le même cadre, TYBOUT (1990) a exploité des données micro-économiques
d'entreprises pour analyser la relation entre l'orientation commerciale d'une entreprise
(marché extérieur-marché local) et la croissance de sa productivité. Il a abouti à la
conclusion que les entreprises tournées vers 1' extérieur connaissent des gains de
productivité plus importants que ceux travaillant pour le marché local. Dans son étude, le
régime commercial est saisi par le taux de croissance des exportations.
S'inspirant de la littérature de la croissance endogène, DE MELO et ROBINSON (1990)
ont construit un modèle qui intègre les extemalités de la libéralisation dans des modèles
calculables d'équilibre général. Son étude a porté sur la Corée du Sud. Il a considéré deux
types d'extemalités, l'un associé à l'expansion des exportations, l'autre à l'expansion des
importations de biens d'équipement. Il trouve que 1' orientation vers les marchés extérieurs
augmente la productivité des facteurs dans les secteurs des industries manufacturières
légères et lourdes, mais n'affecte pas de manière directe les deux autres secteurs,
1' agriculture et les services. On peut alors comprendre que les effets bénéfiques liés au
transfert de technologie et saisis par les importations de biens d'équipement se traduisent
par une amélioration de la productivité ou de l'efficience du capital. L'auteur souligne que
dans ce modèle, 1'optimum de croissance ne peut être atteint en présence de ces
extemalités que si 1'Etat subventionne les exportations et les importations de biens
d'équipement. Il faut noter que dans le modèle de De Melo et Robinson, les mécanismes
par lesquels la libéralisation du commerce extérieur génère les extemalités ne sont pas bien
explicités.
46
FOSU, A.K. (1990) s'est intéressé au cas spécifique des pays africains car selon lui ces
pays ont plus de similitudes entre eux qu'avec les autres pays en développement aussi bien
du point de vue culturel que de la structure de l'économie. Il a analysé l'influence des
exportations sur le taux de croissance annuel moyen du PIB de 28 pays africaines sur la
période 1960-1980. Il a utilisé une fonction de production comprenant le travail, le capital
et les exportations. Ses résultats ont montré que la croissance des exportations a un impact
positif et significatif sur la croissance économique.
Pour DODARO, S. (1991), l'impact de la politique de promotion des exportations sur la
croissance dépend aussi bien du niveau de développement des pays que de la structure de
leurs exportations. En utilisant une analyse transversale sur tous les pays les moins avancés
pour lesquels les données sont disponibles, il a trouvé que la croissance économique est
fortement corrélée avec la part des produits manufacturiers dans les exportations totales,
puis avec le degré de transformations des produits primaires.
GUILLAUMONT, P. (1994) a introduit la politique d'ouverture commerciale dans un
modèle explicatif de la croissance. Ce modèle a été testé sur un échantillon de 40 pays en
développement non exportateurs de pétrole sur les périodes 1970-1981 et 1973-1985. Les
facteurs explicatifs de la croissance du PIB retenus sont : la croissance de la main
d'oeuvre, le taux d'investissement, la croissance des exportations et l'instabilité des
exportations pondérées par le taux d'exportations. Ce modèle suppose que la productivité
des facteurs est influencée à la fois par la croissance des exportations et par l'instabilité des
exportations dont les effets sont supposés dépendre de la politique d'ouverture extérieure.
Les estimations du modèle ont confirmé l'hypothèse d'un effet positif de la croissance des
exportations et d'un effet négatif de leur instabilité. Le modèle explique 63 à 73 % des
47
différences de croissance entre les pays en développement de l'échantillon. Pour l'auteur,
"La politique d'ouverture paraît exercer une influence favorable à la croissance selon trois
modalités principales : elle agit à travers les taux d'investissement (mais cet effet est peu
sensible sur l'échantillon considéré); ensuite, elle élève la croissance des exportations
génératrices d'économies externes; enfin et surtout, elle améliore les réactions à
l'instabilité des exportations".
KELLER (1997) a analysé le mécanisme par lequel les économies bénéficient de la
recherche-développement menée à l'étranger. Il s'est préoccupé de savoir le degré dont un
pays donné bénéficie des importations de biens intermédiaires incorporant les nouvelles
technologies, cette variable étant prise comme proxy de l'investissement étranger dans la
R&D. Il a cherché les liens entre les changements technologiques, le commerce et la
croissance de la productivité. Il a utilisé les données relatives à 1' activité industrielle de
huit pays de l'OCDE (la Suède et les pays du G7) entre 1970 et 1991 pour estimer un
modèle établissant les liens entre le commerce et la croissance. Ses résultats montrent
plusieurs externalités provenant de l'activité deR & D menée à l'étranger. Premièrement,
les effets de la R & D étrangère sur la productivité varient substantiellement, dépendant du
pays qui a mené les activités deR & D. C'est dire que la qualité de la nouvelle technologie
varie d ' un pays à l'autre. Deuxièmement, l'effet de la R & D domestique d'un pays est
plus importante que celui de la R&D menée en moyenne par un pays étranger ; ce qui
s'explique par le fait que les réalisations technologiques locales sont mieux adaptées aux
besoins et compétences locaux. Mais, l'auteur fait la réserve que certains mécanismes
alternatifs comme l'investissement direct étranger devraient être pris en compte lorsqu'on
estime les effets du commerce international dans la diffusion internationale de la
technologie. Troisièmement, la structure des importations d'un pays n'affecte pas signi-
48
ficativement le degré dont il bénéficie de la R & D étrangère. Entre autres raisons à ce
phénomène, l'auteur argumente qu'il serait prioritairement dû à la présence d'un effet
général d'externalité provenant des investissements en R & D étrangers. Cet effet ne serait
pas lié au commerce international mais serait transmis par d'autres mécanismes tels que
l'investissement direct étranger. Ses résultats montrent en effet que le commerce
international ne contribue à l'effet total des investissements en R & D étrangers qu'à
hauteur de 20%.
Au total, avec les nouvelles théories de la crOissance endogène et du commerce
international, l'accent n'est plus mis sur les effets statiques de la libéralisation qui sont des
effets de réallocation des ressources entre facteurs. Sont alors privilégiés les effets
dynamiques externes d'apprentissage, d'imitation, d'assimilation ou de maîtrise de la
technologie et, plus généralement, des effets liés à des gains de productivité.
2.4. ENVIRONNEMENT MACROECONOMIQUE ET CROISSANCE
Un environnement macroéconomique stable se caractérise par un niveau faible et constant
d'inflation, un déficit budgétaire soutenable, un taux de change adéquat et un poids
d'endettement soutenable. La stabilité constitue un signe pour le secteur privé en ce qui
concerne le bon suivi de la politique économique et la qualité de ses objectifs, la crédibilité
des autorités politiques à gérer efficacement l'économie. Alors, en facilitant les décisions
d'investissement et de planification, la stabilité macroéconomique encourage l'épargne et
l'accumulation privée du capital. En fait, un taux de change approprié, une structure
adéquate de prix relatifs sont des conditions nécessaires à un environnement stable. Dans
les études empiriques, la qualité de l'environnement macroéconomique est saisie à l'aide
du taux d'inflation, des variations du taux de change réel, des poids du déficit budgétaire et
49
du secteur extérieur. Par contre, face au boom de l'huilerie, le Nigéria a attendu jusqu'à la
dévaluation imposée par une crise ultérieure. Ces différences de réaction aux chocs
économiques ont conduit l'Indonésie, qui était plus pauvre que le Nigéria en 1960, à
surpasser le Nigéria au début des années 80 en terme de PIB par tête et de la structure des
exportations.
Par ailleurs, la Côte d'Ivoire et la Malaisie sont toutes deux dotées de riches terres
agricoles et de minerais. Entre 1961 et 1970, le PIB de la Côte d'Ivoire a cru à un taux
d'environ 12,4% par an, ce taux qui est bien plus élevé que celui des économies
nouvellement industrialisées d'Asie. Entre 1965 et la fin des années 70, les PIB par tête des
deux pays ont cru au même taux. Mais, depuis la fin des années 70, alors que la Malaisie a
continué à croître, le PIB par tête a chuté en Côte d'Ivoire. Cette chute a été engendrée par
une indiscipline fiscale et un système de taux de change rigide (effet de l'appartenance à
une union monétaire). Une appréciation du taux de change réel intervenue en Malaisie au
début des années 80 a causé un déclin temporaire qui a été corrigé en quelques années.
Aussi, la Malaisie a encouragé l'investissement et les exportations à travers les taxes et
subventions .
Dans le troisième groupe, les politiques adoptées en Thaïlande ont créé un environnement
économique qui a permis au secteur privé de se focaliser sur l'investissement. Au Ghana et
en Tanzanie, par contre, l'investissement privé était découragé par des contrôles directs. La
Thaïlande a maintenu une politique de taux de change stable avec seulement une
dépréciation progressive de sa monnaie par rapport au dollar d'environ 15% vers la fin des
années 80. Cependant, le Ghana et la Tanzanie ont expérimenté de fortes variations de leur
taux de change: dans la première moitié des années 80, leurs monnaies ont connu une
appréciation de plus de 100% et à la fin de la même décennie, les taux de change sont
tombés à moins de 10% de leurs niveaux extrêmes antérieurs.
51
de la dette extérieure par rapport au produit intérieur brut. Selon FISCHER (1993),
l'inflation est un bon indicateur de la crédibilité et de la capacité d'un gouvernement à
gérer correctement 1' économie nationale.
L'impact de l'inflation sur la croissance est souvent mal saisi dans les modèles
macroéconomiques. D'après l'effet Tobin-Mundell, un taux anticipé d'inflation contribue à
faire baisser le taux d'intérêt réel et provoque alors un ajustement de portefeuille de l'actif
monétaire réel à 1' actif physique en capital. Ce qui induit une augmentation du volume de
l'investissement et donc une coissance plus élevée. Cependant, dans le cas des pays en
développement où le marché financier est peu développé ou inexistant, l'ajustement de
portefeuille se fait de l'actif monétaire à l'actif en titres étrangers ou en devises. Par
conséquent, une forte inflation anticipée devrait réduire l'investissement privé et donc
peser sur la croissance économique.
HARROLD, JAYAWICKRAMA et BHATTASALI (1996) partent du constat qu'entre
1961 et 1993, 1'Afrique a enregistré un taux de croissance du PIB par tête très faible
(0,3%) par rapport à celui de l'Asie de l'Est (7,4%) ou de l'Asie du Sud (4,3%). Ils ont
alors recherché les causes de cette lente croissance économique de 1' Afrique en procédant à
une comparaison des économies asiatiques et africaines. Afin de contrôler les conditions
initiales, ils ont choisi trois groupes de pays d'Asie du Sud et d'Afrique subsaharienne à
caractéristiques similaires: le Nigéria et l'Indonésie dans le premier groupe, puis' la Côte
d'Ivoire et la Malaisie dans le second et le Ghana, la Tanzanie et la Thaïlande dans le
troisième
En comparant le Nigéria et l'Indonésie, ils ont remarqué que l'Indonésie avait eu la
possibilité de réduire les effets adverses de la crise allemande en adoptant une politique
fiscale restrictive et une politique monétaire prudente. Aussi, des ajustements rapides du
taux de change en réponse aux changements du prix de 1'huile ont permis d'éviter la crise
50
Ces résultats montrent qu'une politique macroéconomique stable est une condition
essentielle à la croissance. En ce sens, le maintien d'un taux d'inflation faible par une
restriction des dépenses publiques et par l'adoption d'une politique monétaire prudente est
important. En outre, la politique du taux de change doit permettre une certaine flexibilité
afin de préserver et de promouvoir les industries d'exportation.
SACHS et WARNER (1996) ont estimé un modèle de convergence dans lequel le taux de
croissance est affecté par le gap entre le niveau du revenu d'équilibre et son niveau
courant. Il ressort que plus le niveau du revenu courant est faible par rapport au revenu
d'équilibre, plus élevé est le taux de croissance. Par ailleurs, le niveau d'équilibre du
revenu d'un pays est influencé par des variables de politiques économiques et par des
variables structurelles. Parmi les variables de politiques, se trouvent 1'ouverture
commerciale, l'efficacité du marché et le taux d'épargne nationale. Les variables struc-
turelles comprennent le niveau initial du revenu et l'abondance des ressources naturelles.
Les résultats confirment par ailleurs l'hypothèse de convergence si l'on contrôle ces deux
types de variables. Aussi, une analyse de la contribution de chaque facteur à la lente
croissance de l'Afrique identifie la faible ouverture commerciale et le faible taux d'épargne
comme les principales causes.
Pour EASTERLY, W. et LIVE, R. (1997), l'explication du différentiel de croissance entre
les pays requiert, au-delà de la compréhension du lien entre la croissance et les politiques
gouvernementales, la connaissance des raisons du choix de ces politiques. Pour eux, la
faible croissance économique en Afrique est liée au faible niveau d'éducation, à
l'instabilité politique, au faible développement du système financier, aux distorsions dans
le commerce extérieur, au fort déficit budgétaire et à l'insuffisance des infrastructures. Ces
caractéristiques sont beaucoup déterminées par la division ethnique.
52
BURNSIDE, C. ET DOLLAR, D. (1997) se sont intéressés à la relation entre l'aide
extérieure, les politiques adoptées par les gouvernements et la croissance du PNB par tête.
Leur étude a porté sur un échantillon de 56 pays en développement observés sur six
périodes de quatre ans (1970- 1993). Les résultats montrent que les politiques qui influent
le plus sur la croissance économique sont celles orientées vers la fiscalité, l'inflation et
l'ouverture extérieure. L'aide s'avère d'un impact positif sur la croissance des pays en
développement qui mènent de bonnes politiques fiscales, monétaires et commerciales.
Toutefois, il n'existe pas une influence significative de l'aide sur ces politiques.
TAKATOSHO ITO (1997) souligne que les expériences du Japon et d'autres pays d'Asie
en matière de performance économique partagent beaucoup d'aspects communs. Pour lui,
la croissance économique est un processus dynamique qui comporte plusieurs étapes. Dans
une première étape, lorsque les conditions sont normales, l'économie commence à décoller
à partir d'un état stagnant, puis accélère la croissance jusqu'à un taux à deux chiffres. Cette
accélération est généralement accompagnée de changements structurels tels que le passage
d'une économie à dominance agricole à une économie à fortes activités de transformation
manufacturière, d'abord simple, puis sophistiquée. Dans une seconde étape, le taux de
croissance ralentit, la part des produits manufacturiers dans le PIB ayant atteint son plateau
et la technologie étant à sa pointe. L'auteur fait remarquer que les modèles de convergence
saisissent bien la seconde étape et qu'une plus grande attention devrait être portée à la
première. A partir d'une analyse du miracle asiatique (décollage, croissance, changement
structurel, productivité), il souligne que l'acquisition des droits fondamentaux et l'instaura-
tion d'une stabilité sociale et politique semblent être des conditions particulièrement
importantes pour la première étape.
Pour ALBERTO ALES INA (1997), la qualité institutionnelle - mesurée par l'efficacité
bureaucratique, l'absence de corruption, la protection des droits de propriété et la règle de
53
la loi - est importante pour la croissance. Ainsi, sont essentielles la stabilité politique et les
libertés civiles et économiques. Dans les pays où les institutions sont faibles, la
consommation publique n'engendre pas généralement la croissance et se révèle même
vicieuse. En outre, dans ces pays, la consommation publique n'améliore pas les indicateurs
sociaux et ne contribue pas à la réduction de la pauvreté ou des inégalités de revenus. Par
ailleurs, étant donné que 1' aide extérieure sert essentiellement à accroître la consommation
publique, l'auteur recommande que les organisations internationales devraient exiger un
plan d'assistance technique et financière aux pays qui ne satisfont pas les conditions
minimales de qualité institutionnelle. La rupture de 1' assistance peut dans certaines
conditions améliorer la croissance et favoriser le développement social dans le moyen
terme à travers 1' incitation à la création de structures nécessaires au développement
institutionnel.
Enseignement de la revue de la littérature
En définitive, à partir des travaux évoqués dans cet aperçu de la littérature, on comprend
que le concept de croissance a évolué dans le temps et prend en compte aujourd'hui le
souci du bien-être individuel. Aussi, le capital humain défini comme le stock des
connaissances économiquement productives et incorporées aux individus, exerce des
extemalités sur certains secteurs productifs de l'économie; ce qui lui permet d'influencer le
taux de croissance. En outre, 1' action du capital humain sur la croissance dépend de la
politique commerciale du pays (VAROUDAKIS et al., 1997; SACERDOTI et al. 1998).
En ce sens, le contact avec la technologie étrangère et les idées de développement
élaborées à 1' étranger favorisent dans les pays en développement 1' émergence de
meilleures structures productives grâce à 1'imitation. Par ailleurs, la distribution de
1'éducation entre les individus est un autre facteur qui peut, en cas d'une forte inégalité,
inhiber le rôle du capital humain (RAMON et al., 1998). Enfin, il est à souligner que la
54
stabilité macroéconomique et politique est nécessaire pour inciter le secteur privé à investir
dans le capital physique et donc à accélérer la croissance.
Les modèles théoriques ne comportent qu'un nombre très limité de facteurs dans la
fonction de production macroéconomique (généralement deux) et attribuent la part non
expliquée par ceux-ci au progrès technique A. Ce qui limite fortement les actions de
promotion de la croissance économique. Il convient donc d ' élargir le cadre défini par les
modèles théoriques afin de disposer d'assez de variables que l'on peut manipuler pour
promouvoir la croissance. Aussi, 1' approche basée sur les seuls facteurs directs de
production ne permet pas de déduire des politiques d'accroissement de ces facteurs, mais
indique seulement comment les facteurs disponibles sont utilisés. Or l'accumulation accrue
des facteurs est nécessaire à la croissance économique de long terme.
Par ailleurs, si dans les travaux empiriques évoqués plusieurs variables institutionnelles
pertinentes sont mises en évidence, celles-ci ne sont pas identiques chez tous les auteurs.
Mais, pour pouvoir saisir le plus nettement possible l'effet d'une variable institutionnelle, il
est nécessaire de contrôler toutes les autres qui sont susceptibles d'influer sur la croissance.
Par exemple, l'effet des distorsions commerciales sur la croissance ne peut être bien
mesuré si on ne contrôle pas la qualité de la politique économique mise en oeuvre dans le
pays. Un modèle de croissance qui ne comporterait que quelques-unes des principales
variables a priori pertinentes conduirait à une mesure biaisée de leur effet.
L'objet de cette étude étant de rechercher, dans le contexte économique du Sénégal, les
facteurs pertinents dans 1' explication des variations de la croissance économique dans le
temps, nous pouvons à présent construire un cadre méthodologique d'analyse qui permet
d ' identifier les variables pertinentes et de mesurer au mieux leur effet net. Notre modèle
cherchera à prendre en compte dans une fonction de production les effets du capital
humain et des variables de politique macroéconomique.
55
CHAPITRE TROISIEME
ETUDE EMPIRIQUE DES FACTEURS DE CROISSANCE :
METHODOLOGIE ET RESULTATS
A partir des enseignements de la revue de la littérature sur la croissance, nous pouvons définir
un cadre d' analyse pour la recherche des facteurs de croissance économique au Sénégal.
Nous allons spécifier un modèle explicatif des variations du taux de croissance annuelle du
produit intérieur brut (PIB) par tête; nous présentons ensuite la technique d'estimation
économétrique du modèle.
3.1. METHODOLOGIE D'ANALYSE
3.1.1. Spécification du modèle d'analyse de la croissance
Pour construire le modèle économétrique permettant d'identifier les principaux facteurs qui ~
expliquent les variations du taux de croissance du PIB de 1' économie sénégalaise dans le
temps, nous adoptons un cadre théorique de fonction de production macroéconomique.
3.1.1.1. Cadre théorique d'analyse de la croissance
Nous partons de la fonction de production du modèle de Solow. Ensuite, nous prenons en
compte le capital humain, puis d'autres facteurs qui influent sur le taux de croissance à
travers la productivité globale des facteurs.
a) Modèle avec résidu de Solow
On considère une fonction de production néoclassique habituelle : Y, = A,.F(K,, L,), où At est
un indice du niveau de la technologie et représente la Productivité Globale des Facteurs.
56
Cette forme de la fonction suppose que la technologie augmente la production (neutralité du
progrès technique au sens de Hicks). En prenant les logarithmes des deux membres et en
dérivant par rapport au temps on obtient une décomposition du taux de croissance de la
production agrégée :
Multiplions et divisons le premier élément entre parenthèses parK et le second élement entre
parenthèses par L ; nous obtenons :
YI Y~ Ai A+( AÇJC}(~H A~~ }(D

Cette relation s'interprète facilement dans le contexte où le marché des facteurs est
concurrentiel. En effet, dans un tel contexte, le produit marginal de chaque facteur est égal à
son prix et donc AFk est égal au prix du capital, r et AF 1 est égal au taux de salaire w. Il en
résulte que le terme (AFd<N) représente la part du revenu national qui sert à payer le capital
investi et (AF 1.LIY) représente la part du revenu national allant aux salaires.
Cette équation suppose que les variations du taux de croissance de la production agrégée sont
expliquées par celles des taux de croissance des deux facteurs de production. La partie de la
croissance de la production qui n'est due ni à la croissance du capital physique ni à celle de la
main-d'oeuvre correspond à la croissance de la PGF. On obtient, à partir de cette équation, le
modèle économétrique suivant :
gt = c + a*kt + b*lt + Ut ( 1)
où g est le taux de croissance du PIB, c est le terme constant, k et 1 les taux de croissance
respectifs du capital physique et de la main d'oeuvre, u et le terme des résidus du modèle. A
l'issue de l'estimation de ce modèle, on peut approcher le taux de croissance de la PGF
(t.At/A) 1 par c+u 1• Dans ce modèle, la PGF est supposée exogène au sens de Solow.
57
Cependant, pour prendre en compte son caractère endogène mis en exergue par les théories
de la croissance endogène, nous considérons que la PGF est déterminée par divers facteurs
dont le capital humain. Un modèle qui isole l'effet du capital humain constitue donc une
meilleure approche de recherche des facteurs de croissance; un tel modèle réduit en effet les
éléments contenus dans le résidu du modèle ( 1 ) (résidu de Solow).
b) Modèle à capital humain
Nous considérons la fonction de production à capital humain de Romer, D. (1997). Nous
modifions ce modèle pour tenir compte des facteurs qui interviennent par l'intermédiaire de
la PGF, à la manière de Ramon (1998).
La fonction de production est donnée par :
y= VaKaHP(AL)I-a-P-a , a> 0 , ~ > 0 , a+~ < 1 , a est quelconque.
Y est la production totale obtenue à partir du capital physique ( K ), du capital humain ( H ) et
du travail effectif (AL) ; V est un vecteur de variables qui influent sur la Productivité Globale
des Facteurs en dehors du capital humain.
Posons: k = K 1 AL ; h = H 1 AL ; y = Y 1 AL et v = V 1 AL
La fonction de la production par unité de main-d'oeuvre s'écrit: y= vakahp
Alors on trouve en appliquant la règle de la différentielle totale :
,6.y = a ,6.v +a ,6.k + fJ ,6.h (2)

y v k h
~
L'équation ( 1 ) qui établit une décomposition du PIB par travailleur est modifiée pour
obtenir un modèle économétrique de croissance dont l'équation est la suivante:
ao est le terme constant du modèle, Ut le terme des résidus ;
58
TXPIBT est le taux de croissance annuelle du Produit Intérieur Brut par travailleur ;
TXCAPITALT est le taux de croissance annuelle du stock de capital physique par
travailleur ;
TXCAPHUM est le taux de croissance annuelle du capital humain.
V est un vecteur de facteurs qui peuvent influencer la PGF en dehors du capital humain.
Bien que le modèle (2) constitue une meilleure approche par rapport au modèle de Solow, il
ne prend pas en compte l'influence de la politique économique mise en oeuvre dans le pays
sur l'utilisation des facteurs de production. En effet, un environnement macroéconomique
non stable peut engendrer une utilisation inefficace des facteurs ou une sous-utilisation des
capacités de production. Il s'avère donc pertinent d'introduire des variables de politiques
économiques et d'environnement climatique dans le modèle de croissance.
c) Prise en compte de l'environnement macroéconomique dans le modèle
Il s'agit du modèle à capital humain (modèle 2) augmenté par l'introduction de variables liées
à la qualité de la politique macroéconomique, à la consommation publique, à l'ouverture sur
l'extérieur (performance des exportations pour saisir les distorsions commerciales) et aux
aléas cilmatiques. L'introduction de ces variables s'explique par l'importance qui leur est
attribuée aussi bien dans les théories de la croissance endogène que dans les travaux
empiriques examinés dans la revue de la littérature. Nous supposons que ces variables entrent
dans le modèle pour représenter la variable V de l'équation (2).
TXPIBT 1 = ao + a 1TXCAPITALT1 + a2TXCAPHUMit + a3EXPORTt + <i4QUALMACROt +

asTXCONSG1 + <l()SECHERt + Zt (3)
59
EXPORT, est la variable représentant l'ouverture extérieure;
QUALMACRO est la variable représentant la qualité de la politique macroéconomiqué;
TXCONSG est le taux de croissance de la consommation publique;
SECHER est une variable muette qui vise à saisir l'effet éventuel du climat sur la cr~issance
au cours des années particulièrement marquées par une forte sécheresse. Ici, il faut noter que
l'introduction des donées de pluviométrie n'a pas donné des résultats concluants. Ce qui nous
a conduit à préférer cette variable muette qui se révèle significative dans nos analyses. Elle
permet de saisir 1' effet des sécheresses graves sur la croissance.
3.1.1. 2. Description des variables du modèle.
La variable à expliquer: le taux de croissance du PlB par tête
La variable dépendante du modèle est le taux de croissance du PIB par travailleur. Elle est
égale à la variation relative annuelle du rapport PIE/population active. Le PIB est mesuré en
terme réel au prix constant de 1987.
Le facteur travail
Le facteur travail est représenté par la population active totale du Sénégal. La série des taux
de croissance annuel de la population active est utilisée pour estimer le modèle (1 ).
Le capital physique
Il représente une évaluation de l'ensemble des batiments et équipements productifs
disponibles dans l'économie à un moment donné. L'ensemble des dépenses engagées au
cours d'une période en vue de son accroissement constitue les dépenses d'investissement
relatives à cette période. L'investissement peut être réalisé par le secteur public ou par le
secteur privé.
60
Nous calculons une série de capital physique par la méthode de l'inventaire permanent qui
consiste à faire la somme cumulée des chiffres de l'investissement matériel brut et à corriger
le résultat par une estimation de la dépréciation du stock existant. Cette méthode suppose que
pour deux dates t et t-1, 1'investissement (It ) et le capital physique (Kt) sont liés par une
relation du type :
Ce qui signifie que le capital physique à la date t (Kt) est égal au capital physique de la date t-
1 augmenté de l'investissement I1 réalisé entre les deux dates et diminué de la dépréciation du
capital physique initial survenue au cour de la période t (8*K1_1). 8 est le coefficient de
dépréciation du capital physique.
Pour générer la série de capital physique à partir de cette relation itérative, nous prendrons
8 = 5% 3 et nous utiliserons la valeur initiale suivante pour le coefficient du capital :
L'analyse du ratio capital physique par tête (intensité capitalistique, KIL) permettra de
vérifier si dans le temps chaque actif dispose de plus en plus de capacités matérielles lui
permettant d'oeuvrer davantage à la production.
3 Cette valeur du coefficient de dépréciation a déjà été utilisée par Sacerdoti et al. ( 1998) et par Berthelemy et al.
1996.
4 Sacerdoti et al. (1998) ont déjà utilisé cette relation initiale pour l'année 1970 pour le Sénégal et d'autres pays
de 1'Afrique de l'Ouest. Notre recul jusqu'à 1960 se justifie par le fait que la relation itérative pourrait déjà être
stationnaire à partir de 1970 (convergence assez rapide de la relation itérative); ce qui rend plus robustes les
résultats obtenus pour la période 1970-1997 qui intéresse notre étude.
61
Le capital humain
Défini comme l'ensemble des compétences et qualifications détenues par les travailleurs
d'une économie, le capital humain apparaît comme un facteur qui améliore le rendement de
la main-d ' oeuvre. Il peut dès lors expliquer une partie des effets traditionnellement attribués
à la PGF. La nature multidimensionnelle de la notion de capital humain devrait conduire à
prendre en compte plusieurs variables liées à la valorisation des ressources humaines,
notamment des variables qui réflètent les efforts accomplis dans les domaines de la santé et
de l'éducation. Mais le manque de données sur longue période relatives aux dépenses
sanitaires ou à la couverture des soins sanitaires nous contraint à ne retenir que des variables
relatives à l'éducation. Nous avons pris en compte trois variables de stock. Bien que les
variables de flux (par exemple le taux de scolarisation primaire ou total) permettent de saisir
1' effet des efforts réalisés chaque année dans le cadre de la valorisation des ressources
humaines, les données ne sont pas disponibles sur longue période. Les variables de stock
montrent l'effet de tous les efforts menés jusque-là. Ces efforts ont permis d'accumuler le
capital humain dont le stock est utilisé dans l'activité économique. Il s'agit des variables :
Le nombre moyen d'années de scolarité par individu de la population active (dans le cyle
primaire et dans tout le système éducatif) ;
Un indice de qualification de la main-d'oeuvre moderne calculé à partir d'une indexation
des salaires de la fonction publique sur le niveau d'instruction. Le calcul de cet indice
suppose que le salaire constitue une bonne estimation de la productivité du capital humain
détenu par les travailleurs. La productivité est supposée croître avec le capital humain.
Cet indice a été calculé pour le sénégal par Sacerdoti et al. (1998) qui ont utilisé le salaire
distribué dans la fonction publique.
62
La variable d'ouverture extérieure
L'ouverture extérieure est saisie par l'importance des exportations dans l'économie nationale.
Pour cela, nous 1' avons approchée par le taux de contribution des exportations à la croissance
du PIB. Pour une année donnée, cette contribution est définie comme le produit du taux de
croissance annuelle des exportations par le poids des exportations dans le PIB au cours de
l'année précéente. Les exportations et les PIB sont évaluées au prix constant de 1987. Les
activités d'exportation sont censées avoir une relation positive avec la croissance économique
par le biais d'une incitation à la compétitivité internationale qu'elles créent. En effet, un
développement des activités d'exportation conduit à une meilleure allocation des ressources
selon les avantages comparatifs, permet une plus grande utilisation des capacités de
production et suscite une amélioration constante de la technologie face à la concurrence
étrangère. Ce qui est de nature à améliorer la productivité et donc à accroître le produit global
par travailleur.
Aussi, les performances à l'exportation reflètent l'ampleur des distorsions commerciales. Une
forte protéction n'incite pas les secteurs protégés à l'amélioration de leur productivité du fait
de l'absence de concurrence. Il en résulte une faible compétitivité aussi bien sur le marché
national que sur les marchés extérieurs.
Indice de qualité de la gestion macroéconomique
Une politique macroéconomique qui engendre le moins de déséquilibres macro-économiques
favorise la croissance. La qualité de la politique macroéconomique est estimée à l'aide de
trois ratios liés respectivement à la politique budgétaire, à l'endettement extérieur et à la
politique monétaire : déficit public avec dons 1 PIB, masse monétaire M2 1 PIB net du taux de
croissance de l'économie et dette totale 1 PIB. L'indice donne pour chaque année une mesure
de la qualité de la politique économique comparée à celle a été mise en oeuvre au cours des
63
___ ...__ ~
Chapitre 3: Etude empirique des facteurs de croissance:
autres années de la période 1970-1997. Une politique macroéconomique de bonn<: qualité est
caractérisée par de très faibles valeurs pour ces ratios.
En fait, Pour chacun des trois ratios, nous avons procédé à un classement des 28 années de la
période de l'étude (1971-1997) suivant l'importance de ses valeurs. A l' issue du classement,
l' année de plus faible rang (rang 1) est celle qui a enregistré la plus grande vakur du ratio
considéré; donc c' est l'année où la politique économique qui soutend le ratio a cté la moins
bonne sur toute la période. En revanche, cette politique a été la plus favorabl e w cours de
l'année qui a le rang le élevé (rang 28). Ainsi, la variable de rang issue de chaque ratio croît
avec l'amélioration de la politique économique liée au ratio. On obtient ainsi les "lariables de
rang Rdéficit, Rmonnaie, Rdette.
L'indice de qualité macroéconomique (QUALMACRO) est alors défini , pour cluque année,
par la moyenne arithmétique simple de ses valeurs pour les trois variabl es de rang Rdéficit,
Rmonnaie, Rdette5 . Cet indice est, par construction croissante avec la qualité de la politique
macroéconomique.
Les agrégats qui interviennent dans cette estimation sont mesurés au pri x constant de 1987 .
La consommation publique
Les données sur la consommation publique incluent les dépenses courantes de 'Etat et les
dépenses de fonctionnement : salaires, dépenses d'éducation, de santé et autres dépenses
sociales, dépenses d'entretien. Elles ne comprennent pas les dépenses milité1ires, ni les
dépenses en capital. Elle a été mesurée au prix constant de 1987.
5Wacziarg (1998) a utilisé une telle démarche pour construire une variable de qualité de la politique .':.
macroéconomique; il a classé les années en classes de déciles pour les trois ratios et a ensuite défini la v"ariable
de qualité comme la moyenne des rangs de chaque année pour les trois ratios. Toutefois, il faut remarquer qu ' il
convient mieux d'utiliser le taux d'inflation au lieu du ratio M2/PIB et que le déficit public hors de ns donne une
meilleure appréciation de la politique de dépenses de 1'Etat.
64
La variable d'aléas climatiques
Il s'agit d'une variable muette qui prend la valeur 1 pour les années qui ont été marquées par
une forte sécheresse et la valeur 0 les années sans particularité majeure.
Les informations relatives aux années de sécheresse on été obtenues auprès du service des
études agricoles de la Direction de la Prévision et de la Statistique du Sénégal.
Nous avons préféré l'utilisation d'une variable muette à la mesure de la pluviosité parce que
nous voulons saisir 1' effet des chocs exogènes que constituent la sécheresse. Aussi,
l'introduction des données de pluviométrie n'a pas donné des résultats concluants.
3.1.1.3. Modèles à estimer
L'analyse des déterminants de la croissance sera faite en plusieurs étapes. A chaque étape,
nous introduirons une nouvelle variable afin de tester la pertinence de diverses hypothèses
des théories de croissance dans le cas du Sénégal. On estimera ainsi les modèles suivants :
gt = c + a*kt + b*lt +Ut ( I)
où g est le taux de croissanGe du PIB, k et 1 les taux de croissance respectifs du capital
physique et de la population active. Cette estimation vise à vérifier la pertinence de la
fonction de production néoclassique pour l'économie sénégalaise.
TXPIBT1 = ao + a 1TXCAPITALTt + a2TXCAPHUMit + Vt (II)
Ce modèle est un modèle simple à capital humain.
TXPIBT1 = ao + a 1TXCAPITALT1 + a2TXCAPHUMit + a3EXPORTt + <4QUALMACROt +
a5TXCONSG1 + +3.6SECHERt Zt ( III )
65
Ce modèle recherche les facteurs qui influent sur la PGF: le capital humain, l'ouverture
commerciale, la qualité de la gestion macroéconomique, la politique de dépenses publiques et
les aléas climatiques.
Dans ces modèles, on suppose que :
<4 > 0 as> 0 <l() <0
Avec: PIBT: Produit intérieur brut par travailleur.
TXCAPITALT : taux de croissance du capital physique par travailleur.
EXPORT: Taux de croissance des exportations pondéré par leur part relative dans le
PIB.
QUALMACRO : Indice de qualité de la politique macroéconomique.
CAPHUM est la variable de capital humain ; elle sera représentée successivement par
les trois variables suivantes : HUMPRI: Nombre moyen d'années passées par un actif à
l'école primaire; HUMTOT : Nombre moyen d'années passées par un actif dans
1' ensemble du système éducatif; HUMREV : Indice de qualifiaction des travailleurs
estimé à partir de 1'évolution des salaires de la fonction publique .
TXCAPHUM : Taux de croissance annuelle de la variable CAPHUM
TXCONSG : Taux de croissance de la consommation publique ;
SECHER : La variable muette de sécheresse.
3.1. 2. Technique d'estimation du modèle

Le modèle sera estimé par la méthode des moindres carrés ordinaires sur la période 1971-
1997. Avant de procéder à l'estimation des modèles, nous vérifierons les hypothèses de base
requises par l'estimation par la méthode des MCO. Nous ferons en particulier le test de
spécification de Ramsey Reset et le test de racine unitaire.
66
3.1.2.J.Test de spécification de RAMSEY RESET
Le test de spécification de RAMSEY RESET permet de vérifier s'il y a des variables que
nous aurions dû introduire dans le modèle. L'idée du test est que si le modèle souffre de
l'omission d'une variable pertinente, alors il sera possible d'introduire une (ou plusieurs)
variable fictive dans le modèle ; ce sera une variable qui, en dehors des variables explicatives
du modèle, explique le mieux la variable dépendante. Le test consiste alors à vérifier la
significativité de l'effet de la variable fictive introduite; si elle n'est pas significative on
conclut que notre spécification est complète et prend en compte toutes les variables
pertinentes qui interviennent dans l'explication de la variable dépendante. Si par contre la
variable fictive a un effet significatif, on recherchera d'autres agrégats macro-économiques
susceptibles d'influencer les variations du taux croissance du PIB par travailleur.
3.1 .2.2. La stationnarité et le test de racine unitaire
Une autre condition requise pour l'estimation par les MCO d'un modèle utilisant des séries
qv-.t
temporelles est chacune des variables du modèle soit stationnaire. Une série temporelle
{
stationnaire est une série dont :
La moyenne est constante et indépendante du temps ;
La variance (l'amplitude des variations) est finie et indépendante du temps;
La covariance entre ses valeurs en deux instants t et t+k ne dépend pas de t, mais de la
durée k qui sépare ces deux instants.
Il en résulte qu'une série stationnaire ne comporte ni tendance, ni saisonnalité, m aucun
facteur évoluant avec le temps. L'intérêt de la condition de stationarité réside en ce que 1' effet
produit par un choc sur une série possédant une tendance ou un facteur dépendant du temps
(série non stationnaire) est transitoire. Ce choc ne peut affecter significativement la tendance,
et la série retrouve son mouvement tendanciel. Dans ces conditions, il est difficile de cerner
67
clairement l'effet d'une autre série sur les variations d'une série non stationnaire. C'est ce qui
conduit à des régressions fallacieuses (spurious regressions) pour des modèles comportant
des variables non stationnaires.
Pour vérifier la stationnarité d'une série temporelle, il est possible de se servir de la
réprésentation des fonctions d' autocorrélation simple et partielles ( corrélogramme). Mais, il
existe aussi des tests d'hypothèses qui permettent des conclusions plus précises : ce sont les
tests de racine unitaire. Ces tests ont été élaborés pour la première fois par Dickey Fuller et
ont été améliorés par la suite pour donner les tests de Dickey Fuller Augmenté (DF A). Le test
de Phillips-Perron apportent encore une amélioration au test DFA.
Les tests de racine unitaire visent à tester, pour une série Xt donnée, 1'hypothèse HO : ~ = 1
dans l'une quelconque des formes suivantes; accepter HO signifie que la série est non
stationnaire :
(1) :modèle autorégressif d'ordre 1
(2) : modèle autorégressif avec constante
Xt = ~Xt-1 + b.t + c +Et (3) :modèle autorégressif avec tendance
En posant~ - 1 = p, le test de racine unitaire consiste à tester la significativité de p dans l'un
des trois modèles suivants :
j =p
Mt= pXt- 1 - :Lt;&Mt- j +1 + é't
j=2j
j=p
Mt = px - 1 - :L t;&Mt - j +1 + c + êt
j=2
j =p
Mt= pXt- 1 - :Lt;&Mr- j +1 + bt + c + êt
j=2
(nous prendrons p= 4 )
Si p est non significatif, alors on accepte HO : p = ~ - 1 = 0 et donc ~ = 1 ; en conséquence la
série Xt est non stationnaire. Il en est ainsi si le t- statistique associé au coefficient p est
68
supérieur à la valeur critique du test pour le seuil considéré. Nous utiliserons le test de racine
unitaire de Phillips -Perron qui tient compte de 1' éventuelle hétéroscédasticité des erreurs de
la régression des modèles du test. En pratique, le logiciel E-views donne la valeur empirique
de la statistique du test ainsi que les valeurs théoriques correspondant aux seuils de 5% et 1%.
Si la valeur absolue de la valeur empirique est supérieure à la valeur théorique pour une seuil
a donné, on conclut que la série est stationnaire avec un risque de se tromper égal à a ; sinon,
la série est non stationnaire et on refait le test sur sa différence première ou sa différence
seconde, ... , jusqu'à obtenir une série stationnaire.
Lorsque les variables d'un modèle ne sont pas stationnaires, on est amené à rechercher
l'ordre de différentiation qui rend chacune d'elles stationnaire. S'il faut différencier d fois
pour stationnariser une variable X, on dit que X est intégrée d'ordre d. Dans ces conditions,
on vérifie si les variables ont le même ordre d'intégration et on cherche à spécifier un modèle
à correction d'erreur (modèle de court terme): l'estimation du modèle se fait alors par la
méthode de la cointégration.
Toutefois, les tests de racine unitaire ont montré que toutes les variables de notre modèle sont
stationnaires en niveau ; la régression peut donc se faire sans risque d'erreur liée aux
problèmes stochastiques des séries utilisées.
3.1.3. Sources des données

Les données relatives aux agrégats macroéconomiques proviennent de la base de données
macroéconomiques de la Direction de la Prévision et de la Statistique du Sénégal. Les
données relatives aux trois variables de stock représentant le capital humain proviennent des
séries calculées par Sacerdoti et al. Les données relatives au PIB de la France et des Etats-
69
Unis, au taux de change et à la masse monétaire annuelle du Sénégal sont issues des
annuaires "International Financial Statistics" du Fonds Monétaire International, édition de
1998. Les informations relatives au déficit public et à la dette publique sont collectées dans
"World Tables, 1995" et dans "African Development Indicators, 1998/1999" de la Banque
Mondiale.
La fiabilité de ces données est supposée acquise dès lors que ces sources ont toujours été
exploitées à des fins d'études économiques qui ont été concluantes. Nous pouvons donc
procéder aux différentes étapes de l'analyse empirique.
70
3.2. RÉSULTATS
Ce paragraphe présente les résultats de l'étude et procède à leur analyse économique.
3.2.1. Estimation des modèles de croissance
3.2.1.1. Analyse de la Stationnarité des Variables.
Le test de Phillips-Perron (PP)sur les séries temporelles correspondant aux variables utilisées
dans nos modèles a montré qu'elles sont toutes stationnaires en niveau. Elles sont donc
intégrées d'ordre O. En conséquence, on peut les utiliser pour estimer les modèles par la
méthode des moindres carrés ordinaires sans risque de régression fallacieuse. Le tableau 4.1
suivant montre en effet que pour chaque variable la valeur absolue de la statistique du test PP
est supérieure à la valeur absolue de la valeur critique de MCK.INNON.
Tableau 3.1.: Résultats des tests de racine unitaire sur les variables de l'étude (test PP).
Variables Statistique du Valeur critique Valeur critique Ordre
test PP (5%) (1%) d'intégration
TXPIBT -6,76 -2,97 -3,70 0
EXPORT -9,69 -2,97 -3,69 0
QUALMACRO -3,06 -2,97 -3,69 0
TXCAPITALT -5, 16 -2,97 -3,70 0
TXCONSG -4,84 -2,97 -3,70 0

•
TXHPRI -6,42 -2,97 -3,70 0
TXHTOT -7,92 -2,97 -3,70 0
TXHREV -8,29 -2,97 -3,70 0
GAPUSA -3,82 -2,97 -3 ,69 0
GAPFRANCE -3,90 -2,97 -3,69 0
SECHER -3,85 -2,97 -3,69 0
71
Chapitre 3: Etude empirique des facteurs de croissance: Méthodologie et Résultats
3.2.1.2. Analyse de la multicolinéarité entre les variables explicatives
Le tableau 3.2. suivant donne les coefficients de corrélation simple entre les variables de
l'étude, prises deux à deux. Pour que les modèles soient estimables et interprétables, il faut
qu'il n'existe pas de multicolinéarité entre les variables explicatives intervem.nt dans un
même modèle. Le test que nous utilisons consiste à vérifier que les canés je tous les
coefficients de corrélation simple relatifs aux paires de variables explicatives cl 'un modèle
donné sont inférieurs au coefficient de détermination R2 de ce modèle. La matrice des
corrélations est présentée ci-dessous:
72
Chapitre 3: Etude empirique des facteurs d e c r oissance: Méthodologie et Résultats
Tableau 3.2. Matrice de corrélations simples entre les variables du modèle de croissance
TXPIBT TXHTOT TXHREV TXHPRJ TXCONSG TXCAPITALT GAPUSA GAPFRANCE EXPORT QUALMACRO
TXPIBT 1.000000 -0.065094 0.284465 0.164568 0.112726 -0.376189 0.081171 0.1 49922 0.706372 0.156178
TXHTOT -0.065094 1.000000 -0.143 104 0.184153 -0.216263 -0.062465 0.198379 0.217770 0.034751 -0.064682
TXHREV 0.284465 -0.143104 1.000000 -0.055949 0.171867 -0.046733 0.009514 0.01 9254 0.008447 0.102510
TXHPRJ 0.164568 0.184153 -0.055949 1.000000 -0.166263 0.102802 0.103086 0.238240 0.332503 -0.198337
TXCONSG 0.112726 -0.216263 0.171867 -0.166263 1.000000 -0.102690 -0.330252 -0.334362 -0.136968 0.148706
TXCAPITALT -0.376189 -0.062465 -0.046733 0.102802 -0.102690 1.000000 -0.033075 -0.004805 -0.304289 0.044491
GAPUSA 0.081171 0.198379 0.009514 0.103086 -0.330252 -0.033075 1.000000 0.887827 -0.0 19402 -0.257994
GAPFRANCE 0.149922 0.217770 0.019254 0.238240 -0.334362 -0.004805 0.887827 1.000000 0.044510 -0.101442
EXPORT 0.706372 0.034751 0.008447 0.332503 -0.136968 -0.304289 -0.019402 0.044510 1.000000 -0.035116
QUALMACRO 0.156178 -0.064682 0.102510 -0.198337 0.148706 0.044491 -0.257994 -0.101442 -0.035116 1.000000
L'examen du tableau montre que les corrélations sont assez faibles. Mais après l'estimation de chaque modèle, nous procéderons au test de
multicolinéarité à partir des coefficients de ce tableaux et du coefficient de détermination R 2 obtenu. Si on conclut à l'absence de
multiconinéarité entre les variables explicatives d'un modèle, alors on peut interpréter les coefficients en terme d'effet "toutes choses égales par
ailleurs ''.
73
3. 2.1. 3. Résultats de l'estimation des modèles
Nous avons procédé à l'estimation des quatre modèles emboîtés spécifiés dans le chapitre
précédent. Le modèle ( I ) permet à la fois de tester le modèle de Solow et d'analyser
l'évolution de la productivité globale des facteurs (PGF). Pour chacune des trois autres
équations, nous avons procédé à une série de trois régressions qui diffèrent par la mesure
utilisée pour le capital humain. Dans la première régression ( 1 ), le capital humain
correspond à la série des nombres moyens d'années de scolarité primaire par actif; la
seconde ( 2 ) le représente par le nombre moyen d'années de scolarité qu'un actif a passé
dans tout le système éducatif; dans la troisième régression ( 3 ), il est représenté par une
mesure de la scolarité indexée sur les salaires versés dans la fonction publique. Comme
précisé dans le chapitre 3, ces trois variables de capital humain ont été définies et calculées
par sacerdoti et al. (1998). Dans les tableaux qui présentent ces résultats, nous avons écrit
pour chaque variable la valeur du coefficient estimé correspondant et entre parenthèses la
valeur de la statistique t de student associée. Un coefficient sera significatif au seuil de 5%
(resp. 10%) si la valeur absolue du t-student associé est supérieure à 1,96 (resp. 1,64). A
propos des tests diagnostics nous avons écrit la valeur des statistiques empmques
correspondants (White, Breusch-Godfrey, Ramsey Reset et Jarque-Bera ) et entre
parenthèses la probalibité critique critique de significativité associée. Les modèles sont
jugés bons par rapport à un test diagnostic donné si la probabilité critique associée est
supérieure au seuil de significativité retenu (5% dans cette étude).
a) Estimation du modèle ( 1 )
Les résultats sont
TXPIB = -2,3 + 2,54 *TXLABOR 0,71 * TXCAPTL

(-0,143) (0,502) (-0,396)
DW =2,24 F = 0,31 (0,73); Jarque-Bera = 0,38 (0,826).

R 2 =2%
74
Chapitre 3: Etude empirique des facteurs de croissance. Méthodologie e1 Résulutts
Matrice de corrélation des variables explicatives
TXPIB TXLABOR TXCAPTL

TXPIB 1
TXLABOR 0,137 1
TXCAPTL -0,123 -0,348 1
Ces résultats suggèrent qu'aucun des deux facteurs traditionnels de production n'explique
significativement les variations du taux de croissance du produit global de 1'économie. Le
coefficient du taux de croissance du capital physique a même un signe négatif. En fait , le
R 2 (2%) et la statistique de Fischer F (0,31) montrent que le modèle n'a aucun pouvoir
explicatif. Par ailleurs, bien que les résidus ne soient pas autocorrélés, on m•te que le carré
2
du coefficient de corrélation entre les deux variables explicatives est supérieur au R du
modèle, ce qui dénote la présence d'un forte colinéarité entre les cleux variables
explicatives. Ce modèle ne donne donc pas de résultats interprétables.
Nous devrions alors éliminer l'une des variables explicatives qui sont fortement corrélées.
Mais, comme notre modèle ne comporte que deux variables explicatives, éliminer 1' une
des deux nous conduirait à un modèle à une variable explicative. Or il n' existe pas de
fondement théorique pour une fonction de production à un facteur. Pour cela, nous
n'opérons pas la correction de multicolinéarité, et notre analyse portera sur ks résultats des
autres modèles.
75
Chapitre 3. Etude empirique des facteurs de croissan ce. Méthodologie et Rés ztlti! ts
b) Estimation du Modèle (II)
Tableau 3. 3. :Résultats de l'estimation du modèle (II)
Variable dépendante : TXPIBT Taux de croissance du PIB par actif

' ·~ · 3)
Variables/Statistiques de tests ( 1) (2)
c -2,94 (-0,918) 2,246 (0,631) 0,017 (0,017)
TXCAPITALT -2,41 (-2, 169) -2,284 (-2,024) -2,041-1,84)
TXHPRI 1,76 (1,153)

i
TXHTOT -0,69 (-0,4 71) 1
TXHREV 1,50 ( l ,266)

\
Rz 18,6% 14% 19%
1
DW 2,19 2,12 1,94
F 2,75 (0,08) 2,10 (0,143) 2,91 (1),073)
White 0,035 (0,997) 0,08(0,987) 1,25 (1),317)

' 1
Breuch-Godfrey 0,569 (0,573) 0,848 (0,441) 1,03 (1),375)

1 i
'·
Ramsey Reset 0,778 (0,387) 2,807 (0,082) 2,47 (1),107)
Jarque Bera 0,731 (0,693) 0,717 (0,698) 0,003 (0,998)
THEIL 0,629 0,665 0,622
L'estimation du modèle ( Il) qui explique le taux de croissance du PIB p:1r actif par les
deux variables explicatives : le taux de croissance du capital physique par tête et le taux de
croissance du stock de capital humain, montre que ce modèle est meilleur au modèle ( I ).
Le pouvoir explicatif (14 à 19%) est un peu plus élevé mais demeure tiès faible et la
1 1
statistique de Fischer montre qu'il existe au moins une variable dont l'effet est significatif.
On voit en effet que le taux de croissance du capital physique a un effet négatif et
significatif au seuil de 10% et même au seuil de 5% [équations (1) et (2)]. Pour toutes les
différentes mesures du capital humain, l'effet n'est pas significativement non nul. Aussi , la
76
Chapitre 3: Etude empirique des facteurs de croissance. Méthodologie et Résultuls
statistique de Durbin Watson (DW) permet d'accepter l'hypothèse de l'absence
d'autocorrélation des résidus. Le test de multicolinéarité montre qu'on peut accepter
l'hypothèse de l'absence de multicolinéarité (R 2 inférieur aux coefficients de corrélation
entre variables explicatives). Par ailleurs, les tests diagnostics sont tous bom ; seulement le
modèle ne peut pas servir de base aux prévisions. Donc lorsqu'on augmente Je taux de
croissance du capital physique par tête de 1% en maintenant constant cdui du capital
humain, il en résulte une baisse du taux de croissance du produit globale d'environ 2,4%,
2,3% ou 2% selon que le capital humain est mesuré par la durée moyenne de scolarité d'un
actif dans le cycle primaire de 1' éducation (équation 1), ou dans tout le cycle d'éducation
(équation 2) ou encore par par une mesure basée sur les rémunératior s versées aux
employés dans la fonction publique (équation 3). Par contre, le capital h11main apparaît
sans effet sur la croissance dans les trois équations. On note donc que les tr•)is mesures du
capital humain conduisent à des résultats similaires, aussi bien en ce qui concerne 1'effet
non significatif du capital humain qu'en ce qui concerne l'ampleur (ordre de grandeur) de
l'effet du capital physique. La relation négative entre le taux de croissance économique et
le capital physique ne se comprend pas facilement et suggère que le capital physique
n'aurait pas été bien exploité dans l'économie au cours de la période 1970-1997. Mais, il
faut constater que dans ce modèle, on ne contrôle que le capital humain lorsqu'on fait
varier le capital physique. Or, il se peut que d'autres variables de 1'environ 1ement varient
aussi de façon à influencer négativement le taux de croissance. Pour vérifier la robustesse
de cette relation négative nous estimons les modèles qui prennent en comp1.e les variables
de politique économique et l'environnement climatique.
77
. --
Chapitre 3: Etude empirique des facteurs de croissance: Méthodologie et Résultt!ts
~
c) Etimation du modèle (III)
Tableau 3.4. :Résultats de l'estimation du modèle ( Ill )
Variable dépendante: TXPIBT Taux de croissance du PIB par actif

Variables/Statistiques de tests
' ( 1) (2) 1 3)
c -2,72 (-0,824) -2 ,9 5 (-0,83 7) -2 ,731 -1,60 1)
TXCAPITALT -0,78 (-0,922) -0,75 (-0,896) -0 ,5 4 1-0.725)
TXHPRI 0,09 (0,075)

,
....
TXHTOT 0,17 (0, 145)
TXHREV 1,60 (:~ ,069)
EXPORT 0,83 (4,442) 0,84 (4,828) 0,84 (:5,333)
QUALMACRO 0,16 (2 ,256) 0,15 (2,268) 0,13 (:~ ,203)
TXCONSG 0,23 (1,67) 0,24 (1,494) 0,19 ( \ ,56)
SECHER -2,89 (-2,19) -2 ,91 (-2 ,20) -3,11 1-2 ,59)
R2 66,8% 66,8% 73%
DW 2,06 2,07 1,59
F 6,72 (0,0005) 6,73 (0,0005) 8,87 (1 ),0000)
White 0,91 (0,5'48) 0,83 (0,61 0) 1,44 (1 ),251)
0,368 (0,696) 0,48 (0,61 0) 0,38 (1 ),688)

Breuch-Godfrey
):J
Ramsey Reset 0,043 (0,83 7) 0,054 (0,818) 0,267 (0,611)
0,997 (0,607) 1,002 (0,605) 0,45 (1

Jarque Bera
0,282
THEIL 0,316 0,316
Nous avons alors estimé le modele (III) qm cont1ent en plus des vanables du modèle (II)
la variable de perf01mances à l'exportation, l'indice de qualité de la ~;estion macro-
économique, le taux de croissance de la consommation publique et la vari :tb le muette de
sécheresse. Les résultats sont plus concluants et sont similaires pour les trois variables de
' 1
capital humain. On explique en effet plus de 66% des variations du taux dt- croissance du
78
Chapitre 3: Etude empirique des facteurs de croissance:
PIB par actif. La statistique de Fischer montre que le modèle a une très large significativité
globale. De plus, il n'y a pas de problème d'autocorrélation des etTeurs (dam l'équation (3)
le test de Breusch-Godfrey permet de conclure ainsi) et on peut accepter 1'hypothèse de
non multicolinéarité des variables explicatives (R2 plus élevé que les carrés cles coefficients
de corrélation entre les couples de variables explicatives). Donc, les coefficients estimés
sont interprétables. En outre, le test de Ramsey montre qu'on peut accepter que le modèle
est bien spécifié. Il ressort que le taux de croissance du capital physique a toujours un effet
négatif, mais qui n'est plus significatif, son ordre de grandeur est toujours le même dans les
trois équations. L'effet du capital humain, alors positif pour toutes les mesures du capital
humain, n'est significatif à 5% que pour la mesure indexée sur les salaires versés dans la
fonction publique [équation 3]. Les années passées par les travailleurs dans Je système
éducatif (cycle primaire ou tout Je cycle éducatif) semblent ne pas avoir de~; repercussions
positives significaftives sur l'activité économique; elles n'amélioreraient donc pas leur
productivité de façon significative. Par contre, lorsque les salaires versés aux travailleurs
dans la fonction publique augmentent, il y a amélioration de la croissance. Les hausses de
salaires semblent donc se traduire par une amélioration de la productivité des travailleurs .
ce qui influe positivement sur le taux de croissance. Les quatre nouvelles variables sont
toutes pertinentes avec des effets positifs (sauf! a variable muette de séchere;se dont l'effet
est négatif) et significatifs au seuil de 5%. L'effet des perfonnances a l'expo11ation
apparaît le plus important dans toutes les équations. L'effet positif de la consommation
publique n'est significative (au seuil de 10%) que dans une seule équation [equation 1]. En
outre, pour chaque variable explicative, l'ordre de grandeur de son coefficient est le même
dans les trois équations. Ce modèle apparaît ainsi satisfaisant aussi bien d J point de vue
des propriétés statistiques que de la pertinence des résultats. Toutefois, il ne peut pas servir
à des fins de prévision (indice de Theil assez élévé).
79
Ces résultats permettent d'expliquer les mécanismes par lesquels on peut améliorer sur
longue période 1'output global de 1'économie.
3.2.2. Analyse et interprétation des résultats
Les résultats précédents montrent que 1'hypothèse de recherche 2 de notre étude est
vérifiée. A propos de l'hypothèse 1, les deux facteurs - capital physique et capital humain Il
- se révèlent ne pas expliquer significativement les . variations du PIB par tête de
1'économie. L'effet négatif et non significatif du taux de croissance du capital physique par
actif apparaît surprenant puisque c'est l'un des principaux facteurs de croissance identifiés
par les théories de la croissance, depuis le modèle de Solow jusqu'aux nouvelles théories
de la croissance endogène. Aussi, c'est le principal facteur retrouvé par Sacerdoti et
al.(1998) pour les pays d'Afrique de l'ouest. Même si cette différence de résultats peut être
liée aux différences de spécification du modèle de croissance 6 , on pourrait rechercher les
justifications dans les arguments suivants.
D'abord, l'effet des facteurs de l'environnement économique semble n'avoir pas été
favorable à une bonne exploitation du stock de capital existant dans l'économie. Cet
argument est lié au fait qu'en contrôlant les facteurs de politique économique, le
coefficient du capital physique cesse d'être significatif. En effet, dans le modèle (II), le
coefficient mesure l'effet d'une variation du taux de croissance du capital physique sur le
taux de croissance du PIB par actif lorsque seul le taux de croissance du capital humain est
contrôlé et maintenu constant. Dans le modèle III, cet effet est mesuré lorsqu'on suppose
en plus que l'on met en oeuvre les mêmes mesures de politique économique et que les
conditions climatiques sont identiques. Les conditions de travail n'ont donc pas été
favorables à une bonne utilisation des capacités de production nouvellement acquises dans
80
l'économie. Ces conditions pourraient avoir engendré par exemple une sous-utilisation des
capacités, une inadéquation entre compétences et technologie. En effet, beaucoup
d'entreprises industrielles du Sénégal sont caractérisées par des taux élevés de sous-
utilisation des capacités de production (MTOA, SONACOS, etc.) (MEFP, 1997). Aussi, la
situation de l'emploi dans le pays qui est caractérisée par des difficultés d'insertion
automatique après formation ne permet pas aux nouvelles compétences formées d'être
utilisées dans les grandes entreprises de production. Il en résulte que même si ,,

1
l'augmentation du capital physique s'est faite par l'acquisition de nouvelles technologies,
celles-ci ne sont pas convenablement exploitées.
Par ailleurs, un autre argument peut être lié à la forte dépendance climatique de
l'économie. L'économie étant encore dominée par les activités agricoles et les industries
de transformation de produits agricoles, les efforts en capital physique se retrouveront
improductifs si la pluviosité n'a pas été favorable aux activiés agricoles. Il y a là un effet
direct d'une pluviosité défavorable sur les activités agricoles et un effet de répercussion sur
les autres secteurs.
A partir de ces deux arguments, il ressort que les efforts d'accumulation du capital
physique devraient être suivis par de bonnes politiques économiques. L'analyse du résultat
relatif au capital humain suggère aussi la nécessité de politiques économique adéquates .
L'effet du capital humain n'est positif et significatif que lorsqu'il est représenté par sa
mesure qui est indexée sur les salaires versés dans la fonction publique. Les deux autres
mesures du capital ont des effets non significatifs et parfois négatifs. Ce résultat est
conforme à ceux obtenus par Sacerdoti et al. (1998). Ce résultat suggère que l'amélioration '1
6 Sacerdoti et a. ( 1998) ont estimé un modèle à effet fixe puis à terme constant commun sur données de panel
alors que notre analyse est simplement longitudinale.
81
des compétences des travailleurs ne conduit pas à une augmentation significative du niveau
du produit global de 1' économie.
Mais, on sait qu'un niveau d'éducation plus élevé permet au travailleur d'améliorer se
productivité et donc d'avoir de plus grands rendements productifs. L'effet non significatif
du nombre d'années de scolarité montre donc que les compétences marginales acquises
chaque année grâce au système éducatif n'ont généralement pas un impact considérable sur
le niveau de l'activité économique. On peut expliquer ce contraste à partir des arguments
suivants.
D'abord, d'un point de vue statistique, le taux de croissance du capital humain varie très
faiblement alors que la série du taux de croissance du PIB par actif connaît de grandes
fluctuations. Par conséquent, il est difficile d'expliquer les fortes fluctuations par une série
assez stable. Mais, si le taux de croissance du PIB a fluctué si fortement, cela suggère que
la croissance n'a pas pu être maîtrisée; en particulier, les efforts faits pour maintenir stable
le taux de croissance du capital hurnain 7 n'ont donc pas eu d'impact notable sur l'activité
économique.
Ensuite, on doit rechercher pourquoi les compétences marginales acquises chaque année ne
parviennent pas à améliorer les résultats de l'activité économique. On peut évoquer le taux
de chômage assez élevé dans le pays. Aujourd'hui, il est rare qu'un jeune sortant du
système éducatif trouve automatiquement du travail et soit ainsi impliqué dans le secteur
productif. Le marché du travail ne reçoit donc pas immédiatement les nouvelles
compétences. Les variables de capital humain utilisées dans nos estimations surestiment
ainsi les compétences effectivement utilisées dans l'activité de production en prenant en
compte les nouvelles compétences. L'effet donné par l'analyse économétrique étant celui
induit par les nouvelles compétences, il apparaît normal qu'il ne soit pas significatif.
82
En outre, on peut évoquer la forte tertiarisation de l'économie. Les activités du secteur
tertiaire se développent rapidement et fournissent aujourd'hui la plus grande part du PIB.
Mais, ces activités sont généralement menées par des personnes analphabètes ou n'ayant
qu'un faible niveau d'éducation. En fait, ces personnes s'adonnent aux activités de
commerce qui procurent une forte valeur ajoutée mais qui ne requièrent pas une haute
qualification intellectuelle. Il en résulte que même lorsqu'on détourne des individus de leur
parcours scolaire et qu'on les canalise convenablement dans les secteurs du commerce, il y
l
aura amélioration du niveau de l'économie. Toutefois, afin de pouvoir alimenter le secteur Il
du commerce en produits locaux de bonne qualité, il est important d'encourager la
formation de hautes compétences. Ces compétences seront utilisées dans les industries de
transformation des produits agricoles afin d'assurer une bonne dynamique des secteurs
primaire et secondaire. Aussi, certaines branches du tertiaire comme les télécommunica-
tions requièrent, face à la forte avancée de la technologie de l'information, de hautes
compétences techniques. Les efforts dans la formation du capital humain devraient donc
être poursuivis en s'orientant selon les besoins de ces secteurs.
Par ailleurs, on peut évoquer l'inhibition par des forces politiques, des initiatives
favorables à la croissance que pourraient prendre les individus ayant un haut niveau
d'éducation. En effet, la qualité des orientations stratégiques de 1' économie est importante
dans l'incitation et la motivation des travailleurs à bien se déployer. Mais, dans nos pays
africains, l'homme politique domine l'économiste si bien que ce dernier travaille au service
du premier. Les orientations de politique économique sont ainsi généralement élaborées au
gré du gouvernement en place. Ainsi, les décisions stratégiques de l'économie ne sont pas
exactement celles que les analyses de l'économiste lui suggèrent, mais celles désirées par
7 La stabilité du taux de croissance du capital humain implique que le capital humain a toujours augmenté
d'année en année à un taux presque constant. Son évolution a donc été continue.
83
le gouvernement. Alors, apparaît une faiblesse de la productivité des travailleurs liée à un
manque de motivation. Cet argument soutient donc que les compétences issues de
l'éducation ne sont pas toujours exploitées de façon objective, ce qui ne permet pas que
leur accroissement augmente le niveau de l'économie.
Un dernier argument est l'effet des disparités éducationnelles au sem de la population
active. En effet, le taux d'analphébétisme est encore élevé au Sénégal et la mesure du
capital humain par le nombre moyen d'années de scolarité par actif comporte un grand
biais d'inégalité. Cet argument évoque le rôle joué par la distribution de l'éducation entre
les travailleurs dans son utilisation économique 8 . Ce rôle a été identifié par Ramon et al.
( 1998) qui trouve que le coefficient négatif de 1' éducation devient positif et significatif
lorsqu'on contrôle 1' inégalité de la distribution de 1' éducation. Une forte inégalité dans la
'r
distribution de l'éducation implique que la plupart des compétences et aptitudes au travail
issues du système éducatif sont possédées par quelques individus seulement, la plupart des '1
individus paraissant n'avoir pas été à l'école. Dans ces conditions, une variation du capital
humain mesuré par le niveau moyen d' éducation ne traduit pas nécessairement une
amélioration des compétences de la plupart des travailleurs. Le niveau de la production
peut ainsi ne pas s'améliorer à la suite d'une augmentation de la variable de capital humain
ainsi mesuré. Cet argument relatif à l'inégalité de la distribution de l'éducation apparaît
justifié dans la mesure où la variable de capital humain basée sur les salaires versés dans la
fonction publique a un effet positif et significatif dans les différents modèles. En fait, cette
mesure ne prend en compte que les personnes travaillant dans la fonction publique; celles-
ci étant toutes instruites, l'inégalité de l'éducation entre eux est plus petite que celle qui
caractérise la distribution de l'éducation au sein de toute la population active où se trouvent
par exemple des agriculteurs n'ayant jamais été à l'école.
84
Au total, on peut dire que le capital humain contribue positivement à la crOissance
économique au Sénégal. Les mesures du capital humain basées sur le nombre moyen
d'années de scolarité en sont une évaluation moyenne au sein de la population active qui
est caractérisée par de fortes disparités entre les individus. Ce qui n'a pu rendre compte de
l'effet réel des variations du capital humain sur l'activité économique.
Par ailleurs, les chocs climatiques subis par l'économie vers la fin des années 70 et au
milieu des années 80 ont fortement contrasté les activités économiques. En fait, il y a une
répercussion négative directe sur les activités du secteur primaire. Indirectement, les
entreprises manufacturières vont connaître une baisse de leurs activités suite à l'éventuelle
pénurie des approvionnements en produits primaires locaux. Il est aussi possible que la
baisse de la production agricole ait engendré une inflation généralisée dans l'économie, ce
qui constitue une instabilité macroéconomique augmentant ainsi l'incertitude dans
l'économie. L'aversion pour le risque conduit alors les agents à réduire leurs
investissements. Il en résulte une baisse générale du niveau de la production nationale.
Le taux de croissance des dépenses publiques de consommation exerce un effet positif sur
la croissance. Mais, cet effet n'est pas très robuste; il n'est pas significatif dans toutes les
équations. Ces dépenses incluent celles d'éducation, de santé, de salaires, d'entretien ; elles
ne prennent pas en compte les dépenses en capital qui sont des dépenses d'investissement.
L'effet positif obtenu est cependant contraire à celui trouvé par Barro et Sala-I-Martin
(1996) à partir d'une analyse transversale. Mais, il faut noter que leur variable de
consommation publique n'inclut pas les dépenses d'éducation. Bien que les dépenses
publiques relevant de comportements corrupteurs et des autres aspects négatifs du
8 Une forte inégalité dans la distribution de l'éducation ne permet pas de saisir son rôle dans l'économie.
85
fonctionnement de l'Etat auraient influencé négativement la croissance de l'économie, les
dépenses consacrées aux secteurs sociaux tels que 1' éducation et la santé ont certainement
contribué à améliorer la productivité. Cet effet positif des dépenses publiques sur la
productivité des travailleurs peut être expliqué par le renforcement des capacités humaines
qu'elles favorisent. Aussi, au regard de la détérioration de la condition sociale engendrée
par les programmes d'ajustement structurel (PAS), cet effet positif peut s'expliquer par une
réaction inverse des populations face à la réduction des dépenses de l'Etat. En effet, les
populations peuvent réagir en devenant plus innovatrices et plus entreprenantes afin de
faire augmenter leurs revenus. Pour subsister face à la crise sociale, les individus ont dû se
créer des activités secondaires constituant de nouvelles sources potentielles de revenus. Cet
argument semble vérifié au Sénégal du fait de l'apparition de petites entreprises, même
individuelle et de la forte tertiarisation de l'économie au cours de ces dernières années.
Par ailleurs, les mesures d'assainissement des finances publiques prises dans le cadre des
PAS ont eu pour objectif de réduire les dépenses non prodqctives. Cette réduction des
dépenses non productives et la priorité données aux secteurs sociaux depuis le début des
années 90 ont donc conduit à adapter les dépenses publiques aux objectifs de
développement. Ces deux orientations ont pour effet d'améliorer la productivité des
travailleurs et de rendre les interventions de l'Etat efficaces. En outre, les dépenses
salariales semblent avoir été compatibles avec les objectifs de croissance ; ce que reflètent
les effets positifs des dépenses de consommation publique et de la variable de capital
humain basée sur les salaires versés par la fonction publique. En fait, cela peut s'expliquer
par les mesures de réduction de la masse salariale dictées par les PAS . Ces mesures qui ont
été mises en oeuvre à travers le programme de départ volontaire ont contribué à éliminer le
personnel oisif de la fonction publique. Dès lors, il devrait y avoir une efficacité dans les
activités économiques relevant de la fonction publique. Aussi, les nombreuses
86
privatisations opérées dans le cadre des PAS ont énormément réduit la masse salariale de
l'Etat alors qu'elles ont engendré une meilleure gestion des entreprises nationales
privatisées. Il en est donc résulté une plus grande efficacité productive au moment où les
secteurs productifs constituent la priorité dans 1' affectation des ressources de 1'Etat.
En définitive, les différents arguments qui précèdent justifient 1' existence d'une relation
positive entre la croissance et les dépenses publiques de consommation. Mais, la faiblesse
de cet effet (environ une augmentation du taux de croissance du PIB par actif de 0.26%
pour une variation de 1% du taux de croissance de la Consommation publique) et sa
fragilité (absence de robustesse) seraient dues aux comportements négatifs des personnels
de la fonction publique : corruption, lenteur administrative, manque de calendrier
rigoureux de travail et absence de contrôle. Ces attitudes négatives conduisent à une
mauvaise exploitation des infrastructures et matériels de production et à une productivité
non optimale.
L'effet postifde l'indice de qualité de la gestion macroéconomique traduit l'importance de
la stabilité macroéconomique pour la croissance. Ce résultat montre que les années
caractérisées par de faibles ratios au PIB du déficit budgétaire, de la masse monétaire et de
la dette correspondent à celles où le taux de croissance a enregistré des valeurs élevées. En
fait, ces trois ratios indiquent le degré de prudence de 1' action de l'Etat ou des autorités
monétaires. Par exemple, un poids d'endettement faible ou un déficit budgétaire faible
implique que l'Etat a su maîtriser ses dépenses par rapport à ses recettes. De ce fait, les
bailleurs de fonds et les partenaires bilatéraux seraient disposés à lui accorder de prêts et
même des aides pour financer des besoins d'investissement. Il en résulte une augmentation
des ressources disponibles dans l'économie; si elles sont bien exploitées, il y aura hausse
87
du niveau de la production. Si le ratio de la masse monétaire au PIB est faible, cela suggère
que les prix seront assez maîtrisés dans le pays et donc que les risques d'investissement liés
à l'incertitude seront faibles. Alors, les entreprises seront incitées à investir et les autres
agents seront capables d'épargner davantage. En ce sens, il semble que les agents
économiques sont rationnels et éprouvent de l'aversion pour le risque. En fait, ce résultat
souligne la pertinence des mesures de stabilisation menées au début des années 80 au
Sénégal. Face aux importants déséquilibres macroéconomiques (budgétaire, compte
courant) de la fin des années 70, il était nécessaire de les réduire afin d'aspirer à la
croissance. On peut dire que les PAS ont été bien menés dans leur première phase qu'est la
stabilisation. En outre, les mesures de réduction de la masse monétaire qui ont suivi la
dévaluation du franc CFA en vue de la maîtrise de l'inflation se sont révélés positives,
puisque depuis 1994, le taux de croissance du PIB s'améliore.
Toutefois, on pouvait penser que si la dette est utilisée pour pour financer des activités
productives, alors elle serait positivement liée à la croissance. Mais, l'effet positif de la
confiance liée à un faible poids d'endettement semble meilleur, à long terme, à l'effet
positif que peut avoir une dette utilisée dans des activités productives. En effet, les taux de
croissance du PIB ayant été toujours faibles, les politiques de croissance qui comptent sur
la dette pour financer la production conduiraient facilement à une situation d'insolvabilité
(taux de croissance économique inférieur au taux d'intérêt sur la dette). Ce qui constituerait
en conséquence une perte totale de confiance auprès des bailleurs et par la suite, il y aurait
pénurie de moyens financiers dans l'économie. Mais, il faut noter à cet égard que le
Sénégal ne souffre pas d'un poids d'endettement excessif. Sa dette a été jugé soutenable en
88
1998 par l'initiative PPTE 9 ; ce qui suggère que la politique d'endettement du Sénégal a été
compatible avec les objectifs de croissance.
De même l'idée d'encouragement des dépenses publiques productives permet d'envisager
une relation positive entre le déficit budgétaire et la croissance. En ce sens, un déficit
important qui résulte d'importantes dépenses productives contribue à améliorer le niveau
de la production. Mais, il faut noter qu'au Sénégal, les périodes où la politique des
dépenses publiques a commencé par définir des secteurs prioritaires correspondent aussi à
celles de réduction des dépenses (période d'ajustement structurel). Il en résulte que les
dépenses productives correspondent surtout à des dépenses limitées mais bien affectées
selon les priorités de croissance. Ce qui soutient plutôt une relation négative entre la
croissance et le déficit budgétaire.
L'effet positif de la variable représentant la contribution des exportations à la croissance du
PIB traduit l'importance des mesures de libéralisation commerciale dans les activités de
production. Bien que cette relation significative peut résulter d'un effet de simultanéité, il
reste que cette variable réflète la qualité des politiques commerciales. En effet, les
politiques de libéralisation commerciale encouragent les exportations, l'activité
d'exportation se faisant alors avec moins de taxation et bénéficiant même de subventions.
Ainsi, une forte propension à exporter devrait résulter de politiques favorables à
l'ouverture aux échanges extérieurs. L'effet favorable des politiques d'ouverture sur les
activités de production peut s'expliquer par des mécanismes de réallocation des facteurs
d'amélioration de la productivité. Ces politiques exposent les entreprises loacles à la
concurrence étrangère, aussi bien sur le marché national que sur les marchés extérieurs. La
9 Selon l'initiative en faveur des Pays Pauvres Très Endettés (PPTE), la dette d'un pays est jugée soutenable
lorsque les principaux ratios d'endettement montrent que le pays est capable de solder lui-même sa dette sans
aucune mesure spéciale d'allègement; dans ce cas, le pays n'est pas élu pour bénéficier de l' initiative.
89
libéralisation des importations permet en effet l'introduction dans le marché national de
tout produit fabriqué à l'étranger. Ainsi, même les entreprises produisant pour le marché
local sont concurrencées par les produits fabriqués à l'étranger. Pour survivre, elles sont
obligées de rechercher de nouveaux modes de production propres à réduire les coûts de
production et à améliorer la qualité des produits. Ainsi, il y aura apparition de mesures de
motivation et de suivi du personnel ; ce qui améliore la productivité. Cette amélioration de
la productivité des travailleurs permettra d'accroître les niveaux de production. De même,
les entreprises d'exportation sont amenées à réduire leurs coûts de production afin d'être
compétitives sur les marchés d'exportation.
Elles chercheront par exemple les approvisionnements en matières premières les moins
coûteux.
Aussi, la libéralisation commerciale favorise l'accès à la technologie étrangère; ce qm
devrait permettre aux entreprises locales de produire dans des conditions comparables à
celles des économies développées.
En définitive, les résultats de 1' estimation des modèles de croissance suggèrent que ni le
capital humain ni le capital physique n'ont été bien exploités dans l'économie, faute d 'un
environnement de travail et de motivation adéquat. En outre, les politiques de dépenses de
consommations de l'Etat ont induit des attitudes favorables à la croissance. Les politiques
de libéralisation se révèlent comme des mesures qui obligent aussi bien les entreprises de
production locale que celles d'exportation à rechercher une meilleure compétitivité à
travers une hausse de la productivité et l'amélioration de la qualité des produits. Ces
résultats impliquent de nouvelles orientations de la politique économique en vue
d'enregistrer dans les prochaines années des performances régulières en matière de
crmssance.
90
3.3. IMPLICATIONS DE POLITIQUE ECONOMIQUE
Les analyses précédentes suggèrent d'importantes actions de l'Etat et des autres acteurs de
l'économie dans le cadre de la promotion de la croissance économique au Sénégal. La
politique gouvernementale doit définir un cadre à 1'intérieur duquel les entreprises peuvent
fonctionner à pleine capacité et aux meilleures productivités. De plus, les structures de
production doivent être libres de procéder à des ajustements de façon souple et efficace
pour profiter des nouvelles occasions que leur offre l'environnement international. Le
secteur privé, les organisations non gouvernementales et les individus doivent se montrer
préoccupés par rapport au bien-être collectif. Ils doivent alors se déployer pour exploiter au
mieux les opportunités de productions permises par 1' environnement socio-économique
national ou international et par les ressources disponibles.
3.3.1. Politique macroéconomique
La politique macroéconomique du gouvernement doit encourager la croissance. Elle doit
éviter les réductions dans les investissements publics et doit multiplier les stimulants, tant
pour investir que pour épargner, à travers des politiques fiscales adéquates. Mais,
l'accumulation continue des investissements requiert également la maîtrise de l'inflation
afin d'éviter les effets décourageants de l'aversion pour le risque lié à une grande
incertitude. Même si aujourd'hui, l'appartenance à une zone économique et monétaire,
l'intégration régionale et la mondialisation de l'économie offrent des choix limités en
matière de politique macroéconomique, certaines possibilités demeurent. En effet, les
subventions à la recherche-Développement, le §ancement du développement de
l'infrastructure, l'instauration de mesures de rigueur administrative et de lutte contre la
fraude et la corruption sont des actions que l'Etat doit mettre en oeuvre. En ce sens, la
91
politique de l'Etat veillera à contenir le prix des biens d'investissement afin de faciliter
l'acquisition de l'infrastructure. Aussi, doit-elle encourager une forte utilisation des
capacités productives, ce qui suppose une baisse des coûts de production et une
augmentation des revenus. En conséquence, il sera possible d'accroître l'épargne privée et
donc de créer de nouvelles possibilités d'investissement. Les subventions à la R&D auront
pour effet de mettre à la disposition des entreprises de nouveaux modes de production
permettant d'améliorer la qualité ou de réduire les coûts de production. La rigueur
administrative est nécessaire pour éliminer la lenteur caractéristique de la fonction
publique. Les attitudes de fraude et de corruption favorisent les détournements de fonds
publics et entraînent ainsi une hausse des coûts de production. Aussi, ces attitudes
n'incitent les agents ni à déployer de l'effort ni à être soucieux d'une saine exploitation des
ressources. En outre, l'Etat continuera à identifier les secteurs prioritaires qu'il privilégiera
dans l'affectation de ses ressources. En définitive, le but de la politique macroéconomique
sera de garantir que le niveau de production réalisé par l'économie correspond à son
potentiel. Si cette pleine utilisation des capacités est accompagnée d'un développement
harmonieux de la technologie, les performances de l'économie seront substantielles.
3.3.2. Développement technologique
Aussi bien 1'Etat que les entrepreneurs privés doivent mettre en oeuvre des politiques de
gestion qui attirent les investissements. Ceci permettra de renforcer la capacité
technologique des sociétés. Pour cela, il est important de mettre l'information
technologique rapidement et sans grands frais à la portée des utilisateurs et de faciliter la
création de compétences administratives et organisationnelles pour absorber les nouvelles
techniques. Aussi, il faudra encourager l'introduction et le développement d'industries
92
technologiques dans le pays afin de multiplier les capacités techniques locales. Ces
capacités peuvent être regroupées dans les trois catégories suivantes : investissement
physique, capital humain et l'effort technologique. Si le capital physique est rassemblé
sans les compétences ou la technologie nécessaires pour être exploité efficacement, il ne se
développera pas adéquatement. Ou, si les compétences nécessaires sont créées, mais ne
sont pas associées à l'effort technologique, on ne constatera aucune amélioration sensible
de l'efficacité. Ainsi, l'investissement physique est une condition fondamentale, mais
l'efficacité avec laquelle le capital accumulé est utilisé revêt une importance cruciale. Pour
assurer cette synergie entre les trois catégories de capacités, il faudra : a) encourager le
développement des établissements et des instituts de formation afin de créer une main-
d' oeuvre locale dotée des compétences professionnelles et techniques appropriées ; b) créer
un environnement conjoncturel qui récompense la prise de risque et l'innovation, avec des
institutions financières adéquates, des procédures de réglémentation pour l'essai de
nouveaux produits et des liens étroits entre les établissements de fonnation universitaire ou
professionnelle et les entreprises. En ce sens, le programme de réhabilitation de 1'UCAD
financé par la Banque Mondiale est déjà une action favorable. Le volet Bibliothèque de ce
programme qui est déjà en cours d'exécution constitue une aide à la R&D. En outre, les
investisseurs étrangers pourraient créer des liens avec les sociétés locales afin de contribuer
à l'amélioration de l'efficacité du réseau technologique interne. L'économie nationale
pourra alors bénéficier de 1' assimilation à la fois des nouvelles pratiques de gestion et des
technologies modernes.
Au total, 1' objectif du développement technologique est de faciliter 1' accès à la technologie
nouvelle et de garantir que la technologie acquise est utilisée aussi efficacement que
possible. Dans ces conditions, la valorisation du capital humain retient particulièrement
l'attention; elle dépend aussi bien de l'Etat que des individus.
93
3.3.3. Valorisation du capital humain
La Banque Mondiale note au sujet des économies très performantes d'Asie "Entre 60 et
90% de la croissance de leur production viennent de l'accumulation du capital physique et
humain. L'évolution de la productivité a été plus forte que dans les autres économies en
développement et elle est importante pour la réussite de l 'Asie, mais elle n 'en constitue pas
le facteur dominant" (World Bank, 1993). Nos résultats ont montré que l'environnement
et les politiques macroéconomiques n'ont pas favorisé une bonne exploitation du capital
humain et du capital physique. Aussi, la forte inégalité dans la distribution du capital
humain au sein de la population active n'a pas permis de capter ses effets réels sur la
croissance à partir de nos modèles. Toutefois, il reste vrai, à travers la synergie mise en
évidence plus haut, qu'une combinaison efficace du capital humain et du capital physique
est nécessaire pour un accroissement continu du capital et de la croissance. Il est donc
important de multiplier et de renforcer les politiques de "l'éducation pour tous", des
"soins de sauté primaires pour tous". En outre, il faudra préparer des structures de travail
pour accueillir les nouvelles compétences formées. Aussi bien 1'Etat que les organismes
internationaux et les ONG devraient continuer à accroître leurs interventions à caractère
social auprès des populations et à augmenter de plus en plus les fonds alloués aux
investissements dans l'enseignement général. Par ailleurs, les entrepreneurs devraient
commencer à accorder un financement à la formation du personnel en cours d'emploi. En
effet, il faut noter que le rythme rapide de l'évolution technologique, la complexité et la
spécialisation croissante font que l'enseignement général ne peut offrir que des aptitudes
générales de base. Une formation en cours d' emploi est ainsi nécessaire pour pouvoir
communiquer les nouvelles techniques aux personnels d'entreprises. Par ailleurs, les
entrepreneurs devront faire participer leur personnel à des séminaires ou stages pratiques,
94
organisés à 1' échelle international afin de bénéficier plus facilement des idées développées
dans les autres pays.
Il faut dire que les politiques de valorisation du capital humain visent à former de la main-
d'oeuvre qualifiée dont les compétences sont toujours mises à jour avec l'évolution
technologique et qui ne fait pas face au problème de chômage.
3.3.4. Politique commerciale
Les mesures de libéralisation commerciale prises en 1986 dans le cadre de la NPI doivent
être poursuivies. Les autorités publiques devraient insister sur la mise en oeuvre de
mesures de promotion des exportations. En ce sens, il pourra être procédé à une révision à
la baisse des taxes sur les exportations, surtout celles qui portent sur les produits
manufacturiers. En revanche, il faudrait décourager les exportations de produits primaires
qui peuvent permettre aux entreprises locales de réduire leurs coûts d'approvisionnement et
donc leurs coûts de production. Par ailleurs, les importations devront aussi bénéficier de
mesures de libéralisation, mais cela ne devrait concerner que les produits importants et
utiles pour la production nationale. Par exemple, les importations de biens d'équipement
devront être encouragées afin de faciliter le développement technologique des sociétés
locales. Mais, du fait des avantages liés à l'intégration régionale et à la mondialisation, le
Sénégal ne devrait plus dresser de barrières contre l'introduction de quelques produits
spécifiques dans le pays. Il appartient donc aux entreprises locales de contrôler la qualité et
le prix de leur produits pour que les consommateurs nationaux ne leur préfèrent pas des
produits homologues fabriqués à l'étranger. La politique commerciale doit veiller à
éliminer et à éviter les distorsions commerciales de façon que les entreprises - aussi bien
celles qui produisent pour le marché locale que celles d'exportation - soient capables
d'offrir des produits compétitifs sur les marchés locaux ou internationaux. Mais, la prise en
95
compte de l'environnement climatique sujet à plusieurs aléas telle la sécheresse est un
autre aspect important pour accroître les chances de performance dans l'économie.
3.3. 5. Politique environnementale
La forte dépendance négative de l'économie sénégalaise des aléas climatiques, notamment
des mauvaises pluviosités, suscite la nécessité de renforcer les politiques de lutte contre la
sécheresse. Il n'est pas intéressant de voir les efforts de développement technologique, de
renforcement du capital humain et d'instauration d'un environnement économique sain être
sans effet parce que le climat n'a pas été favorable. Pour cela, dans un premier temps, les
populations devraient être éduquées sur l'importance de protéger la faune et la flore de leur
cadre de vie qui sont déjà pauvres (pays sahélien). Les services des eaux et forêts devraient
renforcer leur mesures de lutte et de sensibilisation contre la déforestation. Ensuite, les
mesures de sauvegarde de l'environnement dans les pays sahéliens (Programme CILSS,
par exemple), devraient être poursuivies et renforcées. A tout moment, chaque citoyen
pourrait participer à l'oeuvre du reboisement en plantant au moins un arbre sur une partie
du territoire national.
Ces différentes politiques devraient être mises en oeuvre à travers des mesures et des
actions spécifiques qui font appel aussi bien aux autorités publiques, à la société civile
qu'aux organismes internationaux et aux ONG.
96
CONCLUSION ET RECOMMANDATIONS
CONCLUSION
Au cours des deux décennies qui ont suivi l'indépendance du Sénégal, l'économie a
connue plusieurs mouvements qui ont finalement conduit à une grande récession à la fin
des années 70 comme dans la plupart des pays d'Afrique subsaharienne. Les grands
déséquilibres macroéconomiques étaient alors à leur maximum. Dans ces conditions, la
voie d'issue qui a été adoptée est de mettre en oeuvre des programmes d'ajustement
structurel avec l'appui des institutions internationales (Banque Mondiale et Fonds
Monétaire Internationale). Mais, les politiques contenues dans ces programmes ont été
élaborées et dictées par ces institutions, si bien qu'elles n'ont pu tenir compte des réalités
socio-économiques des populations. Cela a conduit à une détérioration de la condition
sociale dans le pays. De plus, ces programmes n'ont permis qu'une stabilisation
macroéconomique sans un réel effort de croissance. C'est à partir de 1994, avec la
dévaluation du franc CFA, que l'on note une reprise continue de la croissance du PIB. Les
déterminants du résultat de l'activité économique semblent ainsi n'avoir pu être maîtrisés
dans le pays afin d'être orientés dans le sens d'une croissance économique continue. C'est
dans ce sens que cette étude a été menée en vue d'apporter une contribution à la recherche
des facteurs explicatifs des variations du revenu global par actif de l'économie sénégalaise.
Les analyses ont consisté en l'estimation successive d'un modèle de croissance simple
avec résidu de Slow, d'un modèle de croissance avec capital humain et d'un modèle de
croissance prenant en compte les facteurs de 1' environnement macroéconomique. Les
données sont relatives à la période de 1971-1997 qui prend en compte l'ensemble des
principales réformes macroéconomiques et sectorielles adoptées dans 1' économie.
97
Les résultats des autres modèles montrent en effet que ni le capital humain ni le capital
physique n'ont été bien exploités dans l'économie, faute d'un environnement de travail et
de motivation adéquat. Les changements climatiques engendrant une forte sécheresse ont
des répercussions négatives sur l'économie. En outre, les politiques de dépenses de
consommation de l'Etat ont induit des attitudes favorables à la croissance. Les politiques
de libéralisation commerciale se révèlent comme des mesures qui obligent aussi bien les
entreprises de production locale que celles d'exportation à rechercher une meilleure
compétitivité à travers une hausse de la productivité et l'amélioration de la qualité des
produits.
Ces analyses suggèrent d'importantes actions de l'Etat et des autres acteurs de l'économie
dans le cadre de la promotion de la croissance économique au Sénégal. La politique
gouvernementale doit définir un cadre à l'intérieur duquel les entreprises peuvent
fonctionner à pleine capacité et aux meilleures productivités. De plus, les structures de
production doivent être libres de procéder à des ajustements de façon souple et efficace
pour profiter des nouvelles occasions que leur offre l'environnement international. Le
secteur privé, les organisations non gouvernementales et les agents économiques doivent se
montrer préoccupés par rapport au bien-être collectif. Ils doivent alors se déployer pour
exploiter au mieux les opportunités de production permises par l'environnement socio-
économique national ou international et par les ressources disponibles. Des mesures de
sensibilisation doivent être mise en ouvre pour la sauvegarde de l'environnement.
Il faut noter que cette étude présente de nombreuses limites qui, bien qu'elles ne
compromettent pas la validité des résultats, n'ont pas permis de saisir tous les aspects
possibles des facteurs de croissance. Les mesures du capital humain n'ont pas permis de
98
saisir son effet réel ; en particulier, les informations disponibles ne rendent pas compte de
sa distribution au sein de la population active. En outre, par exemple, la fonction de
production utilisée n'a pas permis de tester séparément le rôle joué par l'investissement
public et l'investissement privé. De même, notre analyse a été macroéconomique; en ce
sens, elle n'a permis ni d'examiner les facteurs de croissance spécifiques aux principales
branches dominantes de l'économie, ni d'analyser les contributions au PIB et les
contraintes liées à l'activité économique des différentes régions géographiques du pays. Il
est souhaitable que les études portant sur la croissance économique au Sénégal soient
orientées beaucoup plus vers ces aspects micro qui impliqueraient des politiques de
croissance adaptées aux réalités sociales, culturelles, démographiques et économiques de
chaque région ou aux potentialités de chaque branche de l'économie. Par ailleurs, il est
souhaitable de mener des études sur l'économie des pays victimes de conflits sociaux afin
de cerner l'effet de ces événements sur la croissance économique. C'est dans ces limites
que sont formulées les recommandations suivantes liées aux résultats de l'étude.
RECOMMANDATIONS
1) Une politique macroéconomique saine
Les décideurs politiques, les autorités monétaires devraient continuer à rechercher les
meilleures valeurs possibles pour les indicateurs de stabilité de la politique macro-
économique. La politique monétaire devra viser le maintien de faible taux d'inflation dans
l'économie. La politique budgétaire devrait continuer à se soumettre à des réformes qui
réduit les dépenses mais les rend efficaces ; ce faisant, elle veillera à réduire au mieux les
déficits budgétaires. La structure de la production en terme de la destination des produits
devra être favorables à la réduction des déficits de la balance commerciale. Par ailleurs,
99
Conclusion et Recommandations
aussi bien 1'Etat que les entreprises de production (privées ou publiques) devraient définir
une politique d'endettement telle que leur niveau d'activité ne compromette pas leur
solvabilité. En fait la réalisation de ces objectifs de la politique macroéconomique
dépendent de l'environnement de production. C'est pourquoi nous insisterons davantage
sur les mesures qui favorisent l'efficacité et améliorent la productivité aussi bien du capital
physique que du capital humain.
2) Réforme de l'intervention publique axée sur la bonne gouvernance
Le but de l'intervention publique devrait être de s'éloigner d'une administration publique
inefficace pour tendre vers une administration où 1' accent sera mis sur une meilleure
qualité des prestations des services nécessaires au développement et vers un cadre incitatif
encourageant la recherche de meilleures productivités. Des mesures devraient être prises
aux trois niveaux suivants: ajustement institutionnel, cohérence entre les structures de
salaires et la productivité, amélioration de la gestion économique.
En ce qui concerne l'ajustement institutionnel, l'objectif devrait être d'avoir un Etat plus
légitime, transparent et ayant un grand sens de responsabilité et qui puisse assurer les trois
rôles suivants :
a) Indiquer la voie pouvant mener à une croissance économique durable : Réaliser un
nouvel équilibre entre les rôles respectifs des secteurs public et privé, où la
responsabilité de l'Etat serait axée essentiellement sur la fourniture d'infrastructures
physiques et sociales et sur la création d'un environnement favorable aux activités
productives privées et publiques. L'Etat veillera à prévenir les monopôles, minimiser
l'ingérence politique dans la gestion économique et éliminer toutes les autres pratiques
improductives.
100
b) Elargir la participation au processus de développement : Il faut accroître la
participation effective des populations aux processus de prise de décision, leur accès à
1' information.
c) Connecter 1'Etat à la société civile : Il faut poursuivre les mesures de délégation et de
décentralisation du pouvoir au profit des collectivités locales. Ceci permet de créer une
synergie et un partenariat plus réels entre les institutions du gouvernement central et
celles des gouvernements locaux. La finalité de cette connection est d'assurer un
développement à la base propre à réduire les inégalités internes dans le pays. Il faut
rechercher un cadre légal et institutionnel qui encourage la société civile à influencer en
sa propre faveur les politiques gouvernementales ayant un effet sur le bien-être de ses
membres. La pression de l'opinion publique peut en effet conduire à une plus grande
responsabilisation et à de meilleures réponses de la part des autorités publiques.
A propos de la cohérence entre les structures de salaires et la productivité, il faudra
que le système de récompense et de sanction dans la fonction publique reflète la qualité et
1' efficacité du service, telles que ressenties par les bénéficiaires. Ceci vise à éviter les
dépenses inefficaces. Mais, cette liaison exige l'identification du coût de la main-d'oeuvre
dans 1' ensemble du système productif. Pour cela, la planification, le suivi et 1' évaluation
des programmes de travail individuels devraient être des instruments utilisés au sein des
unités administratives.
Le troisième niveau d'intervention est l'amélioration de la gestion économique. L'accent
devrait être mis sur 1' amélioration de 1' efficacité et de la capacité de 1' administration
publique à exécuter effectivement les politiques les plus importantes. Les principaux
domaines d'actions sont les suivants :
101
a) Gestion des politiques économiques : Ici, il est nécessaire de développer et de renforcer
les capacités institutionnelles d'analyse et de gestion de la politique économique. Il
s'agira d'établir des mécanismes de coordination efficaces, impliquant toutes les
couches de la population, en provenance tant de l'administration publique que du
secteur privé et des organisations non gouvernementales (ONG). Cette action concertée
permettra d'éviter les politiques qui n'avantagent que quelques classes d'individus
seulement, mais de tenir compte des intérêts de tous, puisque toutes les couches seront
représentées dans l'équipe qui élabore les mesures de politique économique.
b) Cadre réglementaire: Il faudra s'assurer que la fonction publique est capable
d'assumer effectivement les fonctions lui permettant d'encourager le développement
des activités économiques, en particulier le développement du secteur privé. Il faudra
alors améliorer l'environnement et les capacités existantes et créer de nouveaux modes
d'interaction entre le gouvernement et les autres acteurs de l'économie.
En fait, les différentes composantes de cette réforme visent à créer un potentiel
institutionnel durable et viable dans le secteur public et à instaurer un environnement
favorable aux activités du secteur public et privé. Même si ces mesures doivent, de par leur
nature, prendre du temps pour être mises en oeuvre, il est souhaitable que celles qui
concernent les fonctions fondamentales de l'Etat soient exécutées. En ce sens, une réforme
des dépenses publiques s'avère importante.
102
3) Réformes en matière de dépenses publiques
La réforme de la dépense de l'Etat devrait faciliter une augmentation de la productivité et
une meilleure utilisation de la capacité de production disponible. Les principaux domaines
d'intervention sont:
a) Encourager 1'investissement public productif: il est important de veiller à ce que la
qualité des programmes d'investissement public soit élevée et à ce que les projets
soient soumis à un certain nombre de tests économiques, car un projet mal conçu ou
mal exécuté peut coûter très cher. Il faudra privilégier les investissements publics qui
complètent les activités déterminées par le marché et non ceux qui leur font
concurrence.
b) Financer 1'exploitation et 1'entretien des biens de capital. Une part des dépenses
courantes au titre des biens et services doit être destinée à l'exploitation (fournitures et
personnel) et à l'entretien des investissements. Si les dépenses d'exploitation sont
insuffisantes, les niveaux d'efficacité risquent d'être faibles dans des domaines tels que
l'éducation ou la santé. De même, des dépenses d'entretien insuffisantes risquent
d'entraîner une dégradation rapide des investissements matériels.
c) Remédier aux causes d'une faible productivité dans les administrations publiques.
Un faible niveau de rémunération des personnels qualifiés ou un différentiel de
traitement insuffisant risquent de décourager l'effort et d'aboutir à une faible
productivité dans le secteur public.
d) Une politique de dépense efficace par rapport à son coût : Le manque de ressources
rend plus pressante l'adoption d'une politique des dépenses qui soit efficace par rapport
à son coût et qui permette d'atteindre des objectifs comme: redistribution du revenu,
autosuffisance. Par exemple, l'octroi généralisé de subventions sur le prix des produits
103
alimentaires est une solution qui n'est pas forcément la plus efficace pour améliorer
l'état nutritionnel des pauvres et qui pourrait être avantageusement remplacée par
d'autres programmes bien ciblés.
e) Limiter la consommation publique. Cette limitation consiste en la restriction des
éléments les moins productifs. Le secteur public pourra alors contribuer à l'épargne
nationale et aura d'autant moins besoin d'augmenter les impôts.
L'action de l'Etat en faveur d'une bonne affectation des ressources d'investissement doit
être suivi d'un renforcement subséquent de la main d'oeuvre pour une meilleure
exploitation de ces ressources.
4) Renforcement du capital humain

L'objectif ici est de doter le pays de compétences intellectuelles capables d'exploiter la
nouvelle technologie de façon à augmenter la productivité et la qualité des produits.
Dans une première phase, il faut :
- ·Poursuivre les politiques de l'éducation primaire obligatoire pour tous, afin de réduire
les inégalités en matière de distribution de l'éducation;
Encourager les individus à atteindre de hauts niveaux d'éducation, notamment le
niveau universitaire. En ce sens, il ne s'agit pas de montrer à ceux qui n'ont qu'un
faible niveau qu'ils n'ont aucune place dans la sphère économique, mais de leur
montrer qu'ils seraient plus utiles s'ils poursuivaient davantage leurs études ;
Que les entreprises adoptent des programmes de formation, de récyclage de leurs
personnels, que ce soit pour des formations de longue ou de courte durée ;
Encourager l'initiative privée afin qu'il y ait une multiplication d'emplois dans le pays,
ce qui permettra d'utiliser les compétences formées et de réduire le taux de chômage;
104
Que les parents, dans la mesure de leurs possibilités, investissent dans l'éducation de
leurs enfants. Ce faisant, ils devraient être conscients qu'ils sont entrain d'investir pour
le futur, ce qui aura d'importants effets privés et sociaux.
Dans cette phase initiale, l'objectif est d'assurer une "Education pour tous tout au long
de la vie", objectif actuel de l'UNESCO (J. SHABANI, 1998).
Dans une deuxième étape, les institutions internationales et l'Etat devront mettre en
oeuvre des stratégies pour réduire les inégalités face au savoir et remédier aux problèmes
d'information.
Les institutions internationales pourraient intervenir de deux façons : fournir des biens
publics à caractère international, servir d'intermédiaire dans le transfert des
connaissances. En effet :
Plusieurs formes de connaissances sont des biens publics et aucun pays n'est prêt à
investir seul dans la création de ce type de biens qui profiterait à tout le reste du monde.
C'est pourquoi, les institutions internationales et les ONG qui agissent pour le compte
de tous devraient prendre en charge la production de tèlles connaissances dans le pays ;
Les institutions internationales devraient rassembler les connaissances issues de la
R&D, les analyser et les diffuser vers l'économie nationale.
Ainsi, les institutions internationales devraient jouer un rôle important. Mais, c'est l'action
de l'Etat et des agents de l'économie qui décidera de l'efficacité avec laquelle ils utilisent
ces connaissances. L'ouverture aux savoirs existant à l'étranger est un aspect important qui
doit être assuré par l'Etat et les entreprises privées. Pour cela, il est souhaitable de prendre
en compte les trois facteurs suivants : le libre-échange, l'investissement étranger et
1' exploitation sous licence de technologies importées. Le libre-échange incite les
entreprises à une efficacité et à avoir des produits conformes aux normes internationales.
De ce fait elles sont amenées à utiliser davantage les nouvelles connaissances. Par ailleurs,
'
105
par leurs activités au Sénégal, les sociétés multinationales, toujours à la pointe du progrès
technologique, peuvent être un moyen de transférer le savoir-faire et la technologie. En
outre, au Sénégal, il serait plus facile d'exploiter une technologie étrangère sous licence,
que d'inventer une nouvelle technologie de production. L'Etat devrait donc favoriser
l'orientation des politiques d'acquisition et d'assimilation des connaissances vers ces trois
aspects.
En définitive, l'Etat et les entreprises privées devront veiller à une large diffusion des
technologies afin que toutes les couches de la population et toutes les structures de
production puissent y accéder. Par exemple, on veillera à rendre plus simple l'acquisition
de l'ordinateur même par des particuliers, l'accès au téléphone, à l'internet. Ces mesures
visent à assurer le transfert des connaissances et des technologies étrangères dans
l'économie sénégalaise et à veiller à leur forte utilisation dans l'ensemble du système
productif.
5) Expansion rapide des exportations

Une stratégie d'expansion rapide des exportations devrait consister à introduire de bonnes
mesures d'incitation, à créer des marchés financiers sains et à laisser ensuite le marché
décider quels produits vont réussir sur les marchés d'exportation. Il ne s'agit donc pas de
choisir au départ les produits à exporter. En fait, une telle stratégie relèverait du
gouvernement et on sait que ce dernier est, sur le plan institutionnel, incapable de définir le
type de comportement entrepreneurial qui est nécessaire pour trouver et promouvoir les
produits pouvant réussir sur les marchés d'exportation. Le développement soutenu des
exportations ne peut être basé dans le long terrile que sur une large ouverture aux échanges
et à l'investissement étranger. Aussi, faut-il que les prix soient déterminés par le marché.
Aussi, il faut noter que les distorsions actuelles sont le résultat soit de défaillances du
106
marché, telles que les situations monopolistiques existant dans beaucoup de secteurs, soit
de mesures d'intervention discriminatoire qui sont essentiellement des taxes aux
producteurs (Banque Mondiale, 1997). Les entreprises qui écoulent leur production sur le
marché national en sont relativement peu affectées, puisqu'elles sont quelque peu à l'abri
de la concurrence internationale. Par contre, l'impact de ces distorsions est plus important
pour les exportateurs puisque leurs prix de vente sont déterminés sur le marché
international. Bien que la meilleure stratégie gouvernementale consisterait à s'attaquer
directement aux distorsions, il faut reconnaître que c'est un processus qui doit prendre du
temps. Le gouvernement peut atténuer certaines défaillances du marché dans le secteur des
exportations, et aussi compenser certains coûts imposés aux entreprises du fait de
l'inefficacité de l'administration. Mais, il faudra qu'à travers un engagement ferme, le
gouvernement manifeste sa volonté d'intégrer l'économie du sénégal à l'économie
mondiale. Afin d'atteindre cet objectif, le gouvernement devrait poursuivre et accélérer le
programme de réformes déjà initié en 1994. En outre, il faudra mettre en oeuvre un certain
nombre de mesures dont les suivants.
Une accélération de la réforme du régime du commerce extérieur ;
L'élimination des obstacles à la concurrence dans le secteur formel;
La réduction du rôle de l'Etat dans l'économie, grâce à la poursuite de la privatisation;
La poursuite des efforts en vue d'obtenir un système d'imposition stable, transparent, et
qui dépendrait moins des taxes sur les importations ;
La promotion du développement du secteur financier, notamment une bonne
exploitation des opportunités offertes par la bourse régionale des valeurs mobilières de
1' Afrique de 1'ouest ;
L'accélération des procédures d'importation et d'exportation par les autorités
compétentes ;
107
L'élimination de toutes les formalités associées au marché des devises '·
La priorité absolue accordée par le système judiciaire à la résolution des différends liés
aux activités d'investissement ou d'exportation.
Un autre aspect des mesures de promotion de la croissance au Sénégal est la sauvegarde de
1' environnement.
6) Information- Education et Communication en matière d'Environnement
Il est important d'assurer la libre diffusion des informations relatives à la protection de
l'environnement. Les mesures de politique environnementale doivent alors viser à
sensibiliser les populations sur le souci de multiplier les opportunités de croissance liées à
l'environnement, notamment les conditions qui favorisent une bonne pluviosité. Les axes
d'actions pourraient être sous-tendus par:
a) L'information et la sensibilsation des individus en vue de susciter les changements
d'attitudes et de comportements;
b) L'intégration de la variable relative à l'environnement dans les cycles de formation
pnma1re, secondaire et universitaire en vue de promouv01r l'éducation
environnementale ;
c) L'élaboration et la mise en oeuvre d'un programme d'éducation environnementale non
formel en direction de la large masse de population analphabète d'âge adulte;
d) L'élaboration et la mise en oeuvre d'une politique d'alphabétisation et de formation de
masse aux méthodes et techniques de production cohérentes avec l'objectif de la
sauvegarde de 1' environnement ;
e) L'élaboration de plans communautaires qui précisent, dans le cadre d'un plan local de
développement, les conditions et les modalités d'une meilleure organisation de
108
1' espace, notamment celles relatives à 1' occupation et à 1'affectation des sols selon leurs
aptitudes actuelles ou futures.
7) Rendre les réformes irréversibles
Nous avons vu que des facteurs politiques et institutionnels n'ont pas toujours pern1is de
mettre en oeuvre les différentes mesures prévues dans différentes réformes économiques au
Sénégal. Trop souvent, l'inefficacité, le favoritisme et l'application sélective des lois et
règlements rendent caduques les politiques les mieux intentionnées. Il apparaît donc que le
succès des différentes mesures contenues dans nos recommandations dépend de 1' existence
de moyens établissant de façon crédible que le gouvernement ne reviendra pas sur ses
promesses favorables déclarées. Ces moyens doivent avoir la même logique : mettre en
place des mécanismes qui empêcheront de revenir sur les engagements déjà pris. Il
faut une franche rupture avec la façon dont les politiques économiques ont souvent été·
mises en oeuvre dans le passé. Les autorités publiques devraient se soumettre aux
restrictions que leur imposent les politiques adéquates et supporter les coûts qu'elles
impliquent. Pour être crédible, le gouvernement peut, par exemple, s'engager à:
Promouvoir la transparence des règles et procédures ;
Imposer la discipline aux institutions et aux fonctionnaires chargés de l'application et
de la déréglementation.
En définitive, les mesures de promotion de la croissance au Sénégal font appel aussi bien à
l'Etat que les organismes internationaux, les organisations non gouvernementales et les
agents privés. Bien qu'elles ne soient pas exhaustives, en raison des nombreuses limites
que présente ce travail, il faudrait veiller à leur mise en oeuvre pour pouvoir espérer
enregistrer des taux de croissance positifs et soutenus au cours des prochaines années.
109
Références bibliographiques
BIBLIOGRAPHIE
ALBERTO, A., (1997), The political economy ofhigh and low growtl!, Paper presented
at the Annual World Bank Conference on development Economies, 1997, Washington D.C.
1... AMABLE, B.; GUELLEC, D. : (1992), Croissance endogène : les principaux
mécanismes, Economie et prévision, n° 1016, Paris, Mai.
i- AMVOUNA, A. M. : (1999), Existe-il un taux de croissance seuil au-delà duquel la
contribution du capital humain devient nécessairement positive ? Communication,
Quatrième Journées Scientifiques, Ouagadougou, Janvier.
BANQUE MONDIALE: (1993), Région de l'Afrique: données internes, Banque
Mondiale, Washington, DC.
BANQUE MONDIALE : (1995), Rapport sur le développement dans le monde 1995 :Les
travailleurs dans un monde en mutation, New York, Oxford University Press for the World
Bank.
BANQUE MONDIALE, (1993), Sénégal : Stabilisation, Ajustement partiel et Stagnation,
Rapport n°11506 - SE.
BANQUE MONDIALE, (1997), Sénégal : le défi de l'intégration international,
Décembre.
BARRO, R· XAVIER Sala-I-Martin : (1996), La croissance économique,
'
MCGRA WHILL, Ediscience, Paris.
BARRO, R. J. : (1991), Economie growth in a cross-section of countries, Quaterly Journal
of economies, March.
)( BARRO, R.J. et XAVIER Sala-I-Martin : (1992), Public finance in models of economie
growth, Review of Economie Studies, no 59.
BARROS, A.R., (1993), Sorne implications of new growth theory for economie
development, in Journal ofInternational Development, vo/.5, n°5, 531-558.
BEN HAMMOUDA, H. (1998), Les théories du post-ajustement: quelques pistes de
recherche pour les économies africaines, CODESRIA-Dakar, Série Etats de la littérature,
N° 1., Dakar, Sénégal.
BERTHELEMY, J-C; DESSUS, S. et VAROUDAKIS, A.: (1997), Capital humain,
Ouverture extérieure et Croissance : estimation dur données de panel d'un modèle à
coefficients variables, OCDE-document technoque, no 121, Janvier.
BONSTON CONSULTING GROUP, (1990), République du Sénégal: Impact de la
réforme de la politique industrielle, Dakar, Sénégal.
BRASSEUL, J.: (1993), Introduction à l'économie du développement, Cursus, Paris.
BROCHART, F.: (1984), Exportation et croissance économique: application aux pays
africains de la zone franc, revue d'Economie Ploitique, 95ème année, N°4, pp. 469-483.
BURNSIDE, C., DOLLAR, D.: (1997), Aid, policies and growth, World Bank Working
Paper, N° 1777, Washington, DC.
Conseil Economique et Social (CES) du Sénégal, (1995), Etude sur l'impact de la
dévaluation du franc CFA, Novembre, Dakar, Sénégal.
DE MELO, J.; ROBINSON, S. : (1990), Productiviry and externalities: Models of
export-led growth, WPS, n° 387, World Bank, Mars.
DIAGNE, A. ; KANE, K. ; DAFFE, G. ; NIANG, I.C. ; SALL, S.S. ; KASSOUM, S.,
(1998), Relance et durabilité de la croissance économique au Sénégal, Dakar, Sénégal.
DIAGNE, A., (1995), Evaluation des politiques macro-économiques du Sénégal avant et
après la dévaluation du franc CFA, Document de recherche du CREA, Dakar, Sénégal.
Direction de la Prévision et de la Statistique (DPS) du sénégal, Base de données sur les
comptes économiques du Sénégal 1960-1997, Dakar.
DODARO, S. :(1991), Comparative advantage, trade and growth: export-led growth
revisited, World Development, vol19, n°9; pp 1153-1165.
DURUFLE, G., (1988), L'ajustement structurel en Afrique, Karthala, Paris.
EASTERLY, W.; LIVE, R.: (1997), Africa's growth tragedy: Policies and ethnie
divisions, Quater/y Journal of Economies, N°ll2, November, pp.1203-1250.
ELLIOT, B., ALEXANDRIA, V. (1990), Ajustement ajourné: réforme de la politique
économique du Sénégal dans les années 80, Dakar, Sénégal.
FISCHER, S.: (1993), The rôle of macroeconomie factors m growth, Journal of
Monetary Economies, vo/32, December, pp. 485-512.
FOSU, A. K.: (1990), Exports and economie growth: the african case, World
Development, vol 18, n°6; pp 831-835.

)( GUNDLACH, E. : (1995), The role of human capital in economie growth : New results
and alternative interpretations, weltwirtschftliches archiv', 132/2.
HAROLD, L., (1995), Impact \le la dévaluation du franc CFA sur l'économie sénégalaise:
synthèse des études, Août, Dakar, Sénégal.
111
Références bibliographiques
HARROLD, P.; JAYAWICKRAMA, M.; BHATTASALI, (1996), Practicallessons

for Africa from East Asia in industrial and trade policies, World Bank discussion paper
310, Washington D.C.
INTERNATIONAL MONETARY FUND, International Financial Statistics 1998.
KELLER, W. : (1997), How trade patterns and technology flows affect productivity
growth, The World Bank Policy Research Working Paper, n° 1831,Washnington DC.
September.
KRUEGER, A. : (1978), Foreign Trade regimes and economie development:
Liberalization attempts and consequences, Ballinger Publishing co.
KRUGMAN, P.R. : (1990), Strategie trade policy and the new international economies,
the MIT Press.
KRUGMAN, P.R.; OBSTFELD, M. : (1996), Economie Internationale, Nouveaux
Horizons, Paris.
LAHOUEL, M. : (1996), Politique commerciale stratégique, Croissance endogène et
Commerce international : Pertinence des nouvelles théories pour les PVD, CODESRJA-
Dakar, document spécial, n° 7, Novembre.
LAU, L. JAMISON, D. T.; LOUAT, F. (1991): Education and Productivity in
developing countries : an aggregate production function approach, World Bank report,
WPS, n° 612, Washington, March.
Ministère de l'Economie, des Finances et du Plan (MEFP) du Sénégal, (1997), Plan de
Développement Economique et Social1996-2001, Dakar, Sénégal.
Ministère de l'Economie, des Finances et du Plan (MEFP) du Sénégal, (1991),
Observations sur le rapport de ELLIOT BERG and Associates : "Ajustement ajourné :
réforme de la politique économique du Sénégal dans les années 80, Dakar, Sénégal.
NGUYEN, T. M. D. ; SCHWAB, L. : (1999), Evolution du capital humain dans les pays
de l'Asie du Sud-Est, ? Communication, Quatrième Journées Scientifiques, Ouagadougou,
Janvier.
PSACHAROPOULOS, G; WOODHALL, M.: (1988), L'éducation pour le
développement: une analyse des choix d'investissement, Economica, Paris.
X PYO, H. K. : (1995), A time-series test of endogenous growth madel with human capital:
Growth theories in light of East Asian .experience, Edited by Takatoshi Ito and Anne 0
Krueger, NBER.
112
RAMON, L.; VINOD, T.; YANG, W., (1998}, Adressing the Education puzzle, World
Bank Policy Research Working Paper, n°2031, December.
ROMER, D., (1997), Macroéconomie Approfondie, Traduit de l'américain par Fabrice
Mazerolle, Ediscience International, Paris.
1li SACERDOTI, E. BRUNSCHWIG, S. et TANG, J. : (1998), The impact of human
capital on growth: Evidence from west Africa, IMF Working Paper.
SACHS, J. D. ; ANDREW, M. ; W ARNER; (1996), Sources of slow growth in african
economies, Paper presented at the Annual World Bank Conference on Development
Economies 1996, Washington D.C.
x SPIEGEL, M. M.; BENHABIB, J.: (1994), The role of human capital in economie
development : evidence from aggregate cross-country data, Journal of Monetary
Economies, n° 34, pp. 143-173.
TAKATOSHI, 1., (1997), What can developing countries leam from East Asian economie
growth ? , Paper presented at the Annual World Bank Conference on development
Economies, 1997, Washington D.C.
TYBOUT, J. : (1992), Reaserching the trade/productivity link: new direction, World
Bank, mimeo, Washington, DC.

V AN DER KRAAIJ, F; VAN DER HOEVEN, R. : (1994), L'ajustement structurel et
au-delà en Afrique Subsaharienne, Karthala.

VERNER, D.: (1999), Wage and Productivity gaps: Evicene from Ghana, Policy
Research Working Paper, no 2168, World Bank, January.
W ACZIARG, R., (1998), Measuring the dynamic gains from Trade, World Bank Policy
Research Working Paper, n°2001, December.

WORLD BANK, World Development Indicators 1997, 1998/1999, Washington D.C.
WORLD BANK, World Tables 1990, 1995, Washington D.C.
WORLD BANK, (1993), The East Asian Miracle, Economie growth and public policy,
World Bank Policy Reseârch Report, Washington, D.C.
11 3
ANNEXES
ANNEXE 1: DONNES UTILISEES DANS L'ETUDE
Les données relatives aux variables utilisées dans les analyses sont données dans les
tableaux 1, 2, 3 et 4. Les libellés et unités de mesure de ces variables sont précisées ci-
après.
Tableau 1:
TXCH: Taux de change dollar US- FCFA. Il s'agit du taux de change sur le marché
officiel. Ce taux donne le nombre d'unités de Franc~ CFA qui équivaut à 1 dollar US.
Les données proviennent de l'annuaire "International Financial Statistics Yearbook,
1999" publié par les services statistiques du Fonds Monétaire International.
PIB: Produit Intérieur Brut, mesuré en millions de FCFA au prix constant de 1987. Les
données proviennent de la base de données de la Direction de la Prévision et de la
Statistique du Sénégal sur les comptes économique.
Dettesdoll: Encours de la dette extérieure totale du sénégal; elle est exprimée en million
de dollars US . Les données proviennent de l'annuaire "World Bank World Tables,
1990", puis de World Development Indicators CD-Rom, World Bank".
dettescfa: est l'encours de la dettes extérieure totale exprimée en million de francs CFA.
Cette série est obtenue par le produit de Dettesdoll par TXCH.
dette-ratio: C'est le ratio de la dette au PIB. C'est le rapport de dettescfa sur PIB
exprimé en pourcentage.
Classement: Il s'agit des valeurs de dette-ratio ordonnées par valeurs décroissantes. Cet
ordre est compatible avec l'évolution de la qualité de la politique d'endettement dans le
temps. Par exemple, la plus grande valeur de dette-ratio correspond à l'année où la
politique d'endettement a été la moins bonne sur toute la période 1970-1997; ainsi, la
qualité de la politique d'endettement a la plus faible valeur (1) pour cette année.
2
Tableau 2:
Deficit-ratio: ratio du déficit budgétaire au PIB exprimé en pourcentage. Les données
proviennent à la fois de "World Bank World Tables, 1990" et de African Development
Indicators, 1998/1999".
M2: La masse monétaire au sens large (monnaie +quasi-monnaie). Les données

proviennent de l'mmuaire "International Financial Statistics Yearbook, 1999" publié par
les services statistiques du Fonds Monétaire International. Elle est exprimée en milliards
de FCFA.
TXM2: Taux de croissance de M2.
g : Taux de croissance du PIB réel, calculé à partir de la série PIB.
TXM2-g : Exprime le taux de croissance réel de la masse monétaire.

Les colonnes Classement et Années sont obetenues d ela même manière que dan sie
tableau 1.
Tableau 3:
Les variables rangdette, rangsurplus budgétaire et rangM2 donnent pour chaque année le
rang de la valeur prise respectivement par dette-ratio, deficit-ratio et TXM2-g pour cette
année, dans le classement des 28 valeurs (1970-1997) par ordre de qualité croissante de la
politique concernée.
L'indice de qualité de la gestion macroéconomique (indice qualmacro) est la moyenne
arithmétique simple des variables rangdette, rangsurplus budgetaire et rangM2. Cet indice
est donc sans unité.
Tableau 4.
PIBTFR : PIB par tête de la France exprimé en millier de FF. Il est calculé à partir du
PIB par tête en FF au prix constant de 1990.
PIBTUSA : PIB par tête des Etats-Unis exprimé en millier de dollar US. Il a été calculé à
partir du PIB par tête en dollar US au prix constant de1990.
QUALMACRO :variable précédemment calculée (tableaux 1 à 3).
SECHER : variable muette représentant les années marquées par une forte sécheresse au
Sénégal. Elle prend la valeur 1 pour les années où il y a eu forte sécheresse et la valeur 0
pour les autres années.
TXCAPITALT : Taux de croissance annuel du capital physique par tête.
TXCONSG: Taux de croissance annuel de la consommation publique CONSPUE.
HUMPRI (resp. HUMTOT) :Nombre moyen d'années de scolarité passées par un actif
dans le cycle primaire (resp. tout de cycle) de l'éducation. L'unité est année.
HUMREV : Indice mesurant le capital humain et calculé à partir de la durée de scolarité

et d' une indexation sur les salaires versés dans la fonction publique.
Les données sur les trois variables HUMPRl, HUMTOT et HUMREV proviennent de
Sacerdoti, E. Brunschwig, S.; Tang, J., The Impact ofhuman capital on growth: Evidence
from west Africa, November 1998.
PIBT : PIB par tête. C'est le rapport du PIB par la populationn active; il est exprimé en
millier de francs CFA.
CONSPUB : Consommation publique mesurée par la consommation finale des

administrations publiques. Elle est mesurée en milliards de FCFA au prix constant de
1987. Les données proviennent de la base de données de la DPS- Sénégal sur les compte
économique.
4
EXPORT: Contribution des exportations à la croissance du PIB. Pour chaque année,
cette contribution est obtenue en multipliant le taux de croissance des exportations au
cours de 1' année par le poids des exportations dans le PIB au cours de 1' année précédente.
Elle est calculée à partir de la base données de la DPS.
GAPFRANCE et GAPUSA : Ecarts entre les PIB par tête de la France et des Etats-Unis
et celui du Sénégal. Ils sont exprimés en million de FCFA et calculés à partir des séries
PIBTFR, PIBTUSA, PIBT, TXCH et du facteur de conversion entre le FCFA et le FF
( 1FF = 50FCFA pour les années avant 1994 et 1FF = 100FCFA à partir de 1994).
TXHPRI, TXHREV, TXHTOT, TXPIBT sont respectivement les taux de croissance

annuels de HUMPRl, HUMREV, HUMTOT et PIBT.
POPACT : Effectif de la population active exprimé en millier d'habitants. Les dom1ées

sont issues de l'annuaire "International Financial Statistics Yearbook, 1999" publié par
les services statistiques du Fonds Monétaire International.
TXLABOR : Taux de croissance annuelle de la population active. Il est calculé à partir

de POP ACT.
TXPIB :Taux de croissance annuel du PIB calculé à partir de la variable PIB.
TXCAPTL : Taux de croissance annuel du capital physique; il est calculé à partir de la

série CAPITAL.
CAPITAL: Stock de capital physique calculé par la méthode de l'inventaire permanent.

Il est exprimé en milliards de FCFA.
5
Calcul de l'indice de qualité macroéconomique
Tableau 1: Classement des années 1970-1997 selon la qualité de la politique
d'endettement
Années TXCH Dettesdoll dettesfcfa PIB dette-ratio Classement Année

1970 277.7 131 36378.7 925600 3.93 133.34 1994
1971 277.1 151.3 41925.23 924300 4.54 121.07 1997
1972 252.2 160.6 40503.32 983300 4.12 119.67 1995
1973 222.7 194.1 43226.07 928400 4.66 111.27 1996
1974 240.5 260 62530 967400 6.46 99.98 1993
1975 214.32 342.3 73361.74 1040300 7.05 97.29 1992
1976 238.98 402.3 96141.65 1133100 8.48 87.57 1987
1977 245.67 597.3 146738.7 1102700 13.31 85.12 1985
1978 255.64 863.2 220668.4 1059100 20.84 84.02 1986
1979 212.72 1109 235906.5 1133200 20.82 79.68 1988
1980 211.3 1473 311244.9 1095800 28.40 73.50 1984
1981 271.73 1541.2 418790.3 1082800 38.68 72.82 1989
1982 328.61 1811.1 595145.6 1249000 47.65 68.31 1990
1983 381 .06 1924.4 733311.9 1276100 57.47 67.97 1991
1984 436.96 2060.6 900399.8 1225000 73.50 57.47 1983
1985 449.26 2408.6 1082088 1271300 85.12 47.65 1982
1986 346.3 3225 1116818 1329300 84.02 38.68 1981
1987 300.5 4028 1210414 1382300 87.57 28.40 1980
1988 297.8 3886 1157251 1452400 79.68 20.84 1978
1989 319 3269 1042811 1432100 72.82 20.82 1979
1990 272.3 3732 1016224 1487700 68.31 13.31 1977
1991 282.1 3570.4 1007210 1481800 67.97 8.48 1976
1992 402 3665.7 1473611 1514600 97.29 7.05 1975
1993 389.4 3802.5 1480694 1481000 99.98 6.46 1974
1994 555 .2 3659 2031477 1523500 133.34 4.66 1973
1995 499.2 3840.9 1917377 1602200 119.67 4.54 1971
1996 511.6 3664.2 1874605 1684700 111.27 4.12 1972
1997 583 .7 3670.6 2142529 1769700 121.07 3.93 1970

Tableau 2: Classement des mmées 1970-1997 selon la qualité des politiques budgétaires
et monétaires.
Années deficit- Classe Année cor- M2 TXM2 g TXM2-g Classe Année

ratio ment respondante ment
1970 0.13 -10.5 1984 34.5 23.35 8.60 14.75 51.51 1994
1971 -0.13 -9.7 1994 35.2 2.03 -0.10 2.13 49.13 1974
1972 0.24 -4.35 1983 39.12 11 .14 6.40 4.74 19.95 1978
1973 -0.81 -4.3 1985 44.2 12.99 -5.60 18.59 19.53 1981
1974 -0.33 -4.3 1990 67.77 53.33 4.20 49.13 18.59 1973
1975 -0.21 -4.19 1982 75.18 10.93 7.50 3.43 17.70 1977
1976 -0.59 -4.1 1993 94.89 26.22 8.90 17.32 17.32 1976
1977 -1 .37 -4 1989 109.12 15.00 -2.70 17.70 17.10 1980
1978 0.14 -3.6 1986 126.53 15.95 -4.00 19.95 14.75 1970
1979 -0.38 -3.5 1995 121 .21 -4.20 7.00 -11.20 12.71 1986
1980 -2.20 -3.1 1992 137.94 13.80 -3.30 17.10 8.81 1989
1981 -2.09 -2.5 1987 163.23 18.33 -1.20 19.53 5.32 1984
1982 -4.19 -2.5 1988 189 15.79 15.30 0.49 4.82 1991
1983 -4.35 -2.20 1980 189.15 0.08 2.20 -2.12 4.74 1972
1984 -10.5 -2.2 1996 191.65 1.32 -4.00 5.32 3.43 1975
1985 -4.3 -2.09 1981 193.49 0.96 3.80 -2.84 3.37 1996
1986 -3.6 -1.5 1997 226.99 17.31 4.60 12.71 2.13 1971
1987 -2.5 -1.37 1977 214.42 -5.54 4.00 -9.54 0.49 1982
1988 -2.5 -0.81 1973 214.91 0.23 5.10 -4.87 -0.24 1992
1989 -4 -0.59 1976 230.83 7.41 -1.40 8.81 -1.46 1995
1990 -4.3 -0.38 1979 204.20 -11 .54 3.90 -15.44 -2.12 1983
1991 0.3 -0.33 1974 213.22 4.42 -0.40 4.82 -2.84 1985
1992 -3.1 -0.21 1975 217.39 1.96 2.20 -0.24 -4.87 1988
1993 -4.1 -0.13 1971 197.75 -9.03 -2.20 -6.83 -5.44 1997
1994 -6.1 0.13 1970 305.34 54.41 2.90 51.51 -6.83 1993
1995 -3.5 0.14 1978 316.76 3.74 5.20 -1.46 -9.54 1987
1996 -2.2 0.24 1972 343.60 8.47 5.10 3.37 -11.20 1979
1997 -1.5 0.3 1991 342.10 -0.44 5.00 -5.44 -15.44 1990
7
Tableau 3: Résultat du calcul de l'indice de qualité macroéconomique pour la période
1970-1997 au Sénégal.
Années rangdette Rangsurplus rangM2 Indice
budgetaire quai macro
1970 28 25 9 20.67
1971 26 24 17 22.33
1972 27 27 14 22.67
1973 25 19 5 16.33
1974 24 22 2 16.00
1975 23 23 15 20.33
1976 22 20 7 16.33
1977 21 18 6 15.00
1978 19 26 3 16.00
1979 20 21 27 22.67
1980 18 14 8 13.33
1981 17 16 4 12.33
1982 16 6 18 13.33
1983 15 3 21 13.00
1984 11 1 12 8.00
1985 8 4 22 11.33
1986 9 9 10 9.33
1987 7 12 26 15.00
1988 10 13 23 15.33
1989 12 8 11 10.33
1990 13 5 28 15.33
1991 14 28 13 18.33
1992 6 11 19 12.00
1993 5 7 25 12.33
1994 1 2 1 1.33
1995 3 10 20 11.00
1996 4 15 16 11.67
1997 2 17 24 14.33
8
Tableau 4: Données utilisées dans les analyses économétriques.
ANNEES HUMPRI HUMREV HUMTOT PIB PIBT

1970 0.730000 0.590000 0.830000 925.6000 460.7700
19n 0.740000 0.590000 0.840000 924.3000 450.2400
1972 0.750000 0.600000 0.860000 983.3000 468.4800
1973 0.770000 0.600000 0.880000 928.4000 432.2000
1974 0.780000 0.600000 0.890000 967.4000 439.6500
1975 0.800000 0.600000 0.920000 1040.300 460.9600
1976 0.820000 0.600000 0.940000 1133.100 488.9700
1977 0.840000 0.600000 0.970000 1102.700 462.7700
1978 0.850000 0.610000 0.990000 1059.100 431.9300
1979 . 0.870000 0.610000 1.010000 1133.200 449.0800
1980 0.890000 0.610000 1.030000 1095.800 430.1400

1981 0.900000 0.610000 1.060000 1082.800 413.2200
1982 0.920000 0.620000 1.080000 1249.000 463.4300
1983 0.940000 0.620000 1.110000 1276.100 460.3700
1984 0.960000 . 0.620000 1.130000 1225.000 439.2400
1985 0.970000 0.620000 1.160000 1271.300 443.1500
1986 0.990000 0.620000 1.1 80000 1329.300 450.5400
1987 1.010000 0.620000 1.210000 1382.300 455.6000
1988 1.030000 0.630000 1.240000 1452.400 465.5600
1989 1.050000 0.630000 1.260000 1432.100 446.4900
1990 1.080000 0.630000 1.290000 1487.700 451.2 100
1991 1.100000 0.630000 1.330000 1481.800 438.6500
1992 1.130000 0.640000 1.360000 1514.600 437.1600
1993 1.150000 0.640000 1.400000 1523.500 416.5300
1994 1.180000 0.640000 1.440000 1481.000 41 7.3300
1995 1.2 10000 0.650000 1.470000 1602.200 427.4200
1996 1.240000 0.650000 1.510000 1684.700 437.5800
1997 1.260000 0.650000 1.550000 1769.700 447.3800

Tableau 4: Données utilisées dans les analyses économétriques (suite)
ANNEES PIBTFR PIBTUSA QUALMACRO SECHER TXCAPITALT TXCONSG

1970 73 .38000 16.52000 20.67000 0.000000
1971 76.17000 16.81000 22.33000 0.000000 0.480000 4.53 703 7
1972 78.85000 17.43000 22.67000 1.000000 0.480000 -1.240035
1973 82.46000 18.16000 16.33000 1.000000 1.180000 -0.35 8744
1974 84.44000 17.89000 16.00000 0.000000 0.610000 7.650765
1975 83.72000 17.57000 20.33000 0.000000 0.890000 1.672241
1976 87.08000 18.26000 16.33000 0.000000 0.780000 13.32237
1977 89.48000 18.89000 15.00000 1.000000 -0.030000 2. 757620
1978 92.07000 19.59000 16.00000 1.000000 -0.084000 14.05367

1979 94.65000 19.87000 22.67000 0.000000 -1.080000 2.352941
1980 95.71000 19.53000 13.33000 0.000000 1.530000 2.05 6866
1981 96.29000 19.68000 12.33000 0.000000 -0.180000 0.11 8554
1982 98.20000 19.07000 13.33000 0.000000 -1.010000 4.499704
1983 98.43000 19.64000 13.00000 0.000000 -0.660000 4.135977
1984 99.33000 20.67000 8.000000 1.000000 1.890000 2.992383
1985 100.7900 21.13000 11.33000 0.000000 -0.800000 2.588484
1986 102.6200 2 1.56000 9.330000 1.000000 -0.910000 6.282 183
1987 104.4200 22.02000 15.00000 1.000000 0.310000 4.457364
1988 108.5400 22.67000 15.33000 0.000000 0.530000 0.324675
1989 111.9200 23.03000 10.33000 0.000000 0.620000 4.068423
1990 114.7500 22.98000 15.33000 0.000000 0.330000 -0.266548
1991 114.9500 22.52000 18.33000 0.000000 0.470000 -5.07795 1
1992 115.6800 22.88000 12.00000 0.000000 0.440000 3.425622
1993 113.5900 23.14000 12.33000 0.000000 0.610000 -2.81 3067
1994 116.2900 23.73000 1.330000 0.000000 -0. 130000 -5.648926
1995 118.2200 23.98000 11.00000 0.000000 0.150000 2.770905
1996 119.5900 24.42000 11 .67000 0.000000 0.200000 0.722 195
1997 121.8600 25.12000 14.33000 0.000000 1.000000 0.71 7017

Tableau 4: Données utilisées dans les analyses économétriques (suite)
ANNEES TXHPRI TXHREV TXHTOT TXPIBT

1970 NA NA NA NA
1971 1.369863 0.000000 1.204819 -2.290000
1972 1.351351 1.694915 2.380952 4.050000
1973 2.666667 0.000000 2.325581 -7.750000
1974 1.298701 0.000000 1.136364 1.720000
1975 2.564103 0.000000 3.370787 4.850000
1976 2.500000 0.000000 2.173913 6.080000
1977 2.439024 0.000000 3.191489 -5.360000
1978 1.190476 1.666667 2.061856 -6.660000
1979 2.352941 0.000000 2.020202 3.970000
1980 2.298851 0.000000 1.980198 -4.220000
1981 1.123596 0.000000 2.912621 -3.930000
1982 2.222222 1.639344 1.886792 12.15000
1983 2.173913 0.000000 2.777778 -0.660000
1984 2.127660 0.000000 1.801802 -4.590000
1985 1.041667 0.000000 2.654867 0.890000
1986 2.061856 0.000000 1.724138 1.670000
1987 2.020202 0.000000 2.542373 1.120000
1988 1.980198 1.612903 2.479339 2.190000
1.941748 0.000000 1.612903 -4.100000

1989
1990 2.857143 0.000000 2.380952 1.060000
1.851852 0.000000 3.100775 -2.780000

1991
2.727273 1.587302 2.255639 -0.340000
1992
1.769912 0.000000 2.941176 -4.720000
1993
2.608696 0.000000 2.857143 0.190000
1994
1.562500 2.083333 2.420000
1995 . 2.542373
0.000000 2.721088 2.380000
1996 2.479339
0.000000 2.649007 2.240000
1997 1.612903
Tableau 4: Données utilisées pour les analyses économétriques (suite)
ANNEES CONS PUB EXPORT GAP FRANCE GAPUSA

1970 108.0000 2.810000 3.200000 .4.100000
1971 112.9000 -3 .090000 3.300000 4.200000
1972 111.5000 4.030000 3.500000 3.900000
1973 111.1000 -3.360000 3.700000 3.600000
1974 119.6000 1.850000 3.800000 3.800000
1975 121.6000 3.640000 3.700000 3.300000
1976 137.8000 5.000000 3.900000 3.900000
1977 141.6000 2.300000 4.000000 4.200000
1978 161.5000 -7.910000 4.200000 4.600000
1979 165.3000 5.560000 4.300000 3.800000
1980 168.7000 -3 .950000 4.300000 3.700000
1981 168.9000 -0.090000 4.400000 4.900000
1982 176.5000 8.700000 4.400000 5.800000
1983 183.8000 0.340000 4.500000 7.000000
1984 189.3000 0.810000 4.500000 8.600000
1985 194.2000 -3.550000 4.600000 9.000000
1986 206.4000 3.740000 4.700000 7.000000
1987 215.6000 0.350000 4.800000 6.200000
1988 216.3000 0.900000 4.900000 6.300000
1989 225.1000 1.380000 5.100000 6.900000
1990 224.5000 1.040000 5.300000 5.800000
1991 213.1000 0.140000 5.300000 5.900000
1992 220.4000 -1.670000 5.300000 8.700000
1993 2 14.2000 -0.560000 5.300000 8.800000
1994 202.1000 5.600000 11.20000 12.70000
1995 207.7000 0.860000 11.40000 11.50000
1996 209.2000 -1.360000 11.50000 12.00000
1997 210.7000 0.100000 11.70000 14.20000

Tableau 4: Données utilisées dans les analyses économétriques (suite et fin)
(variables du modèle I)
ANNEES POP ACT TXLABOR TXPIB TXCAPTL CAPITAL

1970 2008.800 NA 8.600000 1.290000 1327.2
1971 2052.500 2.200000 -0.100000 2.690000 1362.9
1972 2098.900 2.240000 6.400000 2.710000 1399.8
1973 2148.100 2.340000 -5.600000 3.550000 1449.5
1974 2200.400 2.430000 4.200000 3.060000 1494.0
1975 2256.800 2.560000 7.500000 3.480000 1546.0
1976 2317.300 2.680000 8.900000 3.490000 1599.0
1977 2382.800 2.830000 -2.700000 2.790000 1644.0
1978 2452.000 2.900000 -4.000000 2.040000 1678.0
1979 2523.400 2.9 10000 7.000000 1.800000 1708.2
1980 2547.500 2.630000 -3.300000 2.500000 1751.0
1981 2620.400 2.860000 -1 .200000 2.670000 1797.8
1982 2695.100 2.850000 15.30000 1.810000 1830.3
1983 2771.900 2.850000 2.200000 2.170000 1870.0
1984 2789.000 2.750000 -4.000000 2.520000 1917.1
1985 2868.800 2.860000 3.800000 2.040000 1956.2
1986 2950.400 2.840000 4.600000 1.910000 1993.4
1987 3034.100 2.840000 4.000000 3.160000 2056.4
1988 3119.700 2.820000 5.100000 3.370000 2125 .7
1989 3207.400 2.810000 -1.400000 3.450000 2199.0
1990 3297.200 2.800000 3.900000 3.140000 2268.0
1991 3378.000 2.450000 -0.400000 2.930000 2334.4
1992 3464.500 2.560000 2.200000 3.020000 2404.8
1993 3555.600 2.630000 -2.200000 3.260000 2483.1
1994 3650.500 2.670000 2.900000 2.540000 2546.1
1995 3748.500 2.680000 5.200000 2.840000 2618.5
1996 3850.000 2.710000 5.100000 2.920000 2694.8
1997 3955.700 2.750000 5.000000 3.770000 2796.4

ANNEXE 2: Resultats de L'analyse de Stationnarité des Variables
14
Phillips-Perron Unit Root Test on TXPIBT
PP Test Statistic -6.765694 1% Critical Value* -3.7076

5% Critical Value -2.9798
*MacKinnon critical values for rejection of hypothesis of a unit root.
Lag truncation for Bartlett kernel : 4 ( Newey-West suggests: 2 )

Residual variance with no correction 18.76717
Residual variance with correction 8.387197
Phillips-Perron Test Equation

LS // Dependent Variable is D(TXPIBT)
Date: 04/12/00 Time: 09:09
Sample(adjusted): 1972 1997
lncluded observations: 26 after adjusting endpoints
Variable Coefficient Std. Error t-Statistic Pro b.
TXPIBT(-1) -1.185125 0.200578 -5.908560 0.0000

c 0.052983 0.884526 0.059900 0.9527
0.592606 Mean dependent var 0.174231

R-squared
0.575632 S.D. dependent var 6.921638
Adjusted R-squared
4.509002 Akaike info criterion 3.085955
S.E. of regression
487.9463 Schwarz criterion 3.182732
Sum squared resid
-75.00982 F-statistic 34.91108
Log likelihood
2.067666 Prob(F-statistic) 0.000004
Durbin-Watson stat
Phillips-Perron Unit Root Test on EXPORT

Lag truncation for Bartlett kernel: 4 ( Newey-West suggests: 3 )


LS // Dependent Variable is D(EXPORT)
Date: 04/12/00 Time: 09:14
lncluded observations : 27 after adjusting endpoints
EXPORT(-1) -1 .352830 0.186002 -7.273205 0.0000

c 1.077594 0.664951 1.620562 0.1177
R-squared 0.679074 Mean dependent var -0.100370

S.E. of regression 3.3511 30 Akaike info criterion 2.489783
Log likelihood -69.92340 F-statistic 52.89951
Durbin-Watson stat 2.076166 Prob(F-statistic) 0.000000
Phillips-Perron Unit Root Test on QUALMACRO



LS //Dependent Variable is D(QUALMACRO)
Date: 04/12/00 Time: 09:15
QUALMACRO(-1) -0.533594 0.169384 -3.150209 0.0042

c 7.504873 2.584485 2.903817 0.0076

Phillips-Perron Unit Root Test on TXCAPITALT



LS //Dependent Variable is D(TXCAPITALT)
Date: 04/12/00 Time: 09:16
TXCAPITALT(-1) -1.067497 0.207313 -5 .149210 0.0000

c 0.291637 0.159580 1.827524 0.0801

S.E. of regression 0.767957 Akaike info criterion -0.454240
Sum squared resid 14.15418 Schwarz criterion -0.357464
Phillips-Perron Unit Root Test on TXCONSG



LS //Dependent Variable is D(TXCONSG)
Date: 04/12/00 Time: 09:18
Variable Coefficient Std. Error t -Statistic Pro b.
TXCONSG(-1) -0.944375 0.203759 -4.634766 0.0001

c 2.373275 1.050110 2.260026 0.0332
R-squared 0.472308 Mean dependent var -0 .146924

Sum squared resid 503.9027 Schwarz criterion 3.21431 4
Phillips-Perron Unit Root Test on TXHPRI

Lag truncation for Bartlett kernel : 4 ( Newey-West suggests: 2 )


LS // Dependent Variable is D(TXHPRI)
Date: 04/12/00 Time: 09:18
TXHPRI(-1) -1.245045 0.193848 -6.422793 0.0000

c 2.574218 0.412361 6.242626 0.0000

S.E. of regression 0.524244 Akaike info criterion -1.217792
Log likelihood -19.06110 F-statistic 41 .25227
Phillips-Perron Unit Root Test on TXHTOT



LS // Dependent Variable is D(TXHTOT)
Date: 04/12/00 Time: 09:19
TXHTOT(-1) -1.356023 0.173756 -7.804179 0.0000

c 3.21 5020 0.416675 7.715887 0.0000

S.E. of regression 0.502706 Akaike info criterion -1 .301694
Durbin-Watson stat 2.053612 Prob (F -statistic) 0.000000
Phillips-Perron Unit Root Test on TXHREV



LS // Dependent Variable is D(TXHREV)
Date: 04/12/00 Time: 09:19
TXHREV(-1) -1.299699 0.194741 -6 .673976 0.0000

c 0.488069 0.152291 3.204837 0.0038
0.649850 Mean dependent var 0.000000

R-squared
0.635260 S.D. dependent var 1.127842
Adjusted R-squared
0.681146 Akaike info criterion -0 .694154
S.E. of regression
11.13503 Schwarz criterion -0.597377
Sum squared resid
-25.86840 F-statistic 44.54196
Log likelihood
1.998526 Prob(F-statistic) 0.000001
Durbin-Watson stat
Phillips-Perron Unit Root Test on SECHER

Lag truncation for Bartlett kernel: 4 ( Newey-West suggests : 3 )

Residual variance with correction 0.1420 17

LS // Dependent Variable is D(SECHER)
Date: 04/12/00 Time: 09:36
SECHER(-1) -0.771429 0.194705 -3 .962029 0.0005

c 0.200000 0.099139 2.017366 0.0545

S.E. of regression 0.443364 Akaike info criterion -1 .555542
ANNEXE 3: Résultats de l'estimation des modèles
15
LS //Dependent Variable is TXPIBT
Date: 04/12/00 Time: 09:51
Variable Coefficient Std. Error t -Statistic Prob.
c -2.942868 3.205228 -0.918146 0.3677

TXCAPITALT -2.408613 1.110139 -2.169650 0.0402
TXHPRI 1.764971 1.529966 1.153601 0.2600

Log likeliho"od -75.21044 F-statistic 2.753246
Date: 04/12/00 Time: 09:52
Variable Coefficient Std. Error t -Statistic Pro b.
c 2.246186 3.560004 0.630950 0.5340

TXCAPITALT -2.284594 1.128868 -2.023792 0.0543
TXHTOT -0.690638 1.464738 -0.471509 0.6415

LS // Dependent Variable is TXPIBT
Date: 04/12/00 Time: 09:53
c 0.017052 0.977889 0.017438 0.9862

TXCAPITALT -2.045587 1.107876 -1 .846405 0.0772
TXHREV 1.505462 1.189411 1.265721 0.2178

Log likelihood -75.06664 F-statistic 2.91123 1
LS // Dependent Variable is TXPIBT
Date: 04/12/00 Time: 09:57
c -2.720926 3.301798 -0.824074 0.4196

TXCAPITALT -0 .784794 0.850856 -0.922359 0.3673
TXHPRI 0.089882 1.193014 0.075340 0.9407
EXPORT 0.833593 0.187637 4.442589 0.0003
QUALMACRO 0.158277 0.126514 2.256057 0.0235
TXCONSG 0.228345 0.136369 1.674461 0.1096
SECHER -2.895116 1.320838 -2.191878 0.0404

Date: 04/12/00 Time: 09:59
c -2.950043 3.524302 -0.837057 0.4125

TXCAPITALT -0.748602 0.834833 -0.896708 0.3805
TXHTOT 0.172464 1.186405 0.145367 0.8859
EXPORT 0.841151 0.174200 4.828663 0.0001
QUALMACRO 0.155936 0.123880 2.268769 0.0223
TXCONSG 0.240190 0.160761 1.494087 0.1508
SECHER -2.912350 1.323657 -2.200230 0.0397

Date: 04/12/00 Time: 10:03
lncluded observations : 27 after adjusting endpoints
Variable Coefficient Std. Error · t-Statistic Prob.
c -2.729810 1.704731 -1.601315 0.1 250

TXCAPITALT -0.547632 0.755168 -0.725179 0.4767
TXHREV 1.603431 0.774809 2.069454 0.0517
EXPORT 0.840460 0.157583 5.333437 0.0000
QUALMACRO 0.135861 0.112891 1.203466 0.0243
TXCONSG 0.196569 0.124485 1.579053 0.1300
SECHER -3 .115274 1.203057 -2.589465 0.0175

•
15~---------------------------------
--- ---
--- --- --- ---
10
--- ---
---
--- ---
1
5 --- --- ---
1·
1
OT-----------------------------------
-5 --
---
-10
-15+-~~~~~~~~~~~~~~~~~~~
78 80 82 84 86 88 90 92 94 913
- - CUSUM ------- 5°/o Significance \

15~--------------------------------~
---
10 ---
---
--- --- --- ---
--- --- --- ---
5 --- --- ---
0~====~-------------------------~
-5 ~~
-10
-15+-~~~~~~~~~~~~~~~~~~
78 80 82 84 86 88 90 92 94 913

1.
15.---------------------------------- -- ... .
---
--- ---
10 --- --- ---
--- ---
--- ---
--- ---
--- --- ---
5
0+---~~~~~----~~--~~~~~~-~
-5 --
-10
-15+-~~--~~~--~~~~~~~~~~
74 76 78 80 82 84 86 88 90 92 94 S6

15~-------------------------------
---
--- --- ---
10 ---
---
--- -- - ---
5 ---
oTr~~-----------------------------
-5 -
---
-10 ---
-15+-~~~~~~~~~~~~~~~~~
74 76 78 80 82 84 86 88 90 92 94 S6
-- CUS~IM ------- 5°/o Significance \

15~-------------------------------
10 ~~~
~~~
~~~
~~~
~~~
~~~
5 ~~~
~~~
OT----------------------------------

1.6 -r----------------~
1.2
0.8
0.4
-0.4+-~~~~~~~~~~~~~~~~~
78 80 82 84 86 88 90 92 94 95
- - CUSUM of Squares ----- -- 5°/o SignificancEJ

Wiley Royal Economic Society

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Wiley Royal Economic Society

Uploaded by

Copyright:

Available Formats

The Role of Economic Theory in Modelling the Long Run

Author(s): M. Hashem Pesaran

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

THE ROLE OF ECONOMIC THEORY IN

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

I. EQUILIBRIUM AND THE LONG-RUN

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

But, as should be evident, long-run relations can be formulated in a theory-

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

II. ESTIMATION AND INFERENCE

II. I. Case of a Single Long-Run Relation

Ayt = a I - ( I-0) yt-, + yxt-, + ut, (5)

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

ut= (o(ue/o(6eJ)et+ 'th

0 = ly+ (que/ee)6 (I- p)]/ (I -0) (I I)

C Royal Economic Society I997

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

Axt = b -(i I-p) xt-, + ;Ai Axt_i+ 6t. (I 3)

The authors consider the problem of testing y = yo when it is not known

and we have 0 = y only if p = i and/or -, = o. Therefore, if we do not know

in which xt is strictly exogenous for 0. Hence standard t- and F-tests applied to

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

11.2. Case of Multiple Long-Run Relations

where D, andR1, i = I, 2, ..., S-I, are k x k matrices of fixed coefficients.I have

C Royal Economic Society I 997

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

Azt = do+ (Hdl) t- zt-1 + E vi Azt-1+vt, (i8)

where vt = (u', c'), Vi are m x m matrices of fixed coefficients, m = n+k, and

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

P'=(A-B-C -r). (20)

III. ESTIMATION OF LONG-RUN RELATIONS USING PANELS

One of the difficulties in establishing links between economic theory and

yit = ai+0iyi,t_1+yxiXt+&zt+ui, (21)

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

involving common forcing variables and/or social interactions."7 In what

x = EN x and estimates of the long-run relation based on a finite-order

IV. CONCLUDING REMARKS

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

Bernheim,D. (I994). 'A theoryof conformity.'Journalof PoliticalEconomy,vol. I02, pp. 84I-77.

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

C Royal Economic Society I997

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

C Royal Economic Society I997

This content downloaded from 130.160.4.77 on Wed, 31 Jul 2013 16:42:56 PM

FIGURE 1 The ARDL-Bounds Procedure’s Comprehensive Approach to Time-Series Analysis

the testing procedure. Independent variables that are Rewritten, it becomes.

FIGURE 2 Bounds Test Statistics

In addition to the F-test, a one-sided t-test may be + ␤ j xt− j + ⑀t . (11)

The key component to the ARDL-bounds procedure is the 15

FIGURE 3 Proportion of Monte Carlo Simulations (Falsely) Detecting Cointegration

Bounds Test Johansen BIC Johansen Rank Engle-Granger

T = 35, 1 X T = 35, 2 X T = 35, 3 X T = 35, 4 X

T = 50, 1 X T = 50, 2 X T = 50, 3 X T = 50, 4 X

T = 80, 1 X T = 80, 2 X T = 80, 3 X T = 80, 4 X

FIGURE 4 Proportion of Monte Carlo Simulations (Correctly) Detecting

35 Obs. 50 Obs. 80 Obs.

Johansen BIC Johansen BIC Johansen BIC

Johansen BIC Johansen Rank Bounds Engle-Granger

(1) (2) (3) (4) (5)

TABLE 3 Results of the ARDL-Bounds Model (Volscho and Kelly 2012)

Dr. Mounir BELLOUMI

Key words: FDI, trade, economic growth, ARDL cointegration, Tunisia.

Ln(Y) 0 -2.40* -3.53 0 -1.75* -3.19 -2.55*** -3.53

Ln(K) 1 -3.14* -3.53 1 -2.52* -3.19 -2.01** -2.94

Ln(L) 0 -2.18 -2.94 0 -0.92* -3.19 -2.20** -2.94

Ln(F) 0 -2.79 -2.94 0 -2.73 -1.95 -2.76** -2.94

Ln(T) 1 -3.17* -3.53 1 -2.58* -3.19 -2.53*** -3.53

Ln(Y) 0 -6.21 -2.94 1 -3.07 -1.95 -6.26** -2.94

Ln(K) 0 -3.73 -2.94 0 -3.62 -1.95 -3.24* -1.95

Ln(L) 0 -5.58 -2.94 0 -5.62 -1.95 -5.59** -2.94

Ln(T) 0 -4.87 -2.94 0 -4.89 -1.95 -4.47* -1.95