Professional Documents
Culture Documents
Chapter 5
Chapter 5
12/07/2022 2
Therefore, in general a time series data is a
sequence of numerical data in which each
variable is associated with a particular
instant in time.
Univariate time-series analysis- analysis of
single sequence of data describing the
behavior of one variable in terms of its own
past values.
Example: Autoregressive models:
ut = ρut−1 + εt first order autoregressive or
yt= ρ1 yt−1+ ρ2yt−2+εt second order autoregressive
12/07/2022 3
Analysis of several sets of data(variables)
for the same sequence of time periods is
called multivariate time-series analysis.
Examples, analysis of the relationships
among say price level, money supply and
GDP on the basis of annual collected data).
The main purpose of time-series analysis is
to study the dynamics or temporal structure
of the data.
12/07/2022 4
Stationary and Non-stationary stochastic
processes
From theoretical point of view, the collection
of random variable yt ordered in time is called
a stochastic process or random process.
There are two different classes of the
stochastic process.
Stationary stochastic process-gives rise to
stationary time series.
Nonstationary stochastic process- give rise
to nonstationary time series.
12/07/2022 5
Stationary Stochastic Processes
Stochastic process is said to be stationary;
if its mean and variance are constant over time(do
not depend on time or do not change as time
changes).
If the value of the covariance between the two
time periods depends only on the lag between the
two time periods and not on the actual time.
In the time series literature, a stochastic process
that satisfies such conditions is known as weakly
stationary, or covariance stationary.
12/07/2022 6
Mathematically the condition is expressed as;
Mean E(Yt ) u
2 2
Variance var (Yt ) E(Yt u) σ
Covariance γk E (Yt u)(Yt k u)
12/07/2022 7
If a time series is not stationary as defined
above, it is called a non-stationary time series.
In other words, a non-stationary time series will
have a time varying mean or a time-varying
variance or both.
Stationarity is important;
to make generalization for other time periods
and thus to conduct reliable forecasts.
If a time series is non-stationary, we can study
its behavior only for the time period under
consideration (can not generalize the analysis
to other time periods).
12/07/2022 8
Unit Root Process
Let the model is written as;
12/07/2022 10
Integrated Stochastic Processes
If the time series variable is made stationary by
first differencing, it is termed as integrated of
order 1,denoted as I(1).
Similarly, if a time series has to be differenced
twice (i.e. difference of the first differences) to
make it stationary, we call such a time series
integrated of order 2 denoted as I(2).
In general, if a (nonstationary) time series has
to be differenced d times to make it stationary,
that time series is said to be integrated of order
d and is denoted as I(d). Example Yt ∼ I(d).
12/07/2022 11
If a time series Yt is stationary to begin
with or it is stationary at levels (i.e., it does
not require any differencing), it is said to be
integrated of order zero.
Most economic time series data are
generally I(1); that is, they generally become
stationary only after taking their first
differences.
12/07/2022 12
Spurious regression
This is the situation where two variables which
are not theoretical evidence to be correlated
correlated (r=0), may indicate statistical
significant coefficient.
Suppose we regress Yt on Xt . Since Yt and Xt are
uncorrelated I(1) processes, the R2 from the
regression of Y on X should tend to zero; that is,
there should not be any relationship between the
two variables.
But if you run regression you may see regression
results. As you can see, the coefficient of X is
highly statistically significant, although the R2
value is low.
12/07/2022 13
This is in a nutshell the phenomenon of spurious or non-
sense regression. According to Granger and Newbold, and
R2 > d is a good rule of thumb to suspect that the estimated
regression is spurious.
12/07/2022 14
Tests of stationarity
By now you understood the nature of
stationary and non-stationary stochastic
processes and their related importance.
In practice we face two important questions:
1. How do we know that a given time series
is stationary?
2. If it is not stationary, how can we make
it stationary?
12/07/2022 15
Although there are several tests of stationarity
(non-stationarity), researchers commonly use
the following in the literature. These are:
1. Graphical method
2.The unit root test or ADF Test
12/07/2022 16
1. Graphical Analysis
Any time before conducting formal tests, it is
always advisable to plot the time series under
study because such a plot gives an initial clue
about the likely nature of the time series.
12/07/2022 17
This may suggest that the GDP series is not
stationary. Such an intuitive feeling is the
starting point for a more formal tests of
stationarity.
12/07/2022 18
The unit root test
(Augmented Dickey–Fuller (ADF) Test)
In conducting the usual DF test, it was
assumed that the error term ut is
uncorrelated with its lag.
But in case the ut are correlated, Dickey and
Fuller have developed a test, known as the
augmented Dickey–Fuller (ADF) test.
This test is conducted by “augmenting” the
preceding three equations by adding the
lagged values of the dependent variable ∆Yt .
12/07/2022 19
To be specific, suppose we have
12/07/2022 21
Stata commands for ADFT
We use the following commands to conduct
unit root test for GDP.
dfuller tbrate, trend lags(1)
Or if you want the regression to be displayed;
dfuller gdp, trend regress lags(1)
We use similar test for consumption as:
dfuller consumption, trend lags(1)
Or if you want the regression to be displayed;
dfuller consumption, trend regress lags(1)
12/07/2022 22
The command shows regression of the dependent
variable up on itself lagged one year and first
difference.
If not stationary at levels you check it at first
differencing. Apply the command;
dfuller D.gdp, trend regress lags(1)
For lag selection
varsoc GDP MONEY, maxlag(4)
Note:- the asterisks (*) in a result table
indicates appropriate lag to be selected.
12/07/2022 23
Illustrative example
Use stata time series data on GDP and
Electricity consumption for a country during
1958-1994. The file name is ‘TIME SERIES’
First declare that your data is a time series
process using the command:
tsset year
Next to visually detect whether the data is
trended (non-stationary) for both variables use
the command:
twoway (tsline gdp) (tsline cosumption)
We test the null hypothesis that the variable
under consideration is a unit root process.
12/07/2022 24
The graph is given below
1500
1000
500
0
gdp cosumption
12/07/2022 25
ADFT Result for GDP(in levels)
. dfuller gdp, trend regress lags(1)
Interpolated Dickey-Fuller
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value
gdp
L1. -.1655593 .1284583 -1.29 0.207 -.4275517 .096433
LD. -.5189601 .1501446 -3.46 0.002 -.825182 -.2127382
_trend 5.70084 3.568377 1.60 0.120 -1.576914 12.97859
_cons 62.43344 35.29034 1.77 0.087 -9.541691 134.4086
12/07/2022 26
Decision and conclusion: we can see from
the table that ADF test statistics in absolute
value is 1.289 and this value is less than the
critical values at 1%,5% and also 10%.
This tells us that GDP is a unit root
process(i.e. it is nonstationary process).
Since GDP is nonstationary (is a unit root
process) at levels, we take first difference
and check whether the unit root problem is
resolved.
12/07/2022 27
ADF test at first
. dfuller Dgdp, trend regress lags(1)
Interpolated Dickey-Fuller
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value
Dgdp
L1. -2.008074 .3139025 -6.40 0.000 -2.649149 -1.367
LD. .2564326 .1744521 1.47 0.152 -.0998461 .6127114
_trend 1.503518 .7839002 1.92 0.065 -.0974196 3.104456
_cons 28.13491 16.20987 1.74 0.093 -4.970053 61.23987
12/07/2022 28
We can see from the table that, at first
differencing the ADF test statics is -6.379
which is larger than the critical values at 1,5
and 10%.
Thus we reject the null hypothesis and we
conclude that GDP at first differencing is
stationary. That is, the first differencing
removed the unit root problem. Thus, GDP
is said to be integrated of order 1, i.e
GDP~I(1).
We can do similar test for consumption
12/07/2022 29
Long-run time series analysis
6.1. Definition and concept of co-integration
6.2. Test for co-integration
6.3. Error correction models
6.4. Autoregressive distributive lag model
6.5. Application Using Stata
30
Cointegration and VEC Model
Cointegration : a concept used to indicate the
existence of long-run relationship between two
variables.
VEC is estimated to examine the long run
relationship between the variables and to see the
short run dynamics of error adjustment
whenever disequilibrium occurs.
• To clarify the concept, let us consider the following
very simple form of the import function;
12/07/2022 31
First check for existence of cointegration
between variables by using the following Engle-
Granger two step process.
For variables to be cointegrated, the two variables
m and y, must be integrated of order 1 or I(1) and
also the error term derived from their linear
relationships (with no constant) must be
integrated of order 0 or [stationary at levels ,i.e.,
I(0)].
These conditions are checked for the null
hypothesis of no cointegration.
But first look at the time plots of both
variables shown in below, we observe that both
these series are clearly nonstationary.
12/07/2022 32
12/07/2022 33
First, reg Mt Yt, noconst;
Then, predict and save the residual
еt ( from the regression;
Finally, use the saved residual as
an auxiliary regression shown below;
12/07/2022 36
This last formulation has a simple yet
interesting economic interpretation: it pre-
supposes that some variable M has an
equilibrium path defined by:
12/07/2022 37
In other words for simplicity we may write
the model as;
12/07/2022 38
Interpretation
This is a short run model for import and the
model as judged by the F-test is (p-value 0.05)
adequate.
In the short run equation of import, income
has no significant impact, but the error
correction term i.e, coefficient of RES_(-1) is
significant. The absolute value of the
coefficient of the error correction term is -
0.27. This tells us that about 27 percent of the
short run disequilibrium will be adjusted to
equilibrium within a year.
12/07/2022 39
12/07/2022 40
Johnsen cointegration VEC Model
Steps summarized
1. Specify the model correctly – specify by differencing
the VAR (all should have difference operators and
include ECT with the adjustment parameter.
2. Prepare data for timeseries
3. Conduct stationarity test : must be I(1) and not I(2)
4. Determine optimal lag length
5. Perform Johnsen cointegration test with p lags-
statistics –multivariate time series-cointegrating rank
of a VEC-list the variables. You can use both trace
statics and max creteria.
6. If no cointegration estimate unrestricted VAR
7. With cointegration estimate VECM with p lags but
the model is estimated with p-1 lags
8. Perform diagnostics
12/07/2022 41
Data for exercise
Open data on New South Africa and declare it
time series
Conduct stationary test for the variables;
gdppc gds,gcf,import and remittance, they must be
I(1)
• Determine optimal lag length using command;
varsoc gdppc grossdosaving import gcf
remittance , maxlag(4)
It will produce lag length of 4 as a whole (see below)
12/07/2022 42
. varsoc gdppc grossdosaving import gcf remittance , maxlag(4)
Selection-order criteria
Sample: 1975 - 2019 Number of obs = 45
5%
maximum trace critical
rank parms LL eigenvalue statistic value
0 80 -358.60404 . 126.7954 68.52
1 89 -334.74897 0.65362 79.0853 47.21
2 96 -313.76001 0.60657 37.1074 29.68
3 101 -300.89772 0.43541 11.3828* 15.41
4 104 -295.67501 0.20715 0.9374 3.76
5 105 -295.20633 0.02061
12/07/2022 44
Interpreting cointegration test result
The maximum ranks indicate null hypothesis. For
example for rank=0, it says that there is zero
cointegration equation in this model and we reject
null hypothesis if trace statistics is grater than the
critical value.
Similarly rank=1 indicates the null hypothesis for the
existence of 1 cointegrating equation in this model and
to reject the null hypothesis if trace statistics should
be greater than critical value.
In our case trace statistics is less than the critical value
at rank 3 and thus we can not reject the null
hypothesis and conclude that there are 3 cointegrating
equations indicated by the rank.
12/07/2022 45
Estimating VECM
Statistics –multivariate time series-vector error
correction model(VECM)-indicate all the variable
in the dialogue box- 1 cointegration equation, 2
lags (but Stata will estimate it with p-1 lags)
Or
use the following command
vec gdppc grossdosaving gcf import remittance,
trend(constant)
The second table indicates short run-coefficients
with dependent variable, error correction
coefficient and coefficients of other variables.
Ec1 is the speed of adjustment coefficient (since
there are 5 variables we will have 5 ec terms in each
case).
12/07/2022 46
The last table from VEC model (the Johansson's
normalization restriction equation) is the long run
equation.
Johansen normalization restriction imposed
_ce1
gdppc 1 . . . . .
grossdosaving -271.2405 95.96211 -2.83 0.005 -459.3228 -83.15821
gcf 511.9089 115.1419 4.45 0.000 286.2349 737.5829
import -526.7451 81.35322 -6.47 0.000 -686.1945 -367.2958
remittance 17469.23 4510.614 3.87 0.000 8628.593 26309.87
_cons -2093.906 . . . . .
12/07/2022 47
In the model restriction of 1 is made on the target
variable (the dependent variable). The error
correction term is generated from this long run
equation.
For interpretation of the long –run equation you
must reverse the sign, that is the coefficient with
negative sign is interpreted as a positive and vice
versa.
If coefficients have reverse signs, we say they do
have asymmetric effect on the dependent variable
other things kept constant.
For example other things constant, gcf and import
have significant asymmetric effect on gdppc
12/07/2022 48
Interpret the speed of adjustment (coefficient of
ec1 for target variable only.
Adjustment term (coefficient of ec1-second table
of the model) even if not statistically significant
the coefficient is -0.0150375 – implying that the
previous year’s error (the disequilibrium) are
corrected within the current year at an average
speed of 1.5% which is very small in magnitude
and insignificant.
12/07/2022 49
Postestimation tests
For residual autocorrelation
Statistics –multivariate time series- VEC
diagnostics and tests-LM test for residual
autocorrelation – lag –active and vec results- ok.
But first run;
vec gdppc grossdosaving gcf import remittance,
trend(constant) lags(1)
In this case the null hypothesis is ‘no
autocorrelation for at lag order).
You can also conduct normality and stability tests
in the same procedure
12/07/2022 50
Normality test of the residual
Statistics –multivariate time series- VEC
diagnostics and tests- test for normally distributed
disturbances- tick on jaco-bera
null hypothesis- normally distributed
Alternative – not normally distributed
For stability test use;
vecstable
And you will get;
Eigenvalue stability condition
The VECM specification imposes 4 unit moduli.
12/07/2022 51
Autoregressive Distributed Lag (ARDL)
Autoregressive Distributed Lag (ARDL) to
cointegration technique or bound cointegration
technique is one of the most commonly used
approaches for analysing long-run relationships.
Its application does not require pre-tests for unit
roots unlike other techniques.
Consequently, ARDL cointegration technique is
preferable when dealing with variables that are
integrated of different order, I(0), I(1) or
combination of the both.
The long run relationship of the underlying
variables is detected through the F-statistic (Wald
test).
52
The existence of a long-run/cointegrating
relationship can be tested based on the EC
representation.
A bounds testing procedure is available to
draw conclusive inference without knowing
whether the variables are integrated of order
zero or one, I(0) or I(1), respectively.
As the name indicates ARDL is the
combination both the Distributed Lag (DL)
Model and Autoregressive models.
12/07/2022 53
It says that the change in Yt is due to the current
change in Xt plus an error-correction term;
if the ‘disequilibrium error’ in the square
brackets is positive, then a ‘go to equilibrium’
mechanism generates additional negative
adjustment in Yt .
The speed of adjustment is determined by 1−φ ,
which is the adjustment parameter.
Note that stability assumption ensures that
0<(1−φ)<1. Therefore only a part of any disequilibrium
is made up for in the current period
12/07/2022 54
ARDL Model
Commands to run ARDL (use import as dependent
variable)
ardl import gdppc grossdosaving fdi remittance,
lags(2 0 0 0 0)
Check for lag length for each variable one- by one-by
varsoc gdppc
Applying the lag length re-estimate
ardl gdppc saving import remittance, lags ( )
For ECM we include ec after the command ardl gdppc
saving import remittance, lags( ) ec
After ardl you may run normality test by;
predict myresiduals, r
sktest myresiduals
12/07/2022 55
D.import Coef. Std. Err. t P>|t| [95% Conf. Interval]
ADJ
import
L1. -.9409866 .1504486 -6.25 0.000 -1.245055 -.6369187
LR
gdppc .004162 .0008492 4.90 0.000 .0024457 .0058783
grossdosaving .2190234 .090265 2.43 0.020 .0365911 .4014558
fdi .5942854 .3411137 1.74 0.089 -.095131 1.283702
remittance 18.62005 8.063735 2.31 0.026 2.322633 34.91747
SR
import
LD. .336468 .1324769 2.54 0.015 .0687223 .6042137
_cons
12/07/2022
-9.622943 3.847053 -2.50 0.017 -17.39813 -1.847759
56
H0: no level relationship F= 8.695
Case 3 t= -6.255
10% 5% 1% p-value
I(0) I(1) I(0) I(1) I(0) I(1) I(0) I(1)
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values <
12/07/2022
desired level for I(1) variables) 57
. ardl import gdppc grossdosaving fdi remittance, lags(1 2 1 4 3)ec
ARDL(1,2,1,4,3) regression
ADJ
import
L1. -.5392366 .1601595 -3.37 0.002 -.8667997 -.2116736
LR
gdppc .0061927 .0019726 3.14 0.004 .0021583 .0102272
grossdosaving -.0382086 .189775 -0.20 0.842 -.4263421 .3499248
fdi 1.592295 1.457257 1.09 0.284 -1.38813 4.572719
remittance -18.82432 25.36777 -0.74 0.464 -70.70722 33.05859
SR
12/07/2022 58
gdppc
H0: no level relationship F= 3.163
Case 3 t= -3.367
10% 5% 1% p-value
I(0) I(1) I(0) I(1) I(0) I(1) I(0) I(1)
do not reject H0 if
both F and t are closer to zero than critical values for I(0) variables
(if p-values > desired level for I(0) variables)
reject H0 if
both F and t are more extreme than critical values for I(1) variables
(if p-values < desired level for I(1) variables)
12/07/2022 59
Bounds test for long run relationship after ECM
After running arld for error correction, to confirm
existence of long-run relationship, we run bound test
as follows:
estat btest or estat ectest
Where, ‘estat btest’ or ‘estat ectest’ stand for bound test
and The long-run cointegration is possible if the F
statistics value is above the critical value
Note – if F-statistics is greater than the critical values at
5% level of significance, we reject null hypothesis and
conclude that there is long run-relationship and if it is less
than the critical value at 5%, there is no long run
relationship. If it fall in between it is inconclusive.
12/07/2022 60