Model adequacy tests and residual analysis in econometrics

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/354696046
Model Adequacy in Econometrics
Presentation · September 2021
CITATIONS READS
0 31
2 authors:
Vijayamohanan Pillai N Rjumohan Asalatha
167 PUBLICATIONS 563 CITATIONS

Gulati Institute of Finance and Taxation
22 PUBLICATIONS 2 CITATIONS
SEE PROFILE
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Women Empowerment View project
Domar's Burden of Public Debt : A Practical Exercise in R programming View project
All content following this page was uploaded by Vijayamohanan Pillai N on 20 September 2021.
The user has requested enhancement of the downloaded file.

Monday, September 20, 2021
Multiple Regression Analysis
General form of the multiple linear regression model:

Model Adequacy
in Econometrics Define column vectors of n observations on y
and k – 1 variables and k parameters.
Vijayamohanan Pillai N. y i  f ( x i 2 , x i 3 ,..., x ik )  u i

Rjumohan A.
y i   1   2 x i2   3 x i 3  ...   k x ik  ui
20-Sep-21 Vijayamohan & Rjumohan 1 20-Sep-21 Vijayamohan & Rjumohan 2
Multiple Regression Analysis Multiple Regression Analysis
yi  1  2 xi2   3 xi3  ...  k xik  ui

i = 1, …, n
࢟૚ ൌ ࢼ૚ ൅ ࢼ૛࢞૚૛ ൅ ࢼ૜࢞૚૜൅ǤǤǤ൅ࢼ࢑࢞૚࢑ ൅ ࢛૚ To estimate the unknown parameters ઺ s
using observed data on Yi and Xi js

࢟૛ ൌ ࢼ૚ ൅ ࢼ૛࢞૛૛ ൅ ࢼ૜࢞૛૜൅ǤǤǤ൅ࢼ࢑࢞૛࢑ ൅ ࢛૛
࢟૜ ൌ ࢼ૚ ൅ ࢼ૛࢞૜૛ ൅ ࢼ૜࢞૜૜൅ǤǤǤ൅ࢼ࢑࢞૜࢑ ൅ ࢛૜ Ordinary Least Squares, or OLS
……………………….
࢟࢔ ൌ ࢼ૚ ൅ ࢼ૛࢞࢔૛ ൅ ࢼ૜࢞࢔૜൅ǤǤǤ൅ࢼ࢑࢞࢔࢑ ൅ ࢛࢔
1
Sample Regression and Residuals Gauss-Markov Theorem
Consider a simple (bivariate) regression • Of the class of linear unbiased estimators,

for i = 1, 2, …, n: the OLS estimators have the smallest
yi     xi  ui variance
For this population relationship,
the estimated regression: • Least squares estimators are BLUE
• Best
ෝ࢏ = ࢻ
࢟ ෡࢞࢏
ෝ+ࢼ
• Linear
So that the estimate of the error term ui • Unbiased
(residuals):
• Estimators
ෝ࢏ = ࢟࢏ − (ࢻ
࢛ ෡࢞࢏)
ෝ+ࢼ s. t. certain assumptions
Assumptions of OLS Regression Assumptions of OLS Regression

•Assumptions on u:
An acronym NOLINE: •Zero mean
• Non-stochastic X
•The disturbances are normally distributed
• Orthogonal X and Error
• Linearity •The variance parameters in the covariance-
variance matrix are the same (Homoscedasticity
• Independence of Errors = No heteroscedasticity)
• Normality of Error •The disturbance terms are not correlated

(No autocorrelation; Independence of error)
• Equal Variance (Homoscedasticity)
•(Hence spherical error)
2
• (Further) Assumption on u: •Assumptions on X:
No ARCH effect •Non-stochastic X

ARCH = Auto-regressive
Conditional Heteroscedasticity •No perfect linear dependence between two or
→ Autocorrelated volatility more explanatory variables
(No perfect multicollinearity)
•General Assumption :
•Assumption on X and u:
•No specification error–
•The explanatory variables and the disturbance
e.g. exclusion of relevant variables;
term are not correlated
inclusion of irrelevant variables;
(Orthogonal X and error: No endogeneity bias)
incorrect functional form;
measurement error
3
Residual Analysis
Model Adequacy Tests • The residual for observation i,
࢛ෝ࢏ = the difference between its observed and
predicted value
෡࢏
ෝ࢏ = ࢅ࢏ − ࢅ
࢛
Residual Analysis • Check the assumptions of regression by examining
the residuals
• Graphical Analysis of Residuals
Model adequacy diagnosis: A gentle reminder of

model adequacy diagnosis:
An important stage,
before hypothesis testing in forecast modelling.
Important stage in time series (ARIMA) modelling,

still followed to some extent.
Otherwise, unfortunately,
most of the econometric practitioners ignore it, CDS Working Paper 312 (2001)
knowingly or unknowingly,
giving no clues about the validity of their results –
an undesirable practice that must be avoided.
4
Residual Analysis for Linearity

Model adequacy :
The fitted model is said to be adequate Y

if it explains the data set adequately,
residuals
residuals
i.e., if the residual does not contain (or conceal)
any ‘explainable non-randomness’
left from the (‘explained’) model.
x x
i.e., if the residual is purely random/white noise.
residuals
residuals
i.e., if all the OLS assumptions are satisfied. x x
20-Sep-21 Vijayamohan & Rjumohan 17 20-Sep-21

Not Linear
Vijayamohan & Rjumohan Linear 18
Residual Analysis for Independence

Normally distributed errors : Definition
• The residuals are not NID(0, )

Not Independent Normality Tests
 Independent
Assumption Value Probability Decision(5%)
residuals
X Skewness 5.1766 0.000000 Rejected

Kurtosis 4.6390 0.000004 Rejected
residuals
X Histogram of Residuals of rate90

35.0
26.3
residuals
Count
X 17.5
8.8
20-Sep-21 Vijayamohan & Rjumohan 19 20-Sep-21 Vijayamohan

0.0 & Rjumohan 20
-1000.0 -250.0 500.0 1250.0 2000.0
Residuals of rate90
5
Non-normally distributed errors :

Implications
Non-normally distributed errors:
Causes
If the residuals are not normally distributed,
then the estimators of હ and ઺ are also
not normally distributed. • Generally caused by a specification error.
Estimates are, however, still BLUE. • Usually an omitted variable.
Estimates are unbiased and have minimum variance. • Can also result from
– Outliers in data.
If residuals are non-Normal, – Wrong functional form.
it is only our hypothesis tests which are affected.
Non-Normality Tests: Non-Normality Tests:

Residual Analysis for Normality Residual Analysis for Normality
 A normal probability plot of the residuals can be  Residual distribution Positively skewed
used to check for normality:
Percent Percent
100 The plotted 100 The plotted
points points lie
above the
reasonably comparison
linear line on both
tails of the
0 0 distribution
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Residual Residual
6
Non-Normality Tests: Non-normally distributed errors :

Residual Analysis for Normality Tests for non-normality
• Jarque-Bera test
 Residual distribution heavy-tailed – This test examines both the skewness and
kurtosis of a distribution to test for normality.
The plotted
Percent points in the
100 upper tail lie
above the – where S is the skewness and K is the kurtosis of
comparison
line and those
the residuals.
in the lower tail – JB has a 2 distribution with 2 df.
below the line
0 – Ho: S = 0; K = 3 (residuals normal)
-3 -2 -1 0 1 2 3 – If Estimated JB near zero, p-value > 0.05,
Residual
20-Sep-21 Vijayamohan & Rjumohan 25
do not reject Ho.
Non-normally distributed errors: Remedies Outliers?

Can be identified using Box plot
• Try to modify your theory. (Box and whiskers plot).
• Omitted variable?
• Outlier needing specification?
• Modify your functional form by taking some
variance transforming step such as square root,
exponentiation, logs, etc.
7
Boxplot : Boxplot :
a five-number measure: a five-number measure:
the minimum, the maximum, the sample median, and the minimum, the maximum, the sample median, and
the first and third quartiles. the first and third quartiles.
•Minimum (Q0 or 0th percentile): •First quartile (Q1 or 25th percentile):

the lowest data point excluding (any) outliers. also known as the lower quartile qn(0.25):
the median of the lower half of the dataset.
•Maximum (Q4 or 100th percentile):
the largest data point excluding (any) outliers. •Third quartile (Q3 or 75th percentile):
also known as the upper quartile qn(0.75):
•Median (second quartile, Q2 or 50th percentile): the median of the upper half of the dataset.
the middle value of the dataset.
Boxplot and a probability density function

Boxplot : Structure
(pdf) of a Normal N(0,1) Population
Interquartile range (IQR) :

the distance between the upper (Q3)
and the lower (Q1) quartile: IQR = Q3 – Q1
8
Boxplots showing skewness Boxplot from Gretl (1):

Two different groups
70 If two boxes do not overlap
each other, say,
60 box A is completely above
Mean (or below) box B, as in this
50 plot, then there is a difference
between the two groups.
40
30
No overlap
20 Mean
Normal Positive Negative 10

Class A Scores Class B Scores
skewness skewness
Boxplot from Gretl (2): Boxplot from Gretl (3):

Likely to be different groups 75
Variability of data
70
70 If two boxes overlap
each other and 65
60 the median line of box A

60
lies outside of box C entirely,
50
then there is likely to be 55
a difference between the two
groups. as in this plot. 50
Wider ranges
40
45 (whisker
length, box
30
40
size)
35
indicate more
20
variable data.
30
Class A Scores Class C Scores Class A Scores Class B Scores Class C Scores
9
Model Adequacy: Multicollinearity: Implications

Multicollinearity: Definition
the condition where the independent variables are  Three cases
related to each other.
As any two (or more) variables become more and  A) No multicollinearity
more closely correlated,
the condition worsens, and ‘approaches singularity’. X1  X2 : Orthogonal  Cov(X1, X2) = R1.22 = 0.
Since the X's are supposed to be fixed, The regression would appear to be identical
this is a sample problem. to separate bivariate regressions: both
Since multicollinearity is almost always present, coefficients and variances
it is a problem of degree, not merely existence.
Multicollinearity: Implications Multicollinearity: Implications
– B) Perfect Multicollinearity c) Imperfect or near multicollinearity
the more common problem

Given X = (x1, x2,…,xk)
two or more of the explanatory variables are
xi : the i-th column of X with n observations.
approximately linearly related
For example if x1 = kx2 then the variables x1 and x2,
Able to estimate the parameters; unbiased; but
are exactly linearly related (k ≠ 0).
inefficient
Not able to estimate the parameters (singularity)
higher the r, the greater the prevalence of
multicollinearity.
10
Multicollinearity: Implications Multicollinearity:

Tests/Indicators
• If the independent variables are highly correlated,
inflated var( ˆ
k ) A diagnostic statistics:
1
•  t ratios are lower (1  R 2
Variance inflation factor (VIF): k. )
•  insignificant ̂ k
• and R2 tends to be high as well.
“The most useful single diagnostic guide”
•  Significant F J Johnston (1984)
• Sign changes occur with the introduction of a new
variable
• The ̂ k still unbiased.
Interpreting VIFs Multicollinearity:

Tests/Indicators (cont.)
No multicollinearity  VIF = 1
• Also Tolerance:
1
If the VIF is greater than 10, TOLk   (1  R2
k. )
VIF
then multicollinearity is probably severe.
 90% of the variance of Xj is explained • If the tolerance equals 1,
by the other Xs. the variables are unrelated.
In small samples, a VIF of about 5 • If TOLj = 0,
may indicate problems then they are perfectly correlated.
11
Multicollinearity: Multicollinearity:
Tests/Indicators Tests/Indicators
How large a VIF value has to be Given the eigenvalues 1 > 2 > 3 > ….
to be “large enough”?
CIj = (1/ j); j = 1, 2, 3, …..
Belsley (1991) suggests :
Stata/SPSS reports sqrt of CIj
1. Eigen values of X’X
2. Condition index (CI) and CN = sqrt(Max eigenvalue/ Min eigenvalue)
3. Condition number (CN)
Or CN = sqrt(Max CIj).
Multicollinearity: Multicollinearity: Causes

Tests/Indicators
Largest CI = 5 – 10  no problem
Largest CI = 30 – 100  problematic • Sampling mechanism. Poorly constructed design
& measurement scheme or limited range.
Largest CI = 1000 – 3000  severe problem
• Statistical model specification: adding polynomial
terms or trend indicators.
See
1. DA Belsley, 1991 Conditioning Diagnostics: • Too many variables in the model - the model is
Collinearity and weak data in Regression, Wiley overdetermined.
2. DA Belsley, E Kuh and RE Welsch 1980 Regression • Theoretical specification is wrong. Inappropriate
Diagnostics: Identifying Influential Data and Sources of construction of theory or even measurement
Collinearity, Wiley
3. Norman R Draper and Harry Smith 2003 Applied
Regression Analysis 3rd Ed. Wiley
12
Multicollinearity: Remedies Model Specification: Definition

• Increase sample size
• Omit Variables Specification error
• Scale Construction/Transformation
• Factor Analysis • covers any mistake in the set of assumptions of a
model and the associated inference procedures
• Constrain the estimation (set the value of one
coefficient relative to another).
• Ignore it - report adjusted R2 and claim it • But it has come to be used for errors in
warrants retention in the model. specifying the data matrix X.
Model Specification: Definition Too Many or Too Few Variables
basically 4 types of misspecification : • What happens if we include variables in our

specification that don’t belong?
• No effect on our parameter estimate,
exclusion of a relevant variable and OLS remains unbiased
inclusion of an irrelevant variable
• What if we exclude a variable from our
functional form specification that does belong?
measurement error and misspecified error term • OLS will usually be biased
13
1. Omission of relevant variable: 2. Inclusion of Irrelevant Variable:

Exclusion/ Underfitting Bias Inclusion/overfitting bias
Suppose the true model is: Suppose this time the “true” model is:
෥= ࢼ
࢟ ෩૙ + ࢼ
෩૚࢞૚ + ࢛,
࢟ = ࢼ૙ + ࢼ૚࢞૚ + ࢼ૛࢞૛ + ࢛,
but we estimate:
But we estimate
෩૙ + ࢼ
෥= ࢼ
࢟ ෩૚࢞૚ + ࢛, ࢟ = ࢼ૙ + ࢼ૚࢞૚ + ࢼ૛࢞૛ + ࢛,
This equation is over-specified and tends to occur

the misspecified model : biased
when researchers adopt a “kitchen sink” approach to
model building.
This specification error does not lead to bias:

estimators unbiased; but inefficient
3. Functional Form Mis-specification Specification Error Test
A common test for mis-specification is

A third type of mis-specification occurs when Ramsey’s regression specification error test (RESET) –
we adopt an incorrect functional form. Ramsey (1969).
For example, we estimate a linear regression

model whereas the "true" regression model is The test is H0: No specification error.
log-linear.
P-value > 5% = No specification problem
Incorrect functional form can result in
autocorrelation or heteroskedasticity.
14
Spherical error assumption:

Spherical error assumption:
2: No serial/auto correlation
y i  1   2 x i2   3 x i 3  ...   k x ik  ui Cov(ui uj) = 0, ∀ i ≠ j; i, j = 1, 2, ….., n
1: No heteroscedasticity
Var(ui) = ોu2
Negative serial Correlation Positive Serial Correlation
No heteroscedasticity assumption
Variance y
f(y|x)
μ . E(y|x) = b 0 + b 1x
μ
μ
.
Homoscedasticity
x1 x2 x
15
Heteroskedasticity: Definition
• the plots of the residuals by the dependent

f(y|x) variable or appropriate independent variables a
characteristic fan or funnel shape.
. E(y|x) = b 0 + b 1x
180
.
160
140
120
. 100
80
60
40
Series1
Heteroscedasticity 20
0
0 50 100 150
x1 x2 x3 x
Residual Analysis for Heteroskedasticity: Implications

Equal Variance
Given our model,
Y ࢟࢏ = ࢼ૚ + ࢼ૛࢞࢏૛ + ࢼ૜࢞࢏૜+. . . +ࢼ࢑࢞࢏࢑ + ࢛࢏

Y
• The regression s are unbiased /consistent.

• But they are no longer the best estimator.
x x • They are not BLUE (not minimum variance -

hence not efficient).
– So confidence intervals are invalid.
residuals
residuals
x x
– Wrong inference
Non-constant variance
20-Sep-21 Vijayamohan & Rjumohan
 Constant variance
63 20-Sep-21 Vijayamohan & Rjumohan 64
16
Heteroskedasticity: Implications Heteroskedasticity: Causes
• It may be caused by:

• Types of Heteroskedasticity
– Model misspecification - omitted variable or
– There are a number of types of improper functional form.
heteroskedasticity. – Learning behaviors across time
• Additive – Changes in data collection or definitions.
• Multiplicative – Outliers or breakdown in model.
• ARCH (Autoregressive conditional • Frequently observed in cross sectional
heteroskedastic) - a time series data sets where demographics are
involved (population, GNP, etc).
problem.
20-Sep-21 65 20-Sep-21 66
Vijayamohan & Rjumohan Vijayamohan & Rjumohan
Testing for Heteroskedasticity

The White Test:
White’s Generalized Heteroskedasticity test
A number of formal tests :

• The Breusch-Pagan test detects any linear
•Park test forms of heteroskedasticity
•Glejser test • The White test allows for nonlinearities by
•Goldfeld-Quandt test using squares and cross-products of all the xs
•Breusch-Pagan test
• using an F or LM to test whether all the xj, xj2,
•White test
and xjxk are jointly significant
20-Sep-21 67 20-Sep-21 Vijayamohan & Rjumohan 68

Vijayamohan & Rjumohan
17
Remedies for Heteroskedasticity Heteroskedastic Consistent SEs:

Robust SEs
•This depends on the form heteroskedasticity Important to remember:

takes. Robust standard errors only have
asymptotic justification –
•Indirect: (i) Re-specify the model; or with small sample sizes t statistics formed
• (ii) Use heteroscedastic-consistent SE with robust standard errors will not have a
(White’s Robust SE) distribution close to the t, and inferences
will not be correct
•Direct: Generalized Least Squares (Weighted
LS)
adjust the variance-covariance matrix
Generalized Least Squares (GLS) Generalized Least Squares:

Weighted Least squares
• It’s always possible to estimate robust
standard errors for OLS estimates,
• But if we know something of the specific
form of the heteroskedasticity, we can • GLS under heteroscedasticity is a weighted
obtain more efficient estimates than OLS least squares (WLS) procedure where each
squared residual is weighted by the inverse of
• The basic idea is to transform the model Var(ui|xi)
into one that has homoskedastic errors –
• hence generalized least squares (GLS)
18
Autocorrelation: Definition Autocorrelation: Definition
• correlation of the errors over time

• Types of Autocorrelation
Time (t) Residual Plot
 Here, residuals show a 15 – Autoregressive (AR) processes
cyclic pattern, not 10
random. Cyclical 5 – Moving Average (MA) processes

Residuals
0
patterns are a sign of -5 0 2 4 6 8
positive autocorrelation -10
-15
Time (t)
 Violates the regression assumption that errors

are random and independent
20-Sep-21 73 20-Sep-21 Vijayamohan & Rjumohan 74
Vijayamohan & Rjumohan
• Autocorrelation: Definition
Autocorrelation: Definition
• Autoregressive processes (cont.)
• Autoregressive processes AR(p) – In 2nd order autocorrelation the
– The residuals are related to their residuals are related to their t-2 values
preceding values. as well – AR(2):
࢛࢚ = ࢛࢚࣋ି૚ + ࢿ࢚ ࢛࢚ = ࣋૚࢛࢚ି૚ + ࣋૛࢛࢚ି૛ + ࢿ࢚
– This is classic 1st order autocorrelation: – Larger order processes may occur as
AR(1) process well: AR(p)
࢛࢚ = ࣋૚࢛࢚ି૚ + ࣋૛࢛࢚ି૛+. . . +࣋࢖࢛࢚ି࢖ + ࢿ࢚
19
• Moving Average Processes MA(q)

Higher order processes for MA(q) also exist.
࢛࢚ = ࢿ࢚ − ࣂࢿ࢚ି૚
࢛࢚ = ࢿ࢚ − ࣂ૚ࢿ࢚ି૚ − ࣂ૛ࢿ࢚ି૛−. . . −ࣂࢗࢿ࢚ିࢗ
• The error term is a function of some
random error and a portion of the previous The error term is a function of some random
random error. error and some portions of the previous
• MA(1) process random errors.
• Mixed processes ARMA(p,q)

– AR processes represent shocks to
systems that have long-term memory.
࢛࢚ = ࣋૚࢛࢚ି૚ + ࣋૛࢛࢚ି૛+. . . +࣋࢖࢛࢚ି࢖ + ࢿ࢚
−ࣂ૚ࢿ࢚ି૚ − ࣂ૛ࢿ࢚ି૛−. . . −ࣂࢗࢿ࢚ିࢗ – MA processes are quick shocks to
systems, but have only short term
memory.
• The error term is a complex function of
both autoregressive {AR(p)} and moving
average {MA(q)} processes.
20
Autocorrelation: Implications Autocorrelation: Causes
• Coefficient estimates are unbiased, but • Specification error

the estimates are not BLUE – Omitted variable
• The variances are often greatly • Wrong functional form
underestimated (biased small) • Lagged effects
• Hence hypothesis tests are exceptionally • Data Transformations
suspect.
– Interpolation of missing data
– differencing
Autocorrelation: Tests Autocorrelation: Tests
• The Durbin-Watson statistic is used to test for ෝ)

ࢊ ≈ ૛(૚ − ࣋
autocorrelation The possible range is 0 ≤ d ≤ 4:
H0: residuals are not correlated ෝ = 0  no autocorrelation: d ≈ 2
࣋
H1: positive autocorrelation is present
d should be close to 2 if H0 is true (no AC)
∑࢔࢚ୀ૛(࢛ ෝ࢚ି૚)૛
ෝ࢚ − ࢛ ෝ = +1  positive autocorrelation: d ≈ 0
࣋
ࢊ= ࢔ ෝ૛࢚
∑࢚ୀ૚ ࢛ gives
d close to zero  positive autocorrelation
ෝ = –1  negative autocorrelation: d ≈ 4
࣋
ෝ)
ࢊ ≈ ૛(૚ − ࣋
d close to 4  negative autocorrelation
20-Sep-21 83 20-Sep-21 84
21
Testing for +ve Autocorrelation Testing for +ve Autocorrelation
H0: positive autocorrelation does not exist H0: positive autocorrelation does not exist
H1: positive autocorrelation is present H1: positive autocorrelation is present
 Calculate the Durbin-Watson test statistic = d

(Using Stata or SPSS)
Decision rule: reject H0 if d < dL or 4 – dL < d < 4
Find the values dL and dU from the D-W table (for sample
size, n and number of independent variables, k)
Decision rule: reject H0 if d < dL
Reject H0 Inconclusive Do not reject H0 Do not reject H0

d
2 0 dL dU 2 4 – dU 4 – dL 4
0 dL dU
20-Sep-21 85 20-Sep-21 86
Durbin-Watson d Test: Testing for +ve Autocorrelation

Decision Rules
• Suppose we have the following time series (continued)
data:
Null Hypothesis Decision If
160
No + autocorrelation Reject 0 < d < dL 140
No + autocorrelation No Decision d L ≤ d ≤ dU 120

100
No - autocorrelation Reject 4 – dL < d < 4
Sales
80 y = 30.65 + 4.7038x
No - autocorrelation No Decision 4 – d U ≤ d ≤ 4 – dL 60 R2 = 0.8976
40
No +/- autocorrelation Do not reject d U < d < 4 – dL 20
0
• Is there autocorrelation?
0 5 10 15
Time
20 25 30
Testing for +ve Autocorrelation

20-Sep-21 87 20-Sep-21 88
22
Testing for +ve Autocorrelation Testing for +ve Autocorrelation

160
140 • Here, n = 25 and k = 1 : one independent variable
Example with n = 25:
120
100
• Using the Durbin-Watson table,
S ale s
80 y = 30.65 + 4.7038x – dL = 1.29 and dU = 1.45

60 R2 = 0.8976
Durbin-Watson Calculations 40 • d = 1.00494 < dL = 1.29,
Sum of Squared 20
Difference of Residuals 3296.18 0 • Therefore the given linear model is not the appropriate
Sum of Squared 0 5 10 15 20 25 30 model to forecast sales
Residuals 3279.98 Time
Durbin-Watson Statistic 1.00494

Decision: reject H0 since significant +ve
∑‫ܜ܂‬ୀ૛(‫ܝ‬
ෝ‫ ܜ‬− ‫ܝ‬ෝ‫ିܜ‬૚)૛ 3296.18 autocorrelation exists
‫=܌‬ = d = 1.00494 < dL
ෝ‫ܜ‬૛
∑‫ܜ܂‬ୀ૚ ‫ܝ‬ 3279.98
= 1.00494 0
Reject H0
dL=1.29
Inconclusive Do not reject H0
dU=1.45 2
20-Sep-21 Vijayamohan & Rjumohan 89 Vijayamohan & 90
Rjumohan 20-Sep-21
Autocorrelation: Tests Autocorrelation: Tests

• Durbin’s h
• Durbin-Watson d (cont.) – Cannot use DW d if there is a lagged
– Note that the d is symmetric about 2.0, endogenous variable in the model;
so that negative autocorrelation will be hence Durbin suggested a Lagrange
indicated by a d > 2.0. Multiplier (LM) test:
– Use the same distances above 2.0 as
upper and lower bounds. ࢊ ࢀ
ࢎ = ૚−
૛ ૚ − ࢀࡿ૛࢚࢟ష ૚
– Syt-12 is the estimated variance of the Yt-1

term
– h has a standard normal distribution
23
Autocorrelation: Tests Autocorrelation: Remedies

• Durbin’s h test for first-order AC:
ࢊ ࢀ • Generalized Least Squares under autocorrelation:
ࢎ = ૚− • First difference method (when ૉ is not known)
૛ ૚ − ࢀࡿ૛࢚࢟ష ૚
– Take 1st differences of the X and Y
– Regress Y on X
h has a standard normal distribution;
– Assumes that ૉ = +1
If the estimated h > 1.96, reject the null of
• Generalized (quasi-) difference method
“no AC” at 5% significance level.
(i) Requires that ૉ be known.
(ii) Estimating ૉ from DW statistic or from the
Breusch-Pagan test: LM test for higher residuals
order autocorrelations
• Autocorrelation: Remedies
Autocorrelation: Remedies
• Cochran-Orcutt method (cont.)
Cochran-Orcutt method (Estimating ૉ from the (3) using the -hat obtained, perform the
residuals) regression on the generalized differences
(1) Estimate model using OLS and obtain ෝࢅ࢚ି૚) = ࡮ ૚(૚ − ࣋
(ࢅ࢚ − ࣋ ෝ) + ࡮ ૛(ࢄ࢚ − ࣋
ෝࢄ ࢚ି૚) + (࢛࢚ − ࣋
ෝ࢛࢚ି૚)
the residuals, ut.
(2) Using the residuals run regression on (4) Substitute the values of B1 and B2 into
the following relationship: the original regression to obtain new
estimates of the residuals.
ෝ࢚ = ࢛࣋
࢛ ෝ࢚ି૚ + ࢚࢜ (5) Return to step 2 and repeat – until -hat
converges (no longer changes).
24
• Autocorrelation: Remedies Autocorrelation: Remedies:

From the Generalized difference method
Unfortunately, all the text books stop here. to ARDL/g-t-s model
However, there is a simple method that we
Consider Yt = a + bXt + ut,
outline below:
with ut = ૉut – 1 +૓t
From the Generalized difference method,
we can derive an autoregressive distributed Then the Generalized differencing gives
lag (ARDL) model that conforms to (ࢅ࢚ − ࣋ࢅ࢚ି૚) = ࢇ(૚ − ࣋) + ࢈(ࢄ ࢚ − ࣋ࢄ࢚ି૚) + (࢛࢚ − ࢛࢚࣋ି૚)
Hendry’s general-to-specific (g-t-s) method.
And this solves autocorrelation and the which we rewrite as
related problems. ࢅ࢚ = ࢻ + ࢈ࢄ ࢚ + ࢼࢄ ࢚ି૚ + ࣋ࢅ࢚ି૚ + ࢚ࣕ
where હ = ࢇ ૚ − ࣋ ; ઺ = –bૉ, and ઽt = ut – ut – 1.
Autocorrelation: Remedies: Autocorrelation: Remedies:

ARDL/g-t-s model ARDL/g-t-s model
ࢅ࢚ = ࢻ + ࢈ࢄ ࢚ + ࢼࢄ ࢚ି૚ + ࣋ࢅ࢚ି૚ + ࢚ࣕ ࢅ࢚ = ࢻ + ࢈ࢄ ࢚ + ࢼࢄ ࢚ି૚ + ࣋ࢅ࢚ି૚ + ࢚ࣕ
This ARDL(1,1) equation is free from autocorrelation, Thus to remedy AC, we need to estimate
૓t being a spherical error (white noise), an ARDL model of appropriate order,
and can be estimated using OLS. for which we can use the g-t-s method.
If ut is of higher order autocorrelation, To our knowledge,

the ARDL order correspondingly increases. no econometrics textbook has identified and
illustrated this step.
Thus to remedy AC, we need to estimate
an ARDL model of appropriate order, This is our contribution here.
for which we can use the g-t-s method.
25
Monthly Data on Consumption, Income and Inflation

Month Cons Inc Infl Month Cons Inc Infl Month Cons Inc Infl
ARDL/g-t-s method for Model Adequacy: 1 1300 1500 0 24 1314 1500 -7.96 47 1318 1507 -4.74
2 1302 1501 -0.59 25 1316 1500 -8.74 48 1316 1506 -3.65
An Illustration using GRETL 3
4
1304
1301
1501
1499
-1.47
-1.62
26
27
1318
1319
1501
1503
-8.82
-7.68
49
50
1316
1315
1506
1505
-5.40
-5.34
5 1303 1498 -1.83 28 1318 1503 -7.20 51 1313 1505 -4.72
We use 200 monthly observations on 6
7
1301
1301
1499
1498
-0.45
-1.16
29
30
1318
1316
1504
1503
-6.60
-6.78
52
53
1313
1309
1505
1504
-3.44
-1.07
consumption (Cons), income (Inc) 8 1300 1498 -0.50 31 1319 1505 -7.70 54 1308 1504 -1.56
and inflation (Infl) 9
10
1298
1301
1498
1500
0.65
-0.25
32
33
1318
1318
1504
1504
-6.97
-7.84
55
56
1308
1302
1504
1501
-1.23
-0.50
from a data set generated with 11 1306 1502 -1.82 34 1319 1505 -7.30 57 1303 1502 -0.83
GiveWin 2.3 algebra editor (statistical package). 12

13
1307
1307
1503
1501
-1.37
-2.89
35
36
1321
1323
1505
1507
-8.72
-8.89
58
59
1303
1304
1504
1505
1.70
2.35
14 1307 1500 -3.08 37 1321 1506 -8.63 60 1304 1504 3.22
15 1310 1501 -3.65 38 1320 1507 -6.64 61 1302 1504 3.23
16 1309 1501 -3.33 39 1319 1508 -5.24 62 1299 1504 4.52
17 1308 1501 -3.67 40 1323 1510 -5.61 63 1302 1505 1.74
18 1310 1501 -4.33 41 1325 1511 -6.89 64 1305 1506 1.89
19 1312 1502 -4.11 42 1322 1510 -5.70 65 1305 1508 3.48
20 1315 1502 -6.87 43 1318 1508 -4.07 66 1307 1509 2.36
21 1316 1502 -7.22 44 1315 1506 -4.36 67 1306 1509 1.75
22 1317 1502 -7.83 45 1316 1507 -5.04 68 1305 1508 1.29
23 1315 1500 -8.55 46 1317 1506 -5.50 69 1305 1510 1.56
Monthly Data on Consumption, Income and Inflation Monthly Data on Consumption, Income and Inflation
70 1307 1510 0.01 93 1286 1504 1.77 116 1298 1515 7.90
139 1314 1514 -2.24 160 1321 1515 -7.30 181 1315 1507 -11.97
71 1308 1510 0.51 94 1285 1506 2.68 117 1302 1516 6.89
140 1314 1513 -2.57 161 1322 1515 -7.70 182 1314 1508 -11.30
72 1307 1510 -0.84 95 1283 1505 3.74 118 1303 1516 6.29
73 1308 1509 -1.41 96 1284 1507 3.69 119 1307 1517 3.29
141 1313 1513 -3.93 162 1322 1514 -9.29 183 1316 1508 -12.86
74 1308 1509 -2.36 97 1286 1507 4.10 120 1308 1517 2.39 142 1312 1511 -6.34 163 1323 1513 -12.26 184 1314 1507 -12.21
75 1306 1508 -2.74 98 1288 1508 4.35 121 1306 1515 2.25 143 1311 1510 -6.34 164 1325 1513 -11.78 185 1314 1505 -13.11
76 1307 1508 -3.99 99 1290 1510 5.68 122 1307 1515 2.00 144 1312 1511 -6.62 165 1322 1512 -10.86 186 1312 1505 -13.03
77 1309 1509 -3.11 100 1289 1509 5.79 123 1306 1516 2.14 145 1313 1511 -6.51 166 1320 1512 -11.52 187 1314 1504 -14.57
78 1309 1510 -2.51 101 1290 1510 4.92 124 1305 1514 1.60 146 1313 1511 -6.80 167 1321 1513 -9.97 188 1312 1503 -14.03
79 1306 1510 -0.35 102 1289 1510 6.01 125 1305 1514 0.30 147 1314 1512 -7.28 168 1319 1512 -10.13 189 1312 1502 -14.34
80 1304 1510 0.45 103 1288 1510 6.39 126 1307 1514 1.51 148 1313 1512 -6.00 169 1315 1511 -10.15 190 1312 1503 -14.55
81 1303 1510 0.92 104 1291 1510 3.87 127 1307 1514 0.99 149 1315 1511 -6.88 170 1317 1510 -10.50 191 1314 1503 -16.25
82 1305 1512 0.22 105 1290 1510 5.83 128 1307 1515 1.67 150 1314 1510 -8.25 171 1317 1510 -11.92 192 1314 1502 -16.55
83 1302 1511 -0.09 106 1291 1512 6.54 129 1311 1515 0.06 151 1315 1512 -6.86 172 1316 1509 -11.32 193 1314 1501 -16.25
84 1298 1508 0.85 107 1294 1512 5.18 130 1309 1514 0.79
152 1315 1511 -6.55 173 1315 1509 -10.08 194 1312 1500 -14.84
85 1296 1506 0.27 108 1292 1513 5.83 131 1309 1515 0.90
153 1317 1512 -6.37 174 1314 1507 -10.27 195 1310 1500 -13.07
86 1295 1507 1.41 109 1291 1513 5.18 132 1309 1514 0.64
87 1293 1506 2.41 110 1292 1513 4.75 133 1309 1514 0.41
154 1317 1513 -6.92 175 1317 1509 -10.90 196 1308 1498 -13.79
88 1290 1506 4.63 111 1290 1511 6.30 134 1307 1513 0.64 155 1319 1513 -7.47 176 1319 1509 -12.15 197 1309 1497 -14.93
89 1289 1505 3.81 112 1291 1512 7.51 135 1310 1514 0.44 156 1318 1513 -7.70 177 1318 1510 -11.63 198 1305 1496 -14.06
90 1290 1506 2.34 113 1291 1513 9.95 136 1313 1514 -1.36 157 1318 1513 -7.43 178 1318 1509 -12.51 199 1309 1498 -13.97
91 1286 1504 2.86 114 1294 1512 8.17 137 1313 1513 -2.45 158 1319 1514 -5.96 179 1317 1509 -11.41 200 1308 1497 -12.69
92 1287 1505 1.79 115 1296 1514 8.51 138 1312 1513 -1.15 159 1320 1514 -6.22 180 1316 1508 -13.71
26
Monthly Data on Consumption, Income and Inflation
ARDL/g-t-s method for Model Adequacy:

Consumption Income
1325 1520
An Illustration using GRETL

1320
1315 1515
1310
1510
1305
1300
1505
First, we run a regression of

1295
1290 1500
1285 Consumption on Income and Inflation;
1280 1495
2000 2004 2008 2012 2016 2000 2004 2008 2012 2016 and then test it for
Inflation model adequacy:
10
-5
-10
-15
-20
2000 2004 2008 2012 2016
Model Adequacy Model Adequacy
The statistical tests on model adequacy In each of these tests,

consider the following null hypotheses if the p-value is greater than
(in the same order as given in the Gretl menu): the conventional significance level of 5%,
(i) no specification error, then the estimated OLS model is
(ii) no heteroscedasticity, adequate in that case.
(iii) normally distributed residuals, That is, our model is adequate only if all these
(iv) no autocorrelation, and OLS assumptions in general are satisfied;
(v) no auto-regressive conditional heteroscedasticity however, note that the normality assumption is
(ARCH) effect. not essential for estimation,
but it is essential for inference.
27
Model Adequacy Regression Results from Gretl

Another important OLS assumption is
that there is no perfect multicollinearity;
note that in the face of perfect multicollinearity,
it is not possible to estimate the model.
No statistical test is available;
an indicator, viz., variance inflation factor (VIF),
is generally used to measure
the severity of (imperfect) multicollinearity.
A variable with VIF > 10 is
a potential source of severe multicollinearity.
Model Adequacy Tests Model Adequacy Tests
28
Adequacy of Model 1 Adequacy of Model 1
We find that the regression model of We find that the regression model of
consumption on the other two variables, consumption on the other two variables,
income and inflation, income and inflation,
is highly significant, is not adequate in respect of
the corresponding p-value being very small all the OLS assumptions,
in every case, and the coefficients having the the corresponding p-value being almost zero
expected signs and magnitudes. in every case.
However…. Note that the p-value for the RESET is reported

in Gretl as
1.25356e–017 = 1.25356 x 10–17 ≈ 0.
Adequacy of Model 1 Adequacy of Model 1
Another important point to note:

Also note that the Durbin-Watson statistic the regression model 1 is a ‘spurious regression’;
is very small, close to zero, note the R2 (= 0.6821) > DW statistic (0.0528)
and the estimate of (Granger and Newbold 1974),
first order autocorrelation coefficient (rho, ૉ) as the variables considered are all non-stationary;
is close to one in magnitude, also see the graphs above.
signifying first order (positive) autocorrelation.
29
Regression Results from Gretl

Modifying the Model 1
Let us now modify the model

by first remedying the autocorrelation problem.
For that, we use
the autoregressive distributed lag
(ARDL) framework
with Hendry’s general-to-specific (g-t-s)
methodology.
We use a maximum lag of 2

for all the variables and
run the regression of ARDL(2,2):
Regression Results from Gretl

From Model 2 to Model 3
In model 2,
the second lags are insignificant;
so we remove them
and rerun the regression;
this time, ARDL(1, 1)
30
Model Adequacy Tests Model Adequacy Tests
The Adequate Model The Adequate Model

The final specific model with one lag for all the three
variables [ARDL(1, 1)] is found to be both Also note that the absolute value of Durbin’s h
adequate in all the five OLS assumptions is less than 1.96,
and highly statistically significant. the 5% critical (absolute) value of z-test,
signifying no first order autocorrelation.
P-values of all the tests are greater than the The estimate of first order autocorrelation
conventional 5% significance level, coefficient (rho, ૉ) is very small in magnitude.
not rejecting the corresponding null.
31
Multicollinearity of the Adequate Model Model Adequacy and Cointegration

However, the lagged variables appeared to have
higher-than-10 VIF values, See for (the first author’s)
indicating severe multicollinearity; interpretation of residual
since removing them to remedy muticollinearity based cointegration test as a
would mean disastrous model inadequacy, model adequacy diagnostic
we just follow the final recourse in such situations, checking:
that is,
ignore multicollinearity problem. Vijayamohanan Pillai N.,
Econometrics of Electricity Demand:
Remember, Questioning the Tradition. (2010)
the model satisfies the OLS assumption of Lambert Academic Publishing,
no-perfect-multicollinearity Saarbrucken, Deutschland.
(in the face of perfect multicollinearity, 103+vii pages;
it is not possible to estimate the model). ISBN 978-3-8433-7639-6,
Price: € 49.00
32
View publication stats

Model adequacy tests and residual analysis in econometrics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Model adequacy tests and residual analysis in econometrics

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Model Adequacy in Econometrics

Presentation · September 2021

Vijayamohanan Pillai N Rjumohan Asalatha

167 PUBLICATIONS 563 CITATIONS

Women Empowerment View project

Domar's Burden of Public Debt : A Practical Exercise in R programming View project

The user has requested enhancement of the downloaded file.

Multiple Regression Analysis

General form of the multiple linear regression model:

Vijayamohanan Pillai N. y i  f ( x i 2 , x i 3 ,..., x ik )  u i

20-Sep-21 Vijayamohan & Rjumohan 1 20-Sep-21 Vijayamohan & Rjumohan 2

Multiple Regression Analysis Multiple Regression Analysis

yi  1  2 xi2   3 xi3  ...  k xik  ui

using observed data on Yi and Xi js

࢟૜ ൌ ࢼ૚ ൅ ࢼ૛࢞૜૛ ൅ ࢼ૜࢞૜૜൅ǤǤǤ൅ࢼ࢑࢞૜࢑ ൅ ࢛૜ Ordinary Least Squares, or OLS

20-Sep-21 Vijayamohan & Rjumohan 3 20-Sep-21 Vijayamohan & Rjumohan 4

Sample Regression and Residuals Gauss-Markov Theorem

Consider a simple (bivariate) regression • Of the class of linear unbiased estimators,

Assumptions of OLS Regression Assumptions of OLS Regression

• Normality of Error •The disturbance terms are not correlated

Assumptions of OLS Regression Assumptions of OLS Regression

• (Further) Assumption on u: •Assumptions on X:

No ARCH effect •Non-stochastic X

20-Sep-21 Vijayamohan & Rjumohan 9 20-Sep-21 Vijayamohan & Rjumohan 10

Assumptions of OLS Regression Assumptions of OLS Regression

20-Sep-21 Vijayamohan & Rjumohan 11 20-Sep-21 Vijayamohan & Rjumohan 12

20-Sep-21 Vijayamohan & Rjumohan 13 20-Sep-21 Vijayamohan & Rjumohan 14

Model adequacy diagnosis: A gentle reminder of

Important stage in time series (ARIMA) modelling,

20-Sep-21 Vijayamohan & Rjumohan 15 20-Sep-21 Vijayamohan & Rjumohan 16

Residual Analysis for Linearity

The fitted model is said to be adequate Y

20-Sep-21 Vijayamohan & Rjumohan 17 20-Sep-21

Residual Analysis for Independence

• The residuals are not NID(0, )

X Skewness 5.1766 0.000000 Rejected

X Histogram of Residuals of rate90

20-Sep-21 Vijayamohan & Rjumohan 19 20-Sep-21 Vijayamohan

Non-normally distributed errors :

it is only our hypothesis tests which are affected.

20-Sep-21 Vijayamohan & Rjumohan 21 20-Sep-21 Vijayamohan & Rjumohan 22

Non-Normality Tests: Non-Normality Tests:

Non-Normality Tests: Non-normally distributed errors :

Non-normally distributed errors: Remedies Outliers?

20-Sep-21 Vijayamohan & Rjumohan 27 20-Sep-21 Vijayamohan & Rjumohan 28

•Minimum (Q0 or 0th percentile): •First quartile (Q1 or 25th percentile):

20-Sep-21 Vijayamohan & Rjumohan 29 20-Sep-21 Vijayamohan & Rjumohan 30

Boxplot and a probability density function

Interquartile range (IQR) :

Boxplots showing skewness Boxplot from Gretl (1):

Normal Positive Negative 10

20-Sep-21 Vijayamohan & Rjumohan 33 20-Sep-21 Vijayamohan & Rjumohan 34

Boxplot from Gretl (2): Boxplot from Gretl (3):

60 the median line of box A

20-Sep-21 Vijayamohan & Rjumohan 35 20-Sep-21 Vijayamohan & Rjumohan 36

Model Adequacy: Multicollinearity: Implications

it is a problem of degree, not merely existence.

20-Sep-21 Vijayamohan & Rjumohan 37 20-Sep-21 Vijayamohan & Rjumohan 38

Multicollinearity: Implications Multicollinearity: Implications

– B) Perfect Multicollinearity c) Imperfect or near multicollinearity

the more common problem

20-Sep-21 Vijayamohan & Rjumohan 39 20-Sep-21 Vijayamohan & Rjumohan 40