You are on page 1of 47

Addis Ababa University

School of Commerce
Department of Economics

Introduction to Econometrics

December 8, 2023
Chapter 4

Violations of the Classical Linear


Regression Assumptions

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 2 / 47


I The estimates derived using OLS techniques and the inferences based
on those estimates are valid only under certain conditions.
I In general, these conditions amount to the regression model being
"well-specified".
I A regression model is statistically well-specified for an estimator (say,
OLS) if all assumptions needed for optimality of the estimator
are satisfied.
I Before we relax the assumptions of the CLRM, let us recall: (i) basic
steps in a scientific inquiry & (ii) the assumptions made.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 3 / 47


The Major Steps Followed in a Scientific Study:
1 Specifying a statistical model consistent with theory (model
representing the theoretical relationship between a set of variables).
This involves at least two choices to be made:
a choice of variables to be included in the model,
b choice of the functional form of the link (linear in variables, linear in
logs of the variables, polynomial in regressors, etc.)
2 Selecting an estimator with certain desirable properties (provided that
the regression model in question satisfies a given set of conditions).

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 4 / 47


3 Estimating the model. When can one estimate a model? (sample
size? perfect multicollinearity?)
4 Testing for the validity of assumptions made.
i If there is no evidence of misspecification, go on to conducting
statistical inferences.
ii If the tests show evidence of misspecification in one/more relevant
forms, 2 possible courses of action:

I If the precise form of misspecification is known, then find an


alternative estimator.
I Regard statistical misspecification as an indication of a defective
model. Then, search an alternative, well-specified model & start over
(return to Step 1).

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 5 / 47


Recap of the CLRM Assumptions
1 E(ui |X1i , X2i , ..., Xki ) = E(ui ) = 0
2 cov(Xj i , ui ) = E(Xj i ui ) = 0
3 var(ui |Xj i ) = E(u2i |Xj i ) = σ 2 > 0 ∀i
4 Cov(ui , us |Xi , Xs ) = E(ui us |Xi , Xs ) = 0
5 Assumptions from 1-4 implies ui ∼ IID(0, σ 2 )
6 Random sampling or independent random sampling
7 The number of sample observations N is greater than the number of
unknown parameters K
8 Non-stochastic and non-constant regressors
9 No perfect multicollinearity among Xs .

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 6 / 47


Violations of the basic assumptions
1. Small sample size: Problems with few data points
I Requirement for estimation: n ≥ K + 1.
I If n is small, it may be difficult to detect violations of assumptions.
I With small n, it is hard to detect heteroskedasticity or non-normality
of ui ’s even when present.
I Though no assumption is violated, a regression with small n may not
have sufficient power to reject βj = 0, even if βj 6= 0.
I If [(K + 1)/n] > 0.4, it will often be difficult to fit a reliable model.
I Rule of thumb: aim to have n ≥ 6X & ideally n ≥ 10X.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 7 / 47


2. Multicollinearity
I Many social research studies use a large number of predictors.
Problems arise when the various predictors are highly & linearly
related.
I Recall that, in a MLR, only the independent variation in a regressor is
used in estimating the coefficient of that regressor.
I If X1 and X2 are highly correlated, the coefficients of X1 & X2 will
be determined by the minority of cases where they don’t overlap.
I Perfect multicollinearity: occurs when one (or more) of the
regressors in a model (e.g., XK ) is a linear function of other/s
(Xi , i = 1, 2, . . . , K − 1).

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 8 / 47


I For instance, if if X2 = 2X1 , then there is a perfect (an exact)
multicollinearity between X1 & X2 .
Suppose, PRF: Y = β0 + β1 X1 + β2 X2 , & X2 = 2X1 .
I The OLS technique yields 3 normal equations:
nβˆ0 + βˆ1 ni=1 X1i + βˆ2 ni=1 X2i = ni=1 Yi
P P P

βˆ0 ni=1 X1i + βˆ1 ni=1 X1 2i + βˆ2 ni=1 X1i X2i = ni=1 X1i Yi
P P P P

βˆ0 ni=1 X2i + βˆ1 ni=1 X1i X2i + βˆ2 ni=1 X2 2i = ni=1 X2i Yi
P P P P

I But, substituting 2X1 for X2 in the 3rd equation yields the 2nd
equation. i.e., one of the normal equations is in fact redundant.
I Thus, we have only 2 independent equations but 3 unknowns (β 0 s) to
estimate.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 9 / 47


I As a result, the normal equations will reduce to:
Pn ˆ ˆ ˆ Pn X1i
i=1 Yi = nβ0 + (β1 + 2β2 ) i=1

= βˆ0 ni=1 X1i + (βˆ1 + 2βˆ2 ) ni=1 X1 2i


Pn P P
i=1 X1i Yi
I The number of β’s to be estimated is greater than the number of
independent equations.
I So, if two or more X’s are perfectly correlated, it is not possible to
find the estimates for all β’s.
I i.e., we cannot find βˆ1 , βˆ2 separately, but βˆ1 + 2βˆ2
P
Y X −nX̄1 Ȳ
⇒ α̂ = βˆ1 + 2βˆ2 = P i 21i 2 and βˆ0 = Ȳ − βˆ1 X̄1 − βˆ2 X̄2
X1 i −nX̄1
I Thus, it is difficult to isolate the effect of each of the highly collinear
X’s on Y .

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 10 / 47


Consequences of multicollinearity:
I estimated coefficients change radically depending on
inclusion/exclusion of other predictor/s.
β’s tend to be very shaky from one sample to another.
I standard errors of β’s will be inflated, and as a result, t-tests will be
insignificant & CIs wide (rejecting H0 : βj = 0 becomes very rare).
I low t-ratios but high R2 (or F ): i.e., no much individual variation in
the X’s, but a lot of common variation.
I Yet, the OLS estimators are BLU E.
I But, multicollinearity is not a problem if the principal aim is
prediction, given that the same pattern of multicollinearity persists
into the forecast period.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 11 / 47


Sources of Multicollinearity
I Multicollinearity may arises due to one or more of the following
factors:
1 Improper use of dummy variables.
2 Including (almost) the same variable twice.
3 Method of data collection used (e.g. sampling over a limited range of
X values).
4 Including a variable computed from other variables in the model (e.g.
using family income, mother’s income & father’s income together).
5 Adding many polynomial terms to a model, especially if the range of
the X variable is small.
6 Or, it may just happen that variables are highly correlated (chance!).

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 12 / 47


Detecting Multicollinearity
I The classic case of multicollinearity occurs when R2 is high (&
significant), but none of the X’s is significant (some may even have
wrong sign).
I Sometimes, simple or partial coefficients of correlation among
regressors are used.
I However, serious multicollinearity may exist even if these correlation
coefficients are low.
I A statistic commonly used for detecting multi-collinearity is VIF
(Variance Inflation Factor).
2
I From a SLR of Y on Xj we have:var(βˆj )= Pσ 2 xi
I From MLR regression of Y on X’s:
2 2
var(βˆj ) = P 2σ xi (1−Rj2 )
=VIF* Pσ x2
i

where Rj2 is the R2 from regressing Xj on all X’s

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 13 / 47


I The difference between variances of βj in the 2 cases arises from the
correlation between Xj & the other X’s, and is captured by:
1
V IFj = 1−Rj2
I If Xj is not correlated with the other X’s, and the two variances will
be identical.
I As Rj2 increases, V IFj rises.
I If Xj is perfectly correlated with other X’s, V IFj = ∞. Implication
for precision (or CIs)???
I So, large VIF is a sign of serious or “intolerable” multicollinearity.
I There is no cutoff point on VIF beyond which multicollinearity is
taken as intolerable.
I A rule of thumb: V IF ≥ 10 is a sign of severe multicollinearity.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 14 / 47


Remedial for Multicollinearity
I Solutions depend on the sources of the problem.
I The formula below is indicative of some solutions:
P 2
ˆ σ̂ 2 e
var(βj ) = x2 (1−R2 ) = (n−k−1) P xi2 (1−R2 )
P
i j ji j

I More precision (or lower var(βˆj )) may result from:


1 smaller RSS – less noise, ceteris paribus (cp);
2 larger sample size (n) relative to No. of β’s (K + 1), cp;
3 greater variation in values of each Xj , cp;
4 less correlation between regressors, cp.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 15 / 47


I Thus, serious multicollinearity may be solved by using one/more of
the following:
1 Increasing sample size (if possible). ???
2 Utilizing a priori information on parameters (from theory or prior
research).
3 Transforming variables or functional form:
a Using ∆X instead of X (where the cause may be X’s moving in the
same direction over time).
b In polynomial regressions, using Xj − X̄ instead of Xj tends to reduce
collinearity.
c Usually, logs are less collinear than levels.
4 Pooling cross-sectional and time-series data.
5 Dropping one of the collinear predictors. ??? But, this may lead to the
omitted variable bias.
6 To be aware of its existence and employing cautious interpretation of
results.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 16 / 47


3. Non-normality of the Error Term
I Normality is not required to get BLUE of β’s.
I The CLRM merely requires errors to be IID.
I Normality of errors is required only for valid hypothesis testing, i.e.,
validity of t- and F-tests.
I In small samples, if errors are not normally distributed, estimated
coefficients will not follow normal distribution, which complicates
inference.
I NB: No obligation on X’s to be normally distributed!
I A formal test of normality is Shapiro-Wilk test [H0 : errors are
normally distributed].
I Large p-value shows that H0 cannot be rejected.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 17 / 47


I If H0 is rejected, transforming the regressand or re-specifying the
functional form of the model may help.
I With large samples, thanks to the CLT, hypothesis testing is
(asymptotically) valid even if the distribution of errors deviates from
normality.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 18 / 47


Non-IID Errors
I The assumption of IID errors is violated if a (simple) random sampling
cannot be assumed. Specifically, the assumption of IID errors fails if:
1 errors are not identically distributed, i.e., if var(ui ) varies with
observation, heteroscedasticity
2 errors are not independently distributed, i.e., if ui ’s are correlated to
each other, serial correlation
3 errors are both heteroscedastic & autocorrelated ⇒ Common in panel
& time series data.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 19 / 47


4. Heteroskedasticity
I One assumption of the CLRM is homoskedasticity, i. e.,
var(ui |X) = var(ui ) = σ 2 .
I This holds if the observations of the error term are drawn from
identical distributions.
I Heteroskedasticity is present if var(ui ) =σi2 6= σ 2 : different variances
for different segments of the population (segments by the values of
the X’s).
I e.g.: Variability of consumption rises with rise in income, i.e., people
with higher incomes display greater variability in consumption.
I Heteroskedasticity is more likely in cross-sectional than time-series
data.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 20 / 47


I With a correctly specified model (in any other aspect), but
heteroskedastic errors, OLS estimators are unbiased & consistent but
inefficient.
I Reason: OLS estimator for σ 2 (and thus for the standard errors of
the coefficients) are biased.
I Hence, CIs based on biased standard errors will be wrong, and the t
& F tests will be invalid.
NB: Heteroskedasticity could be a symptom of other problems (e.g.
omitted variables).
I If heteroskedasticity is a result of specification error (say, omitted
variables), OLS estimators will be biased & inconsistent.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 21 / 47


I With heteroskedasticity, OLS is not optimal: it gives equal weight to
all observations; actually, observations with larger error variances (σi2 )
contain less information than those with smaller σi2
I To correct, give less weight to data points with greater σi2 and more
weight to those with smaller σi2 . [i.e., use GLS (W LS or F GLS)].

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 22 / 47


Detecting Heteroskedasticity:
A. Graphical Method
I Run OLS & plot squared residuals vs. Ŷ or each X.
I The graph may show some r/p (linear, quadratic, . . . ), providing
clues as to the nature of the problem and a possible remedy.
I e.g. suppose the plot of û2 (from Y = α + βX + u) vs. X signifies
that var(ui ) increases proportional to X 2 ; (i.e., var(ui )= σi2 = cXi2 ).
What is the Solution?
I Transform the model by dividing throughout by X.
Y 1 X u ∗ ∗
X = α X + β X + X ⇒ y = αx + β + u

I u is homoskedastic: V (ui )∗

= V (ui /Xi ) = (1/Xi2 )V (ui ) = (1/Xi2 )cXi2 = c; i.e, W LS solves


heteroskedasticity!
I W LS yields BLUE for the transformed model.
I If the pattern of heteroskedasticity is unknown, log transformation of
both sides may solve the problem
I But, this cannot be used with 0 or negative values.
Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 23 / 47
B. A Formal Test:
I The most-often used test for heteroskedasticity is the Breusch-Pagan
(BP) test.
H0 : homoskedasticity vs. Ha : heteroskedasticity
I Regress û2 on Ŷ or û2 on the original X’s, X 2 ’s and, if enough data,
cross-products of the X’s.
I H0 will be rejected for high values of the test statistic [n ∗ R2 ∼ χ2q ]
or for low p-values. n & R2 are obtained from auxiliary regression of
û2 on q (number of) predictors.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 24 / 47


Solutions to (or Estimation with) Heteroskedasticity
I If heteroskedasticity is detected, first check for some other
specification error (omitted variables, wrong functional form, . . . ).
I If it persists even after correcting for other specification errors, use
one of the following:
1 Use a better method of estimation (WLS/FGLS);
2 Stick to OLS but use robust (heteroskedasticity consistent) standard
errors.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 25 / 47


5. Autocorrelation
I Error terms are autocorrelated if error terms from different (usually
adjacent) time periods (cross-sectional units) are correlated,
E(ui uj ) 6= 0.
I Autocorrelation in cross-sectional data is called spatial autocorrelation
(in space, not over time).
I But, spatial autocorrelation is uncommon since cross-sectional data
do not usually have some ordering logic, or economic interest.
I Serial correlation occurs in time-series when errors associated with a
given time period carry over into future time periods. et are
correlated with lagged values: ut−1 , ut−2 , . . .

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 26 / 47


I Effects of autocorrelation are similar to those of heteroskedasticity:
OLS coefficients are unbiased and consistent, but inefficient; the
estimate of σ 2 is biased, and thus inferences are invalid.
Detecting Autocorrelation
I Plotting OLS residuals against the time variable, or a formal test
could be used.
The Breusch-Godfrey Test
I Commonly-used general test of autocorrelation. Steps:
1 Regress OLS residuals on X’s and lagged residuals:
ut = f (X1t , ..., XK t , ut−1 , . . . , ut−j )
2 Test the joint hypothesis that all the estimated coefficients on lagged
residuals are zero. Use the test statistic: Fj cal ∼ χ2j ;
3 Reject H0 : no serial correlation for high values of the test statistic or
for small p-values.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 27 / 47


Estimation in the Presence of Serial Correlation:
I Solutions depend on the sources of the problem.
I Autocorrelation may result from:
Model misspecification (omitted variables, wrong functional form, . . . )
Misspecified dynamics (e.g. static model estimated when dependence is
dynamic), . . .
If autocorrelation is significant, check for model specification errors, &
consider re-specification.
If the revised model passes other specification tests, but still fails tests
of autocorrelation, consider the following key solutions:
1 Use FGLS,
2 Use OLS with robust standard errors.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 28 / 47


6. Endogenous Regressors: E(ui |Xj ) 6= 0
I A key assumption maintained in the previous lessons is that the
model, E(Y |X) = Xβ or E(Y |X) = β0 + ki=1 βi Xi was correctly
P

specified.
I The model Yi = Xβ + ui is correctly specified if:
1 ui is orthogonal to X’s, enters the model with an additively separable
effect on Y & this effect equals zero on average; and,
2 E(Y |X) is linear in stable parameters (β 0 s).
I If the assumption E(ui |Xj ) = 0 is violated, the OLS estimators will
be biased & inconsistent.
I Assuming exogenous regressors is unrealistic in many situations.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 29 / 47


I The possible sources of endogeneity are:
a stochastic regressors & measurement error;
b specification errors: omission of relevant variables or a wrong functional
form;
c nonlinearity in & instability of parameters; and
d bidirectional link between the X’s and Y .
I Recall two versions of exogeneity assumption:
1 E(ui ) = 0 and X’s are fixed (non-stochastic)
2 E(ui Xj ) = 0 or E(ui |Xj ) = 0 with stochastic X’s.
I The assumption E(ui ) = 0 amounts to: “We do not systematically
over- or under-estimate the PRF,” or the overall impact of all the
excluded variables is random/unpredictable.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 30 / 47


I This assumption cannot be tested as residuals will always have
zero-mean if the model has an intercept.
I If there is no intercept, some information can be obtained by plotting
the residuals.
I If E(ui ) = µ (a constant but 6= 0) & X’s are fixed, the estimators of
all β’s, except β0 , will be OK!
I But, can we assume non-stochastic regressors?

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 31 / 47


7. Stochastic Regressors and Measurement Error
A. Stochastic Regressors
I Many economic variables are stochastic, and it is only for ease that
we assumed fixed X’s.
I For instance, the set of regressors may include:
a lagged dependent variable (Yt−1 ), or
an X characterized by a measurement error.
I In both cases, it is unreasonable to assume fixed X’s
I If no other assumption is violated, OLS retains its desirable
properties even if X’s are stochastic.
I In general, stochastic regressors may or may not be correlated with
the model error term.

1 If Xj & ui are independently distributed, E(ui |Xj ) = 0, OLS retains


all its desirable properties.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 32 / 47


2 If X & ui are not independent but are either contemporaneously
uncorrelated, [E(ui |Xi s )6=0 for s = 1, 2, . . . but E(ui |Xi ) = 0], or ui
& Xj are asymptotically uncorrelated, OLS retains its large sample
properties: estimators are biased, but consistent and asymptotically
efficient.
3 The basis for valid statistical inference remains but inferences must be
based on large samples.
4 If Xj & ui are not independent & are correlated even asymptotically,
then OLS estimators are biased & inconsistent.
SOLUTION: IV/2SLS REGRESSION!
I It is not whether X’s are stochastic or fixed that matters, but the
nature of correlation b/n X’s & ui .

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 33 / 47


B. Measurement Error
I Measurement error in the regressand only does not cause bias in OLS
estimators as long as the measurement error is not systematically
related to one or more of the regressors.
I If the measurement error in Y is uncorrelated with X’s, OLS is
perfectly applicable (though with less precision or higher variances).
I If there is a measurement error in a regressor & if this error is
correlated with the measured variable, then OLS estimators will be
biased & inconsistent.
SOLUTION: IV/2SLS REGRESSION!

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 34 / 47


8. Specification Errors
I Model misspecification may result from:
omission of relevant variable/s,
using a wrong functional form, or
inclusion of irrelevant variable/s.

i. Omission of relevant variables: when one/more relevant variables are


omitted from a model.
I Omitted-variable bias: bias in parameter estimates when the assumed
specification is incorrect in that it omits a regressor that must be in
the model.
I e.g. estimating Y = β0 + β1 X1 + β2 X2 + u when the correct model
is Y = β0 + β1 X1 + β2 X2 + β3 Z + u. Wrongly omitting a variable
(Z) is equivalent to imposing β3 = 0 when in fact β3 6= 0.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 35 / 47


I If a relevant regressor (Z) is missing from a model, OLS estimators
of β 0 s(β0 , β1 & β2 ) will be biased, except if
cov(Z, X1 ) = cov(Z, X2 ) = 0.
I Even if cov(Z, X1 ) = cov(Z, X2 ) = 0, the estimator for β0 is biased.
I The OLS estimators for σ 2 & for the standard errors of the β’s are
also biased.
I Consequently, t- and F-tests will not be valid.
I Generally, OLS estimators will be biased, inconsistent and the
inferences will be invalid.
I The decision to include/exclude variables should be guided by
economic theory and reasoning.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 36 / 47


ii. Error in the algebraic form of the relationship:
I a model that includes all regressors may be mis-specified due to error
in functional form relating.
I e.g. using a linear functional form when the true r/p is logarithmic
(log-log) or semi-logarithmic (lin-log or log-lin).
I The effects of functional form misspecification are the same as those
of omitting relevant variables.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 37 / 47


Testing for OVs & Functional Form Misspecification
1. Examination of Residuals
I Often, the plot of residuals vs fitted values is used to have a quick
glance at problems like non-linearity.
I Ideally, we would like to see residuals rather randomly scattered
around zero.
I If there are such errors as OVs or incorrect functional form, the plot
exhibits distinct patterns.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 38 / 47


2. Ramsey’s Regression Equation Specification Error Test (RESET)
I It tests for misspecification due to omitted variables or a wrong
functional form.
I Steps:
1 Regress Y on X’s, and get Ŷ & û.
2 Regress: Y on X’s Ŷ 2 & Ŷ 3 .
3 If Ŷ 2 & Ŷ 3 are significant (using F test), then reject H0 & conclude
that there is misspecification.
I If the model is misspecified, then try another model: look for some
variables which are left out and/or try a different functional form like
log-linear (but based on some theory).
I The test (by rejecting the null) does not suggest an alternative
specification.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 39 / 47


Inclusion of irrelevant variables:
I when one/more irrelevant variables are wrongly included in the model.
e.g. estimating Y = β0 + β1 X1 + β2 X2 + β3 X3 + u when the correct
model is Y = β0 + β1 X1 + β2 X2 + u.
I Consequence: OLS estimators remain unbiased & consistent but
inefficient.
I σ 2 is correctly estimated & conventional hypothesis-testing methods
are still valid, but the estimated variances of the coefficients are larger.
I As a result, our probability inferences about the parameters are less
precise, i.e., precision is lost if the correct restriction β3 = 0 is not
imposed.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 40 / 47


I To test for irrelevant variables, use F-tests (based on RRSS &
U RSS).
I Do not eliminate variables from a model based on insignificance
implied by t-tests.
I In particular, do not drop a variable with |t| > 1.
I Do not drop 2 or more variables at once (based on t-tests) even if
each has |t| < 1.
I The t-statistic corresponding to an X (Xj ) may radically change once
another (Xi ) is dropped.
I In general, model misspecification due to the inclusion of irrelevant
variables is less serious than that due to omission of relevant
variable/s.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 41 / 47


9. Stability of Parameters and Dummy Variables Regression (DVR)
a.Stability of Parameters
I So far we assumed that the intercept and all the slope coefficients
(βj0 s) are the same/stable for the whole set of observations.
Y = Xβ + u
I But, structural shifts and/or group differences are common in the real
world. May be:
ü the intercept differs/changes, or
ü the (partial) slope differs/changes, or
ü both differ/change across categories or time period.
I Two methods for testing parameter stability:
(i) Using Chow tests, or (ii) Using DVR.
A. The Chow Tests
I Using an F-test to determine whether a single regression is more
efficient than two/more separate regressions on sub-samples.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 42 / 47


I The stages in running the Chow test are:
1 Run 2 separate regressions (say, before & after war or policy reform,) &
save RSS 0 s : RSS1 & RSS2 .
RSS1 has n1 − (K + 1) df & RSS2 has n2 − (K + 1) df.
RSS1 + RSS2 = U RSS with n1 + n2 − 2(K + 1)df .
2 Estimate pooled model (under H0 : β 0 s are stable).
RSS from this model is RRSS with n − (K + 1) df where n = n1 + n2 .
RSS)/K+1
The test-statistic (under H0 ): Fcal = (RRSS−U
U RSS/(n−2(K+1))
3 Find the critical value: FK +1,n−2(K +1) from table.
4 If Fcal > Ftab , reject H0 of stable parameters (and favour Ha : there is
structural break).

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 43 / 47


I e.g.: we have the ff results from estimation of real consumption from
real disposable income:
I i. For the period 1974-1991: consi = α1 + β∗ inci + ui
Consumption = 153.95 + 0.75*Income
p-value: (0.000) (0.000)
RSS = 4340.26114; R2 = 0.9982
I ii. For the period 1992-2006: consi = α2 + β2 ∗ inci + ui
Consumption = 1.95 + 0.806*Income
p-value: (0.975) (0.000)
RSS = 10706.2127; R2 = 0.9949
I iii. For the period 1974-2006: consi = α + β ∗ inci + ui
Consumption = 77.64 + 0.79*Income
t-ratio: (4.96) (155.56)
RSS = 22064.6663; R2 = 0.9987

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 44 / 47


1 URSS = RSS1 + RSS2 = 15064.474
2 RRSS = 22064.6663
K = 1 and K + 1 = 2; n1 = 18, n2 = 15, n = 33.
22064.6663−15064.474)/2
3 Thus, Fcal = 15064.474/(29) = 6.763
4 p-value = Prob(F-tab > 6.7632981) = 0.003883
5 Reject H0 at α=1%. Thus, there is structural break.
The pooled consumption model is an inadequate specification; we
should run separate regressions.
The above method of calculating the Chow test breaks down if either
n1 < K + 1 or n2 < K + 1.
Solution: use Chow’s second (predictive) test!

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 45 / 47


I If, for instance, n2 < K + 1, then the F-statistic will be altered as
follows:
(RRSS−RSS1 )/n2
Fcal = RSS1 /(n1 −(K+1))
I The Chow test tells if the parameters differ on average, but not which
parameters differ.
I Also, it requires that all groups have the same σ 2 .
I This assumption is questionable: if parameters can be different, then
so can the variances be.
I One way of correcting for unequal σ 2 is to use dummy variable
regression with robust standard errors.

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 46 / 47


************* End of Chapter Four *************

Introduction to Econometrics Addis Ababa UniversitySchool of Commerce December 8, 2023 47 / 47

You might also like