You are on page 1of 30

FUNDAMENTALS OF

ECONOMETRICS
DR ABDUL WAHEED
PhD Econometrics
FUNDAMENTALS OF
ECONOMETRICS
Week 3
Lecture 5
The Classical Linear
Regression Model (CLRM)
The Assumptions Underlying the
Method of Least Squares
The regression model was developed first by Gauss in 1821.
It has served as a norm or a standard against which may be compared
the regression models that do not satisfy the Gaussian assumptions.
The Classical Linear
Regression Model (CLRM)
The OLS method is used to estimate 𝜷𝟏 and 𝜷𝟐 and obtain 𝜷𝟏 and 𝜷𝟐 .
How close 𝜷𝟏 and 𝜷𝟐 are to their counterparts in the population or
how close 𝒀𝒊 is to the true 𝑬 𝒀/𝑿𝒊 ?
𝒀𝒊 depends on both 𝑿𝒊 and 𝒖𝒊 .
Thus, the assumptions made about the 𝑿𝒊 variable(s) and the error
term are extremely critical to the valid interpretation of the regression
estimates.
The Classical Linear
Regression Model (CLRM)
The Gaussian, standard, or classical linear
regression model (CLRM), which is the
cornerstone of most econometric theory,
makes 7 assumptions.
CLRM-Assumptions
ASSUMPTION 1
Linear Regression Model
The regression model is linear in the parameters, though it may or may
not be linear in the variables.
That is the regression model as shown

𝒀𝒊 = 𝜷𝟏 + 𝜷𝟐 𝑿𝒊 + 𝒖𝒊

This model can be extended to include more explanatory variables.


CLRM-Assumptions
ASSUMPTION 2
Regressors 𝑿𝒔 are Fixed or
X Values Independent of the Error Term
Values taken by the regressor X may be considered fixed in repeated
samples.
In the latter case, it is assumed that the 𝑿𝒔 and the 𝒖𝒊 are
independent, that is, 𝒄𝒐𝒗 𝑿𝒊 , 𝒖𝒊 = 𝟎.
CLRM-Assumptions

 𝑬 𝒀 = 𝟏𝟐𝟏. 𝟐𝟎 (Unconditional mean)


CLRM-Assumptions
𝒀𝒊 = 𝜷𝟏 + 𝜷𝟐 𝑿𝒊

𝑫𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏 𝒐𝒇 𝒀
𝒈𝒊𝒗𝒆𝒏 𝑿 = $𝟐𝟐𝟎
CLRM-Assumptions
ASSUMPTION 3
Zero Mean Value of Disturbance 𝒖𝒊

Given the value of 𝑿𝒊 , the mean or expected value of the


random disturbance term 𝒖𝒊 is zero.
Symbolically,
𝑬 𝒖𝒊 /𝑿𝒊 = 𝟎
CLRM-Assumptions
Conditional distribution
of the disturbances 𝒖𝒊 .
CLRM-Assumptions
The positive 𝒖𝒊 values cancel out the negative 𝒖𝒊 values so
that their average or mean effect on Y is zero.
There is no specification bias or specification error in the
model used in empirical analysis. The regression model is
correctly specified.
Specification Error:
Leaving out important explanatory variables,
Including unnecessary variables,
Choosing the wrong functional form of the relationship
between the Y and X variables.
CLRM-Assumptions
If the conditional mean of one random variable given
another random variable is zero, the covariance between
the two variables is zero and hence the two variables are
uncorrelated.
But if X and u are correlated, it is not possible to assess
their individual effects on Y.
In this situation it is quite possible that the error term
actually includes some variables that should have been
included as additional regressors in the model.
CLRM-Assumptions
ASSUMPTION 4
Homoscedasticity or Constant Variance of 𝒖𝒊

The variance of the error, or disturbance, term is the same


regardless of the value of X.
Symbolically,
𝒗𝒂𝒓 𝒖𝒊 /𝑿 = 𝝈𝟐
CLRM-Assumptions
The variance of 𝒖𝒊 for each 𝑿𝒊 (i.e., the conditional variance
of 𝒖𝒊 ) is some positive constant number equal to 𝝈𝟐 .
Technically, it is the assumption of homoscedasticity, or
equal spread or equal variance.
Stated differently, that the Y populations corresponding to
various X values have the same variance.
In contrast, the conditional variance of the Y population
varies with X. This situation is known appropriately as
heteroscedasticity, or unequal spread, or unequal variance.
CLRM-Assumptions
Homoscedasticity Heteroscedasticity
CLRM-Assumptions
ASSUMPTION 5
No Autocorrelation between the Disturbances
Given any two X values, 𝑿𝒊 and 𝑿𝒋 𝒊 ≠ 𝒋 , the correlation
between any two 𝒖𝒊 and 𝒖𝒋 𝒊 ≠ 𝒋 , is zero.
In short, the observations are sampled independently.
Symbolically,
𝒄𝒐𝒗 𝒖𝒊 , 𝒖𝒋 /𝑿𝒊 , 𝑿𝒋 = 𝟎
where 𝒊 and 𝒋 are two different observations.
CLRM-Assumptions
Simply the disturbances 𝒖𝒊 and 𝒖𝒋 are uncorrelated. Technically,
this is the assumption of no serial correlation, or no
autocorrelation. This means that, given Xi, the deviations of any
two Y values from their mean value do not exhibit patterns.
If the disturbances (deviations) follow systematic patterns there
is auto or serial correlation.
Autocorrelation: The error term across observations are
correlated with each other. i.e., 𝒖𝒊 is correlated with 𝒖𝒋 (the
correlation between the successive values of error terms)
CLRM-Assumptions

positive serial negative serial zero correlation


correlation correlation
CLRM-Assumptions
ASSUMPTION 6
The Number of Observations n Must Be Greater than the
Number of Parameters to Be Estimated and
no Specification Bias

From less observation there is no way to estimate the more


unknowns (parameters).
CLRM-Assumptions
ASSUMPTION 7
The Nature of X Variables

1. There are no perfect linear relationships among the X


variables. This is the assumption of no multicollinearity.
2. The X values in a given sample must not all be the same.
3. There can be no outliers in the values of the X variable.
CLRM-Assumptions
No multicollinearity: There is no exact linear relationship
between explanatory variables, is technically known as the
assumption of no collinearity or no multicollinearity.
In short, the assumption of no multicollinearity requires that in
the PRF we include only those variables that are not exact linear
functions of one or more variables in the model.
CLRM-Assumptions
If all the X values are identical, then 𝑿𝒊 = 𝑿 and the
denominator of that equation will be zero, making it impossible
to estimate 𝜷𝟐 and therefore 𝜷𝟏 .
If there is very little variation in X then it is not possible to
explain much of the variation in the Y. The variation in both Y
and X is essential to use regression analysis.
In short, the variables must vary!
CLRM-Assumptions
The requirement that there are no outliers in the X values is to
avoid the regression results being dominated by such outliers.
If there are a few X values that are, say, 20 times the average of
the X values, the estimated regression lines with or without such
observations might be vastly different.
CLRM-Assumptions
ASSUMPTION 8
Normality of the distribution
Although it is not a part of the CLRM, it is assumed that the
error term follows the normal distribution with zero mean and
(constant) variance.
Symbolically,
𝒖𝒊 ~𝑵 𝟎, 𝝈𝟐
CLRM-Assumptions
Assumption Number Type of Violation
1 Nonlinearity in parameters.
2 Stochastic regressor(s).
3 Nonzero mean of 𝒖𝒊 . and
Specification Bias.
4 Heteroscedasticity.
5 Autocorrelated disturbances.
6 Sample observations less than the
number of regressors.
7 Multicollinearity, and
Insufficient variability in regressors.
8 Non-normality of disturbances.
The Gauss–Markov Theorem
Given the assumptions of the classical linear regression model,
the least-squares estimates possess some ideal or optimum
properties.
These properties are contained in the well-known Gauss–
Markov theorem.
To understand this theorem, consider the best linear
unbiasedness property of an estimator.
The Gauss–Markov Theorem
On the basis of Assumptions A-1 to A-7, the method of ordinary
least squares (OLS) gives the best linear unbiased estimators
(BLUE).
1. Estimators are linear functions of the dependent variable Y.
The estimators are linear, that is, they are linear functions of the
dependent variable Y. Linear estimators are easy to understand
and deal with compared to nonlinear estimators.
The Gauss–Markov Theorem
2. The estimators are unbiased.
The estimators are unbiased, that is, in repeated applications of
the method, on average, the estimators are equal to their true
values.
3. In the class of linear estimators, OLS estimators are best.
In the class of linear unbiased estimators, OLS estimators have
minimum variance. As a result, the true parameter values can be
estimated with least possible uncertainty; an unbiased estimator
with the least variance is called an efficient estimator.
The Gauss–Markov Theorem
In short, under the assumed conditions,
OLS estimators are
BLUE
best linear unbiased estimators
This is the essence of the well-known Gauss–Markov
theorem.

You might also like