Lec4 PDF

C HAPT E R 4
CLRM Assumption
and BLUE
Learning Objectives
• By the end of this Chapter students will
– be familiar with the classical assumptions
– know the statistical properties of the estimators under
the classical assumptions
– The BLUE properties
Evan Lau for EBQ2074 (CHAP 4)

Classical Assumptions
• In regression analysis, it is very important for us to check
whether our estimated regression model meets certain
assumptions made about independent variables and the error
term.
• There is a set of Gaussian, standard, or classical assumptions
for us to follow in order for OLS estimators to display desirable
property.
• Any interpretation of the regression estimates will be invalid if
these assumptions are violated.
• Linear regression model that fulfill these assumptions are known
as Classical Linear Regression Model (CLRM).

 Why Need the CLRM assumption? – as the objective is not
only to obtain the estimates of beta (both the intercept and
coefficient of X) but rather to draw inferences about true betas
or rather how close is the estimated betas from the counterparts
in the population or how close Y is to the true E(Y|X).
 In this regard, we need to make certain assumption about the
manner in which Y are generated. If the PRF is Y  0  1 X1  
 Then Y is depends on X and  and unless we are specific about
how these variables were generated, there is no way we can
make any statistical inference about the Y.
 Thus, assumptions made about X and error term are extremely
important valid interpretation of regression estimates.
 First developed by Gauss in 1821.

Assumption 1: Linear and correctly specified regression model.
Assumption 2: Explanatory variables are nonstochastic
Assumption 3: Zero mean of error term
Assumption 4: Equal variance of error term
Assumption 5: No autocorrelation between error terms
Assumption 6: Zero correlation between explanatory variables and error
term
Assumption 7: The number of observations n must be greater than the
number of explanatory variables (k).
Assumption 8: Variability in values of explanatory variables
Assumption 9: The regression model is correctly specified
Assumption 10: No perfect multicolinearity between explanatory
variables
The Classical Assumptions
– Assumption 1: Yi  ˆ  ˆX i  uî
• The model is linear in the parameters
– Assumption 2: Explanatory variables are nonstochastic
• Our regression is conditional analysis, in the sense
that we estimate the true value of Yi conditional on
the given values of explanatory variables.
• As such, the explanatory variables are assumed to be
nonstochastic, in the sense that their values must be
fixed in repeating sampling.

–Assumption 3: E  ui | X i   0
• Zero mean of the disturbance ui
• Given the value of X, the mean, or expected value of the
disturbance term, ui , is zero.
• So to speak, the positive  i values and the negative
values  i neutralized each other so that their average
effect on Yi is zero.
• This assumption says that the factors not explicitly
included in the model, do not systematically affect the
expected value of Yi .

Y Conditional Distribution of the Disturbances
. Mean E(ui|Xi) Yi=α+βXi+ui
-ui
+ui
X
X1 X2 X3
– Assumption 4:
• Homoscedasticity means equal (homo) spread
(scedasticity), or equal variance of ui
• Given the value of X, the variance of, ui, the
disturbance term is the same for all observations.
var  ui | X i   E ui  E  ui | X i 

2
 E (u | X i ) uses assumption 3
2
i
 2
• In passing note that error with different variance for

different observation are known as heteroscedastic
error .

Homoscedasticity Case
f(Yi)
.
.
x1=80 x2=100 income xi

The probability density function for Yi at two
levels of family income, X i , are identical.
Heteroscedasticity Case
f(Yi)
.
.
.
x1 x2 x3 income xt
The variance of Yi increases as family income,
xi, increases.
– Assumption 5:
• No autocorrelation between the disturbances
• Given any two X values, Xi and Xj, (i  j ) , the
correlation between any two ui and uj, (i  j ), is zero.
cov  uiu j   E ui  E  ui | X i  u j  E u j | X j 

 E  ui | X i   u j | X j  uses Assumption 3
0
• This is to say that the observations of the error term are
independent of each other.
• We normally come across serial correlation problem in
the modeling of time series data.

4 0.3
0.2
2
0.1
0
-4 -2 0 2 4
0.0
-0.3 -0.1 0.1 0.3
-0.1
-2
-0.2
-4
-0.3
No systematic pattern Pattern of positive correlation among the
errors
0.4
0.3
0.2
Pattern of negative
0.1 correlation among the
0.0
errors
-0.4 -0.2 0.0 0.2 0.4
-0.1
-0.2
-0.3
-0.4

• Assumption 6
– In the regression model, we assume that the explanatory variables and
error term have separate influence on the dependent variable.
– If explanatory variables and error term are correlated,
• OLS program would mistakenly attribute the variation in caused by
to have been caused by instead.
• it is not possible to isolate their individual effects on the dependent
variable.
Zero covariance between Xi and ui or E(Xi ui )=0
cov  ui X i   E ui  E  ui | X i   X i  E  X i 

 E  ui   X i  E  X i   uses Assumption 3
 E  ui X i   E  ui  E  X i  since E  X i  is nonstochastic
 E  ui X i  since E  ui   0
 0 by assumption
– Assumption 7:
• The number of observations (n) must be greater than
the number of parameters to be estimated (k) or n >
k
– Assumption 8:
• Variability in X values
• Technically var(X) must be a finite positive number
• Consider a univariate regression model. If all X
values are identical, X = X the denominator of
the  1 formula will be zero, making it impossible to
estimate  1and therefore  0.

–Assumption 9:
• The regression model is correctly specified. There is no
specification error or bias in the model used for empirical
analysis
–Assumption 10:
• Perfect collinearity between two explanatory variables implies
that
• they are actually the same variable;
• one is a multiple of another; or
• a constant has been added to one of the variables.
• If more than two explanatory variables are linearly related, we
have multicollinearity problem.
• If multicollinearity exists among the explanatory variables, the
OLS estimation procedure will not be able to distinguish one
variable from the others. As such, we the regression
coefficients are indeterminate.
One more assumption that is often used in
practice but is not required for least squares
• This assumption is not required for OLS estimation,

but is added for the purpose of hypothesis testing,
which is important in the verification of economic
theory.
• In particular, the t-statistic and F-statistic are not truly
applicable unless the error term is normally distributed.
• The values of y are normally distributed about
their mean for each value of x
• Y ~ N [(1+2X),2 ] or u is normally distributed with
mean 0, var(u)=2 u ~ N(0,2)

- Chapter 4 page 98 – 101 indicates the
importance of having this additional
assumption as part of statistical inference
especially when dealing with hypothesis
testing.

Properties of OLS Estimators
• The desirable properties of estimators include:
• Linearity
• Unbiasedness
• Efficiency
• Consistency
• Gauss-Markov Theorem: Given the assumptions of the
classical linear regression model, the least squares estimators,
in the class of unbiased linear estimators, have minimum
variance, that is, they are best linear unbiased estimators
(BLUE).

Linearity
• A linear estimator refers to the estimator obtained from a
linear function of dependent variable and independent
variable(s).
Unbiasedness
• An estimator is unbiased if the mean of its sampling
distribution equals the true parameter.
• The mean of the sampling distribution is the expected of the

true parameter, b. Bias is the defined as the difference
between the expected value of the estimator and the true
parameter; that is, bias = .

• Note that lack of bias does not mean thatbˆ  b , but that in
repeated random sampling, we get on average 
the correct
estimate (its average or expected value, E (  2 ), is equal to
the true value, 2).
• EXAMPLE: Suppose the mean exam score (true parameter)
of this class in EBQ2074 of 150 students (population) is 60
(true parameter value).
• If we estimate the mean height by randomly selecting 50
students from the same class and find out that the sample
mean exam score (estimator of true parameter) is 60 cm, then
the sample mean score is an unbiased estimator of the true
mean height from the population of 150 students.

Unbiased :The expected value of the
estimator bk equals to the true value of k
Prob.
(b2) E(b2)<2 E(b2)=2 E(b2)>2
Biased Unbiased Biased
underestimate overestimate
E(b2)
2
E(b2) True value E(b2)

of 2
Efficiency
• The best unbiased or efficient estimator refers to the one with
the smallest variance among unbiased estimators.
• It is the unbiased estimator with the most compact or least
spread out distribution.
• This is very important because the researcher would be more
certain that the estimator is closer to the true population
parameter being estimated.
• Another was of saying this is that an efficient estimator has the
smallest confidence interval and is more likely to be statistically
significant than any other estimator.

Consistency
• An estimator is said to be consistent estimator if it approaches
the true value of the parameter, as the sample size gets larger
and larger (this referred to as asymptotic unbiasedness)
Pr{| ˆ   | } 1 as n goes to infinity for small
This implies that

Var(ˆ )  0 and E(ˆ )   as n goes to infinity

To sum up, an estimator is BLUE if
• It is linear. That is, a linear function of a random variable such as
the dependent variable in the regression model.
• It is unbiased. That is, its average or expected value, is equal to the
true value.
• It is efficient. It has a minimum variance in the class of all such
linear unbiased estimators. An unbiased estimator with the least
variance is known as an efficient estimator.
• It is consistent. As the sample size gets larger, the variance gets
smaller, and the estimate converges on the true value.
• It is normally distributed with mean  and variance β .

σ 2

Goodness of Fit
• The Model: Yi  ˆ  ˆX i  uî
 Y  Y 
n
i 1 i
2

 i 1 Yî  Yˆ
n

2
 i 1 ui 
n
ˆ 2
Total Sum Explained Sum Unexplained

of Squares of Squares (residual) Sum
of Squares

R2 - Measure of “goodness of fit”
ESS RSS
Define R2 = TSS
=1-
TSS
=1-
 u^ i2
(Yi - Y)2
(Yi - Y)2
^
R2 =
(Yi - Y)2 1  R2  0
Goodness of Fit
• For example, if R2 =0.89, 89% of variation in Y is
explained by all the X’s.
• The higher the R2, the better the estimated regression
model fits the sample data.
• As number of independent variables increase (for example
from 2 independents variables increase to 4), R2 will also
increase.
• Thus, it is not suitable for comparing two regression
models with the same dependent variable but difference
number of independent variables. This is because R2 does
not take into consideration of degree of freedom.

The Coefficient of Determination
Observed Y versus estimated Y
 
Yi   0   1 X i  î
  
Explained portion of Yi : Yi  0  1 Xi
Unexplained portion of Yi:
  
î  Yi  Y i  Yi   0   1 X i
Y As R2 = 0
SRF
Which SRF ?
Y
As R2 = 1
SRF
SRF go through all points
Evan Lau for EBQ2074 (CHAP 4) X

• To mitigate this problem, we can used R 2 , the adjusted
(adjusted for degrees of freedom):
(n  1)
R  1
2
(1  R 2 )
(n  k )
• R2 is always positive but adjusted R2 can be negative. So
if compare two models that have the different amount of
independents variables, adjusted R2 can is the best
measure of goodness of fit.

Example
What is the value of R2 and adjusted R2 ?

R 2 is only a descriptive measure.
R 2and Adj do not measure the

R 2
quality of the regression model.
Focusing solely on maximizing

R2 or Adj R2 is not a good idea.

End of Chapter 4

Lec4 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec4 PDF

Uploaded by

Copyright:

Available Formats

C HAPT E R 4

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

. Mean E(ui|Xi) Yi=α+βXi+ui

var  ui | X i   E ui  E  ui | X i 

• In passing note that error with different variance for

Evan Lau for EBQ2074 (CHAP 4)

x1=80 x2=100 income xi

cov  uiu j   E ui  E  ui | X i  u j  E u j | X j 

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

cov  ui X i   E ui  E  ui | X i   X i  E  X i 

Evan Lau for EBQ2074 (CHAP 4)

• This assumption is not required for OLS estimation,

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

• The mean of the sampling distribution is the expected of the

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

Pr{| ˆ   | } 1 as n goes to infinity for small

This implies that

Evan Lau for EBQ2074 (CHAP 4)

• It is normally distributed with mean  and variance β .

Evan Lau for EBQ2074 (CHAP 4)

Total Sum Explained Sum Unexplained

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

SRF go through all points

Evan Lau for EBQ2074 (CHAP 4) X

Evan Lau for EBQ2074 (CHAP 4)

What is the value of R2 and adjusted R2 ?

Evan Lau for EBQ2074 (CHAP 4)

R 2and Adj do not measure the

quality of the regression model.

Focusing solely on maximizing

Evan Lau for EBQ2074 (CHAP 4)

Evan Lau for EBQ2074 (CHAP 4)

You might also like