Professional Documents
Culture Documents
Dougherty
Introduction to Econometrics,
5th edition
Chapter heading
Chapter 2: Properties of the
Regression Coefficients and
Hypothesis Testing
The regression coefficients are special types of random variable. We will demonstrate this
using the simple regression model in which Y depends on X. The two equations show the
true model and the fitted regression.
1
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ
2 X i X Yi Y
X X
2
i
X i X 1 2 X i ui 1 2 X u
X X
2
i
X i X 2 X i X ui u
X X
2
i
We will investigate the properties of the ordinary least squares (OLS) estimator of the slope
coefficient, shown above.
2
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ
2 X i X Yi Y
X X
2
i
X i X 1 2 X i ui 1 2 X u
X X
2
i
X i X 2 X i X ui u
X X
2
i
Y has two components: a nonrandom component that depends on X and the parameters,
and the random component u. Since ˆ2 depends on Y, it indirectly depends on u.
3
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ
2 X i X Yi Y
X X
2
i
X i X 1 2 X i ui 1 2 X u
X X
2
i
X i X 2 X i X ui u
X X
2
i
If the values of u in the sample had been different, we would have had different values of Y,
and hence a different value for ˆ2 . We can in theory decompose b2 into its nonrandom and
random components.
4
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ
2 X i X Yi Y
X X
2
i
X i X 1 2 X i ui 1 2 X u
X X
2
i
X i X 2 X i X ui u
X X
2
i
The first step is to substitute for Y and its sample mean from the true model.
5
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ
2 X i X Yi Y
X X
2
i
X i X 1 2 X i ui 1 2 X u
X X
2
i
X i X 2 X i X ui u
X X
2
i
The b1 terms in the second factor cancel. We rearrange the remaining terms.
6
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
X i X 2 X i X ui u
ˆ
2
X X
2
i
X X X i X ui u
2
2 i
Xi X
2
2
X X u u
i i
X X
2
i
7
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
X i X 2 X i X ui u
ˆ
2
X X
2
i
X X X i X ui u
2
2 i
Xi X
2
2
X X u u
i i
X X
2
i
Hence we decompose ˆ2 into the true value b2 and an error term that depends on the values
of X and u.
8
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
X i X 2 X i X ui u
ˆ
2
X X
2
i
X X X i X ui u
2
2 i
Xi X
2
2
X X u u
i i
X X
2
i
The error term depends on the value of the disturbance term in every observation in the
sample, and thus it is a special type of random variable.
9
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
X i X 2 X i X ui u
ˆ
2
X X
2
i
X X X i X ui u
2
2 i
Xi X
2
2
X X u u
i i
X X
2
i
The error term is responsible for the variations of ˆ2 around its fixed component b2. If we
wish, we can express the decomposition more tidily.
10
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
11
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui X i X u
X i X ui u X i X
X i X ui
The next step is to make a small simplification of the numerator of the error term. First, we
expand it as shown.
12
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui X i X u
X i X ui u X i X
X i X ui
The mean value of u is a common factor of the second summation, so it can be taken
outside.
13
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ
2 X i X Yi Y
2
X X u u
i i
Xi X X X
2 2
i
X i X ui u X i X ui X i X u
X i X ui u X i X
X i X ui
X i X X i nX nX nX 0
X
X i
The second term then vanishes because the sum of the deviations of X around its sample
mean is automatically zero.
14
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui
Thus we can rewrite the decomposition as shown. For convenience, the denominator of the
error term has been denoted D.
15
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui
16
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui
Another re-arrangement.
17
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui
18
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
Thus we have shown that ˆ2 is equal to the true value and plus a weighted linear
combination of the values of the disturbance term in the sample, where the weights are
functions of the values of X in the observations in the sample.
19
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
As you can see, every value of the disturbance term in the sample affects the sample value
of ̂ 2 .
20
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
Before moving on, it may be helpful to clarify a mathematical technicality. In the summation
in the denominator of the expression for ai, the subscript has been changed to j. Why?
21
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
Xi X Xi X Xi X
ai
X 1 X 2 ... X n X 2
X
n
X
2
j
j 1
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
The denominator is the sum, from 1 to n, of the squared deviations of X from its sample
mean. This is made explicit in the version of the expression in the box at the top of the
slide.
22
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
Xi X Xi X Xi X
ai
X 1 X 2 ... X n X 2
X
n
X
2
j
j 1
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
Written this way, the meaning of the denominator is clear, but the form is clumsy.
Obviously, we should use S–notation to compress it.
23
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
Xi X Xi X Xi X
ai
X 1 X 2 ... X n X 2
X
n
X
2
j
j 1
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
For the S-notation, we need to choose an index symbol that changes as we go from the first
squared deviation to the last. We can use anything we like, EXCEPT i, because we are
already using i for a completely different purpose in the numerator.
24
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
Xi X Xi X Xi X
ai
X 1 X 2 ... X n X 2
X
n
X
2
j
j 1
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
We have used j here, but this was quite arbitrary. We could have used anything for the
summation index (except i), as long as the meaning is clear. We could have used a smiley ☻
instead (please don’t).
25
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
The error term depends on the value of the disturbance term in every observation in the
sample, and thus it is a special type of random variable.
26
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y X X u u
i i i i
X X X X
2 2 2
i i
X i X ui u X i X ui
X X
2
j
ˆ
2 2 X i X ui 1
2 X i X ui
1 Xi X
2 X i X ui 2 ui
2 ai ui Xi X Xi X
ai
jX X 2
We will show that the error term has expected value zero, and hence that the ordinary least
squares (OLS) estimator of the slope coefficient in a simple regression model is unbiased.
27
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y a u
i i
X X
2 2 t i
i
Xi X
ai
X X
n
2
j
j 1
E ˆ2 E 2 E ai ui
2 E ai ui 2 ai E ui 2
The expected value of ̂ 2 is equal to the expected value of b2 and the expected value of the
weighted sum of the values of the disturbance term.
28
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y a u
i i
X X
2 2 t i
i
E ai ui E a1 u1 ... an un E a1 u1 ... E a n un E a i ui
E ˆ2 E 2 E ai ui
2 E ai ui 2 ai E ui 2
ˆ2 X X Y Y a u
i i
X X
2 2 t i
i
Xi X
ai
X X
n
2
j
j 1
E ˆ2 E 2 E ai ui
2 E ai ui 2 ai E ui 2
Now for each i, E(aiui) = aiE(ui). This is a really important step and we can make it only with
Model A.
30
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y a u
i i
X X
2 2 t i
i
Xi X
ai
X X
n
2
j
j 1
E ˆ2 E 2 E ai ui
2 E ai ui 2 ai E ui 2
Under Model A, we are assuming that the values of X in the observations are nonstochastic.
It follows that each ai is nonstochastic, since it is just a combination of the values of X.
31
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ2 X X Y Y a u
i i
X X
2 2 t i
i
Xi X
ai
X X
n
2
j
j 1
E ˆ2 E 2 E ai ui
2 E ai ui 2 ai E ui 2
Thus it can be treated as a constant, allowing us to take it out of the expectation using the
second expected value rule (Review chapter).
32
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ
2 X i X Yi Y
2 at ui
X X
2
i
Xi X
ai
X X
n
2
j
j 1
E ˆ2 E 2 E ai ui
2 E ai ui 2 ai E ui 2
Under Assumption A.3, E(ui) = 0 for all i, and so the estimator is unbiased. The proof of the
unbiasedness of the estimator of the intercept will be left as an exercise.
33
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
OLS estimators of the parameters are not the only unbiased estimators. We will give an
example of another.
34
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1
Y
un u1
n 2 1 2 X n un
Y
Yn
X n X1 X n X1
Y1
Y1 1 2 X 1 u1
X1 Xn X
Someone who had never heard of regression analysis, seeing a scatter diagram of a sample
of observations, might estimate the slope by joining the first and the last observations, and
dividing the increase in the height by the horizontal distance between them.
35
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1
Y
un u1
n 2 1 2 X n un
Y
Yn
X n X1 X n X1
Y1
Y1 1 2 X 1 u1
X1 Xn X
The estimator is thus (Yn–Y1) divided by (Xn–X1). We will investigate whether it is biased or
unbiased.
36
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1
Y
un u1
n 2 1 2 X n un
Y
Yn
X n X1 X n X1
Y1
Y1 1 2 X 1 u1
X1 Xn X
37
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1 un u1
2
X n X1 X n X1
The b1 terms cancel out and the rest of the expression simplifies as shown. Thus we have
decomposed this naïve estimator into two components, the true value and an error term.
This decomposition is parallel to that for the OLS estimator, but the error term is different.
38
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1 un u1
2
X n X1 X n X1
un u1
E 2 E 2 E
ˆ
X
n X 1
1
2 E un u1 2
X n X1
39
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1 un u1
2
X n X1 X n X1
un u1
E 2 E 2 E
ˆ
X
n X 1
1
2 E un u1 2
X n X1
The denominator of the error term can be taken outside because the values of X are
nonstochastic.
40
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1 un u1
2
X n X1 X n X1
un u1
E 2 E 2 E
ˆ
X
n X 1
1
2 E un u1 2
X n X1
Given Assumption A.3, the expectations of un and u1 are zero. Therefore, despite being
naïve, this estimator is unbiased.
41
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1 un u1
2
X n X1 X n X1
un u1
E 2 E 2 E
ˆ
X
n X 1
1
2 E un u1 2
X n X1
It is intuitively easy to see that we would not prefer the naïve estimator to OLS. Unlike OLS,
which takes account of every observation, it employs only the first and the last and is
wasting most of the information in the sample.
42
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1 un u1
2
X n X1 X n X1
un u1
E 2 E 2 E
ˆ
X
n X 1
1
2 E un u1 2
X n X1
The naïve estimator will be sensitive to the value of the disturbance term u in those two
observations, whereas the OLS estimator combines all the disturbance term values and
takes greater advantage of the possibility that to some extent they cancel each other out.
43
RANDOM COMPONENTS, UNBIASEDNESS OF THE REGRESSION COEFFICIENTS
ˆ Yn Y1 1 2 X n un 1 2 X 1 u1
2
X n X1 X n X1
2 X n X 1 un u1 un u1
2
X n X1 X n X1
un u1
E 2 E 2 E
ˆ
X
n X 1
1
2 E un u1 2
X n X1
More rigorously, it can be shown that the population variance of the naïve estimator is
greater than that of the OLS estimator, and that the naïve estimator is therefore less
efficient.
44
Copyright Christopher Dougherty 2016.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
2016.04.18