Statistical Properties of The OLS Coefficient Estimators: ECON 351 - NOTE 4

ECONOMICS 351* -- NOTE 4 M.G.
Abbott
ECON 351* -- NOTE 4
Statistical Properties of the OLS Coefficient Estimators
1. Introduction
We derived in Note 2 the OLS (Ordinary Least Squares) estimators β̂ j (j = 0, 1) of

the regression coefficients βj (j = 0, 1) in the simple linear regression model given
by the population regression equation, or PRE
Yi = β0 + β1X i + u i (i = 1, …, N) (1)
where ui is an iid random error term. The OLS sample regression equation
(SRE) corresponding to PRE (1) is
Yi = βˆ 0 + βˆ 1X i + û i (i = 1, …, N) (2)
where β̂ 0 and β̂1 are the OLS coefficient estimators given by the formulas
∑ xy
βˆ 1 = i i 2 i (3)
∑i x i
βˆ 0 = Y − βˆ 1X (4)
x i ≡ X i − X , y i ≡ Yi − Y , X = ∑ i X i N , and Y = ∑ i Yi N .
Why Use the OLS Coefficient Estimators?
The reason we use these OLS coefficient estimators is that, under assumptions A1-
A8 of the classical linear regression model, they have several desirable statistical
properties. This note examines these desirable statistical properties of the OLS
coefficient estimators primarily in terms of the OLS slope coefficient estimator β̂1 ;
the same properties apply to the intercept coefficient estimator β̂ 0 .
ECON 351* -- Note 4: Statistical Properties of OLS Estimators ... Page 1 of 12 pages
ECONOMICS 351* -- NOTE 4 M.G. Abbott
2. Statistical Properties of the OLS Slope Coefficient Estimator
¾ PROPERTY 1: Linearity of β̂ 1
The OLS coefficient estimator β̂1 can be written as a linear function of the
sample values of Y, the Yi (i = 1, ..., N).
Proof: Starts with formula (3) for β̂1 :
∑i x i yi
βˆ 1 =
∑ i x i2
∑ i x i (Yi − Y )
=
∑ i x i2
∑ i x i Yi Y ∑i x i
= −
∑i x i 2
∑ i x i2
∑ i x i Yi
= because ∑ i x i = 0.
∑ i x i2
• Defining the observation weights k i = x i ∑ i x i2 for i = 1, …, N, we can re-

write the last expression above for β̂1 as:
xi
βˆ 1 = ∑ i k i Yi where k i ≡ (i = 1, ..., N) … (P1)
∑ i x i2
• Note that the formula (3) and the definition of the weights ki imply that β̂1 is
also a linear function of the yi’s such that
βˆ 1 = ∑ i k i y i .
Result: The OLS slope coefficient estimator β̂1 is a linear function of the
sample values Yi or yi (i = 1,…,N), where the coefficient of Yi or yi is ki.
Properties of the Weights ki
In order to establish the remaining properties of β̂1 , it is necessary to know the

arithmetic properties of the weights ki.
[K1] ∑ i k i = 0 , i.e., the weights ki sum to zero.
xi 1
∑i ki = ∑i = ∑ i x i = 0, because ∑ i x i = 0.
∑i xi
2
∑ i x 2i
1
[K2] ∑ i k i2 = .
∑ i x i2
⎛ xi ⎞
2
x 2i (∑ x ) 2
1
∑i k = ∑i ⎜ ⎟ = ∑i = =
2 i i
.
⎝ ∑ i x 2i ⎠ ( ) (∑ x ) ∑ i x 2i
i 2 2 2
∑ i x 2i i i
[K3] ∑ i k i x i = ∑ i k i X i .
∑ i k i x i = ∑ i k i ( X i − X)
= ∑i ki Xi − X ∑i ki
= ∑i ki Xi since ∑ i k i = 0 by [K1] above.
[K4] ∑ i k i x i = 1 .
⎛ xi ⎞
∑i k i xi = ∑i ⎜
x i2 (
∑ i x 2i )
⎟ xi = ∑i = = 1.
⎝ ∑ i x 2i ⎠ (
∑ i x 2i ) (
∑ i x 2i )
Implication: ∑ i k i X i = 1.
¾ PROPERTY 2: Unbiasedness of β̂1 and β̂ 0 .
The OLS coefficient estimator β̂1 is unbiased, meaning that E(βˆ 1 ) = β1 .

The OLS coefficient estimator β̂ is unbiased, meaning that E(βˆ ) = β .
0 0 0
• Definition of unbiasedness: The coefficient estimator β̂ 1 is unbiased if and

only if E (βˆ ) = β ; i.e., its mean or expectation is equal to the true coefficient β1.
1 1
Proof of unbiasedness of β̂1 : Start with the formula βˆ 1 = ∑ i k i Yi .
1. Since assumption A1 states that the PRE is Yi = β0 + β1X i + u i ,
βˆ 1 = ∑ i k i Yi
= ∑ i k i (β0 + β1X i + u i ) since Yi = β0 + β1X i + u i by A1
= β 0 ∑ i k i + β1 ∑ i k i X i + ∑ i k i u i
= β1 + ∑ i k i u i , since ∑ i k i = 0 and ∑ i k i X i = 1.
2. Now take expectations of the above expression for β̂1 , conditional on the
sample values {Xi: i = 1, …, N} of the regressor X. Conditioning on the
sample values of the regressor X means that the ki are treated as nonrandom,
since the ki are functions only of the Xi.
E(βˆ 1 ) = E(β1 ) + E[∑ i k i u i ]

= β1 + ∑ i k i E(u i X i ) since β1 is a constant and the k i are nonrandom
= β1 + ∑ i k i 0 since E(u i X i ) = 0 by assumption A2
= β1 .
Result: The OLS slope coefficient estimator β̂1 is an unbiased estimator of

the slope coefficient β1: that is,
E(βˆ 1 ) = β 1 . ... (P2)
Proof of unbiasedness of β̂ 0 : Start with the formula βˆ 0 = Y − βˆ 1X .
1. Average the PRE Yi = β0 + β1X i + u i across i:

N N N
∑ Yi = Nβ0 + β1 ∑ X i + ∑ u i (sum the PRE over the N observations)
i =1 i =1 i =1
N N N
∑ Yi Nβ 0 ∑ Xi ∑ ui
i =1
= + β1 i=1 + i =1
(divide by N)
N N N N
Y = β 0 + β1X + u where Y = ∑iYi N , X = ∑iX i N , and u = ∑iu i N .
2. Substitute the above expression for Y into the formula βˆ 0 = Y − βˆ 1X :
βˆ 0 = Y − βˆ 1X
= β0 + β1X + u − βˆ 1X since Y = β0 + β1X + u
= β + (β − βˆ ) X + u.
0 1 1
3. Now take the expectation of β̂ 0 conditional on the sample values {Xi: i = 1,

…, N} of the regressor X. Conditioning on the Xi means that X is treated as
nonrandom in taking expectations, since X is a function only of the Xi.
[
E (βˆ 0 ) = E (β 0 ) + E (β1 − βˆ 1 ) X + E ( u ) ]
= β0 + X E (β1 − βˆ 1 ) + E ( u ) since β0 is a constant
= β + X E (β − βˆ )
0 1 1since E ( u ) = 0 by assumptions A 2 and A5
[
= β0 + X E (β1 ) − E (βˆ 1 ) ]
= β0 + X (β1 − β1 ) since E(β1 ) = β1 and E (βˆ 1 ) = β1
= β0
Result: The OLS intercept coefficient estimator β̂ 0 is an unbiased estimator

of the intercept coefficient β0: that is,
E(βˆ 0 ) = β 0 . ... (P2)
¾ PROPERTY 3: Variance of β̂1 .
• Definition: The variance of the OLS slope coefficient estimator β̂1 is defined
as
( ) {[ 2
Var βˆ 1 ≡ E βˆ 1 − E(βˆ 1 ) . ]}
• Derivation of Expression for Var( β̂1 ):
1. Since β̂1 is an unbiased estimator of β1, E( β̂1 ) = β1. The variance of β̂1 can
therefore be written as
( ) {[
Var βˆ 1 = E βˆ 1 − β1 ] }.
2
2. From part (1) of the unbiasedness proofs above, the term [ β̂1 − β1], which is
called the sampling error of β̂1 , is given by
[βˆ − β ] = ∑ k u .
1 1 i i i
3. The square of the sampling error is therefore
[βˆ − β ] = (∑ k u )
1 1
2
i i i
2
4. Since the square of a sum is equal to the sum of the squares plus twice the
sum of the cross products,
[βˆ − β ] = (∑ k u )
1 1
2
i i i
2
N N
= ∑k 2
i u + 2∑ ∑ k i k s u i u s .
2
i
i =1 i <s s = 2
For example, if the summation involved only three terms, the square of the
sum would be
2
⎛⎜ 3 k u ⎞⎟ = k u + k u + k u 2
⎝∑ i i
⎠
( 1 1 2 2 3 3)
i=1
= k 12 u 12 + k 22 u 22 + k 23 u 23 + 2 k 1 k 2 u 1 u 2 + 2 k 1 k 3 u 1 u 3 + 2 k 2 k 3 u 2 u 3 .
5. Now use assumptions A3 and A4 of the classical linear regression model

(CLRM):
(A3) Var (u i X i ) = E (u i2 X i ) = σ 2 > 0 for all i = 1, ..., N;
(A4) Cov(u i , u s X i , X s ) = E(u i u s X i , X s ) = 0 for all i ≠ s.
6. We take expectations conditional on the sample values of the regressor X:
[(
E βˆ 1 − β1 ) ] = ∑ k E(u
2 N
i =1
2
i
2
i
N
X i ) + 2 ∑ ∑ k i k s E(u i u s X i , X s )
i <s s = 2
N
= ∑ k i2 E(u i2 X i ) since E(u i u s X i , X s ) = 0 for i ≠ s by (A4)
i =1
N
= ∑ k i2σ 2 since E(u i2 X i ) = σ 2 ∀ i by (A3)
i =1
σ2 1
= since ∑ i k i2 = by (K2).
∑ i x i2 ∑ i x i2
Result: The variance of the OLS slope coefficient estimator β̂1 is
σ2 σ2 σ2
Var (βˆ 1 ) = = = where TSSX = ∑i x i2 . ... (P3)
∑ i x i2 ∑ i ( X i −X ) 2 TSS X
The standard error of β̂1 is the square root of the variance: i.e.,
1
⎛ σ ⎞ 2 2 σ σ
se(βˆ 1 ) = Var (βˆ 1 ) = ⎜⎜ ⎟ =
2 ⎟
= .
∑
⎝ i i ⎠
x ∑ i x i2 TSS X
¾ PROPERTY 4: Variance of β̂ 0 (given without proof).
Result: The variance of the OLS intercept coefficient estimator β̂ 0 is
σ 2 ∑ i X i2 σ 2 ∑ i X i2
Var (β0 ) =
ˆ = . ... (P4)
N ∑ i x i2 N ∑ i ( X i −X ) 2
The standard error of β̂ 0 is the square root of the variance: i.e.,

1
⎛ σ ∑i X
2 2
⎞2
se(βˆ 0 ) = Var (βˆ 0 ) = ⎜⎜ i
⎟⎟ .
⎝ N ∑i x
2
i ⎠
• Interpretation of the Coefficient Estimator Variances
Var (βˆ 0 ) and Var (β$ 1 ) measure the statistical precision of the OLS
coefficient estimators β̂ 0 and β$ 1 .
σ2 σ 2 ∑ i X i2
Var (βˆ 1 ) = ; Var (β 0 ) =
ˆ .
∑ i x i2 N ∑ i x i2
Determinants of Var(βˆ 0 ) and Var(β$ 1 )
Var (βˆ 0 ) and Var(β$ 1 ) are smaller:
(1) the smaller is the error variance σ2 , i.e., the smaller the variance of the
unobserved and unknown random influences on Yi ;
(2) the larger is the sample variation of the Xi about their sample mean,
i.e., the larger the values of x 2i = ( X i − X) 2 , i = 1, …, N;
(3) the larger is the size of the sample, i.e., the larger is N.
¾ PROPERTY 5: Covariance of β̂ 0 and β$ 1 .
• Definition: The covariance of the OLS coefficient estimators β̂ 0 and β$ 1 is

defined as
Cov( β̂ 0 , β̂1 ) ≡ E{[ β̂ 0 - E( β̂ 0 )][ β$ 1 - E( β$ 1 )]}.
• Derivation of Expression for Cov( β̂ 0 , β$ 1 ):
1. Since βˆ 0 = Y − βˆ 1X , the expectation of β̂ 0 can be written as
E(βˆ 0 ) = Y − E (βˆ 1 ) X
= Y − β1 X since E (βˆ 1 ) = β1 .
Therefore, the term βˆ 0 − E (βˆ 0 ) can be written as
βˆ 0 − E (βˆ 0 ) = [ Y − βˆ 1X ] − [ Y − β1X ]
= Y − βˆ X − Y + β X
1 1
= − βˆ 1X + β1X
= − X (βˆ 1 − β1 ).
2. Since E (βˆ 1 ) = β1 , the term βˆ 1 − E (βˆ 1 ) takes the form
βˆ 1 − E (βˆ 1 ) = βˆ 1 − β1 .
3. The product [ βˆ 0 − E (βˆ 0 ) ][ βˆ 1 − E (βˆ 1 ) ] thus takes the form
[βˆ 0 − E (βˆ 0 )][βˆ 1 − E (βˆ 1 )] = − X (βˆ 1 − β1 )(βˆ 1 − β1 )

= − X (βˆ − β ) 2 .
1 1
4. The expectation of the product [ βˆ 0 − E (βˆ 0 ) ][ βˆ 1 − E (βˆ 1 ) ] is therefore
{ } [
E [βˆ 0 − E(βˆ 0 )][βˆ 1 − E (βˆ 1 )] = E − X(βˆ 1 − β1 ) 2 ]
= − X E (βˆ − β ) 2
1 1 b/c X is a constant
= − X Var (βˆ 1 ) b/c E(βˆ 1 − β1 ) 2 = Var(βˆ 1 )
⎛ σ2 ⎞ σ2
= − X ⎜⎜ ⎟
2 ⎟
b/c Var(β1 ) =
ˆ .
∑
⎝ i i ⎠
x ∑ i x i2
Result: The covariance of β̂ 0 and β$ 1 is
⎛ σ2 ⎞
Cov(β0 , β1 ) = − X ⎜⎜
ˆ ⎟.
2 ⎟
... (P5)
⎝ ∑i x i ⎠
• Interpretation of the Covariance Cov( β̂ 0 , β$ 1 ).
Since both σ 2 and ∑ i x 2i are positive, the sign of Cov( β̂ 0 , β̂1 ) depends on the
sign of X .
(1) If X > 0, Cov( β̂ 0 , β̂1 ) < 0: the sampling errors (βˆ 0 − β 0 ) and (βˆ 1 − β1 )
are of opposite sign.
(2) If X < 0, Cov( β̂ 0 , β̂1 ) > 0: the sampling errors (βˆ 0 − β 0 ) and (βˆ 1 − β1 )
are of the same sign.
¾ THE GAUSS-MARKOV THEOREM
• Importance of the Gauss-Markov Theorem:
1) The Gauss-Markov Theorem summarizes the statistical properties of the

OLS coefficient estimators β$ j (j = 0, 1).
2) More specifically, it establishes that the OLS coefficient estimators β̂ j (j =
0, 1) have several desirable statistical properties.
• Statement of the Gauss-Markov Theorem: Under assumptions A1-A8 of the

CLRM, the OLS coefficient estimators β$ j (j = 0, 1) are the minimum variance
estimators of the regression coefficients βj (j = 0, 1) in the class of all linear
unbiased estimators of βj.
That is, under assumptions A1-A8, the OLS coefficient estimators β$ j are the
BLUE of βj (j = 0, 1) in the class of all linear unbiased estimators, where
1) BLUE ≡ Best Linear Unbiased Estimator
2) “Best” means “minimum variance” or “smallest variance”.
So the Gauss-Markov Theorem says that the OLS coefficient estimators β$ j are
the best of all linear unbiased estimators of βj, where “best” means “minimum
variance”.
• Interpretation of the G-M Theorem:

~
1. Let β j be any other linear unbiased estimator of βj.
Let β$ j be the OLS estimator of βj; it too is linear and unbiased.
~
2. Both estimators β j and β$ j are unbiased estimators of βj:
~
E(β$ j ) = β j and E(β j ) = β j .
~
3. But the OLS estimator β$ j has a smaller variance than β j :
~ ~
Var (β$ j ) ≤ Var (β j ) ⇒ β$ j is efficient relative to β j .
~
This means that the OLS estimator β$ j is statistically more precise than β j ,
any other linear unbiased estimator of βj.

Statistical Properties of The OLS Coefficient Estimators: ECON 351 - NOTE 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistical Properties of The OLS Coefficient Estimators: ECON 351 - NOTE 4

Uploaded by

Copyright:

Available Formats

ECONOMICS 351* -- NOTE 4 M.G.

ECON 351* -- NOTE 4

Statistical Properties of the OLS Coefficient Estimators

We derived in Note 2 the OLS (Ordinary Least Squares) estimators β̂ j (j = 0, 1) of

Why Use the OLS Coefficient Estimators?

2. Statistical Properties of the OLS Slope Coefficient Estimator

Proof: Starts with formula (3) for β̂1 :

• Defining the observation weights k i = x i ∑ i x i2 for i = 1, …, N, we can re-

 Properties of the Weights ki

In order to establish the remaining properties of β̂1 , it is necessary to know the

[K1] ∑ i k i = 0 , i.e., the weights ki sum to zero.

¾ PROPERTY 2: Unbiasedness of β̂1 and β̂ 0 .

The OLS coefficient estimator β̂1 is unbiased, meaning that E(βˆ 1 ) = β1 .

• Definition of unbiasedness: The coefficient estimator β̂ 1 is unbiased if and

 Proof of unbiasedness of β̂1 : Start with the formula βˆ 1 = ∑ i k i Yi .

1. Since assumption A1 states that the PRE is Yi = β0 + β1X i + u i ,

E(βˆ 1 ) = E(β1 ) + E[∑ i k i u i ]

 Result: The OLS slope coefficient estimator β̂1 is an unbiased estimator of

E(βˆ 1 ) = β 1 . ... (P2)

 Proof of unbiasedness of β̂ 0 : Start with the formula βˆ 0 = Y − βˆ 1X .

1. Average the PRE Yi = β0 + β1X i + u i across i:

Y = β 0 + β1X + u where Y = ∑iYi N , X = ∑iX i N , and u = ∑iu i N .

2. Substitute the above expression for Y into the formula βˆ 0 = Y − βˆ 1X :

3. Now take the expectation of β̂ 0 conditional on the sample values {Xi: i = 1,

 Result: The OLS intercept coefficient estimator β̂ 0 is an unbiased estimator

E(βˆ 0 ) = β 0 . ... (P2)

¾ PROPERTY 3: Variance of β̂1 .

3. The square of the sampling error is therefore

5. Now use assumptions A3 and A4 of the classical linear regression model

(A3) Var (u i X i ) = E (u i2 X i ) = σ 2 > 0 for all i = 1, ..., N;

(A4) Cov(u i , u s X i , X s ) = E(u i u s X i , X s ) = 0 for all i ≠ s.

6. We take expectations conditional on the sample values of the regressor X:

 Result: The variance of the OLS slope coefficient estimator β̂1 is

¾ PROPERTY 4: Variance of β̂ 0 (given without proof).

 Result: The variance of the OLS intercept coefficient estimator β̂ 0 is

The standard error of β̂ 0 is the square root of the variance: i.e.,

• Interpretation of the Coefficient Estimator Variances

 Determinants of Var(βˆ 0 ) and Var(β$ 1 )

Var (βˆ 0 ) and Var(β$ 1 ) are smaller:

¾ PROPERTY 5: Covariance of β̂ 0 and β$ 1 .

• Definition: The covariance of the OLS coefficient estimators β̂ 0 and β$ 1 is

Cov( β̂ 0 , β̂1 ) ≡ E{[ β̂ 0 - E( β̂ 0 )][ β$ 1 - E( β$ 1 )]}.

• Derivation of Expression for Cov( β̂ 0 , β$ 1 ):

1. Since βˆ 0 = Y − βˆ 1X , the expectation of β̂ 0 can be written as

Therefore, the term βˆ 0 − E (βˆ 0 ) can be written as

2. Since E (βˆ 1 ) = β1 , the term βˆ 1 − E (βˆ 1 ) takes the form

3. The product [ βˆ 0 − E (βˆ 0 ) ][ βˆ 1 − E (βˆ 1 ) ] thus takes the form

[βˆ 0 − E (βˆ 0 )][βˆ 1 − E (βˆ 1 )] = − X (βˆ 1 − β1 )(βˆ 1 − β1 )

4. The expectation of the product [ βˆ 0 − E (βˆ 0 ) ][ βˆ 1 − E (βˆ 1 ) ] is therefore

 Result: The covariance of β̂ 0 and β$ 1 is

• Interpretation of the Covariance Cov( β̂ 0 , β$ 1 ).

¾ THE GAUSS-MARKOV THEOREM

• Importance of the Gauss-Markov Theorem:

1) The Gauss-Markov Theorem summarizes the statistical properties of the

• Statement of the Gauss-Markov Theorem: Under assumptions A1-A8 of the

1) BLUE ≡ Best Linear Unbiased Estimator

2) “Best” means “minimum variance” or “smallest variance”.

• Interpretation of the G-M Theorem:

You might also like

Properties of the Weights ki

Proof of unbiasedness of β̂1 : Start with the formula βˆ 1 = ∑ i k i Yi .

Result: The OLS slope coefficient estimator β̂1 is an unbiased estimator of

Proof of unbiasedness of β̂ 0 : Start with the formula βˆ 0 = Y − βˆ 1X .

Result: The OLS intercept coefficient estimator β̂ 0 is an unbiased estimator

Result: The variance of the OLS slope coefficient estimator β̂1 is

Result: The variance of the OLS intercept coefficient estimator β̂ 0 is

Determinants of Var(βˆ 0 ) and Var(β$ 1 )

Result: The covariance of β̂ 0 and β$ 1 is