You are on page 1of 34

Chapter 13

Method of Moments Estimation

13.1 Introduction
The simple linear regression model is y = 1 + 2 x + e .
The usual assumptions we make are
1. E (e) = 0
2. var(e) = 2
3. cov(ei , e j ) = 0
4. The variable x is not random, and it must take at least two different values.
5. (optional) e ~ N (0, 2 )
Slide 13.1
Undergraduate Econometrics, 2nd Edition-Chapter 13

In this Chapter we relax the assumption that variable x is not random


In an economists non-experimental world, the values of x and y are usually revealed
at the same time, making x random, in the same way y is.

The purpose of this chapter is to discuss regression models in which x is random. We


will
discuss the conditions under which having a random x is not a problem, and how to
test whether our data satisfies these conditions.
present the simultaneous equations model, in which the randomness of x causes the
least squares squares estimator to fail.
provide estimators that have good properties even when x is random.

Slide 13.2
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.2 Linear Regression with Random xs


Let us modify the usual simple regression assumptions follows:

A13.1

yt = 1 + 2 xt + et .

A13.2

The data pairs ( xt , yt ) t = 1,, T , are obtained by random sampling..

A13.3

E (e | x ) = 0 . The expected value of the error term, conditional on any value of

x, is zero.
A13.4

In the sample, xt must take at least two different values.

A13.5

var (e | x ) = 2 . The variance of the error term, conditional on any x is a

constant 2 .
A13.6

e | x ~ N (0, 2 ). The distribution of the error term, conditional on x, is normal.

Slide 13.3
Undergraduate Econometrics, 2nd Edition-Chapter 13

The interpretation of A13.3, E (e | x ) = 0 . This assumption implies that we have


(1) omitted no important variables,
(2) used the correct functional form, and
(3) there exist no factors that cause the error term to be correlated with x.
If E (e | x ) = 0 then we can show that it is also true that x and e are uncorrelated, so that
cov ( x, e ) = 0 .
Conversely, if x and e are correlated, then cov ( x, e ) 0 and we can show that
E (e | x ) 0 .

Slide 13.4
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.2.1

The Finite (Small) Sample Properties of the Least Squares Estimator

The finite sample properties of the least squares estimator when x is random can be
summarized as follows:

1. Under assumptions A13.1-A13.4 the least squares estimator is unbiased.


2. Under assumptions A13.1-A13.5 the least squares estimator is the best linear
unbiased estimator of the regression parameters, and the usual estimator of 2 is
unbiased.
3. Under assumptions A13.1-A13.6 the distribution of the least squares estimators,
conditional upon x, are normal, and their variances are estimated in the usual
way.

Consequently the usual interval estimation and hypothesis testing

procedures are valid.


Slide 13.5
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.2.2

The Asymptotic (Large) Sample Properties of the Least Squares Estimator


When x is Not Random

What happens to the probability distribution of the least squares estimators if we have a
very large sample, or when the sample size T?
First, the least squares estimators are unbiased.
Second, the variances of the least squares estimators, given in equation 4.2.10,
converge to zero as T. As the sample size gets increasingly large the probability
distributions of the least squares estimators collapse about the true parameters.
Estimators with this property are called consistent estimators, and consistency is a nice
large sample property of the least squares estimators.

Slide 13.6
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.2.3

The Asymptotic (Large) Sample Properties of the Least Squares Estimator


When x is Random

For the purposes of a large sample analysis of the least squares estimator we can
somewhat weaken the assumption A13.3. Let us replace it by

A13.3*

E (e ) = 0 and cov ( x, e ) = 0

This is a weaker assumption than A13.3 because E (e | x ) = 0 cov (e, x ) = 0 but


cov (e, x ) = 0 does not E (e | x ) = 0 .

Slide 13.7
Undergraduate Econometrics, 2nd Edition-Chapter 13

What we can say is the following:

1. Under assumption A13.3* the least squares estimators are consistent.


2. Under assumptions A13.1, A13.2, A13.3*, A13.4 and A13.5, the least squares
estimators have approximate normal distributions in large samples, whether the errors
are normally distributed or not. Furthermore our usual interval estimators and test
statistics are valid, if the sample is large.
3. If assumption A13.3* is not true, then the least squares estimators are inconsistent.
They do not converge to the true parameter values even in very large samples.
Furthermore, none of our usual hypothesis testing or interval estimation procedures are
valid.

Slide 13.8
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.2.4

Measurement Errors in Regression Equations

In the errors-in-variables problem an explanatory variable is measured with error.


When we measure an explanatory variable with error, it is correlated with the error
term, and the least squares estimator is inconsistent.
Let us assume that a workers wages are a function of the workers ability. Let yt =
wages of the tth worker and let xt* = the ability of the tth worker
yt = 1 + 2 xt* + et

(13.2.1)

We have asterisked (*) the ability variable because it is difficult, if not impossible, to
observe.
Suppose that we attempt to measure ability using xt = a standardized test score. It is
sometimes called a proxy variable.
Slide 13.9
Undergraduate Econometrics, 2nd Edition-Chapter 13

xt = xt* + ut

(13.2.2)

where ut is a random disturbance, with mean 0 and variance u2 .


Assume that ut is independent of et, and serially uncorrelated.
Substitute xt* = xt ut into equation 13.2.2 to obtain
yt = 1 + 2 xt* + et

= 1 + 2 ( xt ut ) + et

= 1 + 2 xt + (et 2ut )

(13.2.3)

= 1 + 2 xt + vt

Slide 13.10
Undergraduate Econometrics, 2nd Edition-Chapter 13

In equation 13.2.3 the explanatory variable xt is random, from the assumption of


measurement error in equation 13.2.2.
In order to estimate equation 13.2.3 by least squares we must determine whether or not
xt is uncorrelated with the random disturbance vt.
cov ( xt , et ) = E ( xt vt ) = E ( xt* + ut )( et 2ut )
= E ( u

2
2 t

) =
2

2
u

(13.2.5)

The least squares estimator b2 is an inconsistent estimator of 2 because of the


correlation between the explanatory variable and the error term.
b2 does not converge to 2 in large samples.
Furthermore, in large or small samples, b2 is not approximately normal with mean 2
and variance var (b2 ) = 2

( xt x ) .
2

Slide 13.11
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.2.5

The Inconsistency of the Least Squares Estimator when cov ( x, e ) 0

Our regression model is y = 1 + 2 x + e .


Under A13.3* E (e ) = 0 , so that E ( y ) = 1 + 2 E ( x ).
Subtract this expectation from the original equation,
y E ( y ) = 2 x E ( x ) + e
Multiply both sides by x E ( x )
x E ( x ) y E ( y ) = 2 x E ( x ) + x E ( x ) e
2

Slide 13.12
Undergraduate Econometrics, 2nd Edition-Chapter 13

Yake expected values of both sides

E x E ( x ) y E ( y ) = 2 E x E ( x ) + E x E ( x ) e ,
2

or
cov ( x, y ) = 2 var ( x ) + cov ( x, e )
Solve for 2

2 =

cov ( x, y ) cov ( x, e )

var ( x )
var ( x )

(13.2.6)

If we can assume that that cov ( x, e ) = 0 , then


2 =

cov ( x, y )
.
var ( x )

(13.2.7)

Slide 13.13
Undergraduate Econometrics, 2nd Edition-Chapter 13

The least squares estimator, expressed in a slightly different way, is

b2 =

x, y )
( xt x )( yt y ) ( xt x )( yt y ) / (T 1) cov(
=
=
2
2
x)
var(
( xt x )
( xt x ) / (T 1)

(13.2.8)

If cov ( x, e ) = 0 then
b2 =

x, y )
cov(
cov( x, y )

= 2
x)
var(
var( x)

If x and e are correlated, then

2 =

cov ( x, y ) cov ( x, e )

var ( x )
var ( x )
Slide 13.14

Undergraduate Econometrics, 2nd Edition-Chapter 13

The least squares estimator now converges to

b2

cov ( x, y )
cov ( x, e )
2
= 2 +
var ( x )
var ( x )

In this case b2 is an inconsistent estimator of 2.

Slide 13.15
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.3 Estimators Based on the Method of Moments

13.3.1

Method of Moments Estimation of the Mean and Variance of a Random

Variable
The k'th moment of a random variable is the expected value of the random variable
raised to the k'th power.
E (Y k ) = k = k'th moment of Y

(13.3.1)

Slide 13.16
Undergraduate Econometrics, 2nd Edition-Chapter 13

The k'th population moment in equation 13.3.1 can be estimated consistently using the
sample (of size T) analog
E (Y k ) = k = k'th sample moment of Y
(13.3.2)

=y
t =1

k
t

The method of moments estimation procedure equates m population moments to m


sample moments to estimate m unknown parameters.
Let Y be a random variable with mean E (Y ) = and variance
var (Y ) = 2 = E (Y ) = E (Y 2 ) 2
2

(13.3.3)

Slide 13.17
Undergraduate Econometrics, 2nd Edition-Chapter 13

In order to estimate the two population parameters and 2 we must equate two
population moments to two sample moments.
The first two population and sample moments of Y are:
T

E (Y ) = 1 = , = yt T
t =1

E (Y 2 ) = 2 ,

2 = yt2 T

(13.3.4)

t =1

Solve for the unknown mean and variance parameters.


T

= yt T = y

(13.3.5)

t =1

Slide 13.18
Undergraduate Econometrics, 2nd Edition-Chapter 13

Then, use equation 13.3.3, replacing the second population moment by its sample
value, and replacing the mean by equation (13.3.5)
" 2 = E (Y 2 ) 2
T

y
t =1

2
t

y2 =

y
t =1

2
t

Ty 2

(13.3.6)

( yt y )

In general, method of moments estimators are consistent in large samples, but there is
no guarantee that they are "best" in any sense.

Slide 13.19
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.3.2

Method of Moments Estimation in the Simple Linear Regression Model

The definition of a "moment" can be extended to more general situations.


E g (Y ) is a moment of g(Y).
In the linear regression model yt = 1 + 2 xt + et we usually assume that
E (et ) = 0 E ( yt 1 2 xt ) = 0

(13.3.7)

If xt is fixed, or random but not correlated with et, then


E ( xt et ) = 0 E xt ( yt 1 2 xt ) = 0

(13.3.8)

Equations 13.3.7-8 are moment conditions.


Slide 13.20
Undergraduate Econometrics, 2nd Edition-Chapter 13

If we replace the two population moments by the corresponding sample moments, we


have two equations in two unknowns, which define the method of moments estimators
for 1 and 2 .

1
( yt b1 b2 xt ) = 0

T
1
xt ( yt b1 b2 xt ) = 0

(13.3.9)

These two equations are identical to the least squares "normal" equations from Chapter
3, equation 3.3.6, and their solution yields the least squares estimators
b2 =

( xt x )( yt y )
2

x
x
( t )

b1 = y b2 x

Slide 13.21
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.3.3

Method of Moments/Instrumental Variables Estimation in the Simple Linear

Regression Model

Problems for least squares arise when x is random and correlated with the random
disturbance e, so that E ( xt et ) 0 .
Suppose, however, that there is another variable, zt, called an instrumental variable,
which satisfies the moment condition
E ( zt et ) = 0 E zt ( yt 1 2 xt ) = 0

(13.3.10)

Slide 13.22
Undergraduate Econometrics, 2nd Edition-Chapter 13

Then we can use the two equations 13.3.7 and 13.3.10 to obtain estimates of 1 and 2 .
The sample moment conditions are:

1
yt 1 2 xt = 0

T
1
zt yt 1 2 xt = 0

(13.3.11)

Solving these equations leads us to method of moments estimators, which are usually
called the instrumental variable estimators,
z
y T zt yt ( zt z )( yt y )
2 = t t
=

z
x
T
z
x
t t t t ( zt z )( xt x )

(13.3.12)

1 = y 2 x

Slide 13.23
Undergraduate Econometrics, 2nd Edition-Chapter 13

What are the properties of these new estimators?


First, note that
( z, y )
= ( zt z )( yt y ) (T 1) = cov
.
2

z
z
x
x
T
z
x
1
cov
,
(
)
( t )( t ) ( )

(13.3.13)

The sample covariance converges to the true covariance in large samples, so we can
say
cov ( z , y )
2
cov ( z , x )

(13.3.14)

For an instrumental variable to be valid, it must be uncorrelated with the error term e
but correlated with the explanatory variable x.
Slide 13.24
Undergraduate Econometrics, 2nd Edition-Chapter 13

Follow the same steps that led to equation 13.2.6, we obtain

2 =

cov ( z , y ) cov ( z , e )

cov ( z , x ) cov ( z , x )

(13.3.15)

If we can assume that that cov ( z , e ) = 0 , then the instrumental variables estimator in
equation (13.3.14) converges in large samples to 2,
cov ( z , y )
2
= 2 .
cov ( z , x )

(13.3.16)

Thus if cov ( z , e ) = 0 and cov ( z , x ) 0 , then the instrumental variable estimator of 2


is consistent, in a situation in which the least squares estimator is not consistent due to
correlation between x and e.
Slide 13.25
Undergraduate Econometrics, 2nd Edition-Chapter 13

In large samples the instrumental variable estimator has the approximate normal
distribution,

2
~ N ,

2
2 ( x x )2 r 2
t
zx

rzx2 is the sample correlation between the instrument z and the random regressor x.
We want an instrument that is highly correlated with z to improve the efficiency of the
instrumental variable estimator.
The error variance is estimated using the estimator
2IV =

yt 1 2 xt

T 2

(13.3.18)

Slide 13.26
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.3.4

Instrumental Variables Estimation When Surplus Instruments are Available

Usually we have more instrumental variables at our disposal than are necessary.
For example, let w be a variable that is correlated with x but uncorrelated with e

E (et ) = E ( yt 1 2 xt ) = 0 yt 1 2 xt = m1 = 0

E ( zt et ) = E zt ( yt 1 2 xt ) = 0 zt yt 1 2 xt = m2 = 0

(13.3.19a)
(13.3.19b)

E ( wt et ) = E wt ( yt 1 2 xt ) = 0 wt yt 1 2 xt = m3 = 0 (13.3.19c)

Using the least squares principle, choose values for 1 and 2 that minimize the sum of
squares m12 + m22 + m32 .
It is best to use weighted least squares, with the weights being the reciprocals of the
variances of the moments, if the moments are uncorrelated.
Slide 13.27
Undergraduate Econometrics, 2nd Edition-Chapter 13

That is, we should minimize

S 1 , 2

m12
m22
m32
=
+
+
var ( m1 ) var ( m2 ) var ( m3 )

(13.3.20)

The values of 1 and 2 that minimize this weighted sum of squares can be obtained
using a 2-step process.

(1) Regress x on a constant term, z and w, and obtain the predicted values x
(2) Use x as an instrumental variable for x.

Slide 13.28
Undergraduate Econometrics, 2nd Edition-Chapter 13

This leads to the two moment conditions

( y x ) = 0
x ( y x ) = 0
t

2 t

2 t

Solving these conditions, and using the fact that x = x , we have

( x x )( y y ) = ( x x )( y y )
( x x )( x x ) ( x x )( x x )

2 =

(13.3.21)

1 = y 2 x

Slide 13.29
Undergraduate Econometrics, 2nd Edition-Chapter 13

These instrumental variables estimators, derived using the method of moments, are
also called two-stage least squares estimators, because they can be obtained using
two least squares regressions.
Stage 1 is the regression of x on a constant term, z and w, to obtain the predicted
values x
Stage 2 is ordinary least squares estimation of the simple linear regression
yt = 1 + 2 xt + errort

(13.3.22)

The approximate, large sample, variance of 2 is given by the usual formula for the
variance of the least squares estimator of equation (13.3.22),

( )

var 2

2
=
2
( xt x )
Slide 13.30

Undergraduate Econometrics, 2nd Edition-Chapter 13

The estimator of the error variance must be based on the residuals from the original
model, yt = 1 + 2 xt + et ,

2
IV

yt 1 2 xt
T 2

Thus the appropriate estimator of the variance of 2 is

( )

2 =
var

2IV

( xt x )

This estimated variance can be used as a basis for t-tests of significance and interval
estimation of parameters.

Slide 13.31
Undergraduate Econometrics, 2nd Edition-Chapter 13

13.4 Testing for Correlation Between Explanatory Variables and the Error Term

The null hypothesis is H 0 : Cov ( x, e ) = 0 against the alternative that H1 : Cov ( x, e ) 0 .


The idea of the test is to compare the performance of the least squares estimator to an
instrumental variables estimator.
Under the null and alternative hypotheses we know the following:
If the null hypothesis is true, both the least squares estimator and the instrumental

variables estimator are consistent. q = bols IV 0 .


If the null hypothesis is false, the least squares estimator is not consistent, and the

instrumental variables estimator is consistent. q = bols IV c 0 .

Slide 13.32
Undergraduate Econometrics, 2nd Edition-Chapter 13

There are several forms of the test, usually called the "Hausman Test," for this null and
alternative hypothesis
In the regression yt = 1 + 2 xt + et we wish to know whether x is correlated with e.
Let zt1 and zt 2 be instrumental variables for x.
At minimum you need one instrument for each variable you think might be correlated
with the error term. Then carry out the following steps:
Estimate the model xt = a0 + a1 zt1 + a2 zt 2 + vt by least squares, and obtain the
residuals vt = xt a0 + a1 zt1 + a2 zt 2 . If there are more than one explanatory variables
that are questionable, repeat this estimation for each one, using all available
instrumental variables in each regression.
Include the residuals computed in step 1 as an explanatory variable in the
regression, yt = 1 + 2 xt + vt + et . Estimate this "artificial regression" by least
squares, and employ the usual t-test for the hypothesis of significance
Slide 13.33
Undergraduate Econometrics, 2nd Edition-Chapter 13

H 0 : = 0 ( no correlation between x and e )


H1 : 0 (correlation between x and e )

If more than one variable is suspect, the test will be a F-test of joint significance of the
coefficients on the included residuals.

Slide 13.34
Undergraduate Econometrics, 2nd Edition-Chapter 13

You might also like