Undergraduate Econometric

Chapter 13
Method of Moments Estimation
13.1 Introduction
The simple linear regression model is y = 1 + 2 x + e .
The usual assumptions we make are
1. E (e) = 0
2. var(e) = 2
3. cov(ei , e j ) = 0
4. The variable x is not random, and it must take at least two different values.
5. (optional) e ~ N (0, 2 )
Slide 13.1
Undergraduate Econometrics, 2nd Edition-Chapter 13
In this Chapter we relax the assumption that variable x is not random

In an economists non-experimental world, the values of x and y are usually revealed
at the same time, making x random, in the same way y is.
The purpose of this chapter is to discuss regression models in which x is random. We

will
discuss the conditions under which having a random x is not a problem, and how to
test whether our data satisfies these conditions.
present the simultaneous equations model, in which the randomness of x causes the
least squares squares estimator to fail.
provide estimators that have good properties even when x is random.
Slide 13.2
13.2 Linear Regression with Random xs

Let us modify the usual simple regression assumptions follows:
A13.1
yt = 1 + 2 xt + et .
A13.2
The data pairs ( xt , yt ) t = 1,, T , are obtained by random sampling..
A13.3
E (e | x ) = 0 . The expected value of the error term, conditional on any value of
x, is zero.
A13.4
In the sample, xt must take at least two different values.
A13.5
var (e | x ) = 2 . The variance of the error term, conditional on any x is a
constant 2 .
A13.6
e | x ~ N (0, 2 ). The distribution of the error term, conditional on x, is normal.
Slide 13.3
The interpretation of A13.3, E (e | x ) = 0 . This assumption implies that we have

(1) omitted no important variables,
(2) used the correct functional form, and
(3) there exist no factors that cause the error term to be correlated with x.
If E (e | x ) = 0 then we can show that it is also true that x and e are uncorrelated, so that
cov ( x, e ) = 0 .
Conversely, if x and e are correlated, then cov ( x, e ) 0 and we can show that
E (e | x ) 0 .
Slide 13.4
13.2.1
The Finite (Small) Sample Properties of the Least Squares Estimator
The finite sample properties of the least squares estimator when x is random can be
summarized as follows:
1. Under assumptions A13.1-A13.4 the least squares estimator is unbiased.

2. Under assumptions A13.1-A13.5 the least squares estimator is the best linear
unbiased estimator of the regression parameters, and the usual estimator of 2 is
unbiased.
3. Under assumptions A13.1-A13.6 the distribution of the least squares estimators,
conditional upon x, are normal, and their variances are estimated in the usual
way.
Consequently the usual interval estimation and hypothesis testing
procedures are valid.

Slide 13.5
13.2.2
The Asymptotic (Large) Sample Properties of the Least Squares Estimator

When x is Not Random
What happens to the probability distribution of the least squares estimators if we have a
very large sample, or when the sample size T?
First, the least squares estimators are unbiased.
Second, the variances of the least squares estimators, given in equation 4.2.10,
converge to zero as T. As the sample size gets increasingly large the probability
distributions of the least squares estimators collapse about the true parameters.
Estimators with this property are called consistent estimators, and consistency is a nice
large sample property of the least squares estimators.
Slide 13.6
13.2.3
The Asymptotic (Large) Sample Properties of the Least Squares Estimator

When x is Random
For the purposes of a large sample analysis of the least squares estimator we can
somewhat weaken the assumption A13.3. Let us replace it by
A13.3*
E (e ) = 0 and cov ( x, e ) = 0
This is a weaker assumption than A13.3 because E (e | x ) = 0 cov (e, x ) = 0 but

cov (e, x ) = 0 does not E (e | x ) = 0 .
Slide 13.7
What we can say is the following:
1. Under assumption A13.3* the least squares estimators are consistent.

2. Under assumptions A13.1, A13.2, A13.3*, A13.4 and A13.5, the least squares
estimators have approximate normal distributions in large samples, whether the errors
are normally distributed or not. Furthermore our usual interval estimators and test
statistics are valid, if the sample is large.
3. If assumption A13.3* is not true, then the least squares estimators are inconsistent.
They do not converge to the true parameter values even in very large samples.
Furthermore, none of our usual hypothesis testing or interval estimation procedures are
valid.
Slide 13.8
13.2.4
Measurement Errors in Regression Equations
In the errors-in-variables problem an explanatory variable is measured with error.

When we measure an explanatory variable with error, it is correlated with the error
term, and the least squares estimator is inconsistent.
Let us assume that a workers wages are a function of the workers ability. Let yt =
wages of the tth worker and let xt* = the ability of the tth worker
yt = 1 + 2 xt* + et
(13.2.1)
We have asterisked (*) the ability variable because it is difficult, if not impossible, to
observe.
Suppose that we attempt to measure ability using xt = a standardized test score. It is
sometimes called a proxy variable.
Slide 13.9
xt = xt* + ut
(13.2.2)
where ut is a random disturbance, with mean 0 and variance u2 .

Assume that ut is independent of et, and serially uncorrelated.
Substitute xt* = xt ut into equation 13.2.2 to obtain
yt = 1 + 2 xt* + et
= 1 + 2 ( xt ut ) + et
= 1 + 2 xt + (et 2ut )
(13.2.3)
= 1 + 2 xt + vt
Slide 13.10
In equation 13.2.3 the explanatory variable xt is random, from the assumption of

measurement error in equation 13.2.2.
In order to estimate equation 13.2.3 by least squares we must determine whether or not
xt is uncorrelated with the random disturbance vt.
cov ( xt , et ) = E ( xt vt ) = E ( xt* + ut )( et 2ut )
= E ( u
2
2 t
) =
2
2
u
(13.2.5)
The least squares estimator b2 is an inconsistent estimator of 2 because of the

correlation between the explanatory variable and the error term.
b2 does not converge to 2 in large samples.
Furthermore, in large or small samples, b2 is not approximately normal with mean 2
and variance var (b2 ) = 2
( xt x ) .
2
Slide 13.11
13.2.5
The Inconsistency of the Least Squares Estimator when cov ( x, e ) 0
Our regression model is y = 1 + 2 x + e .

Under A13.3* E (e ) = 0 , so that E ( y ) = 1 + 2 E ( x ).
Subtract this expectation from the original equation,
y E ( y ) = 2 x E ( x ) + e
Multiply both sides by x E ( x )
x E ( x ) y E ( y ) = 2 x E ( x ) + x E ( x ) e
2
Slide 13.12
Yake expected values of both sides
E x E ( x ) y E ( y ) = 2 E x E ( x ) + E x E ( x ) e ,
2
or
cov ( x, y ) = 2 var ( x ) + cov ( x, e )
Solve for 2
2 =
cov ( x, y ) cov ( x, e )
var ( x )
var ( x )
(13.2.6)
If we can assume that that cov ( x, e ) = 0 , then

2 =
cov ( x, y )
.
var ( x )
(13.2.7)
Slide 13.13
The least squares estimator, expressed in a slightly different way, is
b2 =
x, y )
( xt x )( yt y ) ( xt x )( yt y ) / (T 1) cov(
=
=
2
2
x)
var(
( xt x )
( xt x ) / (T 1)
(13.2.8)
If cov ( x, e ) = 0 then
b2 =
x, y )
cov(
cov( x, y )
= 2
x)
var(
var( x)
If x and e are correlated, then
2 =
cov ( x, y ) cov ( x, e )
var ( x )
var ( x )
Slide 13.14
The least squares estimator now converges to
b2
cov ( x, y )
cov ( x, e )
2
= 2 +
var ( x )
var ( x )
In this case b2 is an inconsistent estimator of 2.
Slide 13.15
13.3 Estimators Based on the Method of Moments
13.3.1
Method of Moments Estimation of the Mean and Variance of a Random
Variable
The k'th moment of a random variable is the expected value of the random variable
raised to the k'th power.
E (Y k ) = k = k'th moment of Y
(13.3.1)
Slide 13.16
The k'th population moment in equation 13.3.1 can be estimated consistently using the
sample (of size T) analog
E (Y k ) = k = k'th sample moment of Y
(13.3.2)
=y
t =1
k
t
The method of moments estimation procedure equates m population moments to m

sample moments to estimate m unknown parameters.
Let Y be a random variable with mean E (Y ) = and variance
var (Y ) = 2 = E (Y ) = E (Y 2 ) 2
2
(13.3.3)
Slide 13.17
In order to estimate the two population parameters and 2 we must equate two
population moments to two sample moments.
The first two population and sample moments of Y are:
T
E (Y ) = 1 = , = yt T
t =1
E (Y 2 ) = 2 ,
2 = yt2 T
(13.3.4)
t =1
Solve for the unknown mean and variance parameters.

T
= yt T = y
(13.3.5)
t =1
Slide 13.18
Then, use equation 13.3.3, replacing the second population moment by its sample
value, and replacing the mean by equation (13.3.5)
" 2 = E (Y 2 ) 2
T
y
t =1
2
t
y2 =
y
t =1
2
t
Ty 2
(13.3.6)
( yt y )
In general, method of moments estimators are consistent in large samples, but there is
no guarantee that they are "best" in any sense.
Slide 13.19
13.3.2
Method of Moments Estimation in the Simple Linear Regression Model
The definition of a "moment" can be extended to more general situations.

E g (Y ) is a moment of g(Y).
In the linear regression model yt = 1 + 2 xt + et we usually assume that
E (et ) = 0 E ( yt 1 2 xt ) = 0
(13.3.7)
If xt is fixed, or random but not correlated with et, then

E ( xt et ) = 0 E xt ( yt 1 2 xt ) = 0
(13.3.8)
Equations 13.3.7-8 are moment conditions.

Slide 13.20
If we replace the two population moments by the corresponding sample moments, we

have two equations in two unknowns, which define the method of moments estimators
for 1 and 2 .
1
( yt b1 b2 xt ) = 0
T
1
xt ( yt b1 b2 xt ) = 0
(13.3.9)
These two equations are identical to the least squares "normal" equations from Chapter
3, equation 3.3.6, and their solution yields the least squares estimators
b2 =
( xt x )( yt y )
2
x
x
( t )
b1 = y b2 x
Slide 13.21
13.3.3
Method of Moments/Instrumental Variables Estimation in the Simple Linear
Regression Model
Problems for least squares arise when x is random and correlated with the random
disturbance e, so that E ( xt et ) 0 .
Suppose, however, that there is another variable, zt, called an instrumental variable,
which satisfies the moment condition
E ( zt et ) = 0 E zt ( yt 1 2 xt ) = 0
(13.3.10)
Slide 13.22
Then we can use the two equations 13.3.7 and 13.3.10 to obtain estimates of 1 and 2 .
The sample moment conditions are:
1
yt 1 2 xt = 0
T
1
zt yt 1 2 xt = 0
(13.3.11)
Solving these equations leads us to method of moments estimators, which are usually
called the instrumental variable estimators,
z
y T zt yt ( zt z )( yt y )
2 = t t
=
z
x
T
z
x
t t t t ( zt z )( xt x )
(13.3.12)
1 = y 2 x
Slide 13.23
What are the properties of these new estimators?

First, note that
( z, y )
= ( zt z )( yt y ) (T 1) = cov
.
2
z
z
x
x
T
z
x
1
cov
,
(
)
( t )( t ) ( )
(13.3.13)
The sample covariance converges to the true covariance in large samples, so we can
say
cov ( z , y )
2
cov ( z , x )
(13.3.14)
For an instrumental variable to be valid, it must be uncorrelated with the error term e
but correlated with the explanatory variable x.
Slide 13.24
Follow the same steps that led to equation 13.2.6, we obtain
2 =
cov ( z , y ) cov ( z , e )
cov ( z , x ) cov ( z , x )
(13.3.15)
If we can assume that that cov ( z , e ) = 0 , then the instrumental variables estimator in
equation (13.3.14) converges in large samples to 2,
cov ( z , y )
2
= 2 .
cov ( z , x )
(13.3.16)
Thus if cov ( z , e ) = 0 and cov ( z , x ) 0 , then the instrumental variable estimator of 2

is consistent, in a situation in which the least squares estimator is not consistent due to
correlation between x and e.
Slide 13.25
In large samples the instrumental variable estimator has the approximate normal
distribution,
2
~ N ,
2
2 ( x x )2 r 2
t
zx
rzx2 is the sample correlation between the instrument z and the random regressor x.
We want an instrument that is highly correlated with z to improve the efficiency of the
instrumental variable estimator.
The error variance is estimated using the estimator
2IV =
yt 1 2 xt
T 2
(13.3.18)
Slide 13.26
13.3.4
Instrumental Variables Estimation When Surplus Instruments are Available
Usually we have more instrumental variables at our disposal than are necessary.
For example, let w be a variable that is correlated with x but uncorrelated with e
E (et ) = E ( yt 1 2 xt ) = 0 yt 1 2 xt = m1 = 0
E ( zt et ) = E zt ( yt 1 2 xt ) = 0 zt yt 1 2 xt = m2 = 0
(13.3.19a)
(13.3.19b)
E ( wt et ) = E wt ( yt 1 2 xt ) = 0 wt yt 1 2 xt = m3 = 0 (13.3.19c)
Using the least squares principle, choose values for 1 and 2 that minimize the sum of
squares m12 + m22 + m32 .
It is best to use weighted least squares, with the weights being the reciprocals of the
variances of the moments, if the moments are uncorrelated.
Slide 13.27
That is, we should minimize
S 1 , 2
m12
m22
m32
=
+
+
var ( m1 ) var ( m2 ) var ( m3 )
(13.3.20)
The values of 1 and 2 that minimize this weighted sum of squares can be obtained
using a 2-step process.
(1) Regress x on a constant term, z and w, and obtain the predicted values x
(2) Use x as an instrumental variable for x.
Slide 13.28
This leads to the two moment conditions
( y x ) = 0
x ( y x ) = 0
t
2 t
2 t
Solving these conditions, and using the fact that x = x , we have
( x x )( y y ) = ( x x )( y y )
( x x )( x x ) ( x x )( x x )
2 =
(13.3.21)
1 = y 2 x
Slide 13.29
These instrumental variables estimators, derived using the method of moments, are
also called two-stage least squares estimators, because they can be obtained using
two least squares regressions.
Stage 1 is the regression of x on a constant term, z and w, to obtain the predicted
values x
Stage 2 is ordinary least squares estimation of the simple linear regression
yt = 1 + 2 xt + errort
(13.3.22)
The approximate, large sample, variance of 2 is given by the usual formula for the
variance of the least squares estimator of equation (13.3.22),
( )
var 2
2
=
2
( xt x )
Slide 13.30
The estimator of the error variance must be based on the residuals from the original
model, yt = 1 + 2 xt + et ,
2
IV
yt 1 2 xt
T 2
Thus the appropriate estimator of the variance of 2 is
( )
2 =
var
2IV
( xt x )
This estimated variance can be used as a basis for t-tests of significance and interval
estimation of parameters.
Slide 13.31
13.4 Testing for Correlation Between Explanatory Variables and the Error Term
The null hypothesis is H 0 : Cov ( x, e ) = 0 against the alternative that H1 : Cov ( x, e ) 0 .

The idea of the test is to compare the performance of the least squares estimator to an
instrumental variables estimator.
Under the null and alternative hypotheses we know the following:
If the null hypothesis is true, both the least squares estimator and the instrumental
variables estimator are consistent. q = bols IV 0 .

If the null hypothesis is false, the least squares estimator is not consistent, and the
instrumental variables estimator is consistent. q = bols IV c 0 .
Slide 13.32
There are several forms of the test, usually called the "Hausman Test," for this null and
alternative hypothesis
In the regression yt = 1 + 2 xt + et we wish to know whether x is correlated with e.
Let zt1 and zt 2 be instrumental variables for x.
At minimum you need one instrument for each variable you think might be correlated
with the error term. Then carry out the following steps:
Estimate the model xt = a0 + a1 zt1 + a2 zt 2 + vt by least squares, and obtain the
residuals vt = xt a0 + a1 zt1 + a2 zt 2 . If there are more than one explanatory variables
that are questionable, repeat this estimation for each one, using all available
instrumental variables in each regression.
Include the residuals computed in step 1 as an explanatory variable in the
regression, yt = 1 + 2 xt + vt + et . Estimate this "artificial regression" by least
squares, and employ the usual t-test for the hypothesis of significance
Slide 13.33
H 0 : = 0 ( no correlation between x and e )

H1 : 0 (correlation between x and e )
If more than one variable is suspect, the test will be a F-test of joint significance of the
coefficients on the included residuals.
Slide 13.34

Undergraduate Econometric

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Undergraduate Econometric

Uploaded by

Copyright:

Available Formats

Chapter 13

Method of Moments Estimation

In this Chapter we relax the assumption that variable x is not random

The purpose of this chapter is to discuss regression models in which x is random. We

13.2 Linear Regression with Random xs

The data pairs ( xt , yt ) t = 1,, T , are obtained by random sampling..

E (e | x ) = 0 . The expected value of the error term, conditional on any value of

In the sample, xt must take at least two different values.

var (e | x ) = 2 . The variance of the error term, conditional on any x is a

e | x ~ N (0, 2 ). The distribution of the error term, conditional on x, is normal.

The interpretation of A13.3, E (e | x ) = 0 . This assumption implies that we have

The Finite (Small) Sample Properties of the Least Squares Estimator

1. Under assumptions A13.1-A13.4 the least squares estimator is unbiased.

Consequently the usual interval estimation and hypothesis testing

procedures are valid.

The Asymptotic (Large) Sample Properties of the Least Squares Estimator

The Asymptotic (Large) Sample Properties of the Least Squares Estimator

This is a weaker assumption than A13.3 because E (e | x ) = 0 cov (e, x ) = 0 but

What we can say is the following:

1. Under assumption A13.3* the least squares estimators are consistent.

Measurement Errors in Regression Equations

In the errors-in-variables problem an explanatory variable is measured with error.

where ut is a random disturbance, with mean 0 and variance u2 .

In equation 13.2.3 the explanatory variable xt is random, from the assumption of

The least squares estimator b2 is an inconsistent estimator of 2 because of the

The Inconsistency of the Least Squares Estimator when cov ( x, e ) 0

Our regression model is y = 1 + 2 x + e .

Yake expected values of both sides

If we can assume that that cov ( x, e ) = 0 , then

The least squares estimator, expressed in a slightly different way, is

If x and e are correlated, then

Undergraduate Econometrics, 2nd Edition-Chapter 13

The least squares estimator now converges to

In this case b2 is an inconsistent estimator of 2.

13.3 Estimators Based on the Method of Moments

Method of Moments Estimation of the Mean and Variance of a Random

The method of moments estimation procedure equates m population moments to m

Solve for the unknown mean and variance parameters.

Method of Moments Estimation in the Simple Linear Regression Model

The definition of a "moment" can be extended to more general situations.

If xt is fixed, or random but not correlated with et, then

Equations 13.3.7-8 are moment conditions.

If we replace the two population moments by the corresponding sample moments, we

Method of Moments/Instrumental Variables Estimation in the Simple Linear

What are the properties of these new estimators?

Follow the same steps that led to equation 13.2.6, we obtain

Thus if cov ( z , e ) = 0 and cov ( z , x ) 0 , then the instrumental variable estimator of 2

Instrumental Variables Estimation When Surplus Instruments are Available

That is, we should minimize

This leads to the two moment conditions

Solving these conditions, and using the fact that x = x , we have

Undergraduate Econometrics, 2nd Edition-Chapter 13

Thus the appropriate estimator of the variance of 2 is

The null hypothesis is H 0 : Cov ( x, e ) = 0 against the alternative that H1 : Cov ( x, e ) 0 .

variables estimator are consistent. q = bols IV 0 .

instrumental variables estimator is consistent. q = bols IV c 0 .

H 0 : = 0 ( no correlation between x and e )

You might also like