Violation of OLS Assumption 2

4
Violations of the CLRM assumptions
• Measurement error, omitted variables and

simultaneity
5
What is measurement error?
• The measurement error, e0, is the difference between the observed

value and the actual value:
e0  y  y *
where y is the observed value, and y* is the actual value.
• Example:
y*: annual family savings
y: reported annual savings
6
Consequences of measurement error

for OLS estimation
• Consequences are different depending on whether measurement

error is in…
1. the dependent variable
2. an explanatory variable(s)
7
Consequences of measurement error in the

dependent variable
• Consider the model:
y*   0  1 x1  ...   k xk  u (1)
where y* is the actual value and (1) satisfies all the CLRM assumptions.
• Measurement error is:

e0  y  y*


measurement _ error reported _ value actual _ value
where e0 ~ N (0,  e2 ) .
• Rewrite equation (1) in terms of y yields:

y   0  1 x1  ...   k xk  u  e0 (2)
where we assume E (e0 | x1 ,..., xk )  0 and Cov(e0 , u )  0 .

▫ We used y*  y  e0 to obtain equation (2).
8
Measurement error in the dependent

variable – cont.
y   0  1 x1  ...   k xk  u  e0 (2)

v
• The OLS estimators from equation (2) is unbiased and consistent.
▫ Because equation (2) satisfies the first four CLRM assumptions,

including zero conditional mean assumption:
E (v | x1 , x2 ,..., xk )  0
• The test statistics, such as t, F and LM statistics, that we use to test

hypothesis are valid.
ˆ  u2   02  u2
• The variance of the OLS estimator is: Var ( 1 )  2

 i
( x  x )  ( xi  x ) 2
• Consequence: Larger variances of the OLS estimators.

9
Measurement error in an explanatory

variable
• Consider the simple model:
y   0  1 x * u (3)
where x* is the actual value and (3) satisfies the first four CLRM assumptions.
• Measurement error is:

e0  x  x*

 reported _ value actual _ value
measurement _ error
where e0 ~ N (0,  e2 ) .
• Rewriting equation (3) using x yields:

y   0  1 x  (u  1e0 )
(4)
where Cov(e0 , u )  0 , Cov( x, u )  0 and Cov( x, e0 )  0 .
▫ We used x*  x  e0 to obtain equation (4).
10

variable – cont.
y   0  1 x  (u  1e0 ) (4)
 

• The OLS estimators from equation (4) is biased and inconsistent because
x and η are correlated:
Cov ( x, )  0
• Let’s see why. Because x and e0 are correlated,

Cov( x, e0 )  0
• then x and η in equation (4) will also be correlated:
Cov( x, )  Cov( x, u  1e0 )   Cov( x, e0 )  0
▫ implying that the zero conditional mean assumption E ( | x)  0 is violated.

11

variable – cont.
• Let’s see the amount of inconsistency in Diagram: effect of

measurement error when β1 is
OLS when Cov(x, e0) ≠ 0.
positive
• It can be shown that the probability limit
of the OLS estimator ˆ1 is:
ˆ Cov ( x, u  1e0 )  x2*

p lim( 1 )  1   1 ( 2 )
Var ( x)  x*   e20
where we used Var ( x)  Var ( x*)  Var (e0 ) .
• Notice that the term multiplying 1 is always less than 1…

• Causing the attenuation bias.
12
Example
• We estimate the effect of family income on CAS mark average, after
controlling for A-level grade average:
CAS   0  1 fa min c*   2 Alevel  u
where faminc* is actual annual family income.
• Family income, especially those reported by students, could easily be

mismeasured:
e1  fa min c  fa min c *
   
measurement _ error reported _ value actual _ value
• because, for example, those with high A-level grade average may report
family income more accurately.
▫ Then, Cov ( x, e1 )  0 .
13
CAS mark equation with measurement

error
• Then, using reported family income instead of actual family income,
CAS   0  1 fa min c  (u  1e1 )

 

will bias the OLS estimator of β1 toward zero.

▫ This is the attenuation bias.
• One consequence of the downward bias is that a test of
Ho: β1 = 0
will have less chance of detecting β1 ≠ 0.

14
What is an omitted variable?

• An omitted variable is a variable that actually belongs to the true
model but is omitted.
▫ Example: suppose that wage is determined by educational level and

ability [the true model]:
wage   0  1educ   2 abil  u
▫ Because ability is not observable, we estimated the model:

wage   0  1educ  
▫ Here, ability is the omitted variable.
• What happens if we omit a relevant variable?

• Answer:
15
Consequences of omitting a relevant

variable
• Consider the model [true model]:
y   0  1 x   2 z  u
(1)
that satisfies the first four CLRM assumptions.
• But we omit z and estimate the model:

y   0  1 x  (2)
• It can be shown that (the expectation of) the OLS estimator for β1 in
equation (2) is:
E ( ˆ1 )  1   2
 ( x  x )( z  z )
 ( x  x )2 (3)


• (You must be able to) Realise that θ is the slope from the regression
of z on x.
16

variable – cont.
ˆ
E ( 1 )  1   2
 ( x  x )( z  z )
 1
2 (3)
 (x  x)

 
bias
• Note that θ is the slope from the regression of z on x.
• Thus the OLS estimator is biased, unless

1. β2 = 0
2. θ = 0
17

variable – cont.
ˆ
E ( 1 )  1   2
 ( x  x )( z  z )
 1 (3)
2
( xx )

 
bias
• Equation (3) tells that the sign of the bias in ̂1 depends on the signs of
both β2 and θ:
• The bias is…
1. positive if β2 > 0 and θ > 0.
2. negative if β2 > 0 and θ < 0.
3. negative if β2 < 0 and θ > 0.
4. positive if β2 < 0 and θ < 0.
18
Example
• Consider again the wage equation [the true model]:
wage   0  1educ   2 abil  u

(e1)
• Because ability is not observable, we estimated the model:
wage   0  1educ 
(e2)
• (The expectation of) the OLS estimator for β1 in equation (e2) is:
E ( ˆ1 )  1   2
 (educ  educ)(abil  abil )
 (educ  educ) 2
 
 
bias
• What’s the sign of the bias ?

▫ The answer depends on the sign of β2 and θ.
19
Notes on the omitted variable problem
• In summary, an omitted variable can cause the OLS estimator to be

biased, where the bias is:
bias   2
 ( x  x )( z  z ) [from equation (3)]
2
 (x  x)
• Hypothesis testing, such as t test, is wrong.
• This idea can be generalised to many explanatory variables.

▫ In general, omitting relevant variables will bias ALL coefficients in
the model.
20
Including irrelevant variables

• Consider the case where one included an irrelevant variable.
• The true model is written as:

y   0  1 x 
(1)
while the estimated model is of the form:
y   0  1 x   2 z  u
(2)
and this model satisfies the first four CLRM assumptions.
• z has no effect on y after controlling for x (i.e., β2 = 0).
• What is the effect of including z in equation (2)?

• Answer:
21
Including irrelevant variables – cont.
• Does this mean we can include as many irrelevant variables as we

want?
• Answer: No!
• Why? Because the variances are generally larger compared to those

obtained from the true model (1).
▫ Let’s see it concretely.
22
Including irrelevant variables – cont.

• The variance of β1 in the false model (2) can be written as:
2
Var ( ˆ1 ) F  (3)
SST (1  R 2 )
while the variance of β1 in the true model (1) is:
2

Var ( ˆ1 ) 
T
SST (4)
where SST is the total variation in x and R^2 is the R-squared from the
regression of x on z.
• Comparing (3) and (4) reveals that
Var ( ˆ1 ) F  Var ( ˆ1 )T
unless x and z are uncorrelated.
• Thus the t-ratio of β1 will be affected.

23
What is simultaneity?
• Simultaneity is a form of endogeneity that arises when…
▫ one or more of the explanatory variables is jointly determined with the
dependent variable.
• Q: What happens when simultaneity arises?

• A: it generally causes the OLS estimator to be biased and inconsistent.
• Now, let’s see…
1. When simultaneity arises?

• A classic example is a supply and demand equation for input to
production.
2. Why the OLS estimator is biased when it arrises?

24
When does simultaneity arise?

• Consider the labour supply function in the agricultural sector in
country i:
hsi  1wi  1 z1i  u1i
(1)
where hsi: the annual labour hours supplied by workers;
wi: the average hourly wage of the workers;
z1i: the average wage in the manufacturing sector.
• The labour demand function is:

hdi   2 wi   2 z2i  u2i
(2)
where hdi: the annual labour hours demanded by farmers;
z2i: agricultural land area.
• Equation (1) describes a behavioural equation for workers, while

equation (2) is that for farmers.
• Each equation is a structural equation.
25
When does simultaneity arise? –cont.

• Because observed wage and hours are determined by the intersection of
supply and demand, in the equilibrium:
hsi  hdi
• Denote observed equilibrium hours for each country i by hi.

• In the equilibrium, equations (1) and (2) can be expressed using hi:
hi  1wi  1 z1i  u1i (1)’
hi   2 wi   2 z 2i  u 2 i (2)’
• Equations (1)’ and (2)’ constitute a simultaneous equations model.
• hi, wi - called endogenous variables.
• z1i, z2i - called exogenous.
• u1i, u2i - called structural errors.
26
Consequences of simultaneity for OLS

estimation
• When simultaneity arises, it causes the OLS estimator to be biased and
inconsistent…
• Because wi, that is determined simultaneously with the dependent

variable (hi), is correlated with the error term.
• Recall that, in the equilibrium, the labour demand and supply functions
can be expressed as:
hi  1wi  1 z1i  u1i (1)’
hi   2 wi   2 z 2i  u2i
(2)’
where z1i and z2i are exogenous, so that each is uncorrelated with u1 and u2.
• To show that wi is correlated with u1i, we solve the two equations for wi,
in terms of z1i, z2i, u1i and u2i.
27
Consequences of simultaneity – cont.

• Plugging (2)’ into (1)’ and solving for wi gives (i is suppressed for
simplicity):
w  z1  z 2  
(3)
where δ = -β1/(α1-α2), γ = -β2/(α1-α2), and υ = (-u1+u2)/(α1-α2). Assume
α 1 ≠ α 2.
• Equation (3) expresses the endogenous variable in terms of the

exogenous variables and the error terms.
▫ Called the reduced form equation for w.

▫ Parameters δ and γ are called reduced form parameters.
 They are functions of the structural parameters, α’s and β’s.
▫ υ is called the reduced form error, and is a function of the structural
error terms.
28
Consequences of simultaneity – cont.

w  z1  z 2   (3)
where δ = -β1/(α1-α2), γ = -β2/(α1-α2), and υ = (-u1+u2)/(α1-α2).
• Now we can see, using equation (3), that an explanatory variable w in

our original structural equation,
h  1w  1 z1  u1 (1)’
is correlated with the error term, u1.
• We see from (3) that w is a function of υ, which is a function of u1 (and

u2).
• Therefore w and u1 are correlated in equation (1)’, leading to bias and

inconsistency in OLS.
▫ Called simultaneity bias.
29
How to address endogeneity?
• We have looked at 3 cases where x can be correlated with u.

▫ Measurement error, omitted variables and simultaneity.
• If x is correlated with u, then x is said to be an endogenous variable.
• When x is endogenous, the OLS estimator is biased and inconsistent.
• Can we somehow obtain an unbiased and consistent estimator??

30
What is an instrumental variable (IV)?
• An instrumental variable z (also called an instrument) for x is a

variable that satisfies the two assumptions:
1. z is uncorrelated with u, i.e., Cov(z, u) = 0.

2. z is correlated with x, i.e., Cov(z, x) ≠ 0.
31
Example
• Consider the following model:
score   0  1skipped   2income  u
where score = final exam score, skipped = the total number of lectures missed
and income = a student’s income.
• u contains other factors affecting score.
• u and skipped may be correlated. Why?
• Then the OLS estimator of β1 is biased.

• What might be a good IV for skipped? A good IV has to be…
1. uncorrelated with student ability and motivation.
2. correlated with skipped.
▫ A possible instrument: distance between living quarters and campus.
32
IV estimation
• Now let’s see how an IV can be used to consistently estimate β1 in the
following equation:
y   0  1 x  u (1)
• Using (1), the covariance between z (= instrument) and y can be written as:
Cov(z,y) = β1Cov(z,x) + Cov(z,u)
• Solving this for β1 gives: Cov( z , y )

1 
Cov( z , x ) (2)
▫ Notice how this algebra fails if Cov(z,x) = 0.
• Estimate (2) using sample analogs of (2). After cancelling the sample sizes
in the numerator and denominator:
n
 (z i  z )( yi  y )
ˆ1  i 1
n
 (z
i 1
i  z )( xi  x )
called the IV estimator.
33
Two stage least squares (2SLS)

estimator
• What if there is multiple instruments for an endogenous explanatory
variable x?
• Consider the model:

y   0  1 x  u
where x is endogenous, i.e., Cov(x,u) ≠ 0.
• Suppose now that we have two instruments, z1 and z2 for x. That is, z’s
satisfy:
1. Cov( zi , u )  0 for i = 1, 2
2. Cov( zi , x)  0 for i = 1, 2
• Q: Can we just use each as an IV?

• A:
34

estimator
• Because both z1 and z2 are uncorrelated with u, any linear combination
is also uncorrelated with u.
• Thus any linear combination of z’s is a valid IV.
• To find the best IV, choose the linear combination that is most highly
correlated with x (call it x*) given by:
x*   0   1 z1   2 z 2 (1)
• Problem: we don’t know x*...
• We can estimate (1) by OLS to obtain the fitted values for x*:
xˆ*  ˆ 0  ˆ1 z1  ˆ 2 z 2
• Once we have x̂ * , we can use it as the IV for x.
35

estimator – cont.
• With multiple instruments, the IV estimator using x̂ * as the instrument
is also called two stage least squares estimator. Why?
• It can be shown that when we use x̂ * as the IV for x, the IV estimates ̂ s

are identical to the OLS estimates from the regression of y on x̂ * .
• That is, we can obtain the 2SLS estimator in two stages:

Stage1: run regression x*   0   1 z1   2 z2 and obtain x̂ * .
Stage2: run regression of y on x̂ * .
• Econometrics packages (including Eviews) have special commands for

2SLS.
▫ In fact, you should avoid doing the second stage manually, because the
standard errors are invalid.
36
Example
• To investigate the effect of education on wage, an economist
estimated model (1) using a sample of men in the US in 1976:
log( wage)   0  1educ   2 exp er   3 exp er 2

  4black   5 msa   6 south  u (1)
where
educ: years of education;
experience: years of experience;
black: dummy = 1 if black, and 0 otherwise;
msa: dummy = 1 if living in an metropolitan statistical area, and 0
otherwise;
south: dummy = 1 if living in the South.
37
US wage equation
• u contains unobserved factors affecting wage,
▫ e.g., ability, how disciplined workers are, etc.
• …that are likely to be correlated with educ.

▫ Then the OLS estimator for β1 is biased and inconsistent.
Q: What do we do??
A: use an instrument for educ.
• An instrument for educ: a dummy variable for whether someone

grew up near a four year college (nearc4).
1. likely to be correlated with education.
2. unlikely to be correlated with the error term, u.
38
US wage equation – cont.
• Using a sample of men in the US in 1976,

eudc  16.64  .320nearc4  .413 exp er  ... (2)
(.24) (.088) (.034)
where nearc4: dummy = 1 if a person grew up near a four-year college.
▫ Equation (2) is called the reduced form equation for educ, or, the first
stage equation.
• We are interested in the coefficient and t statistic on nearc4.
• Q: How do you interpret the coefficient on nearc4?
• nearc4 can be used as an IV for educ.

▫ if nearc4 is uncorrelated with u.
39
US wage equation – cont.

40
Summary
1. What are measurement error, omitted variables and simultaneity?

 cf. Wooldridge Chapter 3 for omitted variables.
2. What are the consequences for OLS estimation?
3. How to deal with them?

▫ Use the IV (2SLS) estimator.
4. In the next lecture, we will depart from cross section and time
series econometrics and go through panel data econometrics.

Violation of OLS Assumption 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Violation of OLS Assumption 2

Uploaded by

Copyright:

Available Formats

4

Violations of the CLRM assumptions

• Measurement error, omitted variables and

What is measurement error?

• The measurement error, e0, is the difference between the observed

where y is the observed value, and y* is the actual value.

Consequences of measurement error

• Consequences are different depending on whether measurement

Consequences of measurement error in the

• Measurement error is:

• Rewrite equation (1) in terms of y yields:

where we assume E (e0 | x1 ,..., xk )  0 and Cov(e0 , u )  0 .

Measurement error in the dependent

▫ Because equation (2) satisfies the first four CLRM assumptions,

• The test statistics, such as t, F and LM statistics, that we use to test

• Consequence: Larger variances of the OLS estimators.

Measurement error in an explanatory

• Measurement error is:

• Rewriting equation (3) using x yields:

Measurement error in an explanatory

• Let’s see why. Because x and e0 are correlated,

▫ implying that the zero conditional mean assumption E ( | x)  0 is violated.

Measurement error in an explanatory

• Let’s see the amount of inconsistency in Diagram: effect of

ˆ Cov ( x, u  1e0 )  x2*

where we used Var ( x)  Var ( x*)  Var (e0 ) .

• Notice that the term multiplying 1 is always less than 1…

• Family income, especially those reported by students, could easily be

CAS mark equation with measurement

CAS   0  1 fa min c  (u  1e1 )

will bias the OLS estimator of β1 toward zero.

• One consequence of the downward bias is that a test of

will have less chance of detecting β1 ≠ 0.

What is an omitted variable?

▫ Example: suppose that wage is determined by educational level and

▫ Because ability is not observable, we estimated the model:

▫ Here, ability is the omitted variable.

• What happens if we omit a relevant variable?

Consequences of omitting a relevant

• But we omit z and estimate the model:

Consequences of omitting a relevant

• Note that θ is the slope from the regression of z on x.

• Thus the OLS estimator is biased, unless

Consequences of omitting a relevant

wage   0  1educ   2 abil  u

• What’s the sign of the bias ?

Notes on the omitted variable problem

• In summary, an omitted variable can cause the OLS estimator to be

• This idea can be generalised to many explanatory variables.

Including irrelevant variables

• The true model is written as:

• z has no effect on y after controlling for x (i.e., β2 = 0).

• What is the effect of including z in equation (2)?

Including irrelevant variables – cont.

• Does this mean we can include as many irrelevant variables as we

• Why? Because the variances are generally larger compared to those

Including irrelevant variables – cont.

unless x and z are uncorrelated.

• Thus the t-ratio of β1 will be affected.

• Q: What happens when simultaneity arises?

• Now, let’s see…

1. When simultaneity arises?

2. Why the OLS estimator is biased when it arrises?