You are on page 1of 36

Unit 11: Endogeneity and Measurement error

C. Zulehner: Introductory Econometrics 1 / 36


Outline

1 Endogeneity
Failure of Exogeneity
Detecting Endogeneity

2 Omitted Variables
The General Case
Consequences of Omitted Variables: an example

3 Measurement Error
“Classical” Measurement Error
The General Case
An Informative special case
Measurement Error in the Dependent Variable

C. Zulehner: Introductory Econometrics 2 / 36


1. Endogeneity

Until now, we mainly looked at the failure of assumptions MLR.2, MLR.3, and
MLR.5
I We argued that multicollinearity, heteroscedasticity, and correlated errors
among clusters - holding the other assumptions constant - only result in a loss
of efficiency
I Standard errors are too high but point estimates are “correct”, i.e. estimates
are consistent
Assumption MLR.1 (linearity) also doesn’t seem to be too restrictive as we often
can linearize the model. Alternatively, we can use non-linear estimation methods
(not in this lecture)
The crucial assumption of our model is MLR.4: the orthogonality or conditional
independence assumption

E (u|X) = 0

C. Zulehner: Introductory Econometrics 3 / 36


Endogeneity

We now will address the issues related to the failure of this central assumption more
generally
1 What happens if this does not hold
2 What – besides omitted variable bias – can cause its failure
3 How do we know if it fails to hold?

C. Zulehner: Introductory Econometrics 4 / 36


1.1 Failure of Exogeneity

Why do we care so much about MLR.4?


Without the orthogonality condition, our estimates are (biased and) inconsistent!
Why don’t we get consistency?1

 −1  
1 0 1 0
plim β̂ = plim β + plim XX plim X u 6= β
N N
| {z } | {z }


p E [xi 0 xi ]− 1=Q−1 →

p E [xi ui ]6=0

Hence if plim( N1 X0 u) = E (X 0 u) 6= 0 the OLS estimates are inconsistent

1
Remember β̂ = β + (X0 X)−1 X0 u ⇒ β̂ − β = (X0 X)−1 X0 u

C. Zulehner: Introductory Econometrics 5 / 36


Asymptotic Bias

The asymptotic bias is Q−1 plim (X0 u)


The direction of the bias will in general depend on whether:

Q−1 plim X0 u ≷ 0


Since Q is positive definite, this means that


I If X and u are positively correlated we have upward bias: β̂ > β
I If X and u are negatively correlated we have downward bias: β̂ < β
the omitted variable bias for x1 when the true model is yi = β0 + β1 x1i + β2 x2i + ui ,
but we estimate yi = α0 + α1 x1i + vi is

C. Zulehner: Introductory Econometrics 6 / 36


Implications

Key: When exogeneity fails the estimator is unable to identify the true effect of X
on y
Non-identification here means the inability to deliver consistent estimates
Intuition: we are trying to measure ∂y/∂X
This is equal to
∂y ∂u
=β+ 6= β
∂X ∂X

C. Zulehner: Introductory Econometrics 7 / 36


Positive Bias - Intuition

C. Zulehner: Introductory Econometrics 8 / 36


Negative Bias - Intuition

C. Zulehner: Introductory Econometrics 9 / 36


Remarks

If Q−1 plim(X0 u/N) is sufficiently large this can generate large differences between
β̂ and β
I Depending on sign of correlation between X and u, the sign of β̂ and β can
even differ!
If we are sure that the estimate is consistent, even if it is inefficient (has a high
standard error) we can still focus on the size of the average effect and get some
useful conclusions (though the high variance warns to be cautious)
If the estimate is inconsistent we can’t draw any conclusion
Knowing the direction of the bias can be informative:
I If β̂ > 0 and plim(X0 u/N) < 0 than the estimate is a lower bound of true
effect
I Try to guess sign and size of bias whenever possible

C. Zulehner: Introductory Econometrics 10 / 36


1.2. Detecting Endogeneity

We could test/check for heteroskedasticity, correlation, multicollinearity but


unfortunately we cannot easily test whether E (u|X) = 0
I u is unobservable! Can we use “the trick” of replacing it with û?
I NO! OLS estimates are constructed so that Xu = 0!
However, the units on instrumental variable estimation, we will see that there is also
a statistical test, which might be helpful if we have additional information
In general, to “detect” endogeneity
i) become suspicious (i.e. be a good econometrician) and
ii) think about the theory you are testing (i.e. be a good economist !)
think of situations under which conditional independence might fail and check
whether these factors play a role in your case

C. Zulehner: Introductory Econometrics 11 / 36


Sources of Endogeneity

There are four main sources of endogeneity (i.e. E (u|X) 6= 0)


1 Omitted variables
2 Measurement errors
3 Reverse (two-ways) causality
4 Selection
We will manly focus on the first two today and 3) and 4) in the next weeks.

C. Zulehner: Introductory Econometrics 12 / 36


2.1 The General Case

The true relationship is y = Xβ + Zα + u where Z is the potentially omitted


variable(s)
The OLS estimator is therefore

β̂ = (X0 X)−1 X0 y = (X0 X)−1 X0 (Xβ + Zα + u)


= (X0 X)−1 X0 Xβ + (X0 X)−1 X0 Zα + (X0 X)−1 X0 u
= β + (X0 X)−1 X0 Zα + (X0 X)−1 X0 u

Taking expectation

E (β̂) = β + (X0 X)−1 X0 Zα

Hence the bias depends on the weighted portion of zi which is “explained” by xi

C. Zulehner: Introductory Econometrics 13 / 36


The General Case

To exactly sign the bias in the general case, we have to know all correlations among
the x’s and the omitted factor
Typically, we explicitly measure the bias only if we can (safely) assume that the
other x’s are uncorrelated with the omitted factor
So, why is omitted variable a cause of endogeneity?
I If we omit x2 (or Z) from our equation, it will be captured by the error term u
I If x1 and x2 are correlated, then E (u|x1 ) 6= 0
I In this case, omitting x2 will make our assumption MLR.4 fail and OLS
estimates inconsistent!

C. Zulehner: Introductory Econometrics 14 / 36


2.2 Omitted Variables: an example

The use of checks in Italy


Suppose we want to estimate the determinants of the use of checks as a mean of
payment in Italya
Suppose we are interested in the effect of income but we have no proxy for
education, while we think that education matters for the use of checks
I Income and education are highly and significantly correlated (0.407)
I In principle we expect a large effect of education on the use of checks (the
more educated are more likely to have a checking account and thus use checks)
We therefore expect a large bias if we omit education
Suppose that we also omit wealth which is even more strongly and also significantly
correlated to income (correlation=0.630)
Exercise: replicate all tables and calculations using italy.csv
a
See, Guiso, Sapienza, Zingales, 2004, The Role of Social Capital in Financial Development, American
Economic Review, 94,3,526-556

C. Zulehner: Introductory Econometrics 15 / 36


The use of checks in Italy
Dependent Variable Use of Checks Use of Checks Use of Checks
Specification Correct Omit Educ+Wealth Omit Wealth
Social captial 0.9440 0.8950 0.9425
(5.785)∗∗ (4.966)∗∗ (5.758)∗∗
Age -0.0034 -0.0067 -0.0033
(-7.740)∗∗ (-17.080)∗∗ (-7.347)∗∗
Married 0.0589 0.0331 0.0581
(5.111)∗∗ (2.550)∗∗ (5.023)∗∗
Male 0.0197 0.0381 0.0197
(2.110)∗∗ (3.626)∗∗ (2.113)∗∗
Years of education 0.0287 0.0287
(15.852)∗∗ (15.974)∗∗
Household wealth 0.0447
(1.429)
Household income 0.0057 0.0088 0.0061
(14.645)∗∗ (11.514)∗∗ (12.764)∗∗
Judicial Efficiency -0.0130 -0.0172 -0.0128
(-1.138) (-1.380) (-1.127)
Constant -0.4778 -0.0777 -0.4830
(-2.995)∗∗ (-0.448) (-3.013)∗∗
Number of observations 32442 32442 32442
R-squared adjusted 0.277 0.227 0.277

Robust t-statistics clustered at the regional level in parentheses

C. Zulehner: Introductory Econometrics 16 / 36


Consequences of omitted variables: an example

The use of checks in Italy


Omitting education leads to a very large bias in the income parameter (increases by
50%):
I Education is highly correlated with income
I It has a very strong effect on use of checks: an increase by one (four) year(s)
of eduction increase the use of checks by about 3 (12)%
I All other coefficients are also biased. This means that education is correlated
to all other variables. The driver of the bias is however due to the strong
correlation between education and use of checks
Omitting wealth generates instead a smaller bias in spite of being more highly
correlated with income
I Wealth has a small effect on use of checks: doubling the mean of wealth
increases the use of checks by 0.58%

C. Zulehner: Introductory Econometrics 17 / 36


Omitted Variables Bias: Wrap up

Hence in assessing how relevant may be an omitted variable in biasing estimates,


think of both
I How correlated it is likely to be with the included variable(s)
I How important it is in affecting the dependent variable
In practice there may be many variables that affect y, but what matters is that we
include the critical ones

C. Zulehner: Introductory Econometrics 18 / 36


Unobserved Heterogeneity

Unobserved Heterogeneity is a particular case of omitted variables


The only difference is that there is one variable that is unobservable but if observed
should be in the regression

Example 1: ability in estimates of the return to education

wagei = β0 + β1 Educationi + β2 Abilityi + ui


| {z }
New error term

Since we don’t observe Ability , it ends up in the error term


Problem: Ability is likely to be correlated to Education as more able people perform
better ⇒ more likely to invest in education
A positive βˆ1 could simply reflect the effect of unobserved ability due to the
correlation between ability and education

C. Zulehner: Introductory Econometrics 19 / 36


Unobserved Heterogeneity

Example 2: productivity in estimates of returns to capital


βk β
Yj = Aj · Kj · Lj l → yj = β0 + βk · kj + βl · lj + j

with yj = ln Yj , kj = ln Kj , lj = ln Lj and β0 + j = ln Aj
the error term, j , includes:
I technology or management differences, measurement errors, variation in
external factors (weather, machine break down, labor problems)
observed inputs may be correlated with unobserved shock and therefore OLS will
yield biased and inconsistent estimates
I capital and labor are chosen by the firm
I if the firm has knowledge of j (or some part of it) when making input
decisions the choices will likely be correlated with j
I already mentioned by Marshack and Andrews (EMA, 1964)

C. Zulehner: Introductory Econometrics 20 / 36


3.1. “Classical” Measurement Error

Most (perhaps all) our data are measured with error


Suppose our true relation is:

y = Xβ + u with E (u|X) = 0
instead of the exact variables X∗ we observe X, which is measured with error:

X = X∗ + e
The “classical” measurement error is when E (e|X∗ ) = 0
We can therefore re-write our model as:

y = Xβ − eβ + u
I Now X is per definition correlated with the composite error u − eβ
I This lead to inconsistency of the OLS estimates
I We want to estimate E (y|X∗ ) but we only can estimate E (y|X)
I Which type of bias will we have and ow large will it be?

C. Zulehner: Introductory Econometrics 21 / 36


Measurement Error in The SLR
With one regressor and “classical” measurement error the plim of the slope
coefficient is:

Cov (y , X ) Cov (βX − βe + u, X )


plimβ̂ = = =
Var (X ) Var (X )
 
1  
β Cov (X , X ) −βCov (e, X ) + Cov (u, X ) =
=
Var (X )  | {z } | {z }
=Var (X ) =0

=0 Var (e)
z }| { z }| {

Cov (e, X ∗ + e) (Cov (e, X ) + Cov (e, e)
=β−β =β−β
Var (X ) Var (X )
Var (X ∗ )
   
Var (e)
=β 1− = β
Var (X ∗ ) + Var (e) Var (X ∗ ) + Var (e)
| {z }
<1

1 OLS estimate is biased towards zero. This is called attenuation bias

2 The extent of bias is related to importance of the measurement error


(Var (X ∗ )/[Var (X ∗ ) + Var (e)]) and it is called signal-to-noise ratio or reliability ratio

C. Zulehner: Introductory Econometrics 22 / 36


Measurement Error - Intuition

C. Zulehner: Introductory Econometrics 23 / 36


3.2. The General Case

Consider the previous model with classical measurement error but assume now that
X is multi-dimensional
Call Ω the covariance matrix of u. Moreover assume
   
1 ∗0 ∗ 1 ∗0
plim X X = Ω∗XX and plim X e =0
N N

The OLS estimator is


−1
β̂ = (X0 X)−1 X0 y = (X∗ + e)0 (X∗ + e) (X∗ + e)0 (X∗ β + u)

And taking the plim


−1
plimβ̂ = plim (X∗ + e)0 (X∗ + e) (X∗ + e)0 (X∗ β + u)
−1
= (Ω∗XX + Ω) Ω∗XX β

C. Zulehner: Introductory Econometrics 24 / 36


Proof

Denominator:
 
1 ∗
plim (X + e)0 (X∗ + e)
N
 
 1 ∗0 ∗ 1 ∗0 1 0 ∗ 1 0  ∗
 N X X + N X e + N e X + N e e = ΩXX + Ω
= plim  
| {z } | {z }
plim=0 plim=0

Nominator:
 
1 ∗ 0 ∗
plim (X + e) (X β + u)
N
 
 1 ∗0 ∗ 1 0 ∗ 1 ∗0 1 0  ∗
 N X X β + N e X β + N X u + N e u = ΩXX β
= plim  
| {z } | {z }
plim=0 plim=0

C. Zulehner: Introductory Econometrics 25 / 36


Remarks

The matrix equivalent to the attenuation bias is


−1
(Ω∗XX + Ω) Ω∗XX β

But, in this general case, it is hard to say anything about the direction of bias on
any single coefficient. It will depend on the vector of coefficients and the matrices
of covariance among the regressors and the errors
If both Ω∗XX and Ω are diagonal then all coefficients are biased towards zero

C. Zulehner: Introductory Econometrics 26 / 36


3.2 An informative special case

Consider the two variables case. One variable, x1 is measured with error, the other
x2 is measured without error. The two matrixes are then
 2   2 
σe 0 σ1 σ12
Ω= and Ω∗XX =
0 0 σ12 σ22

We can therefore write:

σ12 σ22 − σ12


2
plimβ̂1 = β1
σ12 σ22− σ12 + σe2 σ22
2

σe2 σ12
plimβ̂2 = β1 + β2
σ12 σ22 − σ122
+ σe2 σ22

C. Zulehner: Introductory Econometrics 27 / 36


Why?
∗ 0 ∗ −1 ∗ 0 ∗ ∗ −1 ∗
plimβ̂ = plim (X + e) (X + e) (X + e) (X β + u) = ΩXX + Ω ΩXX β
 2 2
 −1  2
  
σ1 + σe σ12 σ1 σ12 β1
=
σ12 σ22 σ12 σ22 β2
 2  2  
1 σ2 −σ12 σ1 σ12 β1
= 2 2 2 2 2
σ1 σ2 + σe2 σ22 − σ12 σ12 −σ12 σ1 + σe σ12 σ2 β2
2 2
σ12 σ22 − σ12 2
σ2 σ12 − σ2 σ12
 
| {z }  
1 
=0  β1
= 2 2

 2 2
σ1 σ2 + σe2 σ22 − σ12 σ12 σ1 σ12 − σ1 σ12 +σe2 σ12 −σ122
+ σ12 σ22 + σe2 σ22  β2

| {z }
=0

σ12 σ22 − σ12


2
  
1 0 β1
= 2 2
σ1 σ2 + σe2 σ22 − σ12 σ12 σe2 σ12 2
−σ12 + σ12 σ22 + σe2 σ22 β2

C. Zulehner: Introductory Econometrics 28 / 36


Attenuation bias
the attenuation bias of error-ridden variable worsens when other variables are included.
plim of β1 can be written as:
σ12
σ12 +σe2
− ρ212
plimβ̂1 = β1
1 − ρ212

where ρ12 is the correlation between x1 and x2

When ρ12 6= 0 this attenuation bias is worse than when x2 is excluded (check!)
Intuition: x2 soaks up some signal in x1 leaving more noise in what remains
One implication of this is that putting in extra regressors may lead to worse
estimates– omitted variables bias is reduced but attenuation bias is increased

C. Zulehner: Introductory Econometrics 29 / 36


Why?
σ12
σ2 σ2 − σ2 σ2 σ2 − σ2 σ12 +σe2
− ρ212
plimβ̂1 = 2 2 1 2 2 12 2 2 β1 = 2 21 2 2 12 2 β1 = β1
σ1 σ2 − σ12 + σe σ2 σ2 (σ1 + σe ) − σ12 1 − ρ212

Compare with

Cov (x1 , u − β1 e) β1 σ 2 σ2 σ2
plimβ̂1 = β1 + = β1 − 2 e 2 = β1 (1 − 2 e 2 ) = β1 ( 2 1 2 )
Var (x1 ) σ1 + σe σ1 + σe σ1 + σe

C. Zulehner: Introductory Econometrics 30 / 36


Errors and Consistency
The Presence of a variable measured with error causes inconsistency on the coefficients
of other variables (β2 )

σe2 σ12
plimβ̂2 = β1 + β2
σ12 σ22 − σ122
+ σe2 σ22

The plimβ2 6= β2 if x1 and x2 are correlated ⇒ measurement error act as a


contagious disease!
Mirror image of previous result: x2 soaks up some of the true variation of x1

C. Zulehner: Introductory Econometrics 31 / 36


An Extreme Case

Observed x1 is all noise, σe2 = ∞. Then its coefficient will be zero


Then we get:

σ12 Cov (x1 , x2 )


plimβ̂2 = β1 + β2 = β2 + β1
σ22 Var (x2 )

This is the formula for the omitted variables bias!


If x1 is only “noise” then it is equivalent to omit it from the regressions

C. Zulehner: Introductory Econometrics 32 / 36


Measurement Error: An Example
Estimate the use of check equation in the social capital dataset
Social capital is measured by participation in referenda. These are administrative
data ⇒ very little (if any) measurement error
One can generate “measurement error” and add it to trust our measure of social
capital
We generate a random variable in Stata
I set seed 2565
I normals = invnorm(uniform())
Also generate three different measures of social capital with error
I gen noise2 = trust1+normals/2
I gen noise10 = trust1+normals/10
I gen noise100 = trust1+normals/100
The (significant) correlation between trust and terror , terror 10, and terror 100 is
0.1586, 0.6271, and 0.9924 respectively

C. Zulehner: Introductory Econometrics 33 / 36


The use of checks in Italy
Dependent variable Use of Checks Use of Checks Use of Checks Use of Checks
Specification correct large error midium error small error
Social capital 0.9440 0.0127 0.2641 0.9188
(5.785)∗∗ (1.681) (3.760)∗∗ (5.764)∗∗
Age -0.0034 -0.0033 -0.0033 -0.0034
(-7.740)∗∗ (-7.406)∗∗ (-7.584)∗∗ (-7.751)∗∗
Married 0.0589 0.0511 0.0523 0.0583
(5.111)∗∗ (3.930)∗∗ (4.245)∗∗ (5.077)∗∗
Male 0.0197 0.0192 0.0198 0.0199
(2.110)∗∗ (1.814)∗ (1.918)∗ (2.114)∗∗
Years of education 0.0287 0.0283 0.0284 0.0287
(15.852)∗∗ (15.528)∗∗ (15.521)∗∗ (15.809)∗∗
Household wealth 0.0447 0.0410 0.0424 0.0448
(1.429) (1.209) (1.272) (1.427)
Household income 0.0057 0.0063 0.0061 0.0058
(14.645)∗∗ (14.403)∗∗ (15.047)∗∗ (14.722)∗∗
Judicial Efficiency -0.0130 -0.0481 -0.0386 -0.0139
(-1.138) (-3.647)∗∗ (-3.016)∗∗ (-1.212)
Constant -0.4778 0.3845 0.1529 -0.4540
(-2.995)∗∗ (4.890)∗∗ (1.401) (-2.878)∗∗
Number of observations 32442 32442 32442 32442
R-squared adjusted 0.277 0.263 0.267 0.277

Robust t-statistics clustered at the regional level in parentheses

C. Zulehner: Introductory Econometrics 34 / 36


3.3 Measurement Error in the Dependent Variable

Suppose classical measurement error in y: y = y∗ + e


Assume that the error e is uncorrelated with y∗ and X (i.e. E (e|y∗ , X) = 0)
Then we have:

y = Xβ + e + u

Since X is uncorrelated with e then OLS is consistent!


We however have a loss in precision, since the error term has a larger variance due
to the “low quality” of our data

C. Zulehner: Introductory Econometrics 35 / 36


The use of checks in Italy
Dependent variable Use of Checks Use of Checks
Specification Correct Error in LHS
Social capital 0.9440 0.9186
(5.785)∗∗ (4.583)∗∗
Age -0.0034 -0.0028
(-7.740)∗∗ (-5.168)∗∗
Married 0.0589 0.1006
(5.111)∗∗ (4.477)∗∗
Male 0.0197 -0.0031
(2.110)∗∗ (-0.148)
Years of education 0.0287 0.0286
(15.852)∗∗ (15.257)∗∗
Household wealth 0.0447 0.0252
(1.429) (0.670)
Household income 0.0057 0.0062
(14.645)∗∗ (9.607)∗∗
Judicial Efficiency -0.0130 -0.0143
(-1.138) (-1.161)
Constant -0.4778 -0.5078
(-2.995)∗∗ (-2.611)∗∗
Number of observations 32442 32442
R-squared adjusted 0.277 0.056

Robust t-statistics clustered at the regional level in parentheses

C. Zulehner: Introductory Econometrics 36 / 36

You might also like