You are on page 1of 25

BSc Intermediate Econometrics

26/03/2013
Please Do Not Distribute.
Week 12
Ken Yamada
Singapore Management University
1
Plan
Chapter 17: Limited Dependent Variable Models and Sample
Selection Corrections
2
Chapter 17
Limited Dependent Variable Models
and Sample Selection Corrections
3
Chapter 17
Learning Objectives
After successfully completing this chapter, you will be able to
1. Understand how to estimate models for binary data and censored data.
2. Understand how to test and correct for sample selection bias.
4
Limited Dependent Variable Models
A limited dependent variable (LDV) is broadly defined as a
dependent variable whose range of values is substantially restricted.
A binary dependent variable is an example of an LDV. Example?
Optimizing behavior often leads to a corner solution response for
some nontrivial fraction of the population.
Example?
5
17.1 Logit and Probit Models
for Binary Response
In a binary response model, interest lies in the response probability:
Pr(y = 1|x) = Pr(y = 1|x
1
, , x
K
)
To avoid the limitation of LPM (linear probability model), consider
a class of binary response models of the form.
Pr(y = 1|x) = G(
0
+
1
x
1
+ +
K
x
K
) = G(
0
+ x),
where 0 < G(z) < 1, for all real numbers z.
6
Specifying Logit and Probit Models
In the logit model, G is the logistic function:
In the probit model, G is the standard normal cumulative distribution
function (cdf).
where is the standard normal probability density function (pdf).
( )
( )
( )
( ) z
z
z
z G A =
+
=
exp 1
exp
( ) ( ) ( )dv v z z G
z
}

= u = |
( )
|
|
.
|

\
|
=
2
exp
2
1
2
z
v
t
|
7
|
8
A
B
C
A + B + C = 1
1 A = B + C
1 Pr(z <1.5) = Pr(z < 1.5)
A Latent Variable Model
These models can be derived from a latent variable model.
y
*
=
0
+ x + e, y = 1[y
*
> 0],
where the indicator function 1[] takes on the value one if the event
in brackets is true and zero otherwise.
The response probability is
Pr(y = 1|x) = Pr(y
*
> 0|x) = Pr(e >
0
x |x)
= 1 G(
0
x) = G(
0
+ x)
9
Partial (Marginal) Effect
Denote p(x) = Pr(y = 1|x).
For a roughly continuous variable x
j
, its partial effect is
For a binary explanatory variable x
1
, its partial effect is
( )
( ) ( )
( )
dz
z dG
z g g
x
p
j
j
+ =
c
c
where ,
0
| | x
x
( ) ( )
K K K K
x x G x x G | | | | | | | + + + + + + + +
2 2 0 2 2 1 0
0
10
Appendix C.4 Maximum Likelihood
Let {Y
1
, , Y
N
} be a random sample from the population
distribution f(y; ).
The joint distribution of {Y
1
, , Y
N
} is the product of the densities:
f(Y
1
; )f(Y
N
; ).
Define the likelihood function as
L(; Y
1
,, Y
N
) = f(Y
1
; )f(Y
N
; ).
The maximum likelihood estimator of is the value of that maximizes the
likelihood function.
11
Appendix C.4 Maximum Likelihood
The maximum likelihood principle says that, out of all the possible
values for , the value that makes the likelihood of the observed data
largest should be chosen.
The log-likelihood function is obtained by taking the natural log of
the likelihood function:
The MLE is generally the most asymptotically efficient estimator when the
population model f(y; ) is correctly specified.
( ) | | ( ) | |

=
=
N
i
i N
Y f Y Y L
1
1
; log , , ; log u u
12
Appendix 17.A Maximum Likelihood Estimation
with Explanatory Variables
Let f(y; x, ) denote the density function for a random draw y
i
from
the population, conditional on x = x
i
.
The maximum likelihood estimator of maximizes the log-
likelihood function.
The MLE is consistent and has an approximate normal distribution in large
samples.
( )

=
N
i
i i
y f
1
, ; log max b x
b
13
Maximum Likelihood Estimation of
Logit and Probit Models
For the binary response case, the conditional density is determined
by two values (
0
is suppressed for notational simplicity),
f(1; x, ) = Pr(y
i
= 1|x
i
) = G(x
i
)
and
f(0; x, ) = Pr(y
i
= 0|x
i
) = 1 G(x
i
)
The succinct way to write the density is
f(y|x
i
;) = G(x
i
)
y
i
[1 G(x
i
)]
1 y
i
for y = 0, 1.
14
Maximum Likelihood Estimation of
Logit and Probit Models
The log-likelihood function for observation i is

i
() = y
i
log[G(x
i
)] + (1 y
i
)log[1 G(x
i
)]
The log-likelihood for a sample size of N is
The MLE is the logit (probit) estimator if G() is the standard logit (normal)
cdf.
Because 0 G() 1, log[G()] 0 and log[1 G()] 0. The log-likelihood is
bounded above by zero.
( ) ( )

=
=
N
i
i
L
1
| | "
|

15
Testing Multiple Hypotheses
The likelihood ratio (LR) test is based on the difference in the log-
likelihood function for the unrestricted and restricted models.
Dropping variables leads to a smaller (or at least no larger) log-likelihood
function.
The likelihood ratio statistic is twice the difference in the log-
likelihoods:
LR = 2(L
ur
L
r
)
LR is non-negative because L
ur
> L
r
and has an approximate chi-square
distribution under the null.
The degree of freedom equals the number of restrictions.
16
Interpreting the Logit and Probit Estimates
McFadden (1974) defines pseudo R-squared as
where L
0
is the log-likelihood function in the model with only an intercept.
Recall that the log-likelihood is negative. Thus |L
ur
| |L
0
|.
If the covariates have no explanatory power, L
ur
= L
0
. The pseudo R
2
is zero.
17
0
1
L
L
ur

1 0
0
s s
L
L
ur
The Random Utility Model
The discrete choice model can be derived from utility maximizing
behavior.
Consider an individual i who faces a binary choice and obtains U
ij
from
alternative j = 1, 2.
Utility is decomposed as U
ij
= V
ij
+
ij
, where V
ij
is observed; is unobserved.
Assume the linear utility: V
ij
= x
ij

Define x
i
= x
i1
x
i2
and u
i
=
i1

i2
Let G() denote the distribution function of u
i
18
Choice Probability
The probability that the individual i chooses alternative 1 (y = 1) is
19
( ) ( )
( )
( ) ( )
( )
( )
( ) x
x
x
x x
x x x
x x
i
i
i i
i i i i
i i i i i
i i i i i
G
G
u
U U y
=
=
> =
> =
+ > + =
> = =
1
Pr
Pr
Pr
Pr 1 Pr
2 1 2 1
2 2 1 1
2 1
c c
c c
Example: Transportation Mode Choice
Consider a person who can take either a car or a bus to work.
Suppose that the linear utility is specified as
V
ij
=
1
t
ij
+
2
m
ij
+
3
q
i
for j = c, b,
where t is time, m is cost, and q is income.
The probability that the person takes a car (y = 1) is
20
( ) ( )
( )
( ) ( ) ( )
2 1 2 2 1 1
Pr 1 Pr
i i i i
i
i ib ic i i
m m t t G
G
U U y
+ =
=
> = =
| |
x
x x
Example 17.1
Married Womens Labor Force Participation
Three models are used to estimate labor force participation among
married women.
regress inlf nwifeinc educ exper expersq age kidslt6 kidsge6
logit inlf nwifeinc educ exper expersq age kidslt6 kidsge6
probit inlf nwifeinc educ exper expersq age kidslt6 kidsge6
The magnitudes of the coefficients across models are not directly comparable.
Report the partial effect.
21
22
23
Partial (Marginal) Effects in Logit
x 1
_cons
x 20. 12896 12. 28685 10. 63081 178. 0385 42. 53785 . 2377158 1. 353254
nwi f ei nc educ exper exper sq age ki dsl t 6 ki dsge6
Mar gi nal ef f ect s eval uat ed at
_cons . 1034482 . 2090356 0. 49 0. 621 - . 306254 . 5131505
ki dsge6 . 0146162 . 0181884 0. 80 0. 422 - . 0210324 . 0502649
ki dsl t 6 - . 3509498 . 0496395 - 7. 07 0. 000 - . 4482414 - . 2536583
age - . 021403 . 0035398 - 6. 05 0. 000 - . 0283408 - . 0144652
exper sq - . 0007669 . 0002477 - 3. 10 0. 002 - . 0012524 - . 0002815
exper . 0500569 . 0078247 6. 40 0. 000 . 0347209 . 065393
educ . 0537773 . 0105608 5. 09 0. 000 . 0330785 . 0744761
nwi f ei nc - . 0051901 . 0020482 - 2. 53 0. 011 - . 0092045 - . 0011756
i nl f Coef . St d. Er r . z P>| z| [ 95% Conf . I nt er val ]
Log Li kel i hood = - 401. 76515 Pseudo R2 = 0. 2197
Pr ob > chi 2 = 0. 0000
chi 2( 7) = 152. 02
Mar gi nal ef f ect s f r om l ogi t Number of obs = 753
. dl ogi t 2 i nl f nwi f ei nc educ exper exper sq age ki dsl t 6 ki dsge6
24
Partial (Marginal) Effects in Probit
x 1
_cons
x 20. 12896 12. 28685 10. 63081 178. 0385 42. 53785 . 2377158 1. 353254
nwi f ei nc educ exper exper sq age ki dsl t 6 ki dsge6
Mar gi nal ef f ect s eval uat ed at
_cons . 1054865 . 1985227 0. 53 0. 595 - . 2836109 . 4945838
ki dsge6 . 0140628 . 0169852 0. 83 0. 408 - . 0192275 . 0473531
ki dsl t 6 - . 3391514 . 0463581 - 7. 32 0. 000 - . 4300117 - . 2482911
age - . 0206432 . 0033079 - 6. 24 0. 000 - . 0271265 - . 0141598
exper sq - . 0007371 . 0002347 - 3. 14 0. 002 - . 001197 - . 0002771
exper . 0481771 . 0073278 6. 57 0. 000 . 0338149 . 0625392
educ . 0511287 . 0098592 5. 19 0. 000 . 0318051 . 0704523
nwi f ei nc - . 0046962 . 0018903 - 2. 48 0. 013 - . 0084012 - . 0009913
i nl f Coef . St d. Er r . z P>| z| [ 95% Conf . I nt er val ]
Log Li kel i hood = - 401. 30219 Pseudo R2 = 0. 2206
Pr ob > chi 2 = 0. 0000
chi 2( 7) = 177. 68
Mar gi nal ef f ect s f r om pr obi t Number of obs = 753
. dpr obi t 2 i nl f nwi f ei nc educ exper exper sq age ki dsl t 6 ki dsge6
25
17.2 The Tobit Model
for Corner Solution Responses
The Tobit (Tobins probit) model expresses the observed response, y,
in terms of an underlying latent variable:
y
*
= x + u, u|x ~ Normal(0,
2
)
y = max(0, y
*
).
The latent variable y
*
satisfies the classical linear model assumptions.
The observational rule implies that
1. The observed variable y equals y
*
when y
*
0
2. The observed variable y equals 0 when y
*
< 0.
26
Likelihood
1. The density of y given x is the same as the density of y
*
given x for
positive values.
2. Further, as in the probit model,
Pr(y = 0|x) = 1 (x/)
The log-likelihood function for each observation i is
( )
|
.
|

\
|

=
(

o
|
o o
o t
x x
i i
y y 1
2
exp
2
1
2
2
( ) ( ) ( )
(

|
.
|

\
|

> +
(

|
.
|

\
|
u = =
o
|
o o
o
x x

i
i
i
i i
y
y y
1
log 0 1 1 log 0 1 , "
27
Law of Iterated Expectation
Recall
Suppose x = {1, 2, 3} and each value occurs with probability 1/3.
Then
28
( ) ( ) ( )

= = E = E
X
x X x X Y Y Pr
( ) ( ) ( ) ( )
3
1
3
3
1
2
3
1
1 = E + = E + = E = E X Y X Y X Y Y
Interpreting the Tobit Estimates
Both the conditional expectation E(y|y > 0, x) and the
unconditional expectation E(y|x) may be of interest.
E(y|x) = Pr(y > 0|x)E(y|y > 0, x) + Pr(y = 0|x)E(0|y = 0, x)
= (x/)E(y|y > 0, x)
E(y|y > 0, x) = x + E(u|u > x) (see Exercise 9a)
= x + [ (x/)/(x/)]
= x + (x/)
|
29
Partial Effects
For a continuous variable x
j
,
After some algebra,
See Exercise 9b.
( ) ( )
( ) ( )
( )
j j j
x
y y
y y y
x
y
x
y
c
> E c
> + > E
c
> c
=
c
E c x
x x
x x , 0
0 Pr , 0
0 Pr
( )
|
.
|

\
|
u =
c
E c
o
|
x
x
j
j
x
y
30
Example 17.2
Married Womens Annual Labor Supply
Of 753 married women, 428 worked for a wage outside the home
during the year; 325 worked zero hours.
The Tobit model may be suitable for the analysis of the number of
annual hours worked.
regress hours nwifeinc educ exper expersq age kidslt6 kidsge6
tobit hours nwifeinc educ exper expersq age kidslt6 kidsge6, ll(0)
See Exercise 9c for another application.
31
32
33
( ) ( ) ( )
( )
( )
|
.
|

\
|
+
|
.
|

\
|
u =
(

|
|
.
|

\
|
u
+
|
.
|

\
|
u = > E > = E
o
o|
o o
o |
o
o
x
x
x
x
x
x
x
x x x , 0 0 Pr y y y y
Tobit Estimates
428 uncensor ed obser vat i ons
Obs. summar y: 325 l ef t - censor ed obser vat i ons at hour s<=0
_se 1122. 022 41. 57903 ( Anci l l ar y par amet er )
_cons 965. 3053 446. 4358 2. 16 0. 031 88. 88528 1841. 725
ki dsge6 - 16. 218 38. 64136 - 0. 42 0. 675 - 92. 07675 59. 64075
ki dsl t 6 - 894. 0217 111. 8779 - 7. 99 0. 000 - 1113. 655 - 674. 3887
age - 54. 40501 7. 418496 - 7. 33 0. 000 - 68. 96862 - 39. 8414
exper sq - 1. 864158 . 5376615 - 3. 47 0. 001 - 2. 919667 - . 8086479
exper 131. 5643 17. 27938 7. 61 0. 000 97. 64231 165. 4863
educ 80. 64561 21. 58322 3. 74 0. 000 38. 27453 123. 0167
nwi f ei nc - 8. 814243 4. 459096 - 1. 98 0. 048 - 17. 56811 - . 0603724
hour s Coef . St d. Er r . t P>| t | [ 95% Conf . I nt er val ]
Log l i kel i hood = - 3819. 0946 Pseudo R2 = 0. 0343
Pr ob > chi 2 = 0. 0000
LR chi 2( 7) = 271. 59
Tobi t est i mat es Number of obs = 753
. dt obi t 2 hour s nwi f ei nc educ exper exper sq age ki dsl t 6 ki dsge6, l l ( 0)
34
Partial (Marginal) Effects
- - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ki dsge6 | - 9. 2181969 - 6. 5299199 - . 00568147
ki dsl t 6 | - 508. 15578 - 359. 96373 - . 31319264
age | - 30. 923433 - 21. 905318 - . 0190591
exper sq | - 1. 0595743 - . 75057361 - . 00065305
exper | 74. 780239 52. 972286 . 04608945
educ | 45. 838405 32. 470679 . 02825167
nwi f ei nc | - 5. 0099548 - 3. 5489156 - . 0030878
- - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name | Expect ed Val ue bei ng Uncensor ed Uncensor ed
| Uncondi t i onal Condi t i onal on Pr obabi l i t y
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| Mar gi nal Ef f ect s at Obser ved Censor i ng Rat e
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
35
( )
j
x
y
c
E c x ( )
j
x
y
c
> c x 0 Pr ( )
j
x
y y
c
> E c x , 0
Specification Issues in Tobit Models
If any of the assumptions involved in model specifications fail, then
the Tobit MLE may be biased and inconsistent.
One limitation of the Tobit model is that the expected value
conditional on y > 0 is similarly specified as the probability that y >
0.
Fixed costs of work
36
17.5 Sample Selection Corrections
Truncated regression model is a special case of nonrandom sample
selection.
Incidental truncation occurs when we do not observe y because of
the outcome of another variable.
The leading example is estimating the wage offer function.
37
When Is OLS on the Selected Sample Consistent?
We write the population model for a random draw as
y
i
= x
i
+ u
i
, E(u|x) = 0.
Define a selection indicator s
i
for each i by s
i
= 1 if we observe all of
(y
i
, x
i
) and s
i
= 0 otherwise.
The observation will be used if s
i
= 1 and will not if s
i
= 0.
We are interested in the statistical properties of the OLS estimators
using the selected sample.
38
Exogenous Sample Selection
We can only estimate the equation
s
i
y
i
= s
i
x
i
+ s
i
u
i
For the OLS estimators to be consistent,
E(su) = 0,
E[(sx
j
)(su)] = E(sx
j
u) = 0.
The key condition for unbiasedness is E(su|sx
1
, , sx
K
) = 0.
39
Incidental Truncation
The usual approach to incidental truncation is to add an explicit
selection equation to the population model of interest.
y = x + u, E(u|x) = 0
s = 1[z + v 0].
A standard assumption is that z is exogenous in the equation of
primary interest (outcome equation).
E(u|x, z) = E(u|z) = 0
40
Heckman (1979)
Sample Selection Bias as a Specification Error
Suppose that x is a subset of z (exclusion restriction). Taking the
conditional expectation of y given z and v yields
E(y|z, v) = x + E(u|z, v) = x + E(u|v)
Under the joint normality assumption, E(u|v) = v. Thus,
E(y|z, v) = x + v
Conditioning on s,
E(y|z, s) = x + E(v|z, s)
41
Sample Selection Correction
When s = 1,
E(y|z, s = 1) = x + E(v| v z) = x + (z).
The inverse Mills ratio can be estimated from
the probit model: Pr(s = 1|z) = (z)
i. Using all N observations, estimate the probit model. Compute the inverse
Mills ratio for each i.
ii. Using the selected sample, run the regression of y on x and .
There is no sample selection problem under H
0
: = 0.

42
( ) ( ) ( ) | z z z u =
Example 17.5
Wage Offer Equation for Married Women
l ambda . 03226186 . 1336246
si gma . 66362876
r ho 0. 04861
l ambda . 0322619 . 1336246 0. 24 0. 809 - . 2296376 . 2941613
mi l l s
_cons . 2700768 . 508593 0. 53 0. 595 - . 7267472 1. 266901
ki dsge6 . 036005 . 0434768 0. 83 0. 408 - . 049208 . 1212179
ki dsl t 6 - . 8683285 . 1185223 - 7. 33 0. 000 - 1. 100628 - . 636029
age - . 0528527 . 0084772 - 6. 23 0. 000 - . 0694678 - . 0362376
exper sq - . 0018871 . 0006 - 3. 15 0. 002 - . 003063 - . 0007111
exper . 1233476 . 0187164 6. 59 0. 000 . 0866641 . 1600311
educ . 1309047 . 0252542 5. 18 0. 000 . 0814074 . 180402
nwi f ei nc - . 0120237 . 0048398 - 2. 48 0. 013 - . 0215096 - . 0025378
i nl f
_cons - . 5781033 . 3050062 - 1. 90 0. 058 - 1. 175904 . 0196979
exper sq - . 0008591 . 0004389 - 1. 96 0. 050 - . 0017194 1. 15e- 06
exper . 0438873 . 0162611 2. 70 0. 007 . 0120163 . 0757584
educ . 1090655 . 015523 7. 03 0. 000 . 0786411 . 13949
l wage
Coef . St d. Er r . z P>| z| [ 95% Conf . I nt er val ]
Pr ob > chi 2 = 0. 0000
Wal d chi 2( 3) = 51. 53
Uncensor ed obs = 428
( r egr essi on model wi t h sampl e sel ect i on) Censor ed obs = 325
Heckman sel ect i on model - - t wo- st ep est i mat es Number of obs = 753
> i dsl t 6 ki dsge6) t wost ep
. heckman l wage educ exper exper sq, sel ( i nl f = nwi f ei nc educ exper exper sq age k
43
44
See Exercise 9d for another application.
Gender Wage Gap and Selection into Employment
(Mulligan and Rubinstein, 2008)
There has been a large change in wage inequality within and
between genders in the United States and elsewhere.
1. Male wage inequality increased.
2. The gender wage gap decreased.
Among women, there has been a substantial increase in
1. Employment (labor supply)
2. Educational attainment (human capital)
Does the change in the composition of the female workforce affect
the change in gender wage gap?
How would the gender wage gap have changed if there were no change in the
workforce composition?
45
46
Within-gender wage gap Human capital investment High-skilled female workers
Gender wage gap
Chapter 17
Learning Objectives
After successfully completing this chapter, you will be able to
1. Understand how to estimate models for binary data and censored data.
2. Understand how to test and correct for sample selection bias.
47
Exercise 9 (Self-Test 9)
a. Show E(u|u > x) = (x/)/(x/)
b. (optional) Consider the Tobit model. Show
c. Problem 17.4
d. Problem 17.6
48
|
( )
|
.
|

\
|
u =
c
E c
o
|
x
x
j
j
x
y
Consultation Hours until Week 13
We have consultation hours in Weeks 12 and 13, just as usual.
a) My consultation hours are
Mondays and Thursdays 45pm
b) TAs consultation hours are
Wednesdays 10am12pm at SOE GSR 4-7
49
Consultation Hours in Weeks 14 and 15
If you want to ask a question after Week 13, please come to the
following consultation hours.
a) My consultation hours are
35pm on Tuesday in Week 15.
b) TAs consultation hours are
10am12pm at SOE GSR 4-7 on Wednesday in Week 14.
Please send email at [jayen.chua.2009@economics.smu.edu.sg] by midnight
the day before.
50

You might also like