Professional Documents
Culture Documents
Ec1 23
Ec1 23
Lecture 8
Models for Censored and
Truncated Data - Tobit Model
No censoring or truncation
10
6
y
0
0 2 4 6
x
10
8
y 6
0
0 2 4 6
x
0.40
0.35 Prob(St≤ Ē)
0.30
observed part
Prob(St> Ē)
0.25
0.20
0.15
0.10
0.05
0.00
Ē St
10
6
y
0
0 2 4 6
x
PDF(y*)
f(y*)
Prob(y*<5) Prob(y*>5)
5 y*
• The pdf of the observable variable, y, is a mixture of discrete
(prob. mass at Y=5) and continuous (Prob[Y*>5]) distributions.
8
RS – Lecture 17
PDF(y*)
Prob(y*<5) Prob(y*>5)
5 Y
Truncated
10
6
y
0
0 2 4 6
x
Censored Normal
• Moments:
Let y*~ N(µ*, σ2) and α = (c – µ*)/σ. Then
- First Moment
- Conditional: E[y|y*>c] = µ*+σ λ(α)
- Unconditional: E[y] = Φ(α) c + (1- Φ(α)) [µ*+σ λ(α)]
12
RS – Lecture 17
Censored Normal
• To get the first moment, we use a useful result –proven later- for a
truncated normal distribution. If v is a standard normal variable and
the truncation is from below at c, a constant, then
φ(c)
E (v | v > c) = F −1 f =
1 − Φ(c)
Censored Normal
• To get the first moment, we use a useful result –proven later- for a
truncated normal distribution. If v is a standard normal variable and
the truncation is from below at c, a constant, then
φ(c) φ(−c)
E (v | v > c) = F −1 f = =
1 − Φ(c) Φ(−c)
0.40
0.35
0.30
φ(c) 0.25
λ (c ) = − 0.20
1 − Φ (c ) 0.15
0.10
0.05
0.00 14
RS – Lecture 17
Censored Normal
• Moments (continuation):
- Unconditional first moment:
E[y] = Φ(α) c + (1- Φ(α)) [µ*+σ λ(α)]
- Second Moment
(1 − δ )
V a r ( y ) = σ 2 (1 − Φ ( α )) 2
+ ( α − λ ) Φ ( α )
w h ere δ = λ 2 − λ α 0 ≤ δ ≤1
15
16
RS – Lecture 17
Truncated Normal
• Moments:
Let y*~ N(µ*, σ2) and α = (c – µ*)/σ.
- First moment:
E[y*|y> c] = µ* + σ λ(α) <= This is the truncated regression.
=> If µ>0 and the truncation is from below –i.e., λ(α) >0–, the mean
of the truncated variable is greater than the original mean
Note: For the standard normal distribution λ(α) is the mean of the
truncated distribution.
- Second moment:
- Var[y*|y> c] = σ2[1 - δ(α)] where δ(α) = λ(α) [λ(α)- α]
=> Truncation reduces variance! This result is general, it applies to
upper or lower truncation given that 0 ≤ δ(α) ≤ 1
17
Truncated Normal
Model: yi* = Xiβ + εi
Data: y = y* | y* > 0
f(y|y*>0,X)
F(0|X)
f(y*|X)
0
Xiβ Xiβ+σλ
• Truncated regression model:
E(yi| yi* > 0,Xi) = Xiβ + σλi
18
RS – Lecture 17
Examples:
- Consumption (1, not 2)
- Wage changes (2, not 1)
- Labor supply (1 & 2)
Corner Solutions
- These data sets are typical of microeconomics. We think of these
data sets as a result of utility maximization problems with two
decisions:
(1) Do something or not (work or not, buy IBM or not, donate or not)
=> This is binary choice problem. The participation decision (y=0 or
y=0) part.
Remark: The presented Tobit model –also called Type I Tobit Model- can
be written as a combination of two models:
(1) A Probit model: It determines whether y=0 (No) or y>0 (Yes).
(2) A truncated regression model for y>0. 26
RS – Lecture 17
y*, y When y* is
negative,
actual hours
worked is
zero.
Education
- Expectations of interest:
E[y|x], E[y|y>0,x].
E[y*|x]
27
=> The error term, υ, not only has non-zero mean but it is also
heteroskedastic.
30
RS – Lecture 17
31
X ′i β −1 f( z i )2
z i = a i = z if (z i ) − − F (z i )
σ σ 2 1 − F (z i )
1 z if ( z i )2
b i = z 2
i f (z i ) + f (z i ) −
2σ 3 1 − F (z i )
1 z 2
f ( z i )2
c i = − z 3i f (z i ) + z if (z i ) − i
− 2F (z i )
4σ 4 1 − F (z i )
35
Jarque-Bera 107.9888
Probability 0.000000
0
0 1000 2000 3000 4000 5000
Error Distribution
∂E ( y | x ) x' β
2. = βk Φ( )
∂x k σ
• We are interested in the overall effect rather than the effect for a
specific person in the data. Two ways to do this computation:
- At the sample average: Plug the mean of x in the above formula.
- Average of partial effects: Compute the partial effect for each
individual in the data. Then, compute the average.
41
nwifeinc -8. 814 243 4. 459 096 -1 .98 0. 048 -1 7.5 681 1 - .06 037 24
educ 80 .64 561 21 .58 322 3 .74 0. 000 3 8.2 745 3 123 .01 67
exper 13 1.5 643 17 .27 938 7 .61 0. 000 9 7.6 423 1 165 .48 63
expersq -1. 864 158 .5 376 615 -3 .47 0. 001 -2 .91 966 7 - .80 864 79
age -54 .40 501 7. 418 496 -7 .33 0. 000 -6 8.9 686 2 -39 .84 14
kidslt6 -89 4.0 217 11 1.8 779 -7 .99 0. 000 -1 113 .65 5 - 674 .38 87
kidsge6 - 16. 218 38 .64 136 -0 .42 0. 675 -9 2.0 767 5 59. 640 75
_cons 96 5.3 053 44 6.4 358 2 .16 0. 031 8 8.8 852 8 184 1.7 25
. *****************************
. *Compute the Partial effect *
.
.
*at average of educ *
*on hours for working women *
Partial effect at average
. *manually *
.
.
*****************************
predict xbeta, xb
for working women:
. egen avxbeta=mean(xbeta) Computing manually.
. gen avxbsig=avxbeta/_b[/sigma]
. gen lambda=normalden(avxbsig)/normal(avxbsig)
. gen partial=_b[educ]*(1-lambda*(avxbsig+lambda))
. su partial
. *****************************************
. *Compute the Partial effect at average *
. *of education for the entire observation*
. *automatically *
. *****************************************
. Partial effect at average
. mfx, predict(ystar(0,.)) varlist(educ)
for all observations:
Marginal effects after tobit
y = E(hours*|hours>0) (predict, ystar(0,.)) Compute automatically
= 61 1.57 078
educ 4 8.73 409 12. 963 3.76 0 .000 2 3.32 63 74. 1419 1 2.28 69
44
RS – Lecture 17
OLS TOBIT
x'β OLS underestimates
βk Φ
βˆ k , OLS σ
the effect of education
on the labor supply (in
80.65 0.604 the average of the
28.76 48.73 explanatory variables).
• In the Tobit model, we naturally work with the likelihood. The Log L
for the homoscedastic Tobit model is:
N
L ( β 0 , β1 , σ ) = [log( σ 2 ) + log( 2π )] +
2
n
( y − β 0 − β 1 xi ) 2 β + β 1 xi
+ ∑ − Di i 2
+ (1 − D i ) log 1 − Φ ( 0 )
i =1 2σ σ
Homo. Hetero.
Elasticity Value S.E. Value S.E
Φ(z) 0.259 0.008 0.210 0.011
E(y|y>0) 0.284 0.009 0.395 0.010
E(y) 0.544 0.017 0.606 0.020
49
Example: Age affects the decision to donate to charity. But it can have
a different effect on the amount donated. We may find that age has a
positive effect on the decision to donate, but given a positive donation,
younger individuals donate more than older individuals.
51
- Conditional expectation:
E[y|y>0,x] = xi’β + σ12 λ(xi’α)
- Unconditional Expectation
E[y|x] = Prob(y>0|x) * E[y|y>0, x] + Prob(y=0|x)* 0
= Prob(y>0|x) * E[y|y>0, x]
= Φ(xi’α) * [xi’β + σ12 λ(xi’α )]