Econometrics II S11 Solutions 2023

JEB110 Econometrics II: LDV II 13-14 Dec, 2023
Winter Semester 2023/2024

Instructor: Barbara Pertold–Gebicka
Teaching Assistant: Kseniya Bortnikova
Exercise Session 11 - Solutions
Problem
Use the data in fringe.csv for this exercise. The file includes data on pension benefits for 616
workers based on US Survey from 1977.
(a) For what percentage of the workers in the sample a pension is equal to zero? What is the range
of pension for workers with non-zero pension benefits? Why is a Tobit model appropriate for
modeling pension?
(b) Estimate a Tobit model explaining pension in terms of exper, age, tenure, educ, depends,
married, white, and male. Do whites and males have statistically significantly higher expected
pension benefits? Compare with results obtained using OLS to estimate the model.
(c) Use the results from part (b) to estimate the difference in expected pension benefits for a white
male and a nonwhite female, both of whom are 35 years old, are single with no dependents,
have 16 years of education, and have 10 years of experience. Compare with OLS.
(d) Add union to the Tobit model and comment on its significance.
(e) Apply the Tobit model from part (d) but with peratio, the pension–earnings ratio, as the
dependent variable. (Notice that this is a fraction between zero and one, but, though it often
takes on the value zero, it never gets close to being unity. Thus, a Tobit model is fine as an
approximation.) Does gender and race have jointly an effect on the pension–earnings ratio?
Solution
(a) Using R we estimate that pension is equal to zero for 27.92% (172 of 616) of the workers in the
sample. Thus, we have a continuous variable pension with a nontrivial fraction of zero values.
The range of positive values of pension benefits is quite large: max = 2, 880.27, min = 7.28.
The Tobit model is an appropriate here as we have a significant proportion of the sample with
no pension benefits, and a fairly wide spread among those that do receive pension benefits.
1
We can use Tobit model in the following specification:
yi∗ , yi∗ > 0,

yi∗ 2
= xi β + εi , εi ∼ N (0, σ ), yi = max(0, yi∗ ) = (1)
0, yi∗ ≤ 0.
Because yi∗ is normally distributed, y has a continuous distribution over strictly positive values.
Notably, the density of y given x is the same as the density of yi∗ given x for positive values:
f (yi |xi ) = f (εi |xi ) = f (yi∗ − xi β|xi ). (2)
Normal pdf (with zero mean) implies:
x2 1 x 2
 
1 − 1 1 − 1 x
f (x) = √ e 2σ 2 =  √ e 2 σ  = φ (3)
2πσ σ 2π σ σ
Then we have a likelihood function for a positive values of yi :

∗
∗ 1 yi − xi β
Li = f (yi − xi β) = φ . (4)
σ σ
where φ = standard normal probability distribution function(pdf ) .
Further,

εi xi β
P (yi = 0|xi ) = P (yi∗
< 0|xi ) = P (εi < −xi β) = P <− (5)
σ σ

xi β xi β
=Φ − =1−Φ (6)
σ σ
and

xi β
P (yi > 0|xi ) = 1 − P (yi = 0|xi ) = Φ (7)
σ
where Φ = standard normal cumulative distribution function (cdf ).
To summarize:
 ∗
1 yi − x i β
 φ if yi∗ > 0,


σ σ
Li =
xi β
1 − Φ if yi∗ ≤ 0


σ
Then we can write the log–likelihood function of Tobit model as:
X ∗
X xi β 1 yi − x i β
`(β, σ) = 1[yi = 0] log 1 − Φ + 1[yi > 0] log φ (8)
σ σ σ
i i
2
(b) The output of the Tobit model:
Call:
censReg(formula = pension ~ exper + age + tenure + educ + depends + married + white + male)
Observations:
Total Left-censored Uncensored Right-censored
616 172 444 0
Coefficients:
Estimate Std. error t value Pr(> t)
(Intercept) -1.252e+03 2.191e+02 -5.717 1.09e-08 ***
exper 5.203e+00 6.010e+00 0.866 0.387
age -4.639e+00 5.711e+00 -0.812 0.417
tenure 3.602e+01 4.565e+00 7.892 2.97e-15 ***
educ 9.321e+01 1.089e+01 8.558 < 2e-16 ***
depends 3.528e+01 2.192e+01 1.610 0.107
married 5.369e+01 7.174e+01 0.748 0.454
white 1.441e+02 1.021e+02 1.412 0.158
male 3.082e+02 6.989e+01 4.409 1.04e-05 ***
logSigma 6.519e+00 3.562e-02 183.014 < 2e-16 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Newton-Raphson maximisation, 10 iterations

Return code 1: gradient close to zero
Log-likelihood: -3672.964 on 10 Df
Coefficient on white is insignificant, while coefficient on male is significant and positive.

Hence,being male increases predicted pension benefits according to our model.
The output of the linear model estimated by OLS:
Call:
lm(formula = pension ~ exper + age + tenure + educ + depends + married + white + male)
Residuals:
Min 1Q Median 3Q Max
-1180.76 -378.62 -48.39 312.15 2128.01
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -735.028 159.847 -4.598 5.19e-06 ***
exper 3.824 4.319 0.885 0.376
age -2.910 4.084 -0.713 0.476
tenure 27.238 3.437 7.924 1.10e-14 ***
educ 70.546 8.080 8.731 < 2e-16 ***
depends 35.100 16.330 2.149 0.032 *
married 13.906 53.285 0.261 0.794
white 114.936 75.076 1.531 0.126
male 272.953 51.975 5.252 2.09e-07 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 530 on 607 degrees of freedom

Multiple R-squared: 0.2767,Adjusted R-squared: 0.2672
F-statistic: 29.03 on 8 and 607 DF, p-value: < 2.2e-16
3
Using OLS, we can make the same conclusions about significance and signs of the coefficients
as in the case of using the Tobit model.
(c) If we want to estimate the expected value of y as a function of x, we need to find ”conditional
expectation” (E(yi |yi > 0, xi )) and ”unconditional expectation”(E(yi |xi ) ). Note that both
expectations are conditional on the explanatory variables in Tobit model.
Let’s derive E(yi |xi ) for Tobit model:

xi β
E(yi |xi ) = P (yi > 0|xi )E(yi |yi > 0, xi ) + P (yi = 0|xi ) · 0 = Φ E(yi |yi > 0, xi ) (9)
σ
To obtain E(yi |yi > 0, xi ) we are going to use the following properties of normally distributed
random variables and density distribution functions:
φ(c)
E(z|z > c) = for any constant c, φ(−c) = φ(c), Φ(−c) = 1 − Φ(c) (10)
1 − Φ(c)

εi xi β xi β xi β
P (yi > 0|xi ) = P r(εi > −xi β|xi ) = 1 − P <− |xi =1−Φ − =Φ .
σ σ σ σ
(11)
And E(yi |yi > 0, xi ):
E(yi |yi > 0, xi ) = E(xi β + εi |yi > 0, xi ) = xi β + E(εi |yi > 0, xi ) (12)

εi εi xi β
= xi β + E(εi |εi > −xi β, xi ) = xi β + σ · E | >− , xi (13)
σ σ σ

xi β xi β
σ·φ − σ·φ
σ σ xi β
= xi β + = xi β + = xi β + σλ (14)
xi β xi β σ
1−Φ − Φ
σ σ

xi β
where λ is called the inverse Mills ratio, which represents a ratio between the standard
σ
xi β
normal pdf and standard normal cdf, each evaluated at .
σ
Hence,

xi β xi β
E(yi |xi ) = Φ · [xi β + σλ ] (15)
σ σ
We are asked to calculate the following difference:

E(pension|white male) − E(pension|nonwhite f emale) = 966.6 − 582.27 = 384.33. (16)
For OLS:
E(yi |xi ) = E(xi β + εi |xi ) = xi β =⇒ (17)
E(pension|white male) − E(pension|nonwhite f emale) = 990.35 − 602.46 = 387.89. (18)
4
(d) The output of the Tobit model (including union regressor):
Call:
censReg(formula = pension ~ exper + age + tenure + educ + depends + married + white + male + union)
Observations:
616 172 444 0
Coefficients:
(Intercept) -1.572e+03 2.185e+02 -7.191 6.44e-13 ***
exper 4.394e+00 5.831e+00 0.753 0.451159
age -1.654e+00 5.556e+00 -0.298 0.765987
tenure 2.878e+01 4.505e+00 6.388 1.68e-10 ***
educ 1.068e+02 1.077e+01 9.916 < 2e-16 ***
depends 4.147e+01 2.121e+01 1.955 0.050624 .
married 1.975e+01 6.950e+01 0.284 0.776329
white 1.593e+02 9.897e+01 1.610 0.107487
male 2.572e+02 6.802e+01 3.782 0.000156 ***
union 4.390e+02 6.249e+01 7.026 2.12e-12 ***
logSigma 6.481e+00 3.548e-02 182.693 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Return code 1: gradient close to zero
Log-likelihood: -3648.552 on 11 Df
The coefficient on union is large and highly significant.

(e) The output of the Tobit model (including union regressor for pension-earning ratio(peratio)):
Call:
censReg(formula = peratio ~ exper + age + tenure + educ + depends + married + white + male + union)
Observations:
616 172 444 0
Coefficients:
(Intercept) -0.0550630 0.0144896 -3.800 0.000145 ***
exper 0.0001697 0.0003861 0.440 0.660230
age -0.0002176 0.0003669 -0.593 0.553081
tenure 0.0017605 0.0003019 5.832 5.48e-09 ***
educ 0.0053478 0.0007172 7.457 8.88e-14 ***
depends 0.0008265 0.0014185 0.583 0.560140
married 0.0032941 0.0046339 0.711 0.477163
white 0.0031793 0.0065656 0.484 0.628215
male 0.0025937 0.0045309 0.572 0.567021
union 0.0300458 0.0041859 7.178 7.08e-13 ***
logSigma -3.1270435 0.0359053 -87.091 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Return code 2: successive function values within tolerance limit
Log-likelihood: 607.5962 on 11 Df
5
The coefficients on male and white are insignificant individually. Let us use LR − test to check
the joint significance of these coefficients:
Likelihood ratio test
Model 1: peratio ~ exper + age + tenure + educ + depends + married + union

Model 2: peratio ~ exper + age + tenure + educ + depends + married + white + male + union
#Df LogLik Df Chisq Pr(>Chisq)

1 9 607.29
2 11 607.60 2 0.6049 0.739
As p–value is really high, we can’t reject the null hypothesis of the joint insignificance of white
and male. Neither male nor white coefficients are significant, individually or jointly, hence
these variables are not useful predictors of pension benefits as proportion of earnings. It would
thus seem that white males have larger pension benefits than non-white females simply because
they earn more on average.

Econometrics II S11 Solutions 2023

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics II S11 Solutions 2023

Uploaded by

Copyright:

Available Formats

JEB110 Econometrics II: LDV II 13-14 Dec, 2023

Winter Semester 2023/2024

Exercise Session 11 - Solutions

We can use Tobit model in the following specification:

yi∗ , yi∗ > 0,

f (yi |xi ) = f (εi |xi ) = f (yi∗ − xi β|xi ). (2)

Normal pdf (with zero mean) implies:

Then we have a likelihood function for a positive values of yi :

where φ = standard normal probability distribution function(pdf ) .

where Φ = standard normal cumulative distribution function (cdf ).

(b) The output of the Tobit model:

Newton-Raphson maximisation, 10 iterations

Coefficient on white is insignificant, while coefficient on male is significant and positive.

The output of the linear model estimated by OLS:

Residual standard error: 530 on 607 degrees of freedom

We are asked to calculate the following difference:

(d) The output of the Tobit model (including union regressor):

Newton-Raphson maximisation, 8 iterations

The coefficient on union is large and highly significant.

Newton-Raphson maximisation, 11 iterations

Likelihood ratio test

Model 1: peratio ~ exper + age + tenure + educ + depends + married + union

#Df LogLik Df Chisq Pr(>Chisq)

You might also like