You are on page 1of 6

JEB110 Econometrics II: LDV II 13-14 Dec, 2023

Winter Semester 2023/2024


Instructor: Barbara Pertold–Gebicka
Teaching Assistant: Kseniya Bortnikova

Exercise Session 11 - Solutions

Problem

Use the data in fringe.csv for this exercise. The file includes data on pension benefits for 616
workers based on US Survey from 1977.

(a) For what percentage of the workers in the sample a pension is equal to zero? What is the range
of pension for workers with non-zero pension benefits? Why is a Tobit model appropriate for
modeling pension?
(b) Estimate a Tobit model explaining pension in terms of exper, age, tenure, educ, depends,
married, white, and male. Do whites and males have statistically significantly higher expected
pension benefits? Compare with results obtained using OLS to estimate the model.
(c) Use the results from part (b) to estimate the difference in expected pension benefits for a white
male and a nonwhite female, both of whom are 35 years old, are single with no dependents,
have 16 years of education, and have 10 years of experience. Compare with OLS.
(d) Add union to the Tobit model and comment on its significance.
(e) Apply the Tobit model from part (d) but with peratio, the pension–earnings ratio, as the
dependent variable. (Notice that this is a fraction between zero and one, but, though it often
takes on the value zero, it never gets close to being unity. Thus, a Tobit model is fine as an
approximation.) Does gender and race have jointly an effect on the pension–earnings ratio?

Solution
(a) Using R we estimate that pension is equal to zero for 27.92% (172 of 616) of the workers in the
sample. Thus, we have a continuous variable pension with a nontrivial fraction of zero values.
The range of positive values of pension benefits is quite large: max = 2, 880.27, min = 7.28.
The Tobit model is an appropriate here as we have a significant proportion of the sample with
no pension benefits, and a fairly wide spread among those that do receive pension benefits.

1
JEB110 Econometrics II: LDV II 13-14 Dec, 2023

We can use Tobit model in the following specification:

yi∗ , yi∗ > 0,



yi∗ 2
= xi β + εi , εi ∼ N (0, σ ), yi = max(0, yi∗ ) = (1)
0, yi∗ ≤ 0.

Because yi∗ is normally distributed, y has a continuous distribution over strictly positive values.
Notably, the density of y given x is the same as the density of yi∗ given x for positive values:

f (yi |xi ) = f (εi |xi ) = f (yi∗ − xi β|xi ). (2)

Normal pdf (with zero mean) implies:

x2 1  x 2
 
1 − 1 1 − 1 x
f (x) = √ e 2σ 2 =  √ e 2 σ  = φ (3)
2πσ σ 2π σ σ

Then we have a likelihood function for a positive values of yi :


 ∗ 
∗ 1 yi − xi β
Li = f (yi − xi β) = φ . (4)
σ σ

where φ = standard normal probability distribution function(pdf ) .

Further,

 
εi xi β
P (yi = 0|xi ) = P (yi∗
< 0|xi ) = P (εi < −xi β) = P <− (5)
σ σ
   
xi β xi β
=Φ − =1−Φ (6)
σ σ

and
 
xi β
P (yi > 0|xi ) = 1 − P (yi = 0|xi ) = Φ (7)
σ

where Φ = standard normal cumulative distribution function (cdf ).

To summarize:
  ∗ 
1 yi − x i β
 φ if yi∗ > 0,


σ  σ 
Li =
xi β
1 − Φ if yi∗ ≤ 0


σ
Then we can write the log–likelihood function of Tobit model as:
   X  ∗ 
X xi β 1 yi − x i β
`(β, σ) = 1[yi = 0] log 1 − Φ + 1[yi > 0] log φ (8)
σ σ σ
i i

2
JEB110 Econometrics II: LDV II 13-14 Dec, 2023

(b) The output of the Tobit model:

Call:
censReg(formula = pension ~ exper + age + tenure + educ + depends + married + white + male)
Observations:
Total Left-censored Uncensored Right-censored
616 172 444 0
Coefficients:
Estimate Std. error t value Pr(> t)
(Intercept) -1.252e+03 2.191e+02 -5.717 1.09e-08 ***
exper 5.203e+00 6.010e+00 0.866 0.387
age -4.639e+00 5.711e+00 -0.812 0.417
tenure 3.602e+01 4.565e+00 7.892 2.97e-15 ***
educ 9.321e+01 1.089e+01 8.558 < 2e-16 ***
depends 3.528e+01 2.192e+01 1.610 0.107
married 5.369e+01 7.174e+01 0.748 0.454
white 1.441e+02 1.021e+02 1.412 0.158
male 3.082e+02 6.989e+01 4.409 1.04e-05 ***
logSigma 6.519e+00 3.562e-02 183.014 < 2e-16 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Newton-Raphson maximisation, 10 iterations


Return code 1: gradient close to zero
Log-likelihood: -3672.964 on 10 Df

Coefficient on white is insignificant, while coefficient on male is significant and positive.


Hence,being male increases predicted pension benefits according to our model.

The output of the linear model estimated by OLS:

Call:
lm(formula = pension ~ exper + age + tenure + educ + depends + married + white + male)
Residuals:
Min 1Q Median 3Q Max
-1180.76 -378.62 -48.39 312.15 2128.01
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -735.028 159.847 -4.598 5.19e-06 ***
exper 3.824 4.319 0.885 0.376
age -2.910 4.084 -0.713 0.476
tenure 27.238 3.437 7.924 1.10e-14 ***
educ 70.546 8.080 8.731 < 2e-16 ***
depends 35.100 16.330 2.149 0.032 *
married 13.906 53.285 0.261 0.794
white 114.936 75.076 1.531 0.126
male 272.953 51.975 5.252 2.09e-07 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 530 on 607 degrees of freedom


Multiple R-squared: 0.2767,Adjusted R-squared: 0.2672
F-statistic: 29.03 on 8 and 607 DF, p-value: < 2.2e-16

3
JEB110 Econometrics II: LDV II 13-14 Dec, 2023

Using OLS, we can make the same conclusions about significance and signs of the coefficients
as in the case of using the Tobit model.

(c) If we want to estimate the expected value of y as a function of x, we need to find ”conditional
expectation” (E(yi |yi > 0, xi )) and ”unconditional expectation”(E(yi |xi ) ). Note that both
expectations are conditional on the explanatory variables in Tobit model.
Let’s derive E(yi |xi ) for Tobit model:
 
xi β
E(yi |xi ) = P (yi > 0|xi )E(yi |yi > 0, xi ) + P (yi = 0|xi ) · 0 = Φ E(yi |yi > 0, xi ) (9)
σ
To obtain E(yi |yi > 0, xi ) we are going to use the following properties of normally distributed
random variables and density distribution functions:

φ(c)
E(z|z > c) = for any constant c, φ(−c) = φ(c), Φ(−c) = 1 − Φ(c) (10)
1 − Φ(c)
     
εi xi β xi β xi β
P (yi > 0|xi ) = P r(εi > −xi β|xi ) = 1 − P <− |xi =1−Φ − =Φ .
σ σ σ σ
(11)
And E(yi |yi > 0, xi ):
E(yi |yi > 0, xi ) = E(xi β + εi |yi > 0, xi ) = xi β + E(εi |yi > 0, xi ) (12)
 
εi εi xi β
= xi β + E(εi |εi > −xi β, xi ) = xi β + σ · E | >− , xi (13)
σ σ σ
   
xi β xi β
σ·φ − σ·φ  
σ σ xi β
= xi β +   = xi β +   = xi β + σλ (14)
xi β xi β σ
1−Φ − Φ
σ σ
 
xi β
where λ is called the inverse Mills ratio, which represents a ratio between the standard
σ  
xi β
normal pdf and standard normal cdf, each evaluated at .
σ

Hence,

   
xi β xi β
E(yi |xi ) = Φ · [xi β + σλ ] (15)
σ σ

We are asked to calculate the following difference:


E(pension|white male) − E(pension|nonwhite f emale) = 966.6 − 582.27 = 384.33. (16)

For OLS:
E(yi |xi ) = E(xi β + εi |xi ) = xi β =⇒ (17)
E(pension|white male) − E(pension|nonwhite f emale) = 990.35 − 602.46 = 387.89. (18)

4
JEB110 Econometrics II: LDV II 13-14 Dec, 2023

(d) The output of the Tobit model (including union regressor):

Call:
censReg(formula = pension ~ exper + age + tenure + educ + depends + married + white + male + union)
Observations:
Total Left-censored Uncensored Right-censored
616 172 444 0
Coefficients:
Estimate Std. error t value Pr(> t)
(Intercept) -1.572e+03 2.185e+02 -7.191 6.44e-13 ***
exper 4.394e+00 5.831e+00 0.753 0.451159
age -1.654e+00 5.556e+00 -0.298 0.765987
tenure 2.878e+01 4.505e+00 6.388 1.68e-10 ***
educ 1.068e+02 1.077e+01 9.916 < 2e-16 ***
depends 4.147e+01 2.121e+01 1.955 0.050624 .
married 1.975e+01 6.950e+01 0.284 0.776329
white 1.593e+02 9.897e+01 1.610 0.107487
male 2.572e+02 6.802e+01 3.782 0.000156 ***
union 4.390e+02 6.249e+01 7.026 2.12e-12 ***
logSigma 6.481e+00 3.548e-02 182.693 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Newton-Raphson maximisation, 8 iterations


Return code 1: gradient close to zero
Log-likelihood: -3648.552 on 11 Df

The coefficient on union is large and highly significant.


(e) The output of the Tobit model (including union regressor for pension-earning ratio(peratio)):

Call:
censReg(formula = peratio ~ exper + age + tenure + educ + depends + married + white + male + union)
Observations:
Total Left-censored Uncensored Right-censored
616 172 444 0
Coefficients:
Estimate Std. error t value Pr(> t)
(Intercept) -0.0550630 0.0144896 -3.800 0.000145 ***
exper 0.0001697 0.0003861 0.440 0.660230
age -0.0002176 0.0003669 -0.593 0.553081
tenure 0.0017605 0.0003019 5.832 5.48e-09 ***
educ 0.0053478 0.0007172 7.457 8.88e-14 ***
depends 0.0008265 0.0014185 0.583 0.560140
married 0.0032941 0.0046339 0.711 0.477163
white 0.0031793 0.0065656 0.484 0.628215
male 0.0025937 0.0045309 0.572 0.567021
union 0.0300458 0.0041859 7.178 7.08e-13 ***
logSigma -3.1270435 0.0359053 -87.091 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Newton-Raphson maximisation, 11 iterations


Return code 2: successive function values within tolerance limit
Log-likelihood: 607.5962 on 11 Df

5
JEB110 Econometrics II: LDV II 13-14 Dec, 2023

The coefficients on male and white are insignificant individually. Let us use LR − test to check
the joint significance of these coefficients:

Likelihood ratio test

Model 1: peratio ~ exper + age + tenure + educ + depends + married + union


Model 2: peratio ~ exper + age + tenure + educ + depends + married + white + male + union

#Df LogLik Df Chisq Pr(>Chisq)


1 9 607.29
2 11 607.60 2 0.6049 0.739

As p–value is really high, we can’t reject the null hypothesis of the joint insignificance of white
and male. Neither male nor white coefficients are significant, individually or jointly, hence
these variables are not useful predictors of pension benefits as proportion of earnings. It would
thus seem that white males have larger pension benefits than non-white females simply because
they earn more on average.

You might also like