You are on page 1of 14

Lecture 6 Limited Dependent Variable -1

ECMT-7302
Econometrics II
MA Eco. 2022, Fall 2023
Instructor: Sunaina Dhingra
Lectures: Wednesday, (11.20-12.50pM) & Thursday (9-40-11.10 am)
Lecture Meeting Mode: In person (Classroom: T4-F99)
Office Hours: Wednesday 1-2.30 pm & by appointment in FOB, Office No.1B in south on 7th Floor)
Email-id: sunaina@jgu.edu.in
Lecture Material: Slides and textbooks
Credits: 4.5
Types of dependent variables and their estimation method

Dependent Variable Independent Variable Estimation Method


Continuous (Quantitative) Quantitative/ Qualitative Ordinary least square
(OLS)
Binary (Qualitative) Quantitative/ Qualitative LPM/Logit/Probit

Categorical (Qualitative) Quantitative/ Qualitative Multinomial Logit/Probit

Ordered Categorical Quantitative/ Qualitative Cumulative Logit/ Probit


(Qualitative)
Repeated Binary Quantitative/ Qualitative Panel Logit/ Probit

1-2
Limited Dependent Variable Models
• Limited dependent variables (LDV)
• LDV are is substantively restricted
• Binary vavariables whose range riables, e.g. employed/not employed
• Nonnegative variables, e.g. wages, prices, interest rates
• Nonnegative variables with excess zeros, e.g. labor supply
• Count variables, e.g. the number of arrests in a year
• Censored variables, e.g. unemployment durations

3
Regression with a Binary Dependent Variable

1. The Linear Probability Model


2. Probit and Logit Regression
3. Estimation and Inference in Probit and Logit
4. Application to Racial Discrimination in
Mortgage Lending
Binary Dependent Variables: What’s Different?
So far the dependent variable (Y) has been continuous:
• district-wide average test score
• traffic fatality rate

What if Y is binary?
• What is the effect of a tuition subsidy on an individual’s decision to go to college?
• Y = get into college, or not; X = Tuition Subsidy, high school grades, SAT scores,
demographic variables
• What determines whether a teenager takes up smoking?
• Y = person smokes, or not; X = cigarette tax rate, income, demographic variables
• What determines whether a country receives foreign aid?
• What determines whether a mortgage applicant is successful?
• Y = mortgage application is accepted, or not; X = race, income, house characteristics, marital status
• In all these examples, the outcome of interest is binary: The student does or does not
go to college, the teenager does or does not take up smoking, a country does or does
not receive foreign aid, the applicant does or does not get an approval on mortgage
application.
• The binary dependent variable considered in this chapter is an example of a dependent
variable with a limited range; in other words, it is a limited dependent variable.
Example: Mortgage Denial and Race
The Boston Fed HMDA Dataset
• Individual applications for single-family mortgages made in 1990 in the greater
Boston area
• 2380 observations, collected under Home Mortgage Disclosure Act (HMDA)
Variables
• Dependent variable:
• Is the mortgage denied or accepted?
• Independent variables:
• income, wealth, employment status (One important piece of information is the size of the
required loan payments relative to the applicant’s income)
• other loan, property characteristics
• race of applicant
• Research Question: whether race is a factor in denying a mortgage application;
the binary dependent variable is whether a mortgage application is denied
• What does it mean to fit a line to a dependent variable that can take on only two values, 0
and 1?
• The answer to this question is to interpret the regression function as a conditional
probability. It allows us to apply the multiple regression models to binary dependent
variable. But the predicted probability interpretation also suggests that alternative,
nonlinear regression models can do a better job modeling these probabilities
Example: linear probability model, HMDA data Mortgage denial v. ratio of debt
payments to income (P/I ratio) in a subset of the HMDA data set (n = 127)

what, precisely, does it mean for the predicted value


of the binary variable deny to be 0.20?

when P/I ratio =


0.3, the predicted
value of deny is what, precisely, does it mean for the predicted
0.20 value of the binary variable deny to be 0.20?
Binary Dependent Variables and the Linear Probability Model
• A natural starting point is the linear regression model with a single regressor:
• Yi = β0 + β1Xi + ui
• But:
• What does β1 mean when Y is binary?

• Is β1 =
Y ?
X
• What does the line β0 + β1X mean when Y is binary?

• What does the predicted value Yˆ mean when Y is binary?


• For example, what does Yˆ = 0.26 mean?

• Two key things:

• 1. the population regression function is the expected value of Y given the regressors,
• E(Y| X1, X2,…Xk).
• 2. if Y is a 0–1 binary variable, its expected value (or mean) is the probability that Y = 1;
that is,
• E(Y) = 0 * Pr(Y = 0) + 1 * Pr(Y = 1) =Pr(Y = 1).
• In the regression context the expected value is conditional on the value of the regressors,
so, the probability is conditional on X. Thus for a binary variable,
• E(Y| X1, X2,…Xk). = Pr(Y = 1| X1, X2,…Xk)
The linear probability model, ctd.

In the linear probability model, the predicted value of Y is interpreted as the


predicted probability that Y=1, and β1 is the change in that predicted
probability for a unit change in X. Here’s the math:

Linear probability model: Yi = β0 + β1Xi + ui


When Y is binary,
E(Y|X) = 1×Pr(Y=1|X) + 0×Pr(Y=0|X) = Pr(Y=1|X)

Under LS assumption #1, E(ui|Xi) = 0,


so
E(Yi|Xi) = E(β0 + β1Xi + ui|Xi) = β0 + β1Xi,
so
Pr(Y=1|X) = β0 + β1Xi
The linear probability model, ctd.
• When Y is binary, the linear regression model Yi = β0 + β1Xi + ui
is called the linear probability model because Pr(Y=1|X) = β0 + β1Xi

• The predicted value is a probability:


• E(Y|X=x) = Pr(Y=1|X=x) = prob. that Y = 1 given x
= Yˆ the predicted probability that Yi = 1, given X

• β1 = change in probability that Y = 1 for a unit change in x:

β1 = Pr(Y = 1| X = x + x) − Pr(Y = 1| X = x)


x
• The regression coefficient β1 is the change in the probability that Y = 1
associated with a unit change in X1, holding constant the other regressors,
and so forth for β2, β3, ……βk. The regression coefficients can be estimated
by OLS, and the usual (heteroskedasticity-robust) OLS standard errors can
be used for confidence intervals and hypothesis tests.
Linear probability model: full HMDA data set
deny = -.080 + .604P/I ratio
(n = 2380) (.032) (.098)

• What is the predicted value for P/I ratio = .3?

Pr (deny = 1|P / Iratio = .3) = -.080 + .604×.3 = .101


• Calculating “effects:” increase P/I ratio from .3 to .4:

Pr (deny = 1|P / Iratio = .4) = -.080 + .604×.4 = .161

The effect on the probability of denial of an increase in


P/I ratio from .3 to .4 is to increase the probability by
.061, that is, by 6.1 percentage points (what?).
Example: linear probability model, HMDA data Mortgage denial v. ratio of debt
payments to income (P/I ratio) in a subset of the HMDA data set (n = 127)
Linear probability model: HMDA data, ctd
Next include black as a regressor:
deny = -.091 + .559P/I ratio + .177black
(.032) (.098) (.025)
Predicted probability of denial:
• for black applicant with P/I ratio = .3:
Pr (deny = 1) = -.091 + .559×.3 + .177×1 = .254
• for white applicant, P/I ratio = .3:
Pr (deny = 0)= -.091 + .559×.3 + .177×0 = .077
• difference = .177 = 17.7 percentage points
• Coefficient on black is significant at the 5% level
• Still plenty of room for omitted variable bias…
The linear probability model: Summary

• The linear probability model models Pr(Y=1|X) as a linear function of X


• Advantages:
• simple to estimate and to interpret; coefficients are Marginal Effects
• inference is the same as for multiple regression (need heteroskedasticity-robust
standard errors)
• Disadvantages:
• A LPM says that the change in the predicted probability for a given change in X is
the same for all values of X, but that doesn’t make sense. Think about the HMDA
example…
• Also, LPM predicted probabilities can be <0 or >1 which are wrong predictions
• ME are the coefficients but they are constant/ do not vary with x
• No use of R2
• Heteroscedasticity because the variance is not constant
• Var(y)= P(y=1)*[1-P(y=1)] which depends on x
• These disadvantages can be solved by using a nonlinear probability
model: probit and logit regression

You might also like