This chapter develops several different statistical models to handle situations for which OLS and 2SLS are generally not appropriate. With the advent of cheap computing and large microdata sets, applied use of these models has burgeoned. We will restrict our attention to cross-section applications (their most frequent use) although several of the models discussed here have also been analyzed for the time series or panel data context.
This chapter develops several different statistical models to handle situations for which OLS and 2SLS are generally not appropriate. With the advent of cheap computing and large microdata sets, applied use of these models has burgeoned. We will restrict our attention to cross-section applications (their most frequent use) although several of the models discussed here have also been analyzed for the time series or panel data context.
This chapter develops several different statistical models to handle situations for which OLS and 2SLS are generally not appropriate. With the advent of cheap computing and large microdata sets, applied use of these models has burgeoned. We will restrict our attention to cross-section applications (their most frequent use) although several of the models discussed here have also been analyzed for the time series or panel data context.
CHAPTER 13
Discrete and Limited Dependent
Variable Models
In this chapter we develop several different statistical models to handle situations for
which OLS and 2SLS are generally not appropriate. Although many of the lessons
we learned from our extensive analysis of OLS models apply here as well, others
do not. Furthermore, the models we deal with in this chapter are generally nonlinear
models (i.e., nonlinear in the parameters); so that, unlike OLS, they frequently do not
maintain their desirable asymptotic properties when the errors are heteroscedastic, or
nonnormal. Thus. the models appear to be less robust to misspecification in general.
With the advent of cheap computing and large microdata sets, applied use of
these models has burgeoned. We will restrict our attention to cross-section applica
tions (their most frequent use) although several of the models discussed here have
also been analyzed for the time series or panel data context. The texts by Amemiya
and Maddala are useful points of departure for these and other more complicated
models.
13.1
TYPES OF DISCRETE CHOICE MODELS
Discrete choice models attempt (o explain a discrete choice or outcome. There are at
least three basic types of discrete variables, and each generally requires a different
statistical model.
Dichotomous, binary, or dummy variables. These take on a value of one or zero
depending on which of two possible results occur. The reader has already encoun-
'T. Amemiya, Advanced Econometrics, Harvard University Press, 1985; and G. S. Maddala, Limited
Dependent and Qualitative Variables in Econometrics, Cambridge University Press, 1983.
412‘carrer 13: Discrete and Limited Dependent Variable Models 413
tered these types of variables in previous chapters. In this chapter we will deal
with the case when such a variable is on the left-hand side of the relationship, i.c.,
when the dummy variable is an endogenous or dependent variable. Unlike the case
when the dummy variable is exogenous, the endogenous dummy variable poses spe-
cial problems that we have not yet addressed.
To take one example from labor economics, we may be interested in a person’s
decision to take a paying job in some reference period, say, a week. We can then
define a dummy variable y as follows:
= [1 ifperson iis employed in a paying job this week
0 otherwise
Other examples from labor economics include the decision to go to college or not, or
the decision to join a union or not.
Dummy variables are among the most frequently encountered discrete variables
in applied work, and we will analyze these types of models in detail. An example,
taken from the biometrics literature (where models for endogenous dummy vari-
ables were pioneered), is the case of evaluating an insecticide. We can imagine that
tolerance y; of an insect / to the insecticide is normally distributed across insects,
say, yf ~ M(4t, 02). If an insect’s tolerance is less than the dose x; of the insecticide,
the insect dies.
The problem is that we cannot observe the tolerance y} of a particular insect;
instead we only observe whether the insect lives or dies. That is, we observe yj, such
that
1 if the insect dies
0. otherwise
Given this setup, we can now turn to the question of interest: what is the probability
that insect / dies? It is merely the probability that the insect’s tolerance is less than
the dose:
prob(yi = 1) = probly} < x:) (13.1)
In this formulation, what we observe, y;, is generated by the following rule:
ya={} ify