DETERMINANTS OF HEALTH CARE DECISIONS INSURANCE, UTILIZATION, AND EXPENDITURES Chan Shen

DETERMINANTS OF HEALTH CARE DECISIONS: INSURANCE, UTILIZATION, AND
EXPENDITURES
Author(s): Chan Shen
Source: The Review of Economics and Statistics , March 2013, Vol. 95, No. 1 (March
2013), pp. 142-153
Published by: The MIT Press
Stable URL: https://www.jstor.org/stable/23355656
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
The MIT Press is collaborating with JSTOR to digitize, preserve and extend access to The
Review of Economics and Statistics
This content downloaded from

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
All use subject to https://about.jstor.org/terms
DETERMINANTS OF HEALTH CARE DECISIONS:
INSURANCE, UTILIZATION, AND EXPENDITURES
Chan Shen*
correction
Abstract—This paper studies three interrelated health care decisions: insur for sample selection (Heckman, 1976, 1979).
ance, utilization, and expenditures. The model treats insurance as an
endogenous variable with respect to both utilization and expenditures,
An alternative approach is to use a two-part model (Duan
et al., 1983, 1984, 1985).
addresses potential selection issues, and takes into account that the deci Both of these may be problem
atic:
sions to use health care and the level of treatment are determined by different the Heckman correction approach can be sensitive to
decision makers. We employ semiparametric methods to avoid making dis
the distributional assumptions on error terms, while the
tributional assumptions. Using the Medical Expenditure Panel Survey 2005
two-part model approach also makes implicit distributional
data, the semiparametric approach predicts insurance to increase the level
assumptions
of expenditures by 48%, a number in accord with an important experimental (Puhani, 2000). The literature addressing health
study and less than half that obtained using parametric methods.
economics and economics in general does not provide a the
oretical foundation or justification for these distributional
I. Introduction assumptions. Moreover, if incorrect, they can result in incor
rect inferences and policy conclusions with respect to health
care decisions.
A major health
is the care
growing policy issue
population in insurance.
without the UnitedTheStates
key today Yet another challenge is the complicated nature of the
questions are: How does health insurance coverage affect the decision-making process. In health care, both the patient
likelihood an individual seeks medical care? and How does and the doctor are involved in making decisions. The patient
health insurance affect health care expenditures? decides whether to visit a doctor (or, more generally, a health
There are many empirical challenges in studying people's care provider), and then the patient and doctor jointly decide
health care decisions. An individual's decision about whether what treatment the patient will have. These decisions are
to use health care may depend on his or her insurance cov
interrelated. Some papers deal with the two-part decision
erage. The level of use likely also depends on whether the
making process in health care utilization (Newhouse, 1993;
individual has insurance. However, because insurance isMullahy,
a 1998), but none addresses the whole process of
choice variable for the individual, we must allow the possibil
insurance choice, utilization, and expenditure level.
ity that this variable is endogenous. For example, people who This paper contributes to the literature by taking into
account the interrelated nature of health care decisions and
have a greater need for health care have more incentive to buy
health insurance. Some papers deal with this endogeneity by using a semiparametric approach to address the empirical
using instrumental variables (Vera-Hernandez, 1999; Holly,
challenges. We study three health care decisions: insurance
Gardiol, & Huguenin, 2002; Wooldridge, 2002); others use
coverage, utilization, and the level of expenditures. Using
experimental data to avoid this problem (Manning et al.,
the Medical Expenditure Panel Survey (MEPS) 2005 data,
1987; Newhouse & Insurance Experiment Group, 1993).
we formulate and estimate a model for these three health
However, instruments that are correlated with insurance cov care decisions. Because there is not a strong justification for
erage but not with use are difficult to find. Experimental data
normality assumptions underlying a traditional parametric
are scarce and often out of date. For example, the RAND formulation, we employ a semiparametric approach in which
Health Insurance Experiment, which remains the largest these assumptions are not made. As an additional advan
health policy study in U.S. history, started in 1971 and lasted
tage to a semiparametric approach, since marginal effects
for 15 years (RAND, 1974-1982). The structure, practice, in general will not be constant in nonlinear models, we will
and philosophy of medicine have changed dramatically sincereport the impact of changing a variable of interest at sev
the 1980s, as has the insurance industry. eral different points in its distribution. The semiparametric
Another empirical challenge lies in expenditure decisions,approach will also allow greater flexibility in the pattern of
where we observe positive expenditures only from individ these effects than in the parametric case. Nevertheless, as a
uals who decide to see a doctor. One standard parametric
convenient benchmark, we also estimate the model using a
approach deals with this problem by making distributional
standard parametric approach.
assumptions about error terms and then using a HeckmanWhile the focus of this paper is on health care decisions,
the methods used would also apply to other endogenous treat
ment models. For example, in labor economics, a woman's
Received for publication April 2,2009. Revision accepted for publication
May 2, 2011. fertility and marriage decisions, the decision to join the
* Georgetown University. workforce, and wage level have a similar structure.
I thank Roger Klein, Carolyn Moehling, John Landon-Lane, Francis Vella,
the editor and the referees for all their helpful comments and suggestions. I
The paper is organized as follows. Section II introduces
also thank Louise Russell and Usha Sambamoorthi for helpful discussions. the model and explains the parametric and semiparametric
I have also benefited from comments at various seminars. All mistakes are
approaches; section III describes the data set; section IV
mine.
A supplemental appendix is available online at http://www.mitpress
gives the main results; and section V provides conclusions,
journals.org/doi/suppl/10.1162/REST_a_00232. discussions, and future research directions.
The Review of Economics and Statistics, March 2013, 95(1): 142-153

© 2013 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
DETERMINANTS OF HEALTH CARE DECISIONS
II. The Model To avoid making strong distributional assumptions that are
hard to justify, in this paper we employ a semiparametric
We study a set of three equations to examine the effects of
method to estimate the three health care equations discussed
different factors on health care decisions: health insurance,
above. Indeed, we will find that standard parametric dis
utilization, and expenditures. The first equation deals with
tributional assumptions (e.g., joint normality) do not hold.
the health insurance choice. Let I be an indicator of whether
Nevertheless, as a convenient benchmark, we also provide the
an individual selects private health insurance coverage. In the
parametric formulation and results. There are many methods
model, an individual selects insurance if the net value to so
for estimating the parametric model. To make the role of the
doing, V[ — £/, is greater than 0. With V/ determined by a set
parametric assumptions transparent, we estimate the para
of exogenous variables X/ and 1 {•} as an indicator function,
metric model in a manner that parallels the semiparametric
the model is as follows:
approach.
I = 1{V7 > 6/}, where V/ = X/ß/.
A. Parametric Model
The second equation describes the decision to seek health
care. Let A be an indicator of whether an individual seeks In the parametric model, we assume that the error terms
access to health care from a doctor or other health carein the system of three equations follow a trivariate normal
providers, and let XA be a set of exogenous variables distribution.
that A two-step estimation method is then employed
determine the net value of utilizing health care. Then: to estimate the three equations. In the first step, the insurance
and utilization decisions are jointly estimated by maximum
A = 1 {Va+IQa > sA}, where VA = XA$A. likelihood (bivariate probit).
To identify the parameters without relying on nonlinear
Notice that the insurance coverage enters this utilization
ities, we require restrictions on the model. The insurance
(access) equation. There is a vast literature about the effects
equation will depend on only exogenous variables, Xi, while
of moral hazard and adverse selection (Arrow, 1963;the Roth
access decision will depend on exogenous variables, XA,
schild & Stiglitz, 1976; Chiappori & Salanie, 2001 ; Cardon &
and whether the individual has insurance. In this triangular
Hendel, 2001). On the one hand, people who have insurance
system of binary equations, the insurance equation is identi
are much more likely to use health care than their uninsured
fied, as it is essentially a reduced form. However, to identify
counterparts. On the other hand, people who have greater
the access equation, we impose exclusion restrictions on it
demand for health care (e.g., those with high comorbidity
(we discuss these in section III).
levels) may have more incentive to obtain insurance cover
In addition to the parameters in the joint model for the two
age. Consequently in our estimations, we will use methods
decisions, the likelihood depends on the correlation between
that deal with this endogeneity issue. the errors. A nonzero correlation between the two error terms
The last equation explains the level of expenditures.
would indicate the endogeneity of insurance with respect to
Denote Ye as the log of level of expenditures and XE
theas a
utilization decision. As will be described below, we find
set of exogenous variables that affects expenditures forthis indicorrelation to be small in absolute magnitude and not
viduals who access health care services. Then the model is
statistically different from 0.
given as
In the second step, we estimate the expenditure equation
by employing a Heckman correction (Heckman, 1976; Lee,
Yß = Xe$e + I&E + m : A = 1.
1982) that controls for both sample selection and endogene
An individual incurs positive expenditures only if a visit ity. To simplify this correction, we employ a form for it that
is made. The patient decides whether to visit a doctor, and is applicable when, as was found empirically, utilization and
then a joint decision is made by both the doctor and the insurance errors are not correlated.1 For individuals who use
patient. We address this two-part decision-making process by health care, recall the form of the expenditure model in the
separating the two equations and allowing them to have differ previous section. With u as the error term in the log expen
ent explanatory variables and parameters. Again, insurance, diture model and denoting Xs as the set of all the exogenous
health care, and the individual's health status are interrelated. variables in the system of three equations, for d e {1,0},
define
Insurance coverage is included in this model because it may
affect the patient's and doctor's joint decision about treatment
plans. For example, insured people are much more likely to \dGd (VA, V',) = E(u\Xs,A = 1,1 = d).
buy brand-name medications instead of their generic counter
In a parametric model with jointly normal errors, the G
parts. There could also be an adverse selection problem here,
functions above are known and the Xs are parameters whose
because people who are less healthy might have more incen
values are unknown. Typically the above expectations are not
tive to purchase insurance. Hence, our model will account for
the interrelations between these variables and will employ
1 As discussed in the next section, in a semiparametric formulation, we will
estimation methods that deal with both sample selection and not need to make any assumptions on the functional form of this correction
endogeneity issues. factor.

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
THE REVIEW OF ECONOMICS AND STATISTICS
0 and depend on the variables Xs, A, and I. To estimate

of theour
estimator. In general, a single index restriction takes
the form
model, we seek to remove the dependence of the errors on
these conditioning variables. To this end, for d G {1,0}, we
can rewrite the expenditure equation as E{l\X)=E(l\VI)=Fl(VI).
Ye = Xe$e + + '^dGd In this form, not only is the function F\ left unspecified
but the model also permits very flexible interactions between
+ u*d : A = 1,1 = d,Uj = u - ~kdGd (VA, V».
errors and the index.
In some problems, a single index may not adequately
By construction, the conditional expectation of the recentered
describe the underlying behavior of interest. Given that the
error is 0: E (u*d\Xs,A = 1,7 = d) = 0. access model is not linear, when insurance is endogenous with
Provided that the above equation is identified and respect
joint nor
to access, the access probability depends not only on
mality holds, OLS estimation provides consistent estimates.
its own index but also on the exogenous index driving th
To identify it without relying on nonlinearities insurance
in the G decision. In this case, a double index model woul
controls, we impose exclusion restrictions on the exogenous
be appropriate:
variables XE that enter this equation. (Detailed discussions
E(Y\X)
about these and other restrictions are provided in section III.) = E{Y\VU VA) = F2(V,, Va),
We conclude this discussion about the parametric model
where V/, VA are now two indices. Again, there are meth
by emphasizing the importance of its restrictive parametric
assumptions. Both the bivariate probit specification ods
and for
thereliably estimating the above expectation under this
form of the correction term depend on the (joint) double index structure. As discussed below, estimators for
normal
ity assumption. If this assumption is incorrectly imposed, and double index models will be employed here
both single
the resulting estimator is typically inconsistent. In Throughout,
the next we use the notation E(Y\V) to denote an esti
section, we propose a semiparametric approach that mated conditional expectation for Y conditioned on V, where
does not
make distributional assumptions. V may be a single index or a vector containing two indices.
When this estimated expectation is evaluated at an estimat
of V, as we do below, we will write Ê(Y\V).
B. Semiparametric Model
Before continuing, it is important to discuss identification
While the semiparametric model generalizes theof both index parameters and marginal effects of interest
paramet
Recall that
ric model, it does retain a parametric (index) restriction to in the parametric case, the original parameter
are identified
ensure that the estimator "works well" in moderately sized under exclusion restrictions. However, as in all
samples. To illustrate this restriction, return to thenonlinear
insurancemodels, parameters do not translate directly into
model. In a commonly employed probit specification, marginal effects, which are of primary interest. Margina
effects are recovered by comparing estimated probabilitie
P(/ = 1|X) = 4>(Z/ß/), based on parametric distributional assumptions. In the semi
parametric case, however, it is well known in the literature
where the function $ is the cumulative distribution func that the index parameters can at most be identified up to loca
tion for the model's standard normal error component, tione/. and scale. For simplicity, we illustrate the issue for the
insurance
In a semiparametric formulation, this function need not be decision. As will be discussed below, the estimates
are based in part on an estimate of the probability:
specified and indeed can be estimated from the data along
with parameters of interest. In such a formulation, the model
Pr{l = l|X;ß,) = Pr(I = 1| a + fc(X/ß/)),
is semiparametric because it makes no parametric assump
tions on the error distribution but does assume a parametric where a and b ^ 0 are constants.
index, V} = X/ß/. This index, V}, need not be linear, but it
The probability does not depend on a or b. Therefore, only
is important that it has a parametric form. In a more general
ratios of index parameters are identified. Nevertheless, th
nonparametric formulation, we might write
scaled parameters enable us to recover probabilities and,
hence, marginal effects of interest.
P(I =l\X) = F(XxX2, ...,Xk) = E(I\X).
Unlike the insurance decision, the utilization decision
depends on the endogenous insurance decision with coef
However, when the dimension of X is large, it is difficult
ficient 0A- Although this parameter is not identified, we
to "reliably" estimate the above probability (expectation).2
can recover the corresponding marginal effect by looking
Index restrictions serve to keep the relevant dimension of the
at an appropriate probability change. One possibility is to
problem small and thereby improve the finite sample behavior
report the difference in access probabilities conditioned on
insurance and no insurance, which are estimable semipara
2 If X is continuous, then the convergence rate of the estimated expectation
metrically, as we discuss below. It is easy to justify this
to the truth becomes slower as the dimension of X increases. If X is discrete,
there may be few observations to estimate E(Y\X) at each value of calculation
X. if the insurance error does not depend on the

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
Hence, Pn can be estimated by estimating each of the above

access error. Because the estimated parametric correlation
two
between access and insurance errors is insignificant and expectations semiparametrically. The first expectation
small
over A has a double index form, and the second one has a
in absolute magnitude, this calculation may be reasonable
single
and is reported here. It is also possible to estimate the marindex form. The product of the above expectations
(probabilities)
ginal effect of insurance when access and insurance errors are then provides the joint probability of interest.
In general double index models, identification requires that
dependent. Vytlacil and Yildiz (2007) discuss identification
in this context.3 each index contains a continuous variable that is excluded
from the other. Since one component (insurance) of the
For the expenditure equation, subject to the exclusion
model has a single index form, it is not required here. We do
restrictions made here, all parameters are directly identified
other than the coefficient on insurance, 0/. This parameter require,
is however, that the insurance equation contains a con
a direct marginal effect of interest and is an importanttinuousparamvariable excluded from the access equation and that
eter in the model. Since there is no evidence that insurance the access equation depends on at least one continuous vari
can be treated as exogenous in the expenditure equation, we able.4 In addition to these continuity restrictions, we require
discuss an estimation method for recovering this parameter. and impose the same exclusion restrictions discussed in the
Turning to the estimation method, in the insurance and uti previous section for the parametric model.
lization decisions, we estimate the model by a method that is Given the estimated probabilities, we can now proceed as in
analogous to that for the parametric case. For that case, the Klein and Spady (1993) to estimate the model by maximizing
form of the likelihood is known and the model is estimated the following estimated log likelihood:
by maximum likelihood. In contrast, here we do not make
any distributional assumptions on error components, imply
ing that the form of the likelihood is unknown. Nevertheless,
LogL = EE1"» (ii)Ln[Prs(i)].
i r,s
it is possible to employ index assumptions above to develop
an estimator for the likelihood.
When we assume that the above probabilitie
We employ an estimator based on an extension of the
and have a bivariate normal structure, the estim
approach in Klein and Shen (2010), where a bias correction
bivariate probit. By estimating the probabilities
mechanism was proposed to overcome finite sample perfor
assumptions as discussed above, we avoid assum
mance issues of common semiparametric estimators in the
ric functional forms.5 Employing bias control
literature. Monte Carlo studies in that paper show that this
kernels, KSV show that this estimator is con
estimator dominates the others in terms of mean squared
asymptotically distributed as normal (see the app
error. One component of the model below contains a trian
Turning to the expenditure equation, we ag
gular system of binary response equations. Klein, Shen, and
correction term that will enable us to deal with the sam
Vella (2010; hereafter KSV) extend the bias-control mech
ple selection and endogeneity problems. As above, with Va
anism discussed above to establish desirable large-sample
referring to (VA, V,), consider the control function,
properties for the estimator of this component. The estimator
for these components of the model is then based on maxi
mizing an estimated log likelihood. To define this function, Gd(V0) = E(u\A = 1,1 = d,Xs)
for individual i and r,s € {0,1}, let = E(u\A = l,I = d,V0),
Yrs(i) = 1{A(0 = r,m = s}, where d e {1,0}. Notice that this adjustment is similar to
that in the parametric case, but now we do not make any
with the corresponding probabilities: assumptions on its functional form here in the semiparametric
formulation. With c as a constant, let XEfiE = Xcßc + c, and
P„(i) = Pr(rrs(i) = l|VA(0,V/(0). rewrite the expenditure equation as
Suppressing individual subscripts for notational simplicity, Ye = X$c + c + 0£<i + Gd

for r = 1 and 5 = 1 (other cases are analogous), notice that + u*d : A = 1,1 - d,u*d = u - Gd{V0),
E [u*\A = 1,1 = d,Xs] = E [u*d\A = 1,1 = d, V0] = 0.
Pn = Pr (Yu = l\VA, V,) = Pr (A = 1,1 = 1\VA, V,)
= Pr (A = l\I = 1, VA, V/) Pr (/ = l\VA, V,) 4 In the insurance model, we treat the following variables: age, age2, num
= Pr (A = 1|/ = 1, VA, V/)Pr (/ = 11 V>) ber of comorbidities, years of education, family size, and industry insurance
rate as being approximately continuous; in the access decision, these vari
= E(A\I = l,VA,VI)E(I\Vl). ables are age, age2, number of comorbidities, years of education, and family
size.
5 For technical reasons, and as is standard in this literature, we trim out
3 Klein, Shen, and Vella (2012) develop an adaptive estimator for thiscertain observations for which the probabilities are poorly estimated (see
impact. the appendix).

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
of this estimator.
Since the control functions are unknown, we extend Robin We extend this method to estimate both
the constant
son's differencing method (Robinson, 1988) to eliminate theterm and the marginal effect of the endogenous
unknown control functions: insurance.
To set the intuition for the proposed estimator, notice that
YE-E(YE\A = \,I = d,V0) if there were no selection issues, we could proceed to develop
= [Xc — E(XC\A = 1,/ = d, Vomc + u*. an IV estimator for these parameters. To deal with selection,
with PA = Pr(A — 1|V0), for the error in the expenditure
With * denoting a differenced variable, we can rewrite the
equation,
above equation as
E(u\Vo) = 0 = PaE(u\A = l,V„)
F* = X*ßc + u*.
+ (1 - Pa)E(u\A = 0, V0).
Before proceeding to estimate the above differenced
For such individuals with an access probability of 1 (PA = 1),
model, several identification issues need to be discussed.
there would not be a selection problem in that from above,
First, it is clear that the constant term and the insurance vari
able disappear from the model. Second, as in the parametric E(u\A = l,Vo)=E(u\Vo)=0,
model, we require additional identifying restrictions. To this
end, we impose the same exclusion restrictions as in the para and we could proceed with IV estimation, employing
metric model discussed above. To see that these restrictions
are needed, suppose that there are no variables excluded from E(J\A = 1, V0) = Pr (/ = 1|A = 1, V0)
Xc that appear in the indices Vj and VA. Without such restric
as an instrument for I. Two implementation issues now need
tions, it will be possible to take linear combinations of the Xc
variables and reproduce one of the indices. to be solved. First, because the above probability is unknown,
Replacing true expectations and index parameter values require a semiparametric estimate of this function as a fea
we
with their estimates, we first use OLS to estimate the expen sible instrument. Second, we need an appropriate definition
diture equation and get consistent estimates and residuals.of a high-access probability.
Second, employing squared residuals, in a semiparametric With a > 0, define a high-probability set as one for which
regression, we estimate the variance for the error condi PA > 1 — N~". In implementing this rule, we use estimated
tioned on the X variables through the two indices. We then semiparametric probabilities described in the appendix.6 In
employ these conditional variances in a GLS approach tosetting a, as in A&S, there is a bias-variance trade-off that
obtain the final results. Notice that the GLS estimator deals guides its selection. If a is set very high, then the bias will be
with the heteroskedasticity but not the first-stage estimation very low. However, the sample size available for IV estima
uncertainty. This uncertainty comes from the fact that esti tion on the high-probability set will then effectively be very
mated expectations are employed in place of true expectations small, resulting in a high variance. Similarly, if a is set too
and estimated index parameters are substituted for the true low, the variance can be made small, but the bias will not van
ones. It can be shown that the estimated expectations may be ish sufficiently fast. To set a, let S be a smoothed indicator
taken as known and do not affect standard errors for the esti of the form in A&S that is 0 unless observations are in the
mated expenditure parameters, while the uncertainty from high-probability set. Then that paper shows that the bias will
estimated index parameters must be taken into account. In vanish appropriately fast if
particular, as in standard parametric sample selection mod
els, the covariance matrix for these second-stage expenditure B = Nl/2\E [uAS] /Je (AS2)I 0.
estimates will depend on the covariance matrix for the first
stage, joint-binary estimates. The reported standard errorsThe value of a must be set large enough so that this bias
here appropriately reflect this dependence (see the appendix). factor converges to 0 but small enough to keep the variance
Notice that in the above approach we cannot directly esti of the estimator low. Letting Z* be the instrument with its
mate the impact of insurance coverage on expenditures (0g). mean removed, Klein, Shen, and Vella (2012) show that the
Therefore, we next describe a strategy for indirectly obtain following similar bound holds:
ing this marginal effect. Having described an estimator for
the coefficient on Xc above, define residual expenditures:
B = N1/2\E [uASZ'*] /y/E(AS2Z*2)\ -> 0.
R = YE-Xc£c = c + QEI + u. The choice of a is dictated by the same considerations as in
A&S. To set a in this application and in the Monte Carlo
Heckman (1990) developed a method for estimating the con
experiment described below, we employ an upper bound for
stant term in a semiparametric sample selection model, which
can be applied if we did not have the endogenous insurance 6 Tail assumptions similar to A&S enable us to keep index density denom
variable. Andrews and Schafgans (1998; hereafter A&S) sub inators from being too small while remaining in a high probability set. The
sequently established the large-sample properties of a variant appendix develops an appropriate trimming strategy.

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
B that can be estimated.7 To balance bias and variance, The keyweendogenous variables that we seek to explain are
then select the smallest value of a such that this bound tends coverage, utilization of the health care system, and
insurance
to 0. (a is approximately .4 in this application.) the level of expenditures. The insurance variable here is an
To get some sense as to how well the method indicator
described of whether the individual has private health insur
above works in practice, we conduct a small-scale
ance Monte
coverage. It would be important to take the generosity
Carlo experiment where we find that this method of performs
insurance plans into account in terms of copayments and
deductibles.
very well. We generate data from the following design, whichHowever, such information is not available in the
has the same structure as our model: MEPS data used here. The expenditures are the total amount
paid for health care services, including both out-of-pocket
7 = UV, > 6/} :Vi=Xi+X2+X3 + l, payments and payments by insurance but not including pay
A = \{Va+2I>za}:Va=X,+X2, ments for over-the-counter drugs. Note that the expenditures
are derived from the MEPS Household and Medical Provider
Ye = 41 + 2X\ + 1 + u : A = 1,
Components. Since both the health care providers and the
where the Xs are all distributed as normal and the errors are are surveyed, it is more reliable than typical sur
consumers
veys. We
jointly normal with nonzero correlations between them. define utilization of the health care system as having
The
sample size we use is n = 2000, and the number positive
of Montehealth care expenditures.10
Carlo replications is 1,000. As we can see, the trueThe
0£explanatory
= 4. variables are demographics, socioeco
nomic
The average 0£ from the Monte Carlo is 4.02, and the status, and health-related characteristics. The demo
standard
deviation is 0.14. In other words, the percentage biasgraphics
is almostare age, gender, race/ethnicity (white, nonwhite),
0, and the variance is also small, taking into account that the (married, other), family size, and region (North
marital status
truth is 4. east, Midwest, South, West). Years of education, income,
occupation class, and industry insurance rates are included
III. Data as socioeconomic characteristics. We use an indicator for
white-collar jobs (professional, management, business, and
The Medical Expenditure Panel Survey
financial(MEPS)
operations) is an
to reflect the impact of occupation and
ongoing nationally representative survey of the U.S.
the percentage civil
of people having insurance in each indus
ian noninstitutionalized population started
try in the Kaiser studythe
in 1996 by as a variable to reflect the impact
U.S. Department of Health and Human ofServices. Surveys
industry (Kaiser Family Foundation, 2006). The health
of households, employers, and medical providers areare
related characteristics con
number of comorbidities, presence
ducted to collect information on health care illnesses,
of mental expenditures
and whether they are current smokers. All
and health insurance coverage, as well individuals
as demographic and they had any of a number of
was asked whether
socioeconomic characteristics.8 conditions. The comorbidity variable then counts the follow
We consider the subsample of obese adults between theing health problems: Alzheimer's disease, asthma, arthritis,
ages of 22 and 64 who are employed. People who have a cancer, emphysema, diabetes, heart disease, high blood pres
body mass index (BMI) greater than 30 are considered obese sure, osteoarthritis, and stroke. This variable is included to
(Centers for Disease Control and Prevention, 1985-2007).capture differences in people's physical health status and is
We focus on the obese population because this is a growing
often employed in health studies (Klabunde et al., 2000).
population that might have different health care needs and Presence of mental illnesses is an indictor of whether an
patterns than other groups do. We also focus on individuals individual has depression, anxiety, or schizophrenia.
who are employed, because in the United States, insurance is Recalling the exclusion restrictions discussed in the pre
often linked with employment. In fact, health insurance plansvious section, we use the following restrictions in this paper.
are often offered by employers. We exclude individuals who The industry insurance rate and occupation are excluded
have public insurance, because having public insurance is from both utilization and expenditure equations, and marital
not expected to be a consumer's choice for working adultsstatus and region are excluded from the expenditure equa
between the ages of 22 and 64. The final sample consists of
tion. As is known in the literature, occupation and industry
2,771 individuals.9 have important effects on people's insurance (Kaiser Family
Foundation, 2006). In the United States, insurance plans for
7 Assume that the error, u, has finite r absolute moments. Then, Klein et al.
working adults often come as part of the compensation pack
(2012) show that an upper bound on the bias is given by
age. Different jobs might offer varied choices of insurance
B = Ar1/2|Ara(r-1,/r£ [ASZ*] I^E(AS2Z>2)\. packages at different prices. Hence it affects the insurance
decision by affecting the cost of buying insurance. However,
In our application, we replace the expectations with sample averages.
8 We note that the semiparametric model can be less sensitive to report
ing errors than parametric models (Hausman, Abrevaya, & Scott-Morton,
1998). 10 We use this indicator instead of the self-reported health care utiliza
9 Other exclusion criteria include individuals who died during the year and tion because the self-reported utilization may suffer from recall errors,
missing values on the exogenous variables used. Various robustness checks whereas the expenditure data were collected by both sides and hence are
indicate that there are no selection issues in this sample. more reliable.

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
Table 1.—Description of Study Population

once the insurance coverage decision is made, it is plausi
ble to assume that the industry insurance rate and occupation N %
class would not affect the benefit or the cost of utilization

All 2,771 100.0
and expenditures after controlling for income and education. Insurance coverage
Insured 2,283 82.4
Married people might have more incentive to obtain insur
Uninsured 488 17.6
ance coverage. Different regions might have different health Utilization
care policies and plans, as well as different availabilities of Yes 2,509 90.5
No 262 9.5
health care services.11 Consequently region may also have an
Expenditures
impact on insurance. However, recall that the patient makes No expenditures 262 9.5
decisions about insurance and utilization, while the doctor Less than $1,000 990 35.7
and patient jointly decide on the level of treatment, with the $1,000-$2,000 443 16.0
$2,000-$5,000 607 21.9
doctor being the main decision maker. Once a patient decides $5,000-$ 10,000 265 9.6
to visit a health care provider, we assume that the prescribed Over $10,000 and 204 7.4
Education
treatment does not depend on marriage or region. Hence, the
Less than high school 471 17.0
level of expenditures may not depend on these variables. We High school 946 34.1
recognize the difficulty in finding appropriate restrictions for College or higher 1,354 48.9
the type of model that we estimate but view the exclusion Age
Below 40 1,022 36.9
restrictions discussed above as being plausible. 40-49 856 30.9
Some summary statistics of the data are provided in table 1. 50 and over 893 32.2
Income
Note that the continuous variables are categorized into groups
Less than $20,000 781 28.2
to show the distribution of those variables. However, they $20,000-$30,000 569 20.5
remain continuous in estimating the model. Of the 2,771 $30,000-$50,000 794 28.7
Over $50,000 and 627 22.6
individuals in our data set, 488 (18%) are uninsured and 262
Gender
(10%) have no utilization. The level of expenditures for those Female 1,460 52.7
who use health care is very skewed. About 40% of them have Male 1,311 47.3
Race
expenditures of less than $1,000, while 8% of them incur 56.0
White 1,551
more than $10,000 in health care expenditures. Nonwhite 1,220 44.0
Number of comorbidities
0 1,352 48.8
IV. Results 1 881 31.8
2 or more 538 19.4
Mental illness
Before we discuss the results, we recall that parameters are
Yes 540 19.5
identified only up to location and scale in the semiparametric
No 2,231 80.5
case. After estimation, we normalize the parameterCurrent
of educa
smoker
Yes 542 19.6
tion to the corresponding parametric estimate for presentation
No 2,229 80.4
purposes.12 We examine both parametric and semiparamet
Marital status
ric results for the three decisions. We compare not only
Married the 1,714 61.9
1,057 38.1
normalized estimates and average marginal effectsOther
but also
Family size
patterns of marginal effects calculated at different 1-2 levels of 1,219 44.0
certain continuous variables of interest. Most of the normal 3-4 1,069 38.6
5 or more 483 17.4
ized estimates and average marginal effects are close between
Region
the two approaches for insurance and utilization decisions. Northeast 381 13.7
However, the two estimation methods yield very different Midwest 611 22.0
South 1,206 43.5
estimated effects of insurance on expenditures. Furthermore,
West 573 20.7
the semiparametric approach gives richer patterns of marginal
Industry insurance rate
effects. Detailed results are provided in tables 2 to 5. Less than 75% insured 519 18.7
As shown in table 2, for the insurance decision, both the 75%-90% insured 1,326 47.9
Over 90% insured 926 33.4
normalized estimates and the average marginal effects are Occupation
similar for parametric and semiparametric approaches. The White collar 830 30.0
Other 1,941 70.0
biggest marginal effect on the probability of having insurance
comes from marital status, with the p- values of the coefficient
on married in both approaches being less than 0.01 : marriage
increases the probability of having private insurance cover

"No detailed information about state of residence is available in the
MEPS data set.
age by more than 7 percentage points. Region also has a
significant
12 The choice of variable on which to normalize does not affect estimation effect on the insurance coverage, with the North
results (provided that the variable belongs in the model). east indicator having coefficient p-values of 0.06 and 0.02 in

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
Table 2.—Parametric and Semiparametric Estimation Results: Insurance Coverage
Parametric Estimation Semiparametric Estimation
Estimate (SE) p-value ME (percentage points) Estimate (SE) p-value ME (percentage points)
Intercept -6.47 (0.57) <.01

Age 0.04 (0.02) 0.09 0.10 -0.01 (0.02) 0.56 0.00
Age2 —3.86E-04 (2.62E-04) 0.14 (1.65E-04) (2.91E-04) 0.57
Number of comorbidities 0.10 (0.04) 0.01 1.90 0.12 (0.05) 0.01 1.84
Mental illnesses 0.14 (0.09) 0.12 2.68 0.07 (0.09) 0.47 1.00
Female -0.01 (0.07) 0.83 -0.30 0.09 (0.08) 0.25 1.42
White 0.28 (0.07) <.01 5.74 0.27 (0.09) <.01 4.32
Income 0.29 (0.03) <.01 0.58 0.74 (0.13) <.01 3.31
Current smoker -0.07 (0.08) 0.36 -1.45 -0.07 (0.09) 0.45 -1.03
Years of education 0.09 (0.01) <.01 1.85 0.09 1.40
Married 0.36 (0.07) <.01 7.54 0.45 (0.10) <.01 7.22
Family size 0.01 (0.02) 0.72 0.16 0.00 (0.02) 0.85 0.06
Region—Northeast 0.23 (0.12) 0.06 4.41 0.00 (0.14) 0.02 4.83
Region—Midwest 0.12 (0.10) 0.23 2.44 0.04 (0.10) 0.68 0.64
Region—South -0.10 (0.08) 0.23 -2.03 -0.03 (0.09) 0.75 -0.44
Industry insurance rate 2.61 (0.30) <.01 2.54 2.81 (0.56) <.01 2.06
White collar -0.04 (0.08) 0.61 0.88 -0.12 (0.08) 0.13 -1.86
Estimate = parameter estimate; SE = standard error; ME (percentage points) = average marginal effect in percentage points. Expenditure and income are in $1,000 and are logged. Ref
West. Marginal effects of continuous variables are calculated by moving everyone in the sample above by one unit, except income and industry insurance rate, which were moved by 10%
effects of discrete variables are calculated by moving everyone in the sample from 0 to 1.
Table 3.—Parametric and Semiparametric Estimation Results: Utilization

Estimate (SE) p-value ME (percentage points) Estimate (SE) p-value ME (percentage points)
Intercept 0.03 (0.67) 0.97

Age -0.04 (0.03) 0.20 0.10 0.01 (0.04) 0.86 0.12
Age2 5.64E-04 (3.56E-04) 0.11 1.25E-04 (4.81E-04) 0.79
Number of comorbidities 0.56 (0.07) <.01 5.55 0.55 (0.26) 0.03 3.39
Mental illnesses 0.31 (0.12) 0.01 3.69 0.64 (0.33) 0.05 4.18
Female 0.43 (0.08) <.01 5.77 0.51 (0.27) 0.05 3.52
White 0.05 (0.09) 0.57 0.66 -0.35 (0.22) 0.10 -2.27
Income -0.01 (0.04) 0.86 -0.01 -0.28 (0.21) 0.19 -1.40
Current smoker -0.10 (0.09) 0.28 -1.36 0.13 (0.12) 0.27 0.88
Years of education .05 (0.02) <.01 0.70 0.05 0.36
Married 0.25 (0.09) 0.01 3.38 0.36 (0.18) 0.05 2.40
Family size -0.04 (0.03) 0.12 -0.55 -0.10 (0.07) 0.15 -0.66
Region—Northeast 0.02 (0.14) 0.88 0.28 0.00 (0.19) 0.41 -1.05
Region—Midwest 0.04 (0.12) 0.76 0.47 0.19 (0.18) 0.27 1.31
Region—South 0.08 (0.10) 0.39 1.11 0.21 (0.18) 0.22 1.44
Insurance coverage 0.88 (0.29) <.01 15.50 13.70
Correlation factor -0.09 (0.17) 0.61
Estimate = parameter estimate; SE = standard error; ME (percentage points) = average marginal effect in percentage points. Expenditure and income are in $1,000 and are logged. Reference grou
West. Marginal effects of continuous variables are calculated by moving everyone in the sample above by 1 unit, except income and industry insurance rate which were moved by 10% and 5%, respec
effects of discrete variables are calculated by moving everyone in the sample from 0 to 1.
Table 4.—Parametric and Semiparametric Estimation Results: Level of Expenditures

Estimate (SE) p- value ME (percentage points) Estimate (SE) p-value ME (percentage points)
Intercept 5.00 (0.52) <.01

Age — 1.23E-03 (0.02) 0.95 1.40 16E-04 (0.02) 0.99 1.00
Age2 1.72E-04 (2.23E-04) 0.45 115E-04 (2.36E-04) 0.63
Number of comorbidities 0.40 (0.04) <.01 40.40 0.35 (0.08) <.01 34.96
Mental illnesses 0.55 (0.07) <.01 54.99 0.45 (0.10) <.01 45.32
Female 0.15 (0.06) 0.01 15.45 0.08 (0.08) 0.33 8.06
White 0.25 (0.06) <.01 25.17 0.36 (0.07) <.01 36.37
Income -0.01 (0.04) 0.78 -0.10 0.09 (0.05) 0.06 0.93
Current smoker -0.18 (0.07) 0.01 -17.73 -0.20 (0.07) 0.01 -19.75
Years of education 0.03 (0.01) 0.01 3.28 0.03 (0.01) 0.02 3.23
Family size -0.04 (0.02) 0.05 -3.56 -0.03 (0.02) 0.13 -3.07
Insurance coverage 1.25 (0.32) <.01 124.85 47.91
Correction term with regard to visit -0.05 (0.33) 0.89
Correction term with regard to insurance -0.32 (0.16) 0.05
Estimate = parameter estimate; SE = standard error, ME (%) = average marginal effect in percentages. Expenditure and income are in $1,000 and are logged. Reference group for region = West. M
of continuous variables are calculated by moving everyone in the sample above by one unit, except income and industry insurance rate, which were moved by 10% and 5%, respectively. Marginal
variables are calculated by moving everyone in the sample from 0 to 1.

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
Table 5.—Marginal Effects across the Distribution of Select

Marital status, which is one of the exclusions, has a highly
Variables of Interest
significant impact on utilization. Married people are 2 to 3
ME on Insurance ME on Utilization
percentage points more likely to use health care. Region, as
(percentage points) (percentage points)
an additional exclusion, is marginally significant. Another
Semi Semi
Parametric Parametric
important finding here is that the correlation factor in the
parametric parametric
parametric estimation is very small in absolute magnitude
Education
3.22 1.37 1.21 3.25
(—0.09), and it is statistically insignificant, with a p-value of
Less than high school
High school 2.00 1.00 0.58 2.81 0.61. As discussed earlier, this finding has implications for
College or higher 1.05 1.11 0.38 2.96 the form of an adjustment factor in estimating the expenditure
Industry Insurance Rate
Less than 75% insured 4.28 1.55 - -
equation.
75-90% insured 2.32 1.71 The final equation deals with the level of health care expen
- -
More than 90% insured 1.41 1.37 ditures. Results are presented in table 4. Note that most of the
- -
ME on marginal effects
insurance are the same as the coefficient estimates here.
(percentage poin
on utilization (percentage points)
effects of education are calculated With the exception of the impact of insurance, estimates in
of industry insurance rate
the two approaches are similar. Both the numberare
of comor calc
bidities and the presence of mental illnesses have significant

effects in this equation. Both have p-values of less than 0.01.
parametric Having one more physical anddisease can increase the levelsem
of
in the Northeast
expenditures by about 35% on average, while having a mental r
likely to have
illness increases it by more than 45%. insu
West. White The semiparametric approach estimates peopl the marginal
have insurance tha
impact of insurance on expenditures to be 48%.13 This impact
Education would seem to be
andcredible as it is close to the number
inco in a
impacts previous study by
on Newhouse and the Insurance Experiment
insuranc
one of the Group (1993). Their study, based on the RAND Health Insur
exclusio
ance decision. ance Experiment, shows that mean predicted Incr expenditure in
percent the 0% coinsurance (free-care) plan is 46% higher
increases ththan in
than 2 percentage
the 95% coinsurance plan. We want to keep in mind that the
marginally relevance of the study significa
may be lowered by the fact that it was
a significant done more than a decade ago positiv
and not all the insured people in
of comorbidities the sample get free care.14 In contrast, parametric estimation in
probability gives a marginal effect of of 125%. We note havinthat there are many
With respect other parametric studies that have to also found an insurance
th
averaged impact marginalof this magnitude (Hadley & Holahan, 2003; Miller, e
ric results for the utilization decision are also similar. These Banthin, & Moeller, 2004). These studies treat insurance as
results are presented in table 3. One of the most importantexogenous and state that in so doing, the marginal impact of
questions here is how insurance coverage affects utilization,insurance has an upward bias. However, none of these studies
and parametric and semiparametric estimations provide simi has quantified the extent of this bias.
lar results. The average marginal effect of insurance coverageTo understand the large difference between semiparamet
is 14 to 15 percentage points (the probabilities move from 78 ric and parametric results, we performed several checks. First,
to 80% to 93 to 94%,) meaning if we move everyone in the we examined the normality assumption in the insurance equa
sample from uninsured to insured, the average gain in the tion by using semiparametric methods to estimate the density
probability of visiting a doctor is 14 to 15 percentage points, of the error. In particular, we obtained the semiparametric
a large number. Both the number of comorbidities and the estimate of the expectation of the insurance dummy condi
presence of mental illnesses have significant positive impacts tioned on the index. In a traditional threshold-crossing model,
on utilization (coefficient p-value < 0.05). For the number of this estimated expectation is the estimate of the distribution
comorbidities, parametric estimation gives a higher marginal function for the error term. Taking a numerical derivative then
effect of 6 percentage points compared to the 3 percentageproduces its density. As shown in figure 1, the density estima
points of the semiparametric approach; while for the presence tor, which we recentered to have median zero, is remarkably
of mental illnesses, both approaches give an average marginal nonnormal for the insurance error. It should be noted that
effect of 4 percentage points. One interesting finding here is
that females are much more likely to visit a doctor. Paramet13 The 90% confidence interval for the marginal effect is approximately
ric and semiparametric estimations yield average marginal [.12, .84], which is based on the asymptotic distribution of the estimator as
given in Klein, Shen, and Vella (2012).
effects of 6 and 4 percentage points, respectively. Another
14 The impact would be somewhat lower than 46% in comparison to other
interesting finding is that income does not have a significant coinsurance levels plans (not totally free), further supporting the finding
impact on utilization once the insurance decision is fixed. that there is severe upward bias in the parametric estimate.

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
Figure 1.—Estimated Error Distribution in the Insurance Equation
o -6 -4 -2 0 2 4 6 8 10
other components of the

semiparametri
depend on hence
the can pro
insurance d
tion errors estimation,
in the th
insuranc
these other for the three
components of
To evaluate some
the college
implicatioo
assumptions centage
not point
holding, w
iment. Recall strong
that inmonoto
the pa
that control the
for largest mar
selection a
normality. population,
Given the an
fail
assumptions marginal
to hold, effec
it wou
functions areof 1.0 to
incorrect. 1.1
Acp
estimated forfunctions
these the margi w
on their functional
metric form
estimat
estimates of points,
the 0.58
expenditup
ferencing outrespectively,
the G-functi f
However, mation
once all of sugges
the pa
have been the three it
estimated, groi
metric estimates of
points, these
and 2.9
indicating a it is importan
semiparametric
probably mor
Gds =Ê[Ye-than
Zcßsa high
- cs sc
-
concerns the i
Replacing thecase, the bigge
parametric G
mated pared
semiparametricto 1.55
func
the is
parametric in the midd
expenditure
insurance was
75%found
to to
90%. be
Th
that the marginal
parametric bene
margin
factor of more than 2.
contrast, in th
Table 5 monotonie.
shows that para
approaches also The abovegive
results are based on very
the sample, including peo
dif
ent ple
population who have only
groups.outpatient use and those withHere
inpatient use.
the marginalThe effects
RAND experimental study notes that ofthe distribution
the of

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
medical expenditures differs for these two groups.toTo

investigate
addressthe magnitude of marginal effects for differ
ent BMI categories.
the issue of whether the model differs for these groups, we Second, we do not have information to
reestimated the model for individuals with only distinguish
outpatient the type of health care encounters, for example,
use and found that the results are similar. Detailed results are whether it is a preventive checkup with a physician or an
available on request. acute episode of some disease. It would be useful to distin
guish different types of health care use, so that we can study
the effects on different types of health care. Third, since this
V. Conclusion is a cross-sectional data set, we do not know the temporal
effects. It would be interesting to know, for example, how the
This paper studies the determinants of three use health care care in the previous periods affects inpatient
of preventive
care use
decisions: insurance, utilization, and expenditures. We studysubsequently.
the above interrelated health care decisions by analyzing
REFERENCES
a system of three simultaneous equations. Both parametric
and semiparametric methods are employed to estimate Andrews, the Donald W. K., and Marcia Schafgans, M. A., "Semiparam
Estimation of the Intercept of a Sample Selection Model," Re
model. The merit of our semiparametric approach compared
of Economic Studies 65:3 (1998), 497-517.
to a parametric approach is that it avoids distributional
Arrow, andKenneth J., "Uncertainty and the Welfare Economics of M
functional form assumptions, which are not well justified. Care," American Economic Review 53 (1963), 941-973.
Cardon, James. H., and Igal Hendel, "Asymmetric Information in
Indeed, although there are many similarities, parametric and
Insurance: Evidence from the National Medical Expenditure
semiparametric approaches yield some very different vey," results,
RAND Journal of Economics 32:3 (2001), 408-427.
which can lead to different policy implications. Centers for Disease Control and Prevention, Behavioral Risk Factor
In summary, we find that insurance has a substantial lance System Survey Data (Atlanta: U.S. Department of He
effect
and Human Services, Centers for Disease Control and Preven
on both utilization and expenditures. Both methods 1985-2007).
suggest
that having private insurance coverage increases the likeli
Chiappori, Pierre-André, and Bernard Salanie, "Testing for Asym
hood of seeking health care by about 15 percentage Information
points. in Insurance Markets," Journal of Political Econ
108:1 (2001), 56-78.
However, the estimated magnitude of the effect on expendi
Duan, Naihua, Willard G. Manning, Carl N. Morris, and Joseph P
ture diverges. The parametric estimation predicts the house, "A Comparison of Alternative Models for the Demand
level of
Medical Care," Journal of Business and Economic Statistic
expenditures to increase by 125% if universal insurance is
(1983), 115-126.
given, while semiparametric estimation predicts an increase
— "Choosing Between the Sample-Selection Model and the M
of 48%, a number close to that found in a RAND experimental
Part Model," Journal of Business and Economic Statistics 2 (1
283-289.
study (Newhouse and Insurance Experiment Group, 1993).
"Comments on Selectivity Bias," Advances in Health Economics
Because the parametric assumptions are incorrect, the para Services Research 6 (1985), 19-24.
and Health
Hadley, Jack,
metrically estimated impact of insurance on expenditures hasand John Holahan, "Covering the Uninsured: How Much
Would It Cost?" Health Affairs (2003), W3-250-265.
an upward bias on the order of 100%. The policy relevance
Hausman, Jerry A., Jason Abrevaya, and Fiona M. Scott-Morton, "Misclas
of this finding is that the cost of extending universalsification
health of the Dependent Variable in a Discrete Response Setting,"
Journal of Econometrics 87 (1998), 239-269.
care is much lower than predicted by traditional parametric
methods. Heckman, James J., "The Common Structure of Statistical Models of Trun
cation, Sample Selection and Limited Dependent Variables and a
Other marginal effects are also worth noting. Education is Simple Estimator for Such Models," Annals of Economic Social
an important factor in every health care decision, and hence Measurement 5: 4 (1976), 475-492.
"Sample Selection Bias as a Specification Error," Econometrica 47:
improving health literacy is an important issue in the obese
1 (1979), 53-161.
population. Given the pattern in the marginal effects, para "Varieties of Selection Bias," American Economic Review 80
metric results suggest that most, if not all, of the emphasis (1990), 313-318.
Holly, Alberto, Lucien Gardiol, and Jacques Huguenin, "Hospital Services
be placed on improving health literacy of the low education
Utilization in Switzerland: The Role of Supplementary Insurance,"
group (below high school). In contrast, results from the semi Institute of Health Economics and Management, University of
parametric case suggest that it is important to improve health Lausanne, manuscript (2002).
literacy among all education groups (with the low group KaiserinFamily Foundation, Kaiser Fast Facts, Health Insurance Coverage
America (2006).
somewhat favored). Finally both physical and mental ill Klabunde, Carrie N., Arnold L. Potosky, Julie M. Legier, and Joan L. War
nesses increase expenditures dramatically. Physical illnesses ren, "Development of a Comorbidity Index Using Physician Claims,"
Journal of Clinical Epidemiology 53 (2000), 1258-1267.
increase the level of expenditures by about 35%, and men
Klein, Roger W., and Chan Shen, "Bias Corrections in Testing and Estimat
tal illnesses increase it even more (more than 45%). This ing Semiparametric, Single Index Models," Econometric Theory 26
suggests that the obese population with physical and mental (2010), 1683-1718.
Klein, Roger W., Chan Shen, and Francis G. Vella, "Triangular Semi
illnesses is very challenging. More prevention and treatment
parametric Models Featuring Two Dependent Endogenous Binary
of physical and mental illnesses should be provided to this Outcomes," unpublished manuscript (2010).
population. "Semiparametric Selection Models with Binary Outcomes," unpub
lished manuscript (2012).
There are some limitations and consequently some future
Klein, Roger W., and Richard H. Spady, "An Efficient Semiparametric
research directions that we want to point out. First, this study Estimator for Binary Response Models," Econometrica 61 (1993),
is based on the obese population. It would be interesting 387-421.

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC
Lee, Lung-Fei, "Some Approaches to the Correction of Selectivity

Puhani, Bias,"
Patrick A., "The Heckman Correction for Sample Selection an
Review of Economic Studies 49:3 (1982), 355-372. Critique," Journal of Economic Surveys 14:1 (2000), 53-68.
RAND
Manning, Willard G., Joseph P. Newhouse, Naihua Duan, Emmett Health Insurance Experiment [in Metropolitan and N
B. Keeler,
and Arleen Leibowitz, "Health Insurance and the Demand for Med
Metropolitan Areas of the United States] (1974—1982).
ical Care: Evidence from a Randomized Experiment," Robinson,
American Peter M., "Root-n-consistent Semiparametric Regress
Economic Review 77 (1987), 251-277. Econometrica 56 (1988), 931-954.
Miller, Edward, Jessica S. Banthin, and John F. Moeller, "Covering the Michael, and Joseph Stiglitz, "Equilibrium in Competi
Rothschild,
Uninsured: Estimates of the Impact on Total Health Expenditures
Insurance Markets: An Essay on the Economics of Imperfect Inf
for 2002," Agency for Healthcare Research and Quality working
mation," Quarterly Journal of Economics 90:4 (1976), 630-649
paper 04007 (2004). Vera-Hernandez, Angel M., "Duplicate Coverage and Demand
Healthcare. The Case of Catalonia," Health Economics 8 (19
Mullahy, John, "Much Ado about Two: Reconsidering Retransformation
579-598.
and Two-Part Model in Health Econometrics," Journal of Health
Economics 17 (1998), 247-281. Vydacil, Edward, and Nese Yildiz, "Dummy Endogenous Variables in
Newhouse Joseph P., and Insurance Experiment Group, Free for All?Separable Models," Econometrica 75 (2007), 757-779.
Weakly
Lessons from the RAND Health Insurance Experiment (Cambridge,
Wooldridge, Jeffery M., Econometric Analysis of Cross Section and Panel
MA: Harvard University Press 1993). Data (Cambridge, MA, MIT Press, 2002).

145.109.63.194 on Mon, 07 Jun 2021 20:59:04 UTC

DETERMINANTS OF HEALTH CARE DECISIONS INSURANCE, UTILIZATION, AND EXPENDITURES Chan Shen

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DETERMINANTS OF HEALTH CARE DECISIONS INSURANCE, UTILIZATION, AND EXPENDITURES Chan Shen

Uploaded by

Copyright:

Available Formats

DETERMINANTS OF HEALTH CARE DECISIONS: INSURANCE, UTILIZATION, AND

Stable URL: https://www.jstor.org/stable/23355656

This content downloaded from

The Review of Economics and Statistics, March 2013, 95(1): 142-153

This content downloaded from

This content downloaded from

0 and depend on the variables Xs, A, and I. To estimate

This content downloaded from

Hence, Pn can be estimated by estimating each of the above

Suppressing individual subscripts for notational simplicity, Ye = X$c + c + 0£<i + Gd

This content downloaded from

This content downloaded from

This content downloaded from

Table 1.—Description of Study Population

class would not affect the benefit or the cost of utilization

increases the probability of having private insurance cover

This content downloaded from

Table 2.—Parametric and Semiparametric Estimation Results: Insurance Coverage

Parametric Estimation Semiparametric Estimation

Intercept -6.47 (0.57) <.01

Table 3.—Parametric and Semiparametric Estimation Results: Utilization

Parametric Estimation Semiparametric Estimation

Intercept 0.03 (0.67) 0.97

Table 4.—Parametric and Semiparametric Estimation Results: Level of Expenditures

Parametric Estimation Semiparametric Estimation

Intercept 5.00 (0.52) <.01

This content downloaded from

Table 5.—Marginal Effects across the Distribution of Select

bidities and the presence of mental illnesses have significant

This content downloaded from

Figure 1.—Estimated Error Distribution in the Insurance Equation

other components of the

This content downloaded from

medical expenditures differs for these two groups.toTo

This content downloaded from

Lee, Lung-Fei, "Some Approaches to the Correction of Selectivity

This content downloaded from

You might also like